File size: 10,701 Bytes
4dd008e c4b369c 7275aef c4b369c 7275aef c4b369c 7275aef c4b369c 7275aef c4b369c 7275aef c4b369c 7275aef c4b369c 7275aef c4b369c 7275aef c4b369c 7275aef c4b369c 7275aef c4b369c 7275aef c4b369c 7275aef c4b369c 7275aef c4b369c 7275aef c4b369c 7275aef c4b369c 7275aef c4b369c 7275aef c4b369c 7275aef c4b369c 7275aef c4b369c 7275aef c4b369c 7275aef c4b369c 7275aef c4b369c 7275aef c4b369c 7275aef c4b369c 7275aef c4b369c 7275aef c4b369c 7275aef c4b369c 7275aef c4b369c 7275aef c4b369c 7275aef c4b369c 7275aef c4b369c 7275aef c4b369c 7275aef c4b369c 7275aef c4b369c 7275aef c4b369c 7275aef c4b369c 7275aef c4b369c 7275aef c4b369c 7275aef c4b369c 7275aef c4b369c 7275aef c4b369c 7275aef c4b369c 7275aef c4b369c 7275aef c4b369c 7275aef c4b369c 7275aef c4b369c 7275aef c4b369c 7275aef c4b369c 7275aef c4b369c 7275aef c4b369c 7275aef c4b369c 7275aef c4b369c 7275aef c4b369c 7275aef c4b369c 7275aef c4b369c 7275aef c4b369c 7275aef c4b369c 7275aef c4b369c 7275aef | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 | ---
language: en
library_name: transformers
license: mit
tags:
- finetuning
- lora
- qlora
- unsloth
- gpu
- distributed
datasets:
- wikitext
pipeline_tag: text-generation
model-index:
- name: Humigence
results:
- task:
type: text-generation
dataset:
name: WikiText-2
type: wikitext
metrics:
- type: loss
value: 1.50
---
# ๐ง Humigence CLI
**Your AI. Your pipeline. Zero code.**
A complete MLOps suite built for makers, teams, and enterprises. Humigence provides zero-config, GPU-aware fine-tuning with surgical precision and complete reproducibility.
## โจ Key Features
- ๐ฏ **Interactive Wizard**: Step-by-step configuration with Basic/Advanced modes
- ๐ฅ๏ธ **Smart GPU Detection**: Automatic detection and selection of available GPUs
- ๐ **Dual-GPU Training**: Multi-GPU support with Unsloth + TorchRun
- ๐งช **Training Recipes**: QLoRA (4-bit), LoRA (FP16/BF16), Full Fine-tuning
- ๐ **Intelligent Batching**: Auto-fit batch size to available VRAM
- ๐ **Complete Reproducibility**: Config snapshots and reproduce scripts
- ๐ **Built-in Evaluation**: Curated prompts and quality gates
- ๐ฆ **Artifact Export**: Structured outputs with run summaries
## ๐ Quick Start
### Prerequisites
- **GPU**: NVIDIA GPU with CUDA support (RTX 5090, RTX 4080, etc.)
- **RAM**: 8GB+ recommended
- **Storage**: 10GB+ for models and datasets
- **Python**: 3.8+ with PyTorch
### Installation
```bash
# Clone the repository
git clone https://github.com/your-username/humigence.git
cd humigence
# Install dependencies
pip install -r requirements.txt
# Set up Unsloth (required for training)
python3 training/unsloth/setup_humigence_unsloth.py
# Launch the interactive wizard
python3 cli/main.py
```
### Basic Usage
```bash
# Launch the interactive wizard
python3 cli/main.py
# The wizard will guide you through:
# 1. Model selection
# 2. Dataset configuration
# 3. Training parameters
# 4. GPU selection (single or multi-GPU)
# 5. Launch training
```
## ๐ฏ Training Workflow
### 1. Interactive Setup
The Humigence wizard guides you through:
- **Setup Mode**: Basic (essential config) or Advanced (full control)
- **Hardware Detection**: Automatic GPU, CPU, and memory detection
- **Model Selection**: Choose from supported models or custom paths
- **Dataset Loading**: Auto-detection from `~/humigence_data/` or custom paths
- **Training Recipe**: QLoRA, LoRA, or Full Fine-tuning
- **GPU Selection**: Single-GPU auto-selection or multi-GPU prompting
### 2. GPU Selection
Humigence intelligently handles GPU selection:
- **Single GPU**: Automatically selects and uses the available GPU
- **Multiple GPUs**: Prompts you to choose:
```
๐ง Training Mode:
> Multi-GPU Training (all available GPUs)
Single GPU Training (choose specific GPU)
```
### 3. Training Execution
```bash
๐ Humigence Training Starting...
โ
Configuration Loaded: [all settings]
๐ฅ๏ธ GPU Detection: 2x RTX 5090 detected
๐ง Training Mode: Multi-GPU Training
๐ฆ Loading model: Qwen/Qwen2.5-0.5B
โ
LoRA adapters applied
๐ Loading dataset: wikitext2 (10,000 samples)
๐ Starting training with TorchRun...
โ
Training complete โ adapters saved.
```
## ๐ Supported Models
- **Qwen/Qwen2.5-0.5B**: 77M parameters (recommended for testing)
- **microsoft/Phi-2**: 839M parameters
- **TinyLlama/TinyLlama-1.1B-Chat-v1.0**: 369M parameters
- **Custom Models**: Any HuggingFace model or local path
## ๐๏ธ Dataset Support
- **JSONL Format**: Line-by-line JSON with instruction/output pairs
- **Auto-Detection**: Scans `~/humigence_data/` directory
- **Custom Paths**: Specify any local dataset file
- **Sample Datasets**: Includes demo datasets for testing
### Dataset Format
```json
{"instruction": "What is machine learning?", "output": "Machine learning is a subset of artificial intelligence..."}
{"instruction": "Explain quantum computing", "output": "Quantum computing uses quantum mechanical phenomena..."}
```
## ๐ฅ๏ธ Hardware Requirements
### Minimum Requirements
- **GPU**: NVIDIA GPU with 8GB+ VRAM
- **RAM**: 16GB+ system RAM
- **Storage**: 20GB+ free space
### Recommended Setup
- **GPU**: RTX 4080/4090/5090 or better
- **RAM**: 32GB+ system RAM
- **Storage**: 50GB+ free space
### Multi-GPU Support
- **Dual-GPU**: RTX 5090 + RTX 5090 (tested)
- **Memory**: 16GB+ VRAM per GPU recommended
- **Training**: Automatic TorchRun distribution
## ๐ Project Structure
```
humigence/
โโโ cli/
โ โโโ main.py # Main CLI entry point
โ โโโ config_wizard.py # Interactive configuration wizard
โ โโโ lora_wizard.py # LoRA-specific wizard
โโโ training/
โ โโโ unsloth/ # Unsloth integration
โ โโโ wizard.py # Unsloth training wizard
โ โโโ train_lora_dual.py # Multi-GPU training script
โโโ pipelines/
โ โโโ lora_trainer.py # Training pipeline
โโโ utils/
โ โโโ device.py # Hardware detection
โ โโโ dataset_loader.py # Dataset utilities
โ โโโ validators.py # Data validation
โโโ config/
โ โโโ default_config.json # Default configuration
โโโ runs/ # Training outputs
โโโ humigence/
โโโ config.snapshot.json
โโโ adapters/ # LoRA weights
โโโ artifacts.zip # Complete export
```
## ๐ง Configuration
### Basic Mode (Recommended)
Essential configuration with sensible defaults:
- **Learning Rate**: 2e-4
- **Epochs**: 1
- **Gradient Accumulation**: 4
- **LoRA Rank**: 16
- **LoRA Alpha**: 32
### Advanced Mode
Full control over all parameters:
- LoRA configuration (rank, alpha, dropout)
- Training hyperparameters
- Data processing options
- Evaluation settings
## ๐ Training Modes
### Single-GPU Training
```bash
# Automatically selected when 1 GPU detected
๐ง Single GPU detected - using GPU 0: RTX 5090
๐ Launching single-GPU training...
```
### Multi-GPU Training
```bash
# Prompts when multiple GPUs detected
๐ง 2 GPUs detected - choose training mode
> Multi-GPU Training (all available GPUs)
Single GPU Training (choose specific GPU)
```
## ๐ Evaluation & Monitoring
### Built-in Evaluation
- **Curated Prompts**: 5 diverse evaluation questions
- **Model Inference**: Generation with temperature sampling
- **Quality Gates**: Loss thresholds and evaluation metrics
- **Status Tracking**: ACCEPTED.txt or REJECTED.txt files
### Run Monitoring
```bash
# View training progress
tail -f runs/humigence/training.log
# Check evaluation results
cat runs/humigence/eval_results.jsonl
# View run summary
cat runs/humigence/run_summary.json
```
## ๐ Reproducibility
Every training run generates:
- **Config Snapshot**: Complete configuration in JSON
- **Reproduce Script**: One-click rerun capability
- **Artifact Archive**: Complete export of all outputs
- **Run Summary**: Structured metadata for tracking
```bash
# Rerun any training
./runs/humigence/reproduce.sh
# Or use the config directly
python3 training/unsloth/train_lora_dual.py --config runs/humigence/config.snapshot.json
```
## ๐ ๏ธ Development
### Dependencies
Core dependencies are pinned for stability:
```txt
transformers>=4.41.0,<5.0.0
torch>=2.1.0
unsloth @ git+https://github.com/unslothai/unsloth.git
rich>=13.0.0
inquirer>=3.1.0
```
### Local Development
```bash
# Install in development mode
pip install -e .
# Run tests
python3 -m pytest tests/
# Run specific test
python3 test_gpu_selection.py
```
## ๐ค Contributing
We welcome contributions! Please see [CONTRIBUTING.md](CONTRIBUTING.md) for details.
### Quick Contribution Guide
1. Fork the repository
2. Create a feature branch: `git checkout -b feature/amazing-feature`
3. Make your changes
4. Add tests if applicable
5. Commit your changes: `git commit -m 'Add amazing feature'`
6. Push to the branch: `git push origin feature/amazing-feature`
7. Open a Pull Request
## ๐ License
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
## ๐ Acknowledgments
- [Unsloth](https://github.com/unslothai/unsloth) for fast LoRA training
- [HuggingFace](https://huggingface.co/) for the transformers library
- [Microsoft](https://github.com/microsoft) for PEFT and LoRA implementations
- The open-source ML community
## ๐ Comparison with Other Tools
| Feature | Humigence CLI | Other Tools |
|---------|---------------|-------------|
| **Setup** | Interactive wizard | Manual config |
| **GPU Detection** | Automatic | Manual |
| **Multi-GPU** | Built-in TorchRun | Complex setup |
| **Reproducibility** | Complete snapshots | Partial |
| **Evaluation** | Built-in prompts | External tools |
| **Artifacts** | Structured export | Manual collection |
## ๐ Troubleshooting
### Common Issues
**GPU not detected:**
```bash
# Check CUDA installation
python3 -c "import torch; print(torch.cuda.is_available())"
# Check GPU visibility
nvidia-smi
```
**Out of memory:**
```bash
# Reduce batch size in config
# Or use QLoRA for memory efficiency
```
**Training fails:**
```bash
# Check logs
cat runs/humigence/training.log
# Verify dataset format
head -5 ~/humigence_data/your_dataset.jsonl
```
### Getting Help
- **Issues**: [GitHub Issues](https://github.com/your-username/humigence/issues)
- **Discussions**: [GitHub Discussions](https://github.com/your-username/humigence/discussions)
- **Documentation**: [Wiki](https://github.com/your-username/humigence/wiki)
## ๐บ๏ธ Roadmap
### Current Features โ
- Interactive configuration wizard
- Single and multi-GPU training
- QLoRA and LoRA support
- Built-in evaluation
- Complete reproducibility
### Coming Soon ๐ง
- RAG implementation
- EnterpriseGPT integration
- Batch inference
- Context length optimization
- Web UI interface
- Model serving
### Future Features ๐ฎ
- Distributed training across nodes
- Advanced evaluation metrics
- Model compression
- Deployment automation
---
**Built with โค๏ธ for the AI community**
*Humigence โ Your AI. Your pipeline. Zero code.*
## ๐ Stats






 |