GGUF Loader
π NEW: Agentic Mode Now Available! Transform your local AI into an autonomous coding assistant that can read, create, edit, and organize files in your workspace.
GGUF Loader is a privacy-first desktop platform for running local LLMs with advanced agentic capabilities, floating tools, and a powerful plugin system β giving you instant AI anywhere on your screen, all running offline on your machine.
The problem:
Running open-source LLMs locally is powerful but painful. It's either command-line only or scattered across multiple tools. There's no GUI that brings it together no ecosystem, no UX, no quick way to make LLMs useful in daily tasks. Even worse, most local LLMs just chat they can't autonomously read your files, write code, or automate development tasks. You're stuck manually doing everything yourself.
The solution:
GGUF Loader gives users a beautiful desktop interface, one-click model loading, and a plugin system inspired by Blender. But it goes further: with autonomous agentic mode, your AI can now read, create, edit, and organize files automatically. Generate entire project structures, refactor code, create tests, and automate workflows all using fully offline models. Plus, with its built-in floating button, you can summon AI from anywhere on your screen. It's a privacy-first productivity layer that turns LLMs into true autonomous agents you can drag, click, and extend with plugins.
βοΈ 2. Product Vision (2 Paragraphs) The problem: Running open-source LLMs locally is powerful but frustrating. Users face messy installs, scattered tools, CLI-only interfaces, and no way to extend functionality without code. Even power users lack a smooth workflow to manage models, and critically, there's no autonomous agent that can actually DO things β read files, write code, or automate development tasks using their own machine.
The solution: GGUF Loader turns your PC into a local AI platform with autonomous agentic capabilities. With a modern GUI, one-click model loading, and a Blender-style plugin system, you now have an AI agent that can read, create, edit, and organize files automatically. Generate code, automate refactoring, build project structures, add summarizers, floating agents, RAG tools, and more β all running offline. Whether you're a developer, researcher, or AI tinkerer, GGUF Loader gives you a stable, extensible foundation for intelligent tools that respect your privacy and run 24/7 without cloud lock-in.
π― Mission Statement
We believe AI shouldnβt live in the cloud β it should live on your screen, always-on, fully yours. GGUF Loader is building the interface layer for the local LLM revolution: a plugin-based platform with floating assistants, developer extensibility, and a vision to empower millions with intelligent local tools.
π₯ Download GGUF Loader
Option 1: Download ZIP (Recommended - Works on All Platforms)
β¬οΈ Download GGUF Loader ZIP (~5 MB)
How to Run:
- Download the ZIP file above
- Extract the ZIP file to any folder on your computer
- Run the launcher:
- Windows: Double-click
launch.bat - Linux/macOS: Double-click
launch.sh(or runchmod +x launch.sh && ./launch.shin terminal)
- Windows: Double-click
- First time only: Wait 1-2 minutes while dependencies install automatically
- Done! The app will start. Next time just double-click the launcher again - starts instantly!
Option 2: Windows Executable (No Installation Required)
β¬οΈ Download GGUFLoader_v2.1.1.agentic_mode.exe (~150-300 MB)
Just download and double-click to run! No Python, no dependencies, no setup needed.
Option 3: Install via pip
pip install ggufloader
ggufloader
π Links
- GitHub: https://github.com/GGUFloader/gguf-loader
- Website: https://ggufloader.github.io
- PyPI: https://pypi.org/project/ggufloader/
π What's New in v2.1.1 (February 2026)
π€ Agentic Mode - Autonomous AI Assistant
Transform your local AI into an autonomous coding assistant that can:
- π Read Files - Analyze code, documentation, and project structure
- βοΈ Create Files - Generate new source files, configs, and documentation
- βοΈ Edit Files - Modify existing code and update configurations
- π Organize Files - Create folders, move files, and restructure projects
- β‘ Automate Tasks - Execute multi-step workflows without manual intervention
π― Key Features
- Toggle Agent Mode - Enable/disable agentic capabilities with a single checkbox
- Workspace Selector - Choose your project folder with an intuitive file browser
- Real-time Status - Visual indicators show agent state (Ready, Processing, Error)
- Tool Execution - Agent can use various tools to accomplish complex tasks
- Multi-step Reasoning - Break down complex problems into manageable steps
- 100% Private - All processing happens locally on your machine
π§© Floating Assistant Button
A persistent, system-wide AI helper that hovers over your screen β summon AI from anywhere to summarize, reply, translate, or search β all using fully offline models.
π Add-on System (Blender-style Plugins)
Build your own AI tools inside GGUF Loader! Addons live directly in the chat UI with toggle switches β think PDF summarizers, spreadsheet bots, email assistants, and more.
πΈ Screenshot
Modern GUI with agentic mode, floating assistant, and plugin system β all running locally on your machine.
π Model Card
This βmodelβ repository hosts the Model Card and optional demo Space for GGUF Loader, a desktop application that loads, manages, and chats with GGUFβformat large language models entirely offline.
π Description
GGUF Loader with its floating button is light weight software that lets you easily run advanced AI language models (LLMs) like Mistral, LLaMA, and DeepSeek on Windows, macOS, and Linux. It has a drag-and-drop graphical interface, so loading models is quick and easy.
- β¨ GUIβFirst: No terminal commands; pointβandβclick interface
- π Plugin System: Extend with addons (PDF summarizer, email assistant, spreadsheet automatorβ¦)
- β‘οΈ Lightweight: Runs on machines as modest as Intel i5 + 16β―GB RAM
- π Offline & Private: All inference happens locallyβno cloud calls
π― Intended Uses
- Autonomous Development: Let AI generate code, create project structures, and automate refactoring
- Local AI Prototyping: Experiment with open GGUF models without API costs
- PrivacyβFocused Workflows: Chat and automate tasks privately on your own machine
- Code Generation: Generate boilerplate code, unit tests, and documentation
- File Automation: Automate file organization, configuration updates, and project restructuring
- Plugin Workflows: Build custom dataβprocessing addons (e.g. summarization, code assistant)
β οΈ Limitations
- No cloud integration: Purely local, no access to OpenAI or Hugging Face inference APIs
- Workspace access: Agentic mode requires explicit workspace folder selection for file operations
- Model dependent: Agentic capabilities work best with instruction-tuned models (Mistral-7B recommended)
- Requires Python 3.8+ for pip installation (not needed for Windows .exe)
π Citation
If you use GGUF Loader in your research or project, please cite:
@misc{ggufloader2026,
title = {GGUF Loader: Agentic AI Platform with GUI & Plugin System for Local LLMs},
author = {Hussain Nazary},
year = {2026},
howpublished = {\url{https://github.com/GGUFloader/gguf-loader}},
note = {Version 2.1.1, PyPI: ggufloader}
}
- No cloud integration: Purely local, no access to OpenAI or Hugging Face inference APIs
- GUI only: No headless server/CLIβonly mode (coming soon)
- Requires Python 3.8+ and dependencies (
llama-cpp-python,PySide6)
π How to Use
1. Install
pip install ggufloader
2. Launch GUI
ggufloader
3. Load Your Model
- Drag & drop your
.ggufmodel file into the window - Select plugin(s) from the sidebar (e.g. βSummarize PDFβ)
- Start chatting!
4. Python API
from ggufloader import chat
# Ensure you have a GGUF model in ./models/mistral.gguf
chat("Hello offline world!", model_path="./models/mistral.gguf")
π¦ Features
| Feature | Description |
|---|---|
| GUI for GGUF LLMs | Pointβandβclick model loading & chatting |
| Plugin Addons | Summarization, code helper, email reply, more |
| CrossβPlatform | Windows, macOS, Linux |
| MultiβModel Support | Mistral, LLaMA, DeepSeek, Yi, Gemma, OpenHermes |
| MemoryβEfficient | Designed to run on 16β―GB RAM or higher |
π‘ Comparison
| Tool | GUI | Agentic Mode | Plugins | Pip Install | Offline | Notes |
|---|---|---|---|---|---|---|
| GGUF Loader | β | β | β | β | β | Autonomous agent, extensible UI |
| LM Studio | β | β | β | β | β | Polished, but no automation |
| Ollama | β | β | β | β | β | CLIβfirst, narrow use case |
| GPT4All | β | β | β | β | β | Limited extensibility |
π Demo Space
Try a static demo or minimal Gradio embed (no live inference) here:
https://huggingface.co/spaces/Hussain2050/gguf-loader-demo
π Citation
If you use GGUF Loader in your research or project, please cite:
@misc{ggufloader2026,
title = {GGUF Loader: Agentic AI Platform with GUI & Plugin System for Local LLMs},
author = {Hussain Nazary},
year = {2026},
howpublished = {\url{https://github.com/GGUFloader/gguf-loader}},
note = {Version 2.1.1, PyPI: ggufloader}
}
π» System Requirements
Minimum Requirements
- OS: Windows 10/11, Linux (Ubuntu 20.04+), or macOS 10.15+
- RAM: 8 GB (for Q4_0 models)
- Storage: 2 GB free space + model size (4-50 GB depending on model)
- CPU: Intel i5 or equivalent (any modern processor)
Recommended Requirements
- RAM: 16 GB or more
- GPU: NVIDIA GPU with CUDA support (optional, for faster inference)
- Storage: SSD for better performance
Model Size Guide
- 7B models (Q4_0): ~4-5 GB, needs 8 GB RAM
- 7B models (Q6_K): ~6-7 GB, needs 12 GB RAM
- 20B models (Q4_K): ~7-8 GB, needs 16 GB RAM
- 120B models (Q4_K): ~46 GB, needs 64 GB RAM
π½ Download GGUF Models
β‘ Click a link below to download the model file directly (no Hugging Face page in between).
π§ GPT-OSS Models (Open Source GPTs)
High-quality, Apache 2.0 licensed, reasoning-focused models for local/enterprise use.
π§ GPT-OSS 120B (Dense)
π§ GPT-OSS 20B (Dense)
π§ Mistral-7B Instruct β Recommended for Agentic Mode
- β¬οΈ Download Q4_K_M (4.37 GB) - BEST for agentic workflows β
- β¬οΈ Download Q4_0 (4.23 GB) - Faster, slightly lower quality
- β¬οΈ Download Q6_K (6.23 GB) - Higher quality, needs more RAM
Q4_K_M offers the perfect balance of quality and speed for code generation and multi-step problem solving
Qwen 1.5-7B Chat
- β¬οΈ Download Q4_K_M (4.92 GB) - Recommended
- β¬οΈ Download Q4_K (4.88 GB)
- β¬οΈ Download Q6_K (6.83 GB)
DeepSeek 7B Chat
- β¬οΈ Download Q4_K_M (4.92 GB) - Recommended
- β¬οΈ Download Q4_0 (4.87 GB)
- β¬οΈ Download Q8_0 (9.33 GB)
LLaMA 3 8B Instruct
- β¬οΈ Download Q4_K_M (4.92 GB) - Recommended
- β¬οΈ Download Q4_0 (4.68 GB)
- β¬οΈ Download Q6_K (6.91 GB)
ποΈ More Model Collections
ποΈ More Model Collections
π€ Contributing
We welcome contributions! See CONTRIBUTING.md for guidelines.
π Support & Issues
- Report Issues: GitHub Issues
- Discussions: GitHub Discussions
- Email: hossainnazary475@gmail.com
βοΈ License
This project is licensed under the MIT License. See LICENSE for details.
Built with β€οΈ by the GGUF Loader community
Last updated: February 7, 2026
