# Flask API Documentation ## Overview The Research AI Assistant API provides a RESTful interface for interacting with an AI-powered research assistant. The API uses local GPU models for inference and supports conversational interactions with context management. **Base URL (HF Spaces):** `https://jatinautonomouslabs-research-ai-assistant-api.hf.space` **Alternative Base URL:** `https://huggingface.co/spaces/JatinAutonomousLabs/Research_AI_Assistant_API` **API Version:** 1.0 **Content-Type:** `application/json` > **Note:** For Hugging Face Spaces Docker deployments, use the `.hf.space` domain format. The space name is converted to lowercase with hyphens. ## Features - 🤖 **AI-Powered Responses** - Local GPU model inference (Tesla T4) - 💬 **Conversational Context** - Maintains conversation history and user context - 🔒 **CORS Enabled** - Ready for web integration - ⚡ **Async Processing** - Efficient request handling - 📊 **Transparent Reasoning** - Returns reasoning chains and performance metrics --- ## Authentication Currently, the API does not require authentication. However, for production use, you should: 1. Set `HF_TOKEN` environment variable for Hugging Face model access 2. Implement API key authentication if needed --- ## Endpoints ### 1. Get API Information **Endpoint:** `GET /` **Description:** Returns API information, version, and available endpoints. **Request:** ```http GET / HTTP/1.1 Host: huggingface.co ``` **Response:** ```json { "name": "AI Assistant Flask API", "version": "1.0", "status": "running", "orchestrator_ready": true, "features": { "local_gpu_models": true, "max_workers": 4, "hardware": "NVIDIA T4 Medium" }, "endpoints": { "health": "GET /api/health", "chat": "POST /api/chat", "initialize": "POST /api/initialize" } } ``` **Status Codes:** - `200 OK` - Success --- ### 2. Health Check **Endpoint:** `GET /api/health` **Description:** Checks if the API and orchestrator are ready to handle requests. **Request:** ```http GET /api/health HTTP/1.1 Host: huggingface.co ``` **Response:** ```json { "status": "healthy", "orchestrator_ready": true } ``` **Status Codes:** - `200 OK` - API is healthy - `orchestrator_ready: true` - Ready to process requests - `orchestrator_ready: false` - Still initializing **Example Response (Initializing):** ```json { "status": "initializing", "orchestrator_ready": false } ``` --- ### 3. Chat Endpoint **Endpoint:** `POST /api/chat` **Description:** Send a message to the AI assistant and receive a response with reasoning and context. **Request Headers:** ```http Content-Type: application/json ``` **Request Body:** ```json { "message": "Explain quantum entanglement in simple terms", "history": [ ["User message 1", "Assistant response 1"], ["User message 2", "Assistant response 2"] ], "session_id": "session-123", "user_id": "user-456" } ``` **Request Fields:** | Field | Type | Required | Description | |-------|------|----------|-------------| | `message` | string | ✅ Yes | User's message/question (max 10,000 characters) | | `history` | array | ❌ No | Conversation history as array of `[user, assistant]` pairs | | `session_id` | string | ❌ No | Unique session identifier for context continuity | | `user_id` | string | ❌ No | User identifier (defaults to "anonymous") | **Response (Success):** ```json { "success": true, "message": "Quantum entanglement is when two particles become linked...", "history": [ ["Explain quantum entanglement", "Quantum entanglement is when two particles become linked..."] ], "reasoning": { "intent": "educational_query", "steps": ["Understanding request", "Gathering information", "Synthesizing response"], "confidence": 0.95 }, "performance": { "response_time_ms": 2345, "tokens_generated": 156, "model_used": "mistralai/Mistral-7B-Instruct-v0.2" } } ``` **Response Fields:** | Field | Type | Description | |-------|------|-------------| | `success` | boolean | Whether the request was successful | | `message` | string | AI assistant's response | | `history` | array | Updated conversation history including the new exchange | | `reasoning` | object | AI reasoning process and confidence metrics | | `performance` | object | Performance metrics (response time, tokens, model used) | **Status Codes:** - `200 OK` - Request processed successfully - `400 Bad Request` - Invalid request (missing message, empty message, too long, wrong type) - `500 Internal Server Error` - Server error processing request - `503 Service Unavailable` - Orchestrator not ready (still initializing) **Error Response:** ```json { "success": false, "error": "Message is required", "message": "Error processing your request. Please try again." } ``` --- ### 4. Initialize Orchestrator **Endpoint:** `POST /api/initialize` **Description:** Manually trigger orchestrator initialization (useful if initialization failed on startup). **Request:** ```http POST /api/initialize HTTP/1.1 Host: huggingface.co Content-Type: application/json ``` **Request Body:** ```json {} ``` **Response (Success):** ```json { "success": true, "message": "Orchestrator initialized successfully" } ``` **Response (Failure):** ```json { "success": false, "message": "Initialization failed. Check logs for details." } ``` **Status Codes:** - `200 OK` - Initialization successful - `500 Internal Server Error` - Initialization failed --- ## Code Examples ### Python ```python import requests import json BASE_URL = "https://jatinautonomouslabs-research-ai-assistant-api.hf.space" # Check health def check_health(): response = requests.get(f"{BASE_URL}/api/health") return response.json() # Send chat message def send_message(message, session_id=None, user_id=None, history=None): payload = { "message": message, "session_id": session_id, "user_id": user_id or "anonymous", "history": history or [] } response = requests.post( f"{BASE_URL}/api/chat", json=payload, headers={"Content-Type": "application/json"} ) if response.status_code == 200: return response.json() else: raise Exception(f"API Error: {response.status_code} - {response.text}") # Example usage if __name__ == "__main__": # Check if API is ready health = check_health() print(f"API Status: {health}") if health.get("orchestrator_ready"): # Send a message result = send_message( message="What is machine learning?", session_id="my-session-123", user_id="user-456" ) print(f"Response: {result['message']}") print(f"Reasoning: {result.get('reasoning', {})}") # Continue conversation history = result['history'] result2 = send_message( message="Can you explain neural networks?", session_id="my-session-123", user_id="user-456", history=history ) print(f"Follow-up Response: {result2['message']}") ``` ### JavaScript (Fetch API) ```javascript const BASE_URL = 'https://jatinautonomouslabs-research-ai-assistant-api.hf.space'; // Check health async function checkHealth() { const response = await fetch(`${BASE_URL}/api/health`); return await response.json(); } // Send chat message async function sendMessage(message, sessionId = null, userId = null, history = []) { const payload = { message: message, session_id: sessionId, user_id: userId || 'anonymous', history: history }; const response = await fetch(`${BASE_URL}/api/chat`, { method: 'POST', headers: { 'Content-Type': 'application/json' }, body: JSON.stringify(payload) }); if (!response.ok) { const error = await response.json(); throw new Error(`API Error: ${response.status} - ${error.error || error.message}`); } return await response.json(); } // Example usage async function main() { try { // Check if API is ready const health = await checkHealth(); console.log('API Status:', health); if (health.orchestrator_ready) { // Send a message const result = await sendMessage( 'What is machine learning?', 'my-session-123', 'user-456' ); console.log('Response:', result.message); console.log('Reasoning:', result.reasoning); // Continue conversation const result2 = await sendMessage( 'Can you explain neural networks?', 'my-session-123', 'user-456', result.history ); console.log('Follow-up Response:', result2.message); } } catch (error) { console.error('Error:', error); } } main(); ``` ### cURL ```bash # Check health curl -X GET "https://jatinautonomouslabs-research-ai-assistant-api.hf.space/api/health" # Send chat message curl -X POST "https://jatinautonomouslabs-research-ai-assistant-api.hf.space/api/chat" \ -H "Content-Type: application/json" \ -d '{ "message": "What is machine learning?", "session_id": "my-session-123", "user_id": "user-456", "history": [] }' # Continue conversation curl -X POST "https://jatinautonomouslabs-research-ai-assistant-api.hf.space/api/chat" \ -H "Content-Type: application/json" \ -d '{ "message": "Can you explain neural networks?", "session_id": "my-session-123", "user_id": "user-456", "history": [ ["What is machine learning?", "Machine learning is a subset of artificial intelligence..."] ] }' ``` ### Node.js (Axios) ```javascript const axios = require('axios'); const BASE_URL = 'https://jatinautonomouslabs-research-ai-assistant-api.hf.space'; // Check health async function checkHealth() { const response = await axios.get(`${BASE_URL}/api/health`); return response.data; } // Send chat message async function sendMessage(message, sessionId = null, userId = null, history = []) { try { const response = await axios.post(`${BASE_URL}/api/chat`, { message: message, session_id: sessionId, user_id: userId || 'anonymous', history: history }, { headers: { 'Content-Type': 'application/json' } }); return response.data; } catch (error) { if (error.response) { throw new Error(`API Error: ${error.response.status} - ${error.response.data.error || error.response.data.message}`); } throw error; } } // Example usage (async () => { try { const health = await checkHealth(); console.log('API Status:', health); if (health.orchestrator_ready) { const result = await sendMessage( 'What is machine learning?', 'my-session-123', 'user-456' ); console.log('Response:', result.message); } } catch (error) { console.error('Error:', error.message); } })(); ``` --- ## Error Handling ### Common Error Responses #### 400 Bad Request **Missing Message:** ```json { "success": false, "error": "Message is required" } ``` **Empty Message:** ```json { "success": false, "error": "Message cannot be empty" } ``` **Message Too Long:** ```json { "success": false, "error": "Message too long. Maximum length is 10000 characters" } ``` **Invalid Type:** ```json { "success": false, "error": "Message must be a string" } ``` #### 503 Service Unavailable **Orchestrator Not Ready:** ```json { "success": false, "error": "Orchestrator not ready", "message": "AI system is initializing. Please try again in a moment." } ``` **Solution:** Wait a few seconds and retry, or check the `/api/health` endpoint. #### 500 Internal Server Error **Generic Error:** ```json { "success": false, "error": "Error message here", "message": "Error processing your request. Please try again." } ``` --- ## Best Practices ### 1. Session Management - **Use consistent session IDs** for maintaining conversation context - **Generate unique session IDs** per user conversation thread - **Include conversation history** in subsequent requests for better context ```python # Good: Maintains context session_id = "user-123-session-1" history = [] # First message result1 = send_message("What is AI?", session_id=session_id, history=history) history = result1['history'] # Follow-up message (includes context) result2 = send_message("Can you explain more?", session_id=session_id, history=history) ``` ### 2. Error Handling Always implement retry logic for 503 errors: ```python import time def send_message_with_retry(message, max_retries=3, retry_delay=2): for attempt in range(max_retries): try: result = send_message(message) return result except Exception as e: if "503" in str(e) and attempt < max_retries - 1: time.sleep(retry_delay) continue raise ``` ### 3. Health Checks Check API health before sending requests: ```python def is_api_ready(): try: health = check_health() return health.get("orchestrator_ready", False) except: return False if is_api_ready(): # Send request result = send_message("Hello") else: print("API is not ready yet") ``` ### 4. Rate Limiting - **No explicit rate limits** are currently enforced - **Recommended:** Implement client-side rate limiting (e.g., 1 request per second) - **Consider:** Implementing request queuing for high-volume applications ### 5. Message Length - **Maximum:** 10,000 characters per message - **Recommended:** Keep messages concise for faster processing - **For long content:** Split into multiple messages or summarize ### 6. Context Management - **Include history** in requests to maintain conversation context - **Session IDs** help track conversations across multiple requests - **User IDs** enable personalization and user-specific context --- ## Integration Examples ### React Component ```jsx import React, { useState, useEffect } from 'react'; const AIAssistant = () => { const [message, setMessage] = useState(''); const [history, setHistory] = useState([]); const [loading, setLoading] = useState(false); const [sessionId] = useState(`session-${Date.now()}`); const sendMessage = async () => { if (!message.trim()) return; setLoading(true); try { const response = await fetch('https://jatinautonomouslabs-research-ai-assistant-api.hf.space/api/chat', { method: 'POST', headers: { 'Content-Type': 'application/json' }, body: JSON.stringify({ message: message, session_id: sessionId, user_id: 'user-123', history: history }) }); const data = await response.json(); if (data.success) { setHistory(data.history); setMessage(''); } } catch (error) { console.error('Error:', error); } finally { setLoading(false); } }; return (
{history.map(([user, assistant], idx) => (
You: {user}
Assistant: {assistant}
))}
setMessage(e.target.value)} onKeyPress={(e) => e.key === 'Enter' && sendMessage()} disabled={loading} />
); }; ``` ### Python CLI Tool ```python #!/usr/bin/env python3 import requests import sys BASE_URL = "https://jatinautonomouslabs-research-ai-assistant-api.hf.space" class ChatCLI: def __init__(self): self.session_id = f"cli-session-{hash(__file__)}" self.history = [] def chat(self, message): response = requests.post( f"{BASE_URL}/api/chat", json={ "message": message, "session_id": self.session_id, "user_id": "cli-user", "history": self.history } ) if response.status_code == 200: data = response.json() self.history = data['history'] return data['message'] else: return f"Error: {response.status_code} - {response.text}" def run(self): print("AI Assistant CLI (Type 'exit' to quit)") print("=" * 50) while True: user_input = input("\nYou: ").strip() if user_input.lower() in ['exit', 'quit']: break print("Assistant: ", end="", flush=True) response = self.chat(user_input) print(response) if __name__ == "__main__": cli = ChatCLI() cli.run() ``` --- ## Response Times - **Typical Response:** 2-10 seconds - **First Request:** May take longer due to model loading (10-30 seconds) - **Subsequent Requests:** Faster due to cached models (2-5 seconds) **Factors Affecting Response Time:** - Message length - Model loading (first request) - GPU availability - Concurrent requests --- ## Troubleshooting ### Common Issues #### 404 Not Found **Problem:** Getting 404 when accessing the API **Solutions:** 1. **Verify the Space is running:** - Check the Hugging Face Space page to ensure it's built and running - Wait for the initial build to complete (5-10 minutes) 2. **Check URL format:** - ✅ Correct: `https://jatinautonomouslabs-research-ai-assistant-api.hf.space` - ❌ Wrong: `https://jatinautonomouslabs-research_ai_assistant_api.hf.space` (underscores) - ✅ Alternative: `https://huggingface.co/spaces/JatinAutonomousLabs/Research_AI_Assistant_API` 3. **Verify endpoint paths:** - Health: `GET /api/health` - Chat: `POST /api/chat` - Root: `GET /` 4. **Test with root endpoint first:** ```bash curl https://jatinautonomouslabs-research-ai-assistant-api.hf.space/ ``` #### 503 Service Unavailable **Problem:** Orchestrator not ready **Solutions:** 1. Wait 30-60 seconds for initialization 2. Check `/api/health` endpoint 3. Use `/api/initialize` to manually trigger initialization #### CORS Errors **Problem:** CORS errors in browser **Solutions:** - The API has CORS enabled for all origins - If issues persist, check browser console for specific errors - Ensure you're using the correct base URL ### Testing API Connectivity **Quick Health Check:** ```bash # Test root endpoint curl https://jatinautonomouslabs-research-ai-assistant-api.hf.space/ # Test health endpoint curl https://jatinautonomouslabs-research-ai-assistant-api.hf.space/api/health ``` **Python Test Script:** ```python import requests BASE_URL = "https://jatinautonomouslabs-research-ai-assistant-api.hf.space" # Test root try: response = requests.get(f"{BASE_URL}/", timeout=10) print(f"Root endpoint: {response.status_code} - {response.json()}") except Exception as e: print(f"Root endpoint failed: {e}") # Test health try: response = requests.get(f"{BASE_URL}/api/health", timeout=10) print(f"Health endpoint: {response.status_code} - {response.json()}") except Exception as e: print(f"Health endpoint failed: {e}") ``` ## Support For issues, questions, or contributions: - **Repository:** [GitHub Repository URL] - **Hugging Face Space:** [https://huggingface.co/spaces/JatinAutonomousLabs/Research_AI_Assistant_API](https://huggingface.co/spaces/JatinAutonomousLabs/Research_AI_Assistant_API) --- ## Changelog ### Version 1.0 (Current) - Initial API release - Chat endpoint with context management - Health check endpoint - Local GPU model inference - CORS enabled for web integration --- ## License This API is provided as-is. Please refer to the main project README for license information.