Flask API Documentation
Overview
The Research AI Assistant API provides a RESTful interface for interacting with an AI-powered research assistant. The API uses local GPU models for inference and supports conversational interactions with context management.
Base URL: https://huggingface.co/spaces/JatinAutonomousLabs/Research_AI_Assistant_API
API Version: 1.0
Content-Type: application/json
Features
- 🤖 AI-Powered Responses - Local GPU model inference (Tesla T4)
- 💬 Conversational Context - Maintains conversation history and user context
- 🔒 CORS Enabled - Ready for web integration
- ⚡ Async Processing - Efficient request handling
- 📊 Transparent Reasoning - Returns reasoning chains and performance metrics
Authentication
Currently, the API does not require authentication. However, for production use, you should:
- Set
HF_TOKENenvironment variable for Hugging Face model access - Implement API key authentication if needed
Endpoints
1. Get API Information
Endpoint: GET /
Description: Returns API information, version, and available endpoints.
Request:
GET / HTTP/1.1
Host: huggingface.co
Response:
{
"name": "AI Assistant Flask API",
"version": "1.0",
"status": "running",
"orchestrator_ready": true,
"features": {
"local_gpu_models": true,
"max_workers": 4,
"hardware": "NVIDIA T4 Medium"
},
"endpoints": {
"health": "GET /api/health",
"chat": "POST /api/chat",
"initialize": "POST /api/initialize"
}
}
Status Codes:
200 OK- Success
2. Health Check
Endpoint: GET /api/health
Description: Checks if the API and orchestrator are ready to handle requests.
Request:
GET /api/health HTTP/1.1
Host: huggingface.co
Response:
{
"status": "healthy",
"orchestrator_ready": true
}
Status Codes:
200 OK- API is healthyorchestrator_ready: true- Ready to process requestsorchestrator_ready: false- Still initializing
Example Response (Initializing):
{
"status": "initializing",
"orchestrator_ready": false
}
3. Chat Endpoint
Endpoint: POST /api/chat
Description: Send a message to the AI assistant and receive a response with reasoning and context.
Request Headers:
Content-Type: application/json
Request Body:
{
"message": "Explain quantum entanglement in simple terms",
"history": [
["User message 1", "Assistant response 1"],
["User message 2", "Assistant response 2"]
],
"session_id": "session-123",
"user_id": "user-456"
}
Request Fields:
| Field | Type | Required | Description |
|---|---|---|---|
message |
string | ✅ Yes | User's message/question (max 10,000 characters) |
history |
array | ❌ No | Conversation history as array of [user, assistant] pairs |
session_id |
string | ❌ No | Unique session identifier for context continuity |
user_id |
string | ❌ No | User identifier (defaults to "anonymous") |
Response (Success):
{
"success": true,
"message": "Quantum entanglement is when two particles become linked...",
"history": [
["Explain quantum entanglement", "Quantum entanglement is when two particles become linked..."]
],
"reasoning": {
"intent": "educational_query",
"steps": ["Understanding request", "Gathering information", "Synthesizing response"],
"confidence": 0.95
},
"performance": {
"response_time_ms": 2345,
"tokens_generated": 156,
"model_used": "mistralai/Mistral-7B-Instruct-v0.2"
}
}
Response Fields:
| Field | Type | Description |
|---|---|---|
success |
boolean | Whether the request was successful |
message |
string | AI assistant's response |
history |
array | Updated conversation history including the new exchange |
reasoning |
object | AI reasoning process and confidence metrics |
performance |
object | Performance metrics (response time, tokens, model used) |
Status Codes:
200 OK- Request processed successfully400 Bad Request- Invalid request (missing message, empty message, too long, wrong type)500 Internal Server Error- Server error processing request503 Service Unavailable- Orchestrator not ready (still initializing)
Error Response:
{
"success": false,
"error": "Message is required",
"message": "Error processing your request. Please try again."
}
4. Initialize Orchestrator
Endpoint: POST /api/initialize
Description: Manually trigger orchestrator initialization (useful if initialization failed on startup).
Request:
POST /api/initialize HTTP/1.1
Host: huggingface.co
Content-Type: application/json
Request Body:
{}
Response (Success):
{
"success": true,
"message": "Orchestrator initialized successfully"
}
Response (Failure):
{
"success": false,
"message": "Initialization failed. Check logs for details."
}
Status Codes:
200 OK- Initialization successful500 Internal Server Error- Initialization failed
Code Examples
Python
import requests
import json
BASE_URL = "https://huggingface.co/spaces/JatinAutonomousLabs/Research_AI_Assistant_API"
# Check health
def check_health():
response = requests.get(f"{BASE_URL}/api/health")
return response.json()
# Send chat message
def send_message(message, session_id=None, user_id=None, history=None):
payload = {
"message": message,
"session_id": session_id,
"user_id": user_id or "anonymous",
"history": history or []
}
response = requests.post(
f"{BASE_URL}/api/chat",
json=payload,
headers={"Content-Type": "application/json"}
)
if response.status_code == 200:
return response.json()
else:
raise Exception(f"API Error: {response.status_code} - {response.text}")
# Example usage
if __name__ == "__main__":
# Check if API is ready
health = check_health()
print(f"API Status: {health}")
if health.get("orchestrator_ready"):
# Send a message
result = send_message(
message="What is machine learning?",
session_id="my-session-123",
user_id="user-456"
)
print(f"Response: {result['message']}")
print(f"Reasoning: {result.get('reasoning', {})}")
# Continue conversation
history = result['history']
result2 = send_message(
message="Can you explain neural networks?",
session_id="my-session-123",
user_id="user-456",
history=history
)
print(f"Follow-up Response: {result2['message']}")
JavaScript (Fetch API)
const BASE_URL = 'https://huggingface.co/spaces/JatinAutonomousLabs/Research_AI_Assistant_API';
// Check health
async function checkHealth() {
const response = await fetch(`${BASE_URL}/api/health`);
return await response.json();
}
// Send chat message
async function sendMessage(message, sessionId = null, userId = null, history = []) {
const payload = {
message: message,
session_id: sessionId,
user_id: userId || 'anonymous',
history: history
};
const response = await fetch(`${BASE_URL}/api/chat`, {
method: 'POST',
headers: {
'Content-Type': 'application/json'
},
body: JSON.stringify(payload)
});
if (!response.ok) {
const error = await response.json();
throw new Error(`API Error: ${response.status} - ${error.error || error.message}`);
}
return await response.json();
}
// Example usage
async function main() {
try {
// Check if API is ready
const health = await checkHealth();
console.log('API Status:', health);
if (health.orchestrator_ready) {
// Send a message
const result = await sendMessage(
'What is machine learning?',
'my-session-123',
'user-456'
);
console.log('Response:', result.message);
console.log('Reasoning:', result.reasoning);
// Continue conversation
const result2 = await sendMessage(
'Can you explain neural networks?',
'my-session-123',
'user-456',
result.history
);
console.log('Follow-up Response:', result2.message);
}
} catch (error) {
console.error('Error:', error);
}
}
main();
cURL
# Check health
curl -X GET "https://huggingface.co/spaces/JatinAutonomousLabs/Research_AI_Assistant_API/api/health"
# Send chat message
curl -X POST "https://huggingface.co/spaces/JatinAutonomousLabs/Research_AI_Assistant_API/api/chat" \
-H "Content-Type: application/json" \
-d '{
"message": "What is machine learning?",
"session_id": "my-session-123",
"user_id": "user-456",
"history": []
}'
# Continue conversation
curl -X POST "https://huggingface.co/spaces/JatinAutonomousLabs/Research_AI_Assistant_API/api/chat" \
-H "Content-Type: application/json" \
-d '{
"message": "Can you explain neural networks?",
"session_id": "my-session-123",
"user_id": "user-456",
"history": [
["What is machine learning?", "Machine learning is a subset of artificial intelligence..."]
]
}'
Node.js (Axios)
const axios = require('axios');
const BASE_URL = 'https://huggingface.co/spaces/JatinAutonomousLabs/Research_AI_Assistant_API';
// Check health
async function checkHealth() {
const response = await axios.get(`${BASE_URL}/api/health`);
return response.data;
}
// Send chat message
async function sendMessage(message, sessionId = null, userId = null, history = []) {
try {
const response = await axios.post(`${BASE_URL}/api/chat`, {
message: message,
session_id: sessionId,
user_id: userId || 'anonymous',
history: history
}, {
headers: {
'Content-Type': 'application/json'
}
});
return response.data;
} catch (error) {
if (error.response) {
throw new Error(`API Error: ${error.response.status} - ${error.response.data.error || error.response.data.message}`);
}
throw error;
}
}
// Example usage
(async () => {
try {
const health = await checkHealth();
console.log('API Status:', health);
if (health.orchestrator_ready) {
const result = await sendMessage(
'What is machine learning?',
'my-session-123',
'user-456'
);
console.log('Response:', result.message);
}
} catch (error) {
console.error('Error:', error.message);
}
})();
Error Handling
Common Error Responses
400 Bad Request
Missing Message:
{
"success": false,
"error": "Message is required"
}
Empty Message:
{
"success": false,
"error": "Message cannot be empty"
}
Message Too Long:
{
"success": false,
"error": "Message too long. Maximum length is 10000 characters"
}
Invalid Type:
{
"success": false,
"error": "Message must be a string"
}
503 Service Unavailable
Orchestrator Not Ready:
{
"success": false,
"error": "Orchestrator not ready",
"message": "AI system is initializing. Please try again in a moment."
}
Solution: Wait a few seconds and retry, or check the /api/health endpoint.
500 Internal Server Error
Generic Error:
{
"success": false,
"error": "Error message here",
"message": "Error processing your request. Please try again."
}
Best Practices
1. Session Management
- Use consistent session IDs for maintaining conversation context
- Generate unique session IDs per user conversation thread
- Include conversation history in subsequent requests for better context
# Good: Maintains context
session_id = "user-123-session-1"
history = []
# First message
result1 = send_message("What is AI?", session_id=session_id, history=history)
history = result1['history']
# Follow-up message (includes context)
result2 = send_message("Can you explain more?", session_id=session_id, history=history)
2. Error Handling
Always implement retry logic for 503 errors:
import time
def send_message_with_retry(message, max_retries=3, retry_delay=2):
for attempt in range(max_retries):
try:
result = send_message(message)
return result
except Exception as e:
if "503" in str(e) and attempt < max_retries - 1:
time.sleep(retry_delay)
continue
raise
3. Health Checks
Check API health before sending requests:
def is_api_ready():
try:
health = check_health()
return health.get("orchestrator_ready", False)
except:
return False
if is_api_ready():
# Send request
result = send_message("Hello")
else:
print("API is not ready yet")
4. Rate Limiting
- No explicit rate limits are currently enforced
- Recommended: Implement client-side rate limiting (e.g., 1 request per second)
- Consider: Implementing request queuing for high-volume applications
5. Message Length
- Maximum: 10,000 characters per message
- Recommended: Keep messages concise for faster processing
- For long content: Split into multiple messages or summarize
6. Context Management
- Include history in requests to maintain conversation context
- Session IDs help track conversations across multiple requests
- User IDs enable personalization and user-specific context
Integration Examples
React Component
import React, { useState, useEffect } from 'react';
const AIAssistant = () => {
const [message, setMessage] = useState('');
const [history, setHistory] = useState([]);
const [loading, setLoading] = useState(false);
const [sessionId] = useState(`session-${Date.now()}`);
const sendMessage = async () => {
if (!message.trim()) return;
setLoading(true);
try {
const response = await fetch('https://huggingface.co/spaces/JatinAutonomousLabs/Research_AI_Assistant_API/api/chat', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
message: message,
session_id: sessionId,
user_id: 'user-123',
history: history
})
});
const data = await response.json();
if (data.success) {
setHistory(data.history);
setMessage('');
}
} catch (error) {
console.error('Error:', error);
} finally {
setLoading(false);
}
};
return (
<div>
<div className="chat-history">
{history.map(([user, assistant], idx) => (
<div key={idx}>
<div><strong>You:</strong> {user}</div>
<div><strong>Assistant:</strong> {assistant}</div>
</div>
))}
</div>
<input
value={message}
onChange={(e) => setMessage(e.target.value)}
onKeyPress={(e) => e.key === 'Enter' && sendMessage()}
disabled={loading}
/>
<button onClick={sendMessage} disabled={loading}>
{loading ? 'Sending...' : 'Send'}
</button>
</div>
);
};
Python CLI Tool
#!/usr/bin/env python3
import requests
import sys
BASE_URL = "https://huggingface.co/spaces/JatinAutonomousLabs/Research_AI_Assistant_API"
class ChatCLI:
def __init__(self):
self.session_id = f"cli-session-{hash(__file__)}"
self.history = []
def chat(self, message):
response = requests.post(
f"{BASE_URL}/api/chat",
json={
"message": message,
"session_id": self.session_id,
"user_id": "cli-user",
"history": self.history
}
)
if response.status_code == 200:
data = response.json()
self.history = data['history']
return data['message']
else:
return f"Error: {response.status_code} - {response.text}"
def run(self):
print("AI Assistant CLI (Type 'exit' to quit)")
print("=" * 50)
while True:
user_input = input("\nYou: ").strip()
if user_input.lower() in ['exit', 'quit']:
break
print("Assistant: ", end="", flush=True)
response = self.chat(user_input)
print(response)
if __name__ == "__main__":
cli = ChatCLI()
cli.run()
Response Times
- Typical Response: 2-10 seconds
- First Request: May take longer due to model loading (10-30 seconds)
- Subsequent Requests: Faster due to cached models (2-5 seconds)
Factors Affecting Response Time:
- Message length
- Model loading (first request)
- GPU availability
- Concurrent requests
Support
For issues, questions, or contributions:
- Repository: [GitHub Repository URL]
- Hugging Face Space: https://huggingface.co/spaces/JatinAutonomousLabs/Research_AI_Assistant_API
Changelog
Version 1.0 (Current)
- Initial API release
- Chat endpoint with context management
- Health check endpoint
- Local GPU model inference
- CORS enabled for web integration
License
This API is provided as-is. Please refer to the main project README for license information.