Spaces:

lvvignesh2122
/

Gemini-Rag-Fastapi-Pro

Sleeping

App Files Files Community

Gemini-Rag-Fastapi-Pro / README.md

lvvignesh2122

Rebrand project to NexusGraph AI

40bb6e9 about 1 month ago

preview code

raw

history blame contribute delete

3.32 kB

metadata

title: NexusGraph AI
emoji: 🧠
colorFrom: blue
colorTo: indigo
sdk: docker
pinned: false

🧠 NexusGraph AI

High Distinction Project: An advanced "Agentic" Retrieval-Augmented Generation system that uses Graph Theory (LangGraph), Structural Retrieval (SQL), and Self-Correction to answer complex queries.

🚀 The "Master's Level" Difference

Unlike basic RAG scripts that just "search and dump," this system acts like a Consulting Firm:

Supervisor Agent (Hybrid): Uses Gemini 2.5 Flash Lite (Fast) to decide which tool to use (PDF, Web, or SQL).
Responder Agent (Expert): Uses Gemini 3 Flash Preview (Smart) to synthesize the final answer.
Self-Correction: If the answer is bad, the agent rewrites the query and tries again.
Hybrid Retrieval: Combines Unstructured Data (PDFs) with Structured Data (SQL Database).
Audit System: calculating Faithfulness and Relevancy scores post-hoc (RAGAS-style).

🏛️ Architecture

graph TD
    User --> Supervisor
    Supervisor -->|Policy?| PDF[Librarian: Vectors]
    Supervisor -->|Stats?| SQL[Analyst: SQL DB]
    Supervisor -->|News?| Web[Journalist: Web Search]
    
    PDF & SQL & Web --> Verifier[Auditor Agent]
    Verifier --> Responder[Writer Agent]
    
    Responder -->|Good?| End
    Responder -->|Bad?| Supervisor

✨ New Features

1. 📊 Data Analyst (SQL Tool)

The system can now answer quantitative questions like "Who pays the highest fees?" or "What is the average GPA?" by querying a local SQLite database.

2. 🛡️ Resilience (Circuit Breaker)

If the Google Gemini API quota is exceeded (429), the system catches the error and returns a graceful "System Busy" message instead of crashing (500).

3. ⚖️ Hybrid Agent Architecture

Optimized for Speed and Intelligence:

Routing: Handled by lightweight gemini-2.5-flash-lite.
Reasoning: Handled by powerful gemini-3-flash-preview.

4. 🚀 CI/CD Pipeline

Automated deployment from GitHub to Hugging Face using GitHub Actions. Commits to main are instantly verified and deployed to production.

5. 🧪 Automated Testing

Includes a tests/ suite:

test_api.py: Integrations tests for endpoints.
test_rag.py: Unit tests for retrieval logic.

6. 🐳 Dockerized

Fully containerized for "Run Anywhere" capability.

🛠️ How to Run

Option A: Local Python

Install: pip install -r requirements.txt
Environment: Create .env with GEMINI_API_KEY and TAVILY_API_KEY.
Run Service:
```
uvicorn main:app --reload
```
Run Evaluation Audit:
```
python run_evals.py
```

Option B: Docker (Recommended)

Build:
```
docker-compose build
```
Run:
```
docker-compose up
```

Option C: Run Tests

pytest

📊 Evaluation (The Science)

We use an LLM-as-a-Judge approach (run_evals.py) to measure:

Faithfulness: Is the answer hallucinated?
Relevancy: Did we answer the prompt?
Current Benchmarks: ~0.92 Faithfulness / 0.89 Relevancy.

📜 Credits

Built by Vignesh Ladar Vidyananda. Powered by FastAPI, LangGraph, FAISS, and Google Gemini.