--- title: AI Research Paper Explainer emoji: ๐Ÿ“„ colorFrom: indigo colorTo: purple sdk: docker app_port: 8080 pinned: false --- # AI Research Paper Explainer > A **Decoding Data Science** build โ€” learner edition. > An AI agent that reads a research paper (PDF), explains the concepts in plain language, and draws flowcharts to make them easier to understand. Built with Google's **Agent Development Kit (ADK)** and **Gemini**. This guide is written so a learner can take the project from **running on their own laptop (VS Code)** all the way to a **live, shareable app on Hugging Face Spaces โ€” for free**. --- ## What it does - **PDF Analysis** โ€” upload and analyze a research paper in PDF format - **Concept Explanation** โ€” clear, accessible explanations of complex ideas - **Visual Learning** โ€” automatic flowcharts to illustrate processes and relationships - **Live Research Context** โ€” pulls in related work and follow-up directions when useful - **Interactive Q&A** โ€” ask follow-up questions in the same conversation --- ## How it works (the two halves) The app is made of two parts that now run together as **one service**: 1. **The brain (backend)** โ€” a FastAPI app running a Google ADK agent. It reads the PDF, talks to Gemini, and generates flowcharts. 2. **The face (frontend)** โ€” a simple HTML chat page in the `public/` folder where you upload the PDF and ask questions. When deployed (locally or on Hugging Face), **a single server serves both the page and the AI**, so there is nothing to wire together between separate services. --- ## Part 1 โ€” Run it locally (on your laptop) ### Prerequisites - **Python 3.10+** - A **Google AI Studio API key** (free) โ€” get one at > Note: This project uses **only the Google API key** (the Gemini Developer API). It does **not** use Google Cloud, Vertex AI, or any paid Google Cloud service. ### Step 1 โ€” Get the code ```bash git clone https://github.com//research-paper-explainer.git cd research-paper-explainer ``` ### Step 2 โ€” Install the dependencies ```bash pip install -r requirements.txt ``` > You also need **Graphviz** installed on your machine for flowcharts. On Mac: `brew install graphviz`. On Ubuntu/Debian: `sudo apt-get install graphviz`. (On Hugging Face this is installed automatically โ€” see Part 2.) ### Step 3 โ€” Add your API key Create a `.env` file in the project root (you can copy the template): ```bash cp env.example .env ``` Then open `.env` and set just one line: ``` GOOGLE_API_KEY=your-api-key-here ``` ### Step 4 โ€” Run it ```bash python main.py ``` Now open **** in your browser. The chat page loads **and** the AI works โ€” all from one command, because the app serves the page and the API together. Upload a PDF, type a question like *"Explain the main method in this paper,"* and you should get an explanation with a flowchart. --- ## Part 2 โ€” Put it on Hugging Face Spaces (free) Once it works locally, here is exactly how to take that same project and host it for free on Hugging Face, so anyone can use it from a public link. ### Why Docker? Think of **Docker as a sealed lunchbox**: it packs the code, the language, and the tools (like Graphviz) into one box that runs the same way on any computer. This repo already includes a `Dockerfile` (the packing recipe), so we use a **Docker Space**. Hugging Face opens the lunchbox and runs your app. ### What had to change to go from "two services" to "one Space" The original project was designed to run the page and the brain in **two separate places**. A Space is **one container**, so the project was adjusted as follows (these changes are already in this repo): | Change | File | Why | |---|---|---| | Backend now also serves the web page | `main.py` | One container serves both the page and the API | | Page calls `/api/explain` (relative), not `localhost` | `public/index.html` | So it works on any server, not just your laptop | | Stop excluding the `public/` folder from the image | `.dockerignore` | The web page must be packed inside the container | | Added the Space config header | `README.md` | Tells Hugging Face: `sdk: docker`, `app_port: 8080` | | Confirmed the agent is fully defined (`root_agent`) | `research_explainer/agent.py` | The app crashes on startup if the agent code is missing | > Tip for learners: if you ever see `ImportError: cannot import name 'root_agent'`, it means the agent object in `research_explainer/agent.py` is missing or renamed โ€” make sure the file ends with a `root_agent = Agent(...)` definition. ### Step-by-step deployment 1. **Create a Space** - Go to - **SDK:** Docker โ†’ template **Blank** - **Hardware:** **CPU Basic (Free)** - Click **Create Space** 2. **Upload the project files** - Open the **Files** tab โ†’ **Add file โ†’ Upload files** - Upload everything so that `README.md`, `Dockerfile`, and `main.py` sit at the **top level** (the root) of the Space โ€” not inside a subfolder โ€” and include the `public/` and `research_explainer/` folders. - Keep the auto-created `.gitattributes` (leave it untouched). Let your `README.md` **overwrite** the default one โ€” yours carries the Space settings. - Add a commit message and **Commit changes**. 3. **Add your API key as a Secret** (never put the key in a file) - Go to the Space's **Settings** tab โ†’ **Variables and secrets** โ†’ **New secret** - **Name:** `GOOGLE_API_KEY` (exact spelling) ยท **Value:** your key โ†’ **Save** 4. **Let it build** - The Space rebuilds automatically. Watch the **Logs**. - First build takes ~3โ€“5 minutes (it installs Graphviz and the Python packages). - When the status turns **Running**, your app is live on a public `โ€ฆhf.space` URL. 5. **Test it** - Open the App, upload a PDF, ask a question, and confirm you get an explanation with a flowchart. ### Common issues (and fixes) - **`403 โ€” cpu-basic quota limit`** โ†’ On the free plan only a limited number of your Spaces can run at once. Open your other Spaces marked **"Running"**, **pause** one you don't need (Settings โ†’ Pause), then **Restart** this Space. (Spaces marked "Sleeping" don't count.) - **Runtime crash on startup / `ImportError: root_agent`** โ†’ Make sure `research_explainer/agent.py` actually defines the agent (`root_agent = Agent(...)`). - **App loads but answers fail / "Failed to fetch"** โ†’ Check the secret is named exactly `GOOGLE_API_KEY`. --- ## Response structure (what the agent returns) Each explanation generally follows this shape: 1. **Brief Overview** โ€” what the concept is and why it matters 2. **Detailed Explanation** โ€” step-by-step breakdown with technical details 3. **Paper Context** โ€” how this fits into the broader research 4. **Visual Aid / Research Context** โ€” a flowchart or related work, placed where most useful 5. **Key Takeaways** โ€” a short summary --- ## Tools the agent uses - **Flowchart Generator (`generate_flowchart`)** โ€” builds clean flowcharts (via Graphviz) to show processes, architectures, and relationships. - **Research Context (`find_research_context`)** โ€” finds related papers and follow-up directions and ties them back to the uploaded paper. - **Diagram Generator (`generate_diagram`)** โ€” *optional* image-based diagrams via Gemini image generation. Left **off by default** in this build; enable it in `research_explainer/agent.py` by un-commenting it in the agent's `tools` list. --- ## Technical details - **Model:** Gemini 2.5 Flash (text generation and analysis) - **Flowcharts:** Graphviz (programmatic generation) - **Backend:** FastAPI + Google ADK, in-memory sessions (runs as a single instance โ€” which is exactly what one Space provides) - **Auth:** Google AI Studio API key (`GOOGLE_API_KEY`) โ€” no Vertex AI, no Google Cloud --- ## Sample questions to try - "Explain the machine learning method described in this paper." - "How does the proposed approach work, step by step?" - "What is the architecture of the system described?" - "What are the key contributions of this research?" --- ## Extending it This agent is easy to build on. You can: - Add new tools (e.g. different kinds of visualizations) - Adjust the prompt in `research_explainer/agent.py` to specialize in a domain - Enable the image-diagram tool - Improve the PDF handling or support more file types --- ## Credits & ownership Maintained by **Mohammad Arshad** ยท **Decoding Data Science** ยฉ 2026 Decoding Data Science. Built for the DDS community and learners. Original "Research Paper Explainer" concept built with Google's Agent Development Kit (ADK).