decodingdatascience's picture
Update README.md
5550b41 verified
|
Raw
History Blame Contribute Delete
8.72 kB
metadata
title: AI Research Paper Explainer
emoji: πŸ“„
colorFrom: indigo
colorTo: purple
sdk: docker
app_port: 8080
pinned: false

AI Research Paper Explainer

A Decoding Data Science build β€” learner edition. An AI agent that reads a research paper (PDF), explains the concepts in plain language, and draws flowcharts to make them easier to understand. Built with Google's Agent Development Kit (ADK) and Gemini.

This guide is written so a learner can take the project from running on their own laptop (VS Code) all the way to a live, shareable app on Hugging Face Spaces β€” for free.


What it does

  • PDF Analysis β€” upload and analyze a research paper in PDF format
  • Concept Explanation β€” clear, accessible explanations of complex ideas
  • Visual Learning β€” automatic flowcharts to illustrate processes and relationships
  • Live Research Context β€” pulls in related work and follow-up directions when useful
  • Interactive Q&A β€” ask follow-up questions in the same conversation

How it works (the two halves)

The app is made of two parts that now run together as one service:

  1. The brain (backend) β€” a FastAPI app running a Google ADK agent. It reads the PDF, talks to Gemini, and generates flowcharts.
  2. The face (frontend) β€” a simple HTML chat page in the public/ folder where you upload the PDF and ask questions.

When deployed (locally or on Hugging Face), a single server serves both the page and the AI, so there is nothing to wire together between separate services.


Part 1 β€” Run it locally (on your laptop)

Prerequisites

Note: This project uses only the Google API key (the Gemini Developer API). It does not use Google Cloud, Vertex AI, or any paid Google Cloud service.

Step 1 β€” Get the code

git clone https://github.com/<your-username>/research-paper-explainer.git
cd research-paper-explainer

Step 2 β€” Install the dependencies

pip install -r requirements.txt

You also need Graphviz installed on your machine for flowcharts. On Mac: brew install graphviz. On Ubuntu/Debian: sudo apt-get install graphviz. (On Hugging Face this is installed automatically β€” see Part 2.)

Step 3 β€” Add your API key

Create a .env file in the project root (you can copy the template):

cp env.example .env

Then open .env and set just one line:

GOOGLE_API_KEY=your-api-key-here

Step 4 β€” Run it

python main.py

Now open http://localhost:8080 in your browser. The chat page loads and the AI works β€” all from one command, because the app serves the page and the API together.

Upload a PDF, type a question like "Explain the main method in this paper," and you should get an explanation with a flowchart.


Part 2 β€” Put it on Hugging Face Spaces (free)

Once it works locally, here is exactly how to take that same project and host it for free on Hugging Face, so anyone can use it from a public link.

Why Docker?

Think of Docker as a sealed lunchbox: it packs the code, the language, and the tools (like Graphviz) into one box that runs the same way on any computer. This repo already includes a Dockerfile (the packing recipe), so we use a Docker Space. Hugging Face opens the lunchbox and runs your app.

What had to change to go from "two services" to "one Space"

The original project was designed to run the page and the brain in two separate places. A Space is one container, so the project was adjusted as follows (these changes are already in this repo):

Change File Why
Backend now also serves the web page main.py One container serves both the page and the API
Page calls /api/explain (relative), not localhost public/index.html So it works on any server, not just your laptop
Stop excluding the public/ folder from the image .dockerignore The web page must be packed inside the container
Added the Space config header README.md Tells Hugging Face: sdk: docker, app_port: 8080
Confirmed the agent is fully defined (root_agent) research_explainer/agent.py The app crashes on startup if the agent code is missing

Tip for learners: if you ever see ImportError: cannot import name 'root_agent', it means the agent object in research_explainer/agent.py is missing or renamed β€” make sure the file ends with a root_agent = Agent(...) definition.

Step-by-step deployment

  1. Create a Space

  2. Upload the project files

    • Open the Files tab β†’ Add file β†’ Upload files
    • Upload everything so that README.md, Dockerfile, and main.py sit at the top level (the root) of the Space β€” not inside a subfolder β€” and include the public/ and research_explainer/ folders.
    • Keep the auto-created .gitattributes (leave it untouched). Let your README.md overwrite the default one β€” yours carries the Space settings.
    • Add a commit message and Commit changes.
  3. Add your API key as a Secret (never put the key in a file)

    • Go to the Space's Settings tab β†’ Variables and secrets β†’ New secret
    • Name: GOOGLE_API_KEY (exact spelling) Β· Value: your key β†’ Save
  4. Let it build

    • The Space rebuilds automatically. Watch the Logs.
    • First build takes ~3–5 minutes (it installs Graphviz and the Python packages).
    • When the status turns Running, your app is live on a public …hf.space URL.
  5. Test it

    • Open the App, upload a PDF, ask a question, and confirm you get an explanation with a flowchart.

Common issues (and fixes)

  • 403 β€” cpu-basic quota limit β†’ On the free plan only a limited number of your Spaces can run at once. Open your other Spaces marked "Running", pause one you don't need (Settings β†’ Pause), then Restart this Space. (Spaces marked "Sleeping" don't count.)
  • Runtime crash on startup / ImportError: root_agent β†’ Make sure research_explainer/agent.py actually defines the agent (root_agent = Agent(...)).
  • App loads but answers fail / "Failed to fetch" β†’ Check the secret is named exactly GOOGLE_API_KEY.

Response structure (what the agent returns)

Each explanation generally follows this shape:

  1. Brief Overview β€” what the concept is and why it matters
  2. Detailed Explanation β€” step-by-step breakdown with technical details
  3. Paper Context β€” how this fits into the broader research
  4. Visual Aid / Research Context β€” a flowchart or related work, placed where most useful
  5. Key Takeaways β€” a short summary

Tools the agent uses

  • Flowchart Generator (generate_flowchart) β€” builds clean flowcharts (via Graphviz) to show processes, architectures, and relationships.
  • Research Context (find_research_context) β€” finds related papers and follow-up directions and ties them back to the uploaded paper.
  • Diagram Generator (generate_diagram) β€” optional image-based diagrams via Gemini image generation. Left off by default in this build; enable it in research_explainer/agent.py by un-commenting it in the agent's tools list.

Technical details

  • Model: Gemini 2.5 Flash (text generation and analysis)
  • Flowcharts: Graphviz (programmatic generation)
  • Backend: FastAPI + Google ADK, in-memory sessions (runs as a single instance β€” which is exactly what one Space provides)
  • Auth: Google AI Studio API key (GOOGLE_API_KEY) β€” no Vertex AI, no Google Cloud

Sample questions to try

  • "Explain the machine learning method described in this paper."
  • "How does the proposed approach work, step by step?"
  • "What is the architecture of the system described?"
  • "What are the key contributions of this research?"

Extending it

This agent is easy to build on. You can:

  • Add new tools (e.g. different kinds of visualizations)
  • Adjust the prompt in research_explainer/agent.py to specialize in a domain
  • Enable the image-diagram tool
  • Improve the PDF handling or support more file types

Credits & ownership

Maintained by Mohammad Arshad Β· Decoding Data Science Β© 2026 Decoding Data Science. Built for the DDS community and learners.

Original "Research Paper Explainer" concept built with Google's Agent Development Kit (ADK).