decodingdatascience commited on
Commit
8bd78d1
·
verified ·
1 Parent(s): af8d4e3

Upload 15 files

Browse files
.dockerignore ADDED
@@ -0,0 +1,11 @@
 
 
 
 
 
 
 
 
 
 
 
 
1
+ .venv
2
+ __pycache__
3
+ **/__pycache__
4
+ *.pyc
5
+ *.pyo
6
+ .git
7
+ .env
8
+ .firebaserc
9
+ firebase.json
10
+ WORKSHOP.md
11
+ README.md
.gcloudignore ADDED
@@ -0,0 +1,11 @@
 
 
 
 
 
 
 
 
 
 
 
 
1
+ .gcloudignore
2
+ .git
3
+ .venv
4
+ __pycache__
5
+ **/__pycache__
6
+ *.pyc
7
+ *.pyo
8
+ .env
9
+ public/
10
+ WORKSHOP.md
11
+ README.md
.gitignore ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ *.pyc
2
+ .env
3
+ .vscode/settings.json
Dockerfile ADDED
@@ -0,0 +1,12 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ FROM python:3.11-slim
2
+
3
+ RUN apt-get update && apt-get install -y --no-install-recommends graphviz && rm -rf /var/lib/apt/lists/*
4
+
5
+ WORKDIR /app
6
+ COPY requirements.txt .
7
+ RUN pip install --no-cache-dir -r requirements.txt
8
+
9
+ COPY . .
10
+
11
+ ENV PORT=8080
12
+ CMD ["python", "main.py"]
README.md ADDED
@@ -0,0 +1,202 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ title: AI Research Paper Explainer
3
+ emoji: 📄
4
+ colorFrom: indigo
5
+ colorTo: purple
6
+ sdk: docker
7
+ app_port: 8080
8
+ pinned: false
9
+ ---
10
+
11
+ # Research Paper Explainer Agent
12
+
13
+ A specialized ADK based AI agent that analyzes research papers and provides detailed explanations with visual aids. Upload a PDF research paper, ask questions about specific concepts, and receive comprehensive explanations accompanied by flowcharts and diagrams. Made with Google's Agent Development Kit (ADK)
14
+
15
+ **My motivation to make this:** I often need to read several research papers and learn new advanced concepts in machine learning directly from highly technical papers to keep up with the literature and implement these new concepts at work and in my research. This tool will help me focus on the important parts of the paper and give me illustrations and diagrams to help me learn and visualize things faster. The agent is designed to use multiple diagrams to explain the concept, giving me more details than the one or two diagrams that are normally included in research papers.
16
+
17
+ ## Features
18
+
19
+ - **PDF Analysis**: Upload and analyze research papers in PDF format
20
+ - **Concept Explanation**: Get detailed, accessible explanations of complex research concepts
21
+ - **Visual Learning**: Automatic generation of flowcharts and diagrams to enhance understanding
22
+ - **Context-Aware**: Explanations are grounded in the specific paper being analyzed
23
+ - **Interactive Q&A**: Ask follow-up questions and get clarifications
24
+
25
+ ## Quick Start
26
+
27
+ ### Prerequisites
28
+
29
+ - Python 3.8+
30
+ - Google Cloud Project with Vertex AI enabled OR
31
+ - [Google AI Studio](https://aistudio.google.com/app/apikey) API key
32
+ - ADK (Agent Development Kit) installed
33
+
34
+ ### Installation
35
+
36
+ 1. Clone or download this project
37
+ 2. Install dependencies:
38
+ ```bash
39
+ pip install -r requirements.txt
40
+ ```
41
+
42
+ 3. Set up environment variables:
43
+ ```bash
44
+ cp env.example .env
45
+ ```
46
+ Edit `.env` and add your Google Cloud configuration / Google AI Studio API key:
47
+ ```
48
+ GOOGLE_GENAI_USE_VERTEXAI=TRUE
49
+ GOOGLE_CLOUD_PROJECT=your-project-id
50
+ GOOGLE_CLOUD_LOCATION=your-region
51
+ ```
52
+ OR
53
+ ```
54
+ GOOGLE_API_KEY=your-api-key
55
+ ```
56
+
57
+ ### Running locally
58
+
59
+ The backend (FastAPI + ADK agent) and frontend (static HTML) are served separately — mirroring how they're deployed in production (Cloud Run + Firebase Hosting).
60
+
61
+ **Terminal 1 — backend:**
62
+ ```bash
63
+ fastapi dev main.py
64
+ ```
65
+ The API will be available at `http://localhost:8000`.
66
+
67
+ **Terminal 2 — frontend:**
68
+ ```bash
69
+ cd public
70
+ python3 -m http.server 3000
71
+ ```
72
+ Open `http://localhost:3000` in your browser.
73
+
74
+ > The `BACKEND_URL` in `public/index.html` defaults to `http://localhost:8000/api/explain`, so no extra config is needed for local dev.
75
+
76
+ ## How It Works
77
+
78
+ ### Core Functionality
79
+
80
+ The Research Explainer agent follows a structured workflow:
81
+
82
+ 1. **Paper Analysis**: Reads and understands the uploaded PDF research paper
83
+ 2. **Concept Identification**: Identifies the specific concept you're asking about
84
+ 3. **Detailed Explanation**: Provides a clear, structured explanation including:
85
+ - Definition of the concept
86
+ - How it works (step-by-step if applicable)
87
+ - Why it's important in the context of the paper
88
+ - Key mathematical formulas or technical details
89
+ 4. **Visual Generation**: Creates appropriate flowcharts or diagrams to illustrate the concept
90
+ 5. **Integration**: Seamlessly integrates visual aids into the explanation
91
+
92
+ ### Response Structure
93
+
94
+ Each explanation follows this format:
95
+ - **Brief Overview**: What the concept is and why it matters
96
+ - **Detailed Explanation**: Step-by-step breakdown with technical details
97
+ - **Paper Context**: How this concept fits into the broader research
98
+ - **Visual Aid**: Flowchart or diagram (integrated at the most relevant point)
99
+ - **Key Takeaways**: Summary of the most important points
100
+
101
+ ## Tools
102
+
103
+ The agent is equipped with two specialized tools for visual learning:
104
+
105
+ ### 1. Flowchart Generator (`generate_flowchart`)
106
+
107
+ Creates programmatically generated flowcharts to illustrate processes, workflows, and relationships between concepts.
108
+
109
+ **Features:**
110
+ - Customizable node colors and labels
111
+ - Flexible connection patterns
112
+ - Professional styling with clean typography
113
+ - Automatic layout optimization
114
+
115
+ **Use Cases:**
116
+ - Algorithm workflows
117
+ - Process diagrams
118
+ - System architectures
119
+ - Decision trees
120
+ - Data flow diagrams
121
+
122
+ ### 2. Diagram Generator (`generate_diagram`)
123
+
124
+ Creates abstract diagrams and illustrations to explain complex concepts that don't fit into flowchart format.
125
+
126
+ **Features:**
127
+ - AI-generated technical illustrations
128
+ - High-resolution, clean design
129
+ - Context-aware visualizations
130
+ - Support for abstract concepts
131
+
132
+ **Use Cases:**
133
+ - Mathematical concepts
134
+ - Scientific phenomena
135
+ - Abstract relationships
136
+ - Conceptual models
137
+ - Technical illustrations
138
+
139
+ ## Example Usage
140
+
141
+ ### Sample Questions
142
+
143
+ - "Explain the machine learning algorithm described in this paper"
144
+ - "How does the proposed method work step by step?"
145
+ - "What is the architecture of the system described?"
146
+ - "Can you explain the mathematical formulation in section 3?"
147
+ - "What are the key contributions of this research?"
148
+
149
+ ### Sample Response
150
+
151
+ The agent will provide:
152
+ 1. Paper title and main contributions
153
+ 2. Detailed explanation of the requested concept
154
+ 3. Relevant flowcharts showing the process flow
155
+ 4. Additional diagrams illustrating key concepts
156
+ 5. Page references and citations from the paper
157
+
158
+ ## Technical Details
159
+
160
+ ### Model
161
+ - **Primary Model**: Gemini 2.5 Pro for text generation and analysis
162
+ - **Image Generation**: Gemini 2.5 Flash Image Preview for diagram creation
163
+ - **Flowchart Engine**: Graphviz for programmatic flowchart generation
164
+
165
+
166
+ ## Troubleshooting
167
+
168
+ ### Common Issues
169
+
170
+ 1. **PDF Upload Fails**: Ensure the PDF is not password-protected and is readable
171
+ 2. **No Visuals Generated**: The agent may determine that a concept doesn't need visual aids
172
+ 3. **Environment Errors**: Verify your Google Cloud credentials and project configuration / `GOOGLE_API_KEY` is set in `.env`
173
+
174
+ ### Getting Help
175
+
176
+ If you encounter issues:
177
+ 1. Check that `GOOGLE_API_KEY` is set in `.env`
178
+ 2. Ensure all dependencies are properly installed
179
+ 3. Check the console output for detailed error messages
180
+
181
+ If using VertexAI:
182
+ 1. Check your Google Cloud project configuration
183
+ 2. Verify that Vertex AI is enabled in your project
184
+ 3. Ensure all dependencies are properly installed
185
+ 4. Check the console output for detailed error messages
186
+
187
+ ## Contributing
188
+
189
+ This agent is designed to be easily extensible. You can:
190
+ - Add new tools for different types of visualizations
191
+ - Modify the prompt to specialize in specific research domains
192
+ - Enhance the PDF processing capabilities
193
+ - Add support for additional file formats
194
+
195
+ ## License
196
+
197
+ Created by Rohan Mitra (rohanmitra8@gmail.com)
198
+ Copyright © 2025
199
+
200
+ ---
201
+
202
+ **Note**: This agent requires a Google Cloud project with Vertex AI enabled and proper authentication configured OR Google AI Studio API key (`GOOGLE_API_KEY` in `.env`).
env.example ADDED
@@ -0,0 +1,8 @@
 
 
 
 
 
 
 
 
 
1
+ # Google Cloud Configuration
2
+ # GOOGLE_GENAI_USE_VERTEXAI=TRUE
3
+ # GOOGLE_CLOUD_PROJECT=<your-project-id>
4
+ # GOOGLE_CLOUD_LOCATION=<region>
5
+
6
+ # Google AI Studio (Gemini Developer API)
7
+ # Get a key at https://aistudio.google.com/app/apikey
8
+ GOOGLE_API_KEY=<your-api-key>
main.py ADDED
@@ -0,0 +1,334 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Copyright 2026 Google LLC
2
+ #
3
+ # Licensed under the Apache License, Version 2.0 (the "License");
4
+ # you may not use this file except in compliance with the License.
5
+ # You may obtain a copy of the License at
6
+ #
7
+ # http://www.apache.org/licenses/LICENSE-2.0
8
+ #
9
+ # Unless required by applicable law or agreed to in writing, software
10
+ # distributed under the License is distributed on an "AS IS" BASIS,
11
+ # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12
+ # See the License for the specific language governing permissions and
13
+ # limitations under the License.
14
+
15
+ """
16
+ FastAPI entrypoint for an ADK agent using **in-memory** session and artifact storage.
17
+
18
+ Intended for **single-instance** deployment (e.g. one Cloud Run instance with min/max
19
+ instances = 1). State is lost on restart and is not shared across replicas.
20
+
21
+ Swap points for your project:
22
+ - Import: replace `paper_agent` / the fallback import with your real agent symbol
23
+ (this repo exposes `root_agent` in `research_explainer.agent`).
24
+
25
+ Images in the JSON response are **data URLs** (`data:image/png;base64,...`) loaded from
26
+ the in-memory artifact store after the run, so a browser or frontend can render them
27
+ without GCS.
28
+
29
+ Session TTL: set ``SESSION_TTL_SECONDS`` (seconds of inactivity). ``0`` disables expiry.
30
+ After TTL, the session is deleted and recreated on the next request. Only **session-scoped**
31
+ artifacts are removed on expiry; ``user:`` namespaced artifacts are left intact so other
32
+ sessions for the same ``user_id`` are not affected.
33
+ """
34
+
35
+ from __future__ import annotations
36
+
37
+ import base64
38
+ import logging
39
+ import os
40
+ from typing import Any, Iterable
41
+
42
+ import dotenv
43
+
44
+ dotenv.load_dotenv()
45
+
46
+ from fastapi import FastAPI, File, Form, HTTPException, UploadFile
47
+ from fastapi.middleware.cors import CORSMiddleware
48
+ from fastapi.staticfiles import StaticFiles
49
+ from google.adk.artifacts import InMemoryArtifactService
50
+ from google.adk.events.event import Event
51
+ from google.adk.runners import Runner
52
+ from google.adk.sessions import InMemorySessionService
53
+ from google.genai import types
54
+ from pydantic import BaseModel
55
+ import uvicorn
56
+
57
+ # -----------------------------------------------------------------------------
58
+ # SWAP: import your root ADK agent here.
59
+ # Example for this repository:
60
+ # from research_explainer.agent import root_agent as paper_agent
61
+ # -----------------------------------------------------------------------------
62
+ try:
63
+ from agent import paper_agent
64
+ except ImportError: # pragma: no cover - convenience for this repo layout
65
+ from research_explainer.agent import root_agent as paper_agent
66
+
67
+ logger = logging.getLogger(__name__)
68
+
69
+ # Max PDF size for UploadFile (bytes).
70
+ MAX_PDF_BYTES = int(os.environ.get("MAX_PDF_BYTES", str(25 * 1024 * 1024)))
71
+
72
+ APP_NAME = os.environ.get("ADK_APP_NAME", "research_explainer")
73
+ DEFAULT_USER_ID = os.environ.get("ADK_USER_ID", "web")
74
+ # Set RUNNING_LOCALLY=1 for verbose session logging (similar to local dev flags).
75
+ RUNNING_LOCALLY = os.environ.get("RUNNING_LOCALLY", "").lower() in (
76
+ "1",
77
+ "true",
78
+ "yes",
79
+ )
80
+
81
+ artifact_service = InMemoryArtifactService()
82
+ session_service = InMemorySessionService()
83
+
84
+ runner = Runner(
85
+ app_name=APP_NAME,
86
+ agent=paper_agent,
87
+ session_service=session_service,
88
+ artifact_service=artifact_service,
89
+ )
90
+
91
+
92
+ async def resolve_session(
93
+ user_id: str,
94
+ session_id: str,
95
+ *,
96
+ initial_state: dict[str, Any] | None = None,
97
+ ) -> None:
98
+ """
99
+ Load an existing session or create one with the given id.
100
+
101
+ Use `initial_state` when you need to seed session-scoped state on first creation
102
+ (e.g. tool flags). Omitted here by default; extend the call site if your app needs it.
103
+ """
104
+ sess = await session_service.get_session(
105
+ app_name=APP_NAME,
106
+ user_id=user_id,
107
+ session_id=session_id,
108
+ )
109
+ if sess is not None:
110
+ if RUNNING_LOCALLY:
111
+ logger.info(
112
+ "Session already exists: app=%r user=%r session=%r",
113
+ APP_NAME,
114
+ user_id,
115
+ session_id,
116
+ )
117
+ return
118
+
119
+ await session_service.create_session(
120
+ app_name=APP_NAME,
121
+ user_id=user_id,
122
+ session_id=session_id,
123
+ state=initial_state,
124
+ )
125
+ logger.info(
126
+ "New session created: app=%r user=%r session=%r",
127
+ APP_NAME,
128
+ user_id,
129
+ session_id,
130
+ )
131
+
132
+
133
+ app = FastAPI(title="Research Explainer API", version="1.0.0")
134
+
135
+ app.add_middleware(
136
+ CORSMiddleware,
137
+ allow_origins=["*"],
138
+ allow_credentials=False,
139
+ allow_methods=["*"],
140
+ allow_headers=["*"],
141
+ )
142
+
143
+
144
+ def _gather_text_for_response(events: Iterable[Event]) -> str:
145
+ """Collects user-visible assistant text from streamed events.
146
+
147
+ Do not skip events just because they also include tool calls/responses; the model
148
+ often emits explanation text in the same turn as ``function_call`` parts. Skipping
149
+ those events previously dropped the entire explanation while images still appeared.
150
+ """
151
+ final_chunks: list[str] = []
152
+ assistant_chunks: list[str] = []
153
+
154
+ for event in events:
155
+ if event.partial:
156
+ continue
157
+ if not event.content or not event.content.parts:
158
+ continue
159
+ # User turn events can appear in the stream; only aggregate assistant output.
160
+ if event.author == "user":
161
+ continue
162
+
163
+ pieces: list[str] = []
164
+ for part in event.content.parts:
165
+ if part.text:
166
+ pieces.append(part.text)
167
+ segment = "".join(pieces).strip()
168
+ if not segment:
169
+ continue
170
+
171
+ assistant_chunks.append(segment)
172
+ if event.is_final_response():
173
+ final_chunks.append(segment)
174
+
175
+ if final_chunks:
176
+ return "\n\n".join(final_chunks)
177
+ if assistant_chunks:
178
+ return "\n\n".join(assistant_chunks)
179
+ return ""
180
+
181
+
182
+ async def _collect_images_as_data_urls(
183
+ events: Iterable[Event],
184
+ *,
185
+ app_name: str,
186
+ user_id: str,
187
+ session_id: str,
188
+ ) -> list[str]:
189
+ """
190
+ Loads image artifacts touched during this run from the in-memory artifact service
191
+ and returns them as data URLs for the frontend.
192
+ """
193
+ seen: set[tuple[str, int]] = set()
194
+ ordered: list[str] = []
195
+
196
+ for event in events:
197
+ if not event.actions or not event.actions.artifact_delta:
198
+ continue
199
+ for filename, version in event.actions.artifact_delta.items():
200
+ key = (filename, version)
201
+ if key in seen:
202
+ continue
203
+ seen.add(key)
204
+
205
+ load_session_id = None if filename.startswith("user:") else session_id
206
+ part = await artifact_service.load_artifact(
207
+ app_name=app_name,
208
+ user_id=user_id,
209
+ filename=filename,
210
+ session_id=load_session_id,
211
+ version=version,
212
+ )
213
+ if not part or not part.inline_data or not part.inline_data.data:
214
+ continue
215
+ mime = (part.inline_data.mime_type or "application/octet-stream").lower()
216
+ if not mime.startswith("image/"):
217
+ continue
218
+ b64 = base64.b64encode(part.inline_data.data).decode("ascii")
219
+ ordered.append(f"data:{mime};base64,{b64}")
220
+
221
+ return ordered
222
+
223
+
224
+ class ExplainResponse(BaseModel):
225
+ text: str
226
+ images: list[str]
227
+
228
+
229
+ @app.post("/api/explain", response_model=ExplainResponse)
230
+ async def explain(
231
+ session_id: str = Form(...),
232
+ user_input: str = Form(""),
233
+ file: UploadFile | None = File(default=None),
234
+ ) -> ExplainResponse:
235
+ """
236
+ Runs one agent turn for the given ``session_id``.
237
+
238
+ Send JSON-compatible fields via **multipart/form-data**: ``session_id``, ``user_input``,
239
+ and optional ``file`` (PDF). The PDF is attached to the user message as inline bytes
240
+ for the model. A PDF is only accepted on the **first** turn of a session (no prior
241
+ events); later turns must omit ``file``.
242
+ """
243
+ session_id = session_id.strip()
244
+ user_input = (user_input or "").strip()
245
+ user_id = DEFAULT_USER_ID
246
+
247
+ pdf_bytes: bytes | None = None
248
+ if file is not None and getattr(file, "filename", None):
249
+ if not str(file.filename).lower().endswith(".pdf"):
250
+ raise HTTPException(
251
+ status_code=400, detail="Only PDF uploads are supported (.pdf)."
252
+ )
253
+ pdf_bytes = await file.read()
254
+ if len(pdf_bytes) > MAX_PDF_BYTES:
255
+ raise HTTPException(
256
+ status_code=400,
257
+ detail=f"PDF exceeds maximum size of {MAX_PDF_BYTES // (1024 * 1024)} MB.",
258
+ )
259
+ if not pdf_bytes:
260
+ raise HTTPException(status_code=400, detail="Uploaded PDF is empty.")
261
+
262
+ if not user_input and not pdf_bytes:
263
+ raise HTTPException(
264
+ status_code=400,
265
+ detail="Provide non-empty user_input and/or a PDF file.",
266
+ )
267
+
268
+ try:
269
+ await resolve_session(user_id, session_id)
270
+ except Exception as exc: # pragma: no cover - runtime guard
271
+ logger.exception("Session resolution failed for session_id=%s", session_id)
272
+ raise HTTPException(status_code=500, detail=str(exc)) from exc
273
+
274
+ existing = await session_service.get_session(
275
+ app_name=APP_NAME,
276
+ user_id=user_id,
277
+ session_id=session_id,
278
+ )
279
+ if (
280
+ pdf_bytes is not None
281
+ and existing
282
+ and existing.events
283
+ and len(existing.events) > 0
284
+ ):
285
+ raise HTTPException(
286
+ status_code=400,
287
+ detail="A PDF can only be attached on the first message of a conversation.",
288
+ )
289
+
290
+ parts: list[types.Part] = []
291
+ if pdf_bytes is not None:
292
+ parts.append(
293
+ types.Part.from_bytes(data=pdf_bytes, mime_type="application/pdf")
294
+ )
295
+ if user_input:
296
+ parts.append(types.Part.from_text(text=user_input))
297
+
298
+ new_message = types.Content(role="user", parts=parts)
299
+
300
+ collected: list[Event] = []
301
+
302
+ try:
303
+ async for event in runner.run_async(
304
+ user_id=user_id,
305
+ session_id=session_id,
306
+ new_message=new_message,
307
+ ):
308
+ collected.append(event)
309
+ except HTTPException:
310
+ raise
311
+ except Exception as exc: # pragma: no cover - runtime guard
312
+ logger.exception("Runner failed for session_id=%s", session_id)
313
+ raise HTTPException(status_code=500, detail=str(exc)) from exc
314
+
315
+ text = _gather_text_for_response(collected)
316
+ images = await _collect_images_as_data_urls(
317
+ collected,
318
+ app_name=APP_NAME,
319
+ user_id=user_id,
320
+ session_id=session_id,
321
+ )
322
+
323
+ return ExplainResponse(text=text, images=images)
324
+
325
+
326
+
327
+ # Serve the static frontend in public/ at the site root, so a SINGLE container
328
+ # serves both the web page and the /api/explain endpoint (Hugging Face Spaces).
329
+ app.mount("/", StaticFiles(directory="public", html=True), name="static")
330
+
331
+ if __name__ == "__main__":
332
+ logging.basicConfig(level=os.environ.get("LOG_LEVEL", "INFO"))
333
+ port = int(os.environ.get("PORT", "8080"))
334
+ uvicorn.run("main:app", host="0.0.0.0", port=port)
public/index.html ADDED
@@ -0,0 +1,550 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <!DOCTYPE html>
2
+ <html lang="en">
3
+ <head>
4
+ <meta charset="UTF-8" />
5
+ <meta name="viewport" content="width=device-width, initial-scale=1.0" />
6
+ <title>AI Research Paper Explainer</title>
7
+ <script src="https://cdn.tailwindcss.com"></script>
8
+ <script>
9
+ tailwind.config = { darkMode: "class" };
10
+ </script>
11
+ <script src="https://cdn.jsdelivr.net/npm/marked/marked.min.js"></script>
12
+ <style>
13
+ .drop-zone-active {
14
+ border-color: #6366f1 !important;
15
+ background-color: #eef2ff !important;
16
+ box-shadow: 0 0 0 3px rgba(99, 102, 241, 0.25);
17
+ }
18
+ .dark .drop-zone-active {
19
+ background-color: #1e1b4b !important;
20
+ }
21
+ </style>
22
+ </head>
23
+ <body class="min-h-screen bg-stone-100 text-stone-900 antialiased transition-colors dark:bg-stone-950 dark:text-stone-100">
24
+ <div class="mx-auto flex min-h-screen max-w-3xl flex-col px-4 py-6">
25
+
26
+ <!-- Header -->
27
+ <header class="mb-5 flex flex-shrink-0 items-center justify-between">
28
+ <div>
29
+ <h1 class="font-serif text-2xl font-bold tracking-tight text-stone-800 dark:text-stone-100 md:text-3xl">
30
+ AI Research Paper Explainer
31
+ </h1>
32
+ <p class="mt-0.5 text-sm text-stone-500 dark:text-stone-400">
33
+ Powered by Gemini 2.5 Pro · Google ADK
34
+ </p>
35
+ </div>
36
+ <div class="flex items-center gap-2">
37
+ <button
38
+ type="button"
39
+ id="theme-toggle"
40
+ class="rounded-lg border border-stone-300 bg-white px-3 py-2 text-sm text-stone-600 shadow-sm transition hover:bg-stone-50 dark:border-stone-600 dark:bg-stone-800 dark:text-stone-300 dark:hover:bg-stone-700"
41
+ title="Toggle dark mode"
42
+ >
43
+ <span id="theme-toggle-label">Dark</span>
44
+ </button>
45
+ <button
46
+ type="button"
47
+ id="new-chat-btn"
48
+ class="rounded-lg border border-stone-300 bg-white px-3 py-2 text-sm text-stone-600 shadow-sm transition hover:bg-stone-50 dark:border-stone-600 dark:bg-stone-800 dark:text-stone-300 dark:hover:bg-stone-700"
49
+ >
50
+ New chat
51
+ </button>
52
+ </div>
53
+ </header>
54
+
55
+ <!-- Chat panel -->
56
+ <div class="flex min-h-0 flex-1 flex-col overflow-hidden rounded-2xl border border-stone-200 bg-white shadow-sm dark:border-stone-700 dark:bg-stone-900">
57
+
58
+ <!-- Message list -->
59
+ <div
60
+ id="chat-messages"
61
+ class="flex-1 space-y-5 overflow-y-auto p-5"
62
+ aria-live="polite"
63
+ >
64
+ <div id="chat-empty" class="flex flex-col items-center justify-center py-16 text-center">
65
+ <div class="mb-3 flex h-14 w-14 items-center justify-center rounded-full bg-indigo-50 dark:bg-indigo-950">
66
+ <svg class="h-7 w-7 text-indigo-500" fill="none" viewBox="0 0 24 24" stroke-width="1.5" stroke="currentColor">
67
+ <path stroke-linecap="round" stroke-linejoin="round" d="M19.5 14.25v-2.625a3.375 3.375 0 0 0-3.375-3.375h-1.5A1.125 1.125 0 0 1 13.5 7.125v-1.5a3.375 3.375 0 0 0-3.375-3.375H8.25m0 12.75h7.5m-7.5 3H12M10.5 2.25H5.625c-.621 0-1.125.504-1.125 1.125v17.25c0 .621.504 1.125 1.125 1.125h12.75c.621 0 1.125-.504 1.125-1.125V11.25a9 9 0 0 0-9-9Z" />
68
+ </svg>
69
+ </div>
70
+ <p class="text-sm font-medium text-stone-700 dark:text-stone-300">Upload a paper and start asking questions</p>
71
+ <p class="mt-1 text-xs text-stone-400 dark:text-stone-500">
72
+ Press <kbd class="rounded bg-stone-100 px-1.5 py-0.5 text-stone-600 dark:bg-stone-800 dark:text-stone-300">Enter</kbd> to send &nbsp;·&nbsp;
73
+ <kbd class="rounded bg-stone-100 px-1.5 py-0.5 text-stone-600 dark:bg-stone-800 dark:text-stone-300">Shift+Enter</kbd> for a new line
74
+ </p>
75
+ </div>
76
+ </div>
77
+
78
+ <!-- Typing indicator -->
79
+ <div
80
+ id="typing-indicator"
81
+ class="hidden border-t border-stone-100 px-5 py-2.5 text-sm text-stone-500 dark:border-stone-800 dark:text-stone-400"
82
+ >
83
+ <span class="inline-flex items-center gap-2">
84
+ <span class="flex gap-0.5">
85
+ <span class="h-1.5 w-1.5 animate-bounce rounded-full bg-indigo-400 [animation-delay:-0.3s]"></span>
86
+ <span class="h-1.5 w-1.5 animate-bounce rounded-full bg-indigo-400 [animation-delay:-0.15s]"></span>
87
+ <span class="h-1.5 w-1.5 animate-bounce rounded-full bg-indigo-400"></span>
88
+ </span>
89
+ Agent is thinking…
90
+ </span>
91
+ </div>
92
+
93
+ <!-- Input area -->
94
+ <form id="chat-form" class="border-t border-stone-200 bg-stone-50 p-4 dark:border-stone-700 dark:bg-stone-900/60" novalidate>
95
+
96
+ <!-- PDF drop zone (hidden after first turn) -->
97
+ <div id="pdf-zone-wrap" class="mb-3">
98
+ <div
99
+ id="pdf-drop-zone"
100
+ class="relative flex cursor-pointer flex-col items-center justify-center gap-2 rounded-xl border-2 border-dashed border-stone-300 bg-white px-4 py-5 transition dark:border-stone-600 dark:bg-stone-800 hover:border-indigo-400 dark:hover:border-indigo-500"
101
+ >
102
+ <!-- Default (no file selected) -->
103
+ <div id="pdf-drop-default" class="flex flex-col items-center gap-1 text-center">
104
+ <svg class="h-8 w-8 text-stone-400 dark:text-stone-500" fill="none" viewBox="0 0 24 24" stroke-width="1.5" stroke="currentColor">
105
+ <path stroke-linecap="round" stroke-linejoin="round" d="M3 16.5v2.25A2.25 2.25 0 0 0 5.25 21h13.5A2.25 2.25 0 0 0 21 18.75V16.5m-13.5-9L12 3m0 0 4.5 4.5M12 3v13.5" />
106
+ </svg>
107
+ <p class="text-sm font-medium text-stone-600 dark:text-stone-300">
108
+ Drag &amp; drop your PDF here, or <span class="text-indigo-600 underline dark:text-indigo-400">click to upload</span>
109
+ </p>
110
+ <p class="text-xs text-stone-400 dark:text-stone-500">One PDF per conversation · Max 25 MB</p>
111
+ </div>
112
+
113
+ <!-- File selected state -->
114
+ <div id="pdf-drop-selected" class="hidden w-full items-center justify-between gap-3">
115
+ <div class="flex min-w-0 items-center gap-2">
116
+ <svg class="h-5 w-5 shrink-0 text-indigo-500" fill="none" viewBox="0 0 24 24" stroke-width="1.5" stroke="currentColor">
117
+ <path stroke-linecap="round" stroke-linejoin="round" d="M19.5 14.25v-2.625a3.375 3.375 0 0 0-3.375-3.375h-1.5A1.125 1.125 0 0 1 13.5 7.125v-1.5a3.375 3.375 0 0 0-3.375-3.375H8.25m2.25 0H5.625c-.621 0-1.125.504-1.125 1.125v17.25c0 .621.504 1.125 1.125 1.125h12.75c.621 0 1.125-.504 1.125-1.125V11.25a9 9 0 0 0-9-9Z" />
118
+ </svg>
119
+ <span id="pdf-filename" class="truncate text-sm font-medium text-stone-700 dark:text-stone-200"></span>
120
+ </div>
121
+ <button
122
+ type="button"
123
+ id="pdf-clear-btn"
124
+ class="shrink-0 rounded-full p-1 text-stone-400 transition hover:bg-stone-100 hover:text-stone-600 dark:hover:bg-stone-700 dark:hover:text-stone-200"
125
+ title="Remove file"
126
+ >
127
+ <svg class="h-4 w-4" fill="none" viewBox="0 0 24 24" stroke-width="2" stroke="currentColor">
128
+ <path stroke-linecap="round" stroke-linejoin="round" d="M6 18 18 6M6 6l12 12" />
129
+ </svg>
130
+ </button>
131
+ </div>
132
+
133
+ <!-- Hidden file input -->
134
+ <input
135
+ type="file"
136
+ id="pdf-input"
137
+ accept=".pdf,application/pdf"
138
+ class="absolute inset-0 cursor-pointer opacity-0"
139
+ />
140
+ </div>
141
+ </div>
142
+
143
+ <!-- Locked PDF banner (shown after first turn) -->
144
+ <div id="pdf-locked-banner" class="mb-3 hidden items-center gap-2 rounded-xl border border-stone-200 bg-stone-50 px-3 py-2 text-xs text-stone-500 dark:border-stone-700 dark:bg-stone-800 dark:text-stone-400">
145
+ <svg class="h-3.5 w-3.5 shrink-0" fill="none" viewBox="0 0 24 24" stroke-width="2" stroke="currentColor">
146
+ <path stroke-linecap="round" stroke-linejoin="round" d="M16.5 10.5V6.75a4.5 4.5 0 1 0-9 0v3.75m-.75 11.25h10.5a2.25 2.25 0 0 0 2.25-2.25v-6.75a2.25 2.25 0 0 0-2.25-2.25H6.75a2.25 2.25 0 0 0-2.25 2.25v6.75a2.25 2.25 0 0 0 2.25 2.25Z" />
147
+ </svg>
148
+ <span id="pdf-locked-name"></span>
149
+ <span class="text-stone-400 dark:text-stone-500">· Click <strong>New chat</strong> to use a different paper.</span>
150
+ </div>
151
+
152
+ <!-- Text input + send -->
153
+ <div class="flex items-end gap-2">
154
+ <textarea
155
+ id="chat-input"
156
+ rows="2"
157
+ class="min-h-[4rem] flex-1 resize-none rounded-xl border border-stone-300 bg-white px-3.5 py-2.5 text-sm text-stone-800 shadow-inner placeholder:text-stone-400 focus:border-indigo-500 focus:outline-none focus:ring-2 focus:ring-indigo-500/25 dark:border-stone-600 dark:bg-stone-800 dark:text-stone-100 dark:placeholder:text-stone-500"
158
+ placeholder="Ask about the paper…"
159
+ ></textarea>
160
+ <button
161
+ type="submit"
162
+ id="send-btn"
163
+ class="inline-flex h-10 shrink-0 items-center justify-center gap-1.5 rounded-xl bg-indigo-600 px-4 text-sm font-semibold text-white shadow transition hover:bg-indigo-700 focus:outline-none focus:ring-2 focus:ring-indigo-500 focus:ring-offset-2 focus:ring-offset-stone-50 disabled:cursor-not-allowed disabled:opacity-50 dark:focus:ring-offset-stone-900"
164
+ >
165
+ <span id="send-label">Send</span>
166
+ <span
167
+ id="send-spinner"
168
+ class="hidden h-4 w-4 animate-spin rounded-full border-2 border-white border-t-transparent"
169
+ aria-hidden="true"
170
+ ></span>
171
+ </button>
172
+ </div>
173
+
174
+ <p id="validation-msg" class="mt-2 hidden text-xs text-red-600 dark:text-red-400" role="alert"></p>
175
+ </form>
176
+ </div>
177
+ </div>
178
+
179
+ <!-- Lightbox -->
180
+ <div
181
+ id="lightbox"
182
+ class="fixed inset-0 z-[100] hidden cursor-zoom-out items-center justify-center bg-black/85 p-4"
183
+ role="dialog"
184
+ aria-modal="true"
185
+ aria-label="Enlarged image"
186
+ >
187
+ <button
188
+ type="button"
189
+ id="lightbox-close"
190
+ class="absolute right-4 top-4 rounded-full bg-white/10 px-3 py-1 text-sm text-white hover:bg-white/20"
191
+ >
192
+ Close
193
+ </button>
194
+ <img id="lightbox-img" src="" alt="" class="max-h-[92vh] max-w-full rounded object-contain shadow-2xl" />
195
+ </div>
196
+
197
+ <script>
198
+ // TODO: After deploying the backend to Cloud Run, replace this with your Cloud Run HTTPS URL.
199
+ const BACKEND_URL = "/api/explain";
200
+
201
+ const SESSION_KEY = "research-explainer-session-id";
202
+ const THEME_KEY = "research-explainer-theme";
203
+
204
+ const chatMessages = document.getElementById("chat-messages");
205
+ const chatForm = document.getElementById("chat-form");
206
+ const chatInput = document.getElementById("chat-input");
207
+ const sendBtn = document.getElementById("send-btn");
208
+ const sendLabel = document.getElementById("send-label");
209
+ const sendSpinner = document.getElementById("send-spinner");
210
+ const typingIndicator = document.getElementById("typing-indicator");
211
+ const validationMsg = document.getElementById("validation-msg");
212
+ const newChatBtn = document.getElementById("new-chat-btn");
213
+ const pdfInput = document.getElementById("pdf-input");
214
+ const pdfDropZone = document.getElementById("pdf-drop-zone");
215
+ const pdfZoneWrap = document.getElementById("pdf-zone-wrap");
216
+ const pdfDropDefault = document.getElementById("pdf-drop-default");
217
+ const pdfDropSelected = document.getElementById("pdf-drop-selected");
218
+ const pdfFilename = document.getElementById("pdf-filename");
219
+ const pdfClearBtn = document.getElementById("pdf-clear-btn");
220
+ const pdfLockedBanner = document.getElementById("pdf-locked-banner");
221
+ const pdfLockedName = document.getElementById("pdf-locked-name");
222
+ const lightbox = document.getElementById("lightbox");
223
+ const lightboxImg = document.getElementById("lightbox-img");
224
+ const lightboxClose = document.getElementById("lightbox-close");
225
+ const themeToggle = document.getElementById("theme-toggle");
226
+ const themeToggleLabel = document.getElementById("theme-toggle-label");
227
+
228
+ var hasCompletedFirstTurn = false;
229
+
230
+ const mdClasses =
231
+ "prose-msg space-y-2 leading-relaxed text-stone-800 dark:text-stone-100 [&_a]:text-indigo-600 dark:[&_a]:text-indigo-400 [&_a]:underline [&_code]:rounded [&_code]:bg-stone-100 dark:[&_code]:bg-stone-900 [&_code]:px-1 [&_pre]:overflow-x-auto [&_pre]:rounded-lg [&_pre]:bg-stone-100 dark:[&_pre]:bg-stone-900 [&_pre]:p-3 [&_ul]:list-disc [&_ul]:pl-5 [&_ol]:list-decimal [&_ol]:pl-5 [&_h1]:text-lg [&_h2]:text-base [&_h3]:text-sm [&_blockquote]:border-l-4 [&_blockquote]:border-stone-300 dark:[&_blockquote]:border-stone-600 [&_blockquote]:pl-4 [&_blockquote]:italic";
232
+
233
+ // ── Theme ──────────────────────────────────────────────────────────────
234
+
235
+ function isDarkMode() { return document.documentElement.classList.contains("dark"); }
236
+
237
+ function syncThemeUi() {
238
+ var dark = isDarkMode();
239
+ themeToggleLabel.textContent = dark ? "Light" : "Dark";
240
+ }
241
+
242
+ function applyStoredTheme() {
243
+ if (localStorage.getItem(THEME_KEY) === "dark") {
244
+ document.documentElement.classList.add("dark");
245
+ } else {
246
+ document.documentElement.classList.remove("dark");
247
+ }
248
+ syncThemeUi();
249
+ }
250
+
251
+ applyStoredTheme();
252
+ themeToggle.addEventListener("click", function () {
253
+ document.documentElement.classList.toggle("dark");
254
+ localStorage.setItem(THEME_KEY, isDarkMode() ? "dark" : "light");
255
+ syncThemeUi();
256
+ });
257
+
258
+ // ── Session ────────────────────────────────────────────────────────────
259
+
260
+ function getSessionId() {
261
+ var id = sessionStorage.getItem(SESSION_KEY);
262
+ if (!id) {
263
+ id = typeof crypto !== "undefined" && crypto.randomUUID
264
+ ? crypto.randomUUID()
265
+ : "sess-" + Date.now() + "-" + String(Math.random()).slice(2, 10);
266
+ sessionStorage.setItem(SESSION_KEY, id);
267
+ }
268
+ return id;
269
+ }
270
+
271
+ // ── PDF zone state ─────────────────────────────────────────────────────
272
+
273
+ function showFileSelected(name) {
274
+ pdfDropDefault.classList.add("hidden");
275
+ pdfDropSelected.classList.remove("hidden");
276
+ pdfDropSelected.classList.add("flex");
277
+ pdfFilename.textContent = name;
278
+ }
279
+
280
+ function showFileDefault() {
281
+ pdfDropDefault.classList.remove("hidden");
282
+ pdfDropSelected.classList.add("hidden");
283
+ pdfDropSelected.classList.remove("flex");
284
+ pdfFilename.textContent = "";
285
+ }
286
+
287
+ function lockPdfZone(name) {
288
+ pdfZoneWrap.classList.add("hidden");
289
+ pdfLockedBanner.classList.remove("hidden");
290
+ pdfLockedBanner.classList.add("flex");
291
+ pdfLockedName.textContent = name || "PDF attached";
292
+ }
293
+
294
+ function unlockPdfZone() {
295
+ pdfZoneWrap.classList.remove("hidden");
296
+ pdfLockedBanner.classList.add("hidden");
297
+ pdfLockedBanner.classList.remove("flex");
298
+ showFileDefault();
299
+ }
300
+
301
+ pdfClearBtn.addEventListener("click", function (e) {
302
+ e.stopPropagation();
303
+ pdfInput.value = "";
304
+ showFileDefault();
305
+ });
306
+
307
+ pdfInput.addEventListener("change", function () {
308
+ if (pdfInput.files && pdfInput.files[0]) {
309
+ showFileSelected(pdfInput.files[0].name);
310
+ } else {
311
+ showFileDefault();
312
+ }
313
+ });
314
+
315
+ // Drag-and-drop
316
+ ["dragenter", "dragover"].forEach(function (evt) {
317
+ pdfDropZone.addEventListener(evt, function (e) {
318
+ e.preventDefault();
319
+ pdfDropZone.classList.add("drop-zone-active");
320
+ });
321
+ });
322
+ ["dragleave", "drop"].forEach(function (evt) {
323
+ pdfDropZone.addEventListener(evt, function (e) {
324
+ e.preventDefault();
325
+ pdfDropZone.classList.remove("drop-zone-active");
326
+ });
327
+ });
328
+ pdfDropZone.addEventListener("drop", function (e) {
329
+ var files = e.dataTransfer && e.dataTransfer.files;
330
+ if (files && files[0]) {
331
+ var dt = new DataTransfer();
332
+ dt.items.add(files[0]);
333
+ pdfInput.files = dt.files;
334
+ showFileSelected(files[0].name);
335
+ }
336
+ });
337
+
338
+ // ── New chat ───────────────────────────────────────────────────────────
339
+
340
+ function startNewChat() {
341
+ sessionStorage.removeItem(SESSION_KEY);
342
+ hasCompletedFirstTurn = false;
343
+ pdfInput.value = "";
344
+ unlockPdfZone();
345
+ chatMessages.innerHTML = "";
346
+ var empty = document.createElement("div");
347
+ empty.id = "chat-empty";
348
+ empty.className = "flex flex-col items-center justify-center py-16 text-center";
349
+ empty.innerHTML =
350
+ '<div class="mb-3 flex h-14 w-14 items-center justify-center rounded-full bg-indigo-50 dark:bg-indigo-950">' +
351
+ '<svg class="h-7 w-7 text-indigo-500" fill="none" viewBox="0 0 24 24" stroke-width="1.5" stroke="currentColor"><path stroke-linecap="round" stroke-linejoin="round" d="M19.5 14.25v-2.625a3.375 3.375 0 0 0-3.375-3.375h-1.5A1.125 1.125 0 0 1 13.5 7.125v-1.5a3.375 3.375 0 0 0-3.375-3.375H8.25m0 12.75h7.5m-7.5 3H12M10.5 2.25H5.625c-.621 0-1.125.504-1.125 1.125v17.25c0 .621.504 1.125 1.125 1.125h12.75c.621 0 1.125-.504 1.125-1.125V11.25a9 9 0 0 0-9-9Z" /></svg>' +
352
+ '</div>' +
353
+ '<p class="text-sm font-medium text-stone-700 dark:text-stone-300">Upload a paper and start asking questions</p>' +
354
+ '<p class="mt-1 text-xs text-stone-400 dark:text-stone-500">Press <kbd class="rounded bg-stone-100 px-1.5 py-0.5 text-stone-600 dark:bg-stone-800 dark:text-stone-300">Enter</kbd> to send &nbsp;·&nbsp; <kbd class="rounded bg-stone-100 px-1.5 py-0.5 text-stone-600 dark:bg-stone-800 dark:text-stone-300">Shift+Enter</kbd> for a new line</p>';
355
+ chatMessages.appendChild(empty);
356
+ chatInput.focus();
357
+ }
358
+
359
+ newChatBtn.addEventListener("click", startNewChat);
360
+
361
+ // ── Loading state ──────────────────────────────────────────────────────
362
+
363
+ function scrollToBottom() { chatMessages.scrollTop = chatMessages.scrollHeight; }
364
+
365
+ function setLoading(on) {
366
+ sendBtn.disabled = on;
367
+ chatInput.disabled = on;
368
+ pdfInput.disabled = on;
369
+ if (on) {
370
+ sendSpinner.classList.remove("hidden");
371
+ sendLabel.textContent = "…";
372
+ typingIndicator.classList.remove("hidden");
373
+ } else {
374
+ sendSpinner.classList.add("hidden");
375
+ sendLabel.textContent = "Send";
376
+ typingIndicator.classList.add("hidden");
377
+ }
378
+ scrollToBottom();
379
+ }
380
+
381
+ // ── Chat bubbles ───────────────────────────────────────────────────────
382
+
383
+ function appendUserBubble(text, withPdf) {
384
+ var emptyEl = document.getElementById("chat-empty");
385
+ if (emptyEl) emptyEl.remove();
386
+ var wrap = document.createElement("div");
387
+ wrap.className = "flex justify-end";
388
+ var bubble = document.createElement("div");
389
+ bubble.className =
390
+ "max-w-[82%] rounded-2xl rounded-br-md bg-indigo-600 px-4 py-2.5 text-sm text-white shadow-sm dark:bg-indigo-700";
391
+ bubble.textContent = text || "(PDF only)";
392
+ if (withPdf) {
393
+ var note = document.createElement("div");
394
+ note.className = "mt-1.5 flex items-center gap-1 border-t border-indigo-400/50 pt-1.5 text-xs text-indigo-200";
395
+ note.innerHTML = '<svg class="h-3 w-3" fill="none" viewBox="0 0 24 24" stroke-width="2" stroke="currentColor"><path stroke-linecap="round" stroke-linejoin="round" d="M19.5 14.25v-2.625a3.375 3.375 0 0 0-3.375-3.375h-1.5A1.125 1.125 0 0 1 13.5 7.125v-1.5a3.375 3.375 0 0 0-3.375-3.375H8.25m2.25 0H5.625c-.621 0-1.125.504-1.125 1.125v17.25c0 .621.504 1.125 1.125 1.125h12.75c.621 0 1.125-.504 1.125-1.125V11.25a9 9 0 0 0-9-9Z"/></svg> PDF attached';
396
+ bubble.appendChild(note);
397
+ }
398
+ wrap.appendChild(bubble);
399
+ chatMessages.appendChild(wrap);
400
+ scrollToBottom();
401
+ }
402
+
403
+ function appendAssistantBubble(html, images) {
404
+ var wrap = document.createElement("div");
405
+ wrap.className = "flex justify-start";
406
+ var bubble = document.createElement("div");
407
+ bubble.className =
408
+ "max-w-[88%] rounded-2xl rounded-bl-md border border-stone-200 bg-stone-50 px-4 py-3 text-sm shadow-sm dark:border-stone-700 dark:bg-stone-800";
409
+ var textEl = document.createElement("div");
410
+ textEl.className = mdClasses;
411
+ textEl.innerHTML = html;
412
+ bubble.appendChild(textEl);
413
+ if (images && images.length > 0) {
414
+ var grid = document.createElement("div");
415
+ grid.className = "mt-3 grid gap-3 sm:grid-cols-2";
416
+ images.forEach(function (src, i) {
417
+ if (!src) return;
418
+ var img = document.createElement("img");
419
+ img.src = src;
420
+ img.alt = "Diagram " + (i + 1);
421
+ img.className =
422
+ "diagram-img max-h-80 w-full cursor-zoom-in rounded-xl border border-stone-200 bg-white object-contain shadow-sm transition hover:ring-2 hover:ring-indigo-300 dark:border-stone-600 dark:bg-stone-900 dark:hover:ring-indigo-500";
423
+ img.loading = "lazy";
424
+ img.title = "Click to enlarge";
425
+ grid.appendChild(img);
426
+ });
427
+ bubble.appendChild(grid);
428
+ }
429
+ wrap.appendChild(bubble);
430
+ chatMessages.appendChild(wrap);
431
+ scrollToBottom();
432
+ }
433
+
434
+ function appendErrorBubble(msg) {
435
+ var safe = String(msg).replace(/&/g, "&amp;").replace(/</g, "&lt;").replace(/>/g, "&gt;").replace(/"/g, "&quot;");
436
+ var wrap = document.createElement("div");
437
+ wrap.className = "flex justify-start";
438
+ var bubble = document.createElement("div");
439
+ bubble.className =
440
+ "max-w-[88%] rounded-2xl rounded-bl-md border border-red-200 bg-red-50 px-4 py-3 text-sm text-red-800 shadow-sm dark:border-red-900 dark:bg-red-950/40 dark:text-red-300";
441
+ bubble.innerHTML = "<p>" + safe + "</p>";
442
+ wrap.appendChild(bubble);
443
+ chatMessages.appendChild(wrap);
444
+ scrollToBottom();
445
+ }
446
+
447
+ // ── Lightbox ───────────────────────────────────────────────────────────
448
+
449
+ chatMessages.addEventListener("click", function (e) {
450
+ if (e.target && e.target.classList && e.target.classList.contains("diagram-img")) {
451
+ lightboxImg.src = e.target.src;
452
+ lightboxImg.alt = e.target.alt || "Diagram";
453
+ lightbox.classList.remove("hidden");
454
+ lightbox.classList.add("flex");
455
+ }
456
+ });
457
+
458
+ function closeLightbox() {
459
+ lightbox.classList.add("hidden");
460
+ lightbox.classList.remove("flex");
461
+ lightboxImg.src = "";
462
+ }
463
+
464
+ lightboxClose.addEventListener("click", closeLightbox);
465
+ lightbox.addEventListener("click", function (e) {
466
+ if (e.target === lightbox || e.target === lightboxImg) closeLightbox();
467
+ });
468
+ document.addEventListener("keydown", function (e) {
469
+ if (e.key === "Escape" && !lightbox.classList.contains("hidden")) closeLightbox();
470
+ });
471
+
472
+ // ── Keyboard shortcut ──────────────────────────────────────────────────
473
+
474
+ chatInput.addEventListener("keydown", function (e) {
475
+ if (e.key === "Enter" && !e.shiftKey) {
476
+ e.preventDefault();
477
+ if (!sendBtn.disabled) chatForm.requestSubmit();
478
+ }
479
+ });
480
+
481
+ // ── Submit ─────────────────────────────────────────────────────────────
482
+
483
+ function formatDetail(detail) {
484
+ if (detail == null) return "";
485
+ if (typeof detail === "string") return detail;
486
+ if (Array.isArray(detail)) return detail.map(function (x) { return x && x.msg ? x.msg : JSON.stringify(x); }).join("; ");
487
+ return JSON.stringify(detail);
488
+ }
489
+
490
+ chatForm.addEventListener("submit", async function (e) {
491
+ e.preventDefault();
492
+ validationMsg.classList.add("hidden");
493
+
494
+ var text = chatInput.value.trim();
495
+ var file = !hasCompletedFirstTurn && pdfInput.files && pdfInput.files[0] ? pdfInput.files[0] : null;
496
+
497
+ if (!text && !file) {
498
+ validationMsg.textContent = "Enter a message and/or attach a PDF.";
499
+ validationMsg.classList.remove("hidden");
500
+ return;
501
+ }
502
+
503
+ appendUserBubble(text, !!file);
504
+ chatInput.value = "";
505
+ setLoading(true);
506
+
507
+ var fd = new FormData();
508
+ fd.append("session_id", getSessionId());
509
+ fd.append("user_input", text);
510
+ if (file) fd.append("file", file, file.name);
511
+
512
+ var parseMd = typeof marked.parse === "function" ? marked.parse.bind(marked) : marked;
513
+
514
+ try {
515
+ var response = await fetch(BACKEND_URL, { method: "POST", body: fd });
516
+ var rawBody = await response.text();
517
+ var data = null;
518
+ try { data = rawBody ? JSON.parse(rawBody) : null; } catch (_) {}
519
+
520
+ if (!response.ok) {
521
+ var msg = "Request failed (" + response.status + " " + response.statusText + ").";
522
+ if (data && data.detail !== undefined) msg += " " + formatDetail(data.detail);
523
+ else if (rawBody && !data) msg += " " + rawBody.slice(0, 200);
524
+ appendErrorBubble(msg);
525
+ return;
526
+ }
527
+
528
+ if (!data || typeof data !== "object") { appendErrorBubble("Unexpected response from server."); return; }
529
+
530
+ var md = data.text != null ? String(data.text) : "";
531
+ var images = Array.isArray(data.images) ? data.images : [];
532
+ if (!md.trim() && images.length > 0) md = "*The model returned diagrams without explanation text.*";
533
+ appendAssistantBubble(parseMd(md, { breaks: true }), images);
534
+
535
+ if (!hasCompletedFirstTurn) {
536
+ hasCompletedFirstTurn = true;
537
+ lockPdfZone(file ? file.name : "");
538
+ }
539
+ } catch (err) {
540
+ appendErrorBubble(err && err.message ? err.message : "Network error — check the backend URL and CORS.");
541
+ } finally {
542
+ setLoading(false);
543
+ chatInput.focus();
544
+ }
545
+ });
546
+
547
+ getSessionId();
548
+ </script>
549
+ </body>
550
+ </html>
requirements.txt ADDED
@@ -0,0 +1,9 @@
 
 
 
 
 
 
 
 
 
 
1
+ google-adk
2
+ google-genai==1.27.0
3
+ python-dotenv
4
+ requests
5
+ fastapi[standard]
6
+ pydantic
7
+ python-multipart
8
+ uvicorn
9
+ graphviz
research_explainer/__init__.py ADDED
@@ -0,0 +1 @@
 
 
1
+ # Research Explainer Agent Package
research_explainer/agent.py ADDED
@@ -0,0 +1,96 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Author: Rohan Mitra (rohanmitra8@gmail.com)
3
+ agent.py (c) 2025
4
+ Desc: The Research Explainer ADK agent
5
+ Created: 2025-09-05T18:00:00.000Z
6
+ Modified: 2026-06-20T06:53:56.136Z
7
+ """
8
+
9
+ from google.adk.agents import Agent
10
+ from .tools import generate_flowchart, generate_diagram, find_research_context
11
+ import dotenv
12
+
13
+ dotenv.load_dotenv()
14
+
15
+
16
+ BASE_PROMPT = """
17
+ You are a Research Paper Explainer agent. Your goal is to help users understand specific concepts from research papers by providing clear, detailed explanations and generating appropriate diagrams when needed.
18
+
19
+ ## Core Capabilities
20
+ - Analyze the uploaded PDF research paper
21
+ - Explain complex concepts in simple, understandable terms
22
+ - Generate flowcharts for visual learning alongside your explanations
23
+ - Find live external research context, related papers, and follow-up directions when useful
24
+ - Provide context-aware explanations based on the specific paper
25
+
26
+ ## Behavior and Style
27
+ - Be thorough but accessible in your explanations
28
+ - Break down complex concepts into digestible parts
29
+ - Use analogies and examples when helpful
30
+ - Always cite specific sections or pages from the paper when relevant
31
+ - Ask clarifying questions if the user's request is ambiguous
32
+
33
+ ## Workflow for Concept Explanation
34
+ 1. Read the uploaded PDF paper and understand the content. You must output the title of the paper and the main contributions of the paper in max 3 lines.
35
+ 2. **Concept Explanation**: Provide a clear, structured explanation that includes:
36
+ - Definition of the concept
37
+ - How it works (step-by-step if applicable)
38
+ - Why it's important in the context of the paper
39
+ - Key mathematical formulas or technical details
40
+ 3. **Visual Learning**: When a visual would help, use the `generate_flowchart` tool to generate a flowchart. The tool requires a dict of all the nodes in the flowchart and their background colors, as well as a list of all the connections between the nodes.
41
+ 4. **External Research Context**: When the user asks where a concept leads, what uses it, related work, follow-up reading, future directions, or broader impact, use the `find_research_context` tool. Give it the concept, paper-specific context from the uploaded PDF, and the research domain. You can also volunteer this information if you think it would be helpful and relevant to the explanation.
42
+ 5. **Integration**: Include the flowchart or research context naturally in your response. It can be at any point in the explanation, not just the end.
43
+
44
+ ## Flowchart Integration
45
+ When you determine that a diagram would enhance understanding, make the `generate_flowchart` tool call and include the flowchart in your response.
46
+ You need to first give it a dictionary of all the nodes to be included in the flowchart, and their background colors in hexadecimal format (#000000 - #FFFFFF). Keep the names of the nodes simple and make sure the arrows show the relationship between the nodes. The relationship can be complex and doesnt have to result in a linear set of relationships.
47
+ You also need to give it a list of all the connections between the nodes by listing the source and destination nodes as a tuple. Make sure to use the same names for the nodes as in the dictionary.
48
+ The datatypes are as follows:
49
+ - nodes_and_colors: dict[str, str]
50
+ - edges: list[list[str]]
51
+
52
+ A sample generate_flowchart call is given below:
53
+ ```
54
+ nodes_and_colors = {
55
+ 'A': '#c1e5f5',
56
+ 'B': '#ffb76e',
57
+ 'C': '#c1e5f5',
58
+ 'D': '#88cc99',
59
+ 'E': 'white',
60
+ }
61
+ edges = [
62
+ ('A', 'B'), #Connects A to B
63
+ ('C', 'D'), #Connects C to D
64
+ ('D', 'E'), #Connects D to E
65
+ ('E', 'B'), #Connects E to B
66
+ ]
67
+ ```
68
+
69
+ ## Research Context Integration
70
+ When the user wants to understand how a paper concept connects to the broader field, call `find_research_context`.
71
+ Use a precise `concept` and include a short `paper_context` string that captures the method, task, domain, and surrounding terminology from the uploaded paper.
72
+ After the tool returns results, explain how the external papers or directions relate back to the original paper. Do not overstate the connection if a result is only loosely related.
73
+
74
+ ## Response Structure
75
+ Your explanations should follow this structure:
76
+ 1. **Brief Overview**: What the concept is and why it matters
77
+ 2. **Detailed Explanation**: Step-by-step breakdown with technical details
78
+ 3. **Paper Context**: How this concept fits into the broader research
79
+ 4. **Visual Aid or Research Context**: Include a flowchart or live research context if helpful - this can be at any point in the explanation, not just the end.
80
+ 5. **Key Takeaways**: Summary of the most important points
81
+
82
+ ## Technical Guidelines
83
+ - Always read the PDF paper first before attempting to explain concepts
84
+ - Provide page numbers or section references when available
85
+ - If a concept isn't clearly explained in the paper, acknowledge this limitation
86
+ - For mathematical concepts, include the relevant formulas and explain their meaning
87
+
88
+ ## Error Handling
89
+ - If a concept isn't found in the paper, suggest related concepts that are discussed
90
+ - If the explanation becomes too technical, offer to simplify it further
91
+
92
+ Remember: Your goal is to make complex research accessible while maintaining accuracy and depth. Always ground your explanations in the specific paper being analyzed.
93
+ """
94
+
95
+
96
+
research_explainer/tools/__init__.py ADDED
@@ -0,0 +1,13 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Tool exports for the Research Explainer ADK agent.
3
+ """
4
+
5
+ from .diagram import generate_diagram
6
+ from .flowchart import generate_flowchart
7
+ from .research_context import find_research_context
8
+
9
+ __all__ = [
10
+ "find_research_context",
11
+ "generate_diagram",
12
+ "generate_flowchart",
13
+ ]
research_explainer/tools/diagram.py ADDED
@@ -0,0 +1,102 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ AI-generated diagram tool.
3
+ """
4
+
5
+ import os
6
+
7
+ import dotenv
8
+ from google import genai
9
+ from google.adk.tools import ToolContext
10
+ from google.genai import types
11
+
12
+ dotenv.load_dotenv()
13
+
14
+ client = genai.Client(api_key=os.environ["GOOGLE_API_KEY"])
15
+
16
+
17
+ async def generate_diagram(
18
+ image_gen_prompt: str, concept_to_explain: str, tool_context: ToolContext
19
+ ) -> dict:
20
+ """
21
+ Generates a technical diagram for a research paper based on a detailed prompt describing the flow and design requirements.
22
+ Returns a dictionary with the status and filename or error detail.
23
+
24
+ Args:
25
+ image_gen_prompt (str): The detailed prompt describing the flow and design requirements.
26
+ concept_to_explain (str): The concept to explain.
27
+ tool_context (ToolContext): The context for the tool execution.
28
+
29
+ Returns:
30
+ dict: Contains 'status' ('success' or 'failed'), and either 'filename' or 'detail'.
31
+ """
32
+ print("Generate diagram tool called!")
33
+ try:
34
+ # Create a comprehensive prompt for diagram generation
35
+ enhanced_prompt = f"""
36
+ Create a technical, high-quality diagram for a research paper to explain the concept "{concept_to_explain}", based on the following specifications:
37
+
38
+ {image_gen_prompt}
39
+
40
+ Requirements:
41
+ - Clear, readable, and precise design
42
+ - High resolution and clean design
43
+ - Use of appropriate colors and typography
44
+ - Helps explain the concept to a student
45
+
46
+ Generate a diagram that represents the flow and design requirements.
47
+ """
48
+
49
+ content = types.Content(
50
+ role="user",
51
+ parts=[
52
+ types.Part.from_text(text=enhanced_prompt),
53
+ ],
54
+ )
55
+
56
+ response = client.models.generate_content(
57
+ model="gemini-2.5-flash-image-preview",
58
+ contents=content,
59
+ config=types.GenerateContentConfig(
60
+ temperature=0.8,
61
+ top_p=0.95,
62
+ max_output_tokens=8192,
63
+ response_modalities=["TEXT", "IMAGE"],
64
+ ),
65
+ )
66
+
67
+ if not response or not getattr(response, "candidates", None):
68
+ return {"status": "failed", "detail": "No response or candidates from model."}
69
+
70
+ image_bytes_out = None
71
+ candidate = response.candidates[0] if response.candidates else None
72
+ content_out = getattr(candidate, "content", None) if candidate is not None else None
73
+
74
+ if content_out is not None:
75
+ for part in getattr(content_out, "parts", []):
76
+ part_inline = getattr(part, "inline_data", None)
77
+ part_data = (
78
+ getattr(part_inline, "data", None)
79
+ if part_inline is not None
80
+ else None
81
+ )
82
+ if part_data:
83
+ image_bytes_out = part_data
84
+ break
85
+
86
+ if not image_bytes_out:
87
+ return {"status": "failed", "detail": "No image bytes found in model response."}
88
+
89
+ # Save the generated diagram
90
+ await tool_context.save_artifact(
91
+ "diagram.png",
92
+ types.Part.from_bytes(data=image_bytes_out, mime_type="image/png"),
93
+ )
94
+
95
+ return {
96
+ "status": "success",
97
+ "detail": "Diagram generated successfully and stored in artifacts.",
98
+ "filename": "diagram.png",
99
+ }
100
+
101
+ except Exception as e:
102
+ return {"status": "failed", "detail": f"Error generating diagram: {str(e)}"}
research_explainer/tools/flowchart.py ADDED
@@ -0,0 +1,66 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Programmatic flowchart generation tool.
3
+ """
4
+
5
+ import graphviz
6
+ from google.adk.tools import ToolContext
7
+ from google.genai import types
8
+
9
+
10
+ async def generate_flowchart(
11
+ nodes_and_colors: dict[str, str],
12
+ edges: list[list[str]],
13
+ title: str,
14
+ tool_context: ToolContext,
15
+ ) -> dict:
16
+ """
17
+ Generates a flowchart for a research paper based on the nodes to be included, and the connections between them.
18
+ Returns a dictionary with the status and filename or error detail.
19
+
20
+ Args:
21
+ nodes_and_colors (dict[str,str]): dictionary of the nodes to be included in the flowchart, and their background colors in hex format. Eg: {'node1': '#c1e5f5', 'node2': '#88cc99'}
22
+ edges (list[tuple[str,str]]): list of tuples of the nodes to be connected, and the connections between them. Eg: [('node1', 'node2'), ('node2', 'node3')] this would draw an arrow from node1 to node2, and from node2 to node3.
23
+ title (str): the title of the flowchart.
24
+ tool_context (ToolContext): The context for the tool execution.
25
+
26
+ Returns:
27
+ dict: Contains 'status' ('success' or 'failed'), and either 'filename' or 'detail'.
28
+ """
29
+
30
+ try:
31
+ # Initialize the Digraph with a specific engine and global attributes
32
+ dot = graphviz.Digraph(comment=title, engine="dot")
33
+ dot.attr(rankdir="TB", splines="ortho", pad="0.5", nodesep="0.5")
34
+ dot.attr("node", style="filled", fontname="Times-Roman", fontsize="16")
35
+ dot.attr("edge", fontname="Times-Roman", fontsize="16")
36
+
37
+ # Define a cluster for the main flow
38
+ with dot.subgraph(name="cluster_1") as c:
39
+ c.attr(rankdir="TB", splines="ortho")
40
+ c.attr("node", shape="box", fontname="Times-Roman", fontsize="16")
41
+
42
+ # Define the nodes with their specific colors and labels
43
+ for node, color in nodes_and_colors.items():
44
+ c.node(node, node, fillcolor=color)
45
+
46
+ # Add edges with labels
47
+ for edge in edges:
48
+ c.edge(edge[0], edge[1])
49
+
50
+ # Add the title at the top
51
+ dot.attr(label=title, labelloc="t", fontname="Times-Roman", fontsize="20")
52
+
53
+ png_bytes = dot.pipe(format="png")
54
+ # Save the generated flowchart
55
+ await tool_context.save_artifact(
56
+ "flowchart.png",
57
+ types.Part.from_bytes(data=png_bytes, mime_type="image/png"),
58
+ )
59
+ return {
60
+ "status": "success",
61
+ "detail": "Flowchart generated successfully and stored in artifacts.",
62
+ "filename": "flowchart.png",
63
+ }
64
+
65
+ except Exception as e:
66
+ return {"status": "failed", "detail": f"Error generating flowchart: {str(e)}"}
research_explainer/tools/research_context.py ADDED
@@ -0,0 +1,236 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Live research-context lookup tools.
3
+ """
4
+
5
+ import re
6
+ import xml.etree.ElementTree as ET
7
+
8
+ import requests
9
+
10
+ SEMANTIC_SCHOLAR_SEARCH_URL = "https://api.semanticscholar.org/graph/v1/paper/search"
11
+ ARXIV_SEARCH_URL = "https://export.arxiv.org/api/query"
12
+ PAPER_FIELDS = "title,abstract,year,authors,url,citationCount,venue,fieldsOfStudy"
13
+
14
+
15
+ def _clean_text(value: str | None, max_length: int | None = None) -> str:
16
+ text = re.sub(r"\s+", " ", value or "").strip()
17
+ if max_length and len(text) > max_length:
18
+ return f"{text[: max_length - 3].rstrip()}..."
19
+ return text
20
+
21
+
22
+ def _build_research_query(concept: str, paper_context: str, domain: str) -> str:
23
+ query_parts = [
24
+ _clean_text(concept, 120),
25
+ _clean_text(paper_context, 220),
26
+ _clean_text(domain, 80),
27
+ ]
28
+ return " ".join(part for part in query_parts if part)
29
+
30
+
31
+ def _normalize_max_results(max_results: int) -> int:
32
+ return max(1, min(int(max_results or 5), 10))
33
+
34
+
35
+ def _semantic_scholar_papers(query: str, max_results: int) -> list[dict]:
36
+ response = requests.get(
37
+ SEMANTIC_SCHOLAR_SEARCH_URL,
38
+ params={
39
+ "query": query,
40
+ "limit": max_results,
41
+ "fields": PAPER_FIELDS,
42
+ },
43
+ timeout=10,
44
+ )
45
+ response.raise_for_status()
46
+ data = response.json()
47
+
48
+ papers: list[dict] = []
49
+ for paper in data.get("data", []):
50
+ title = _clean_text(paper.get("title"))
51
+ if not title:
52
+ continue
53
+
54
+ papers.append(
55
+ {
56
+ "title": title,
57
+ "year": paper.get("year"),
58
+ "authors": [
59
+ _clean_text(author.get("name"))
60
+ for author in paper.get("authors", [])[:5]
61
+ if author.get("name")
62
+ ],
63
+ "venue": _clean_text(paper.get("venue")),
64
+ "url": paper.get("url"),
65
+ "citation_count": paper.get("citationCount"),
66
+ "fields_of_study": paper.get("fieldsOfStudy") or [],
67
+ "abstract": _clean_text(paper.get("abstract"), 700),
68
+ "source": "Semantic Scholar",
69
+ }
70
+ )
71
+
72
+ return papers
73
+
74
+
75
+ def _arxiv_papers(query: str, max_results: int) -> list[dict]:
76
+ response = requests.get(
77
+ ARXIV_SEARCH_URL,
78
+ params={
79
+ "search_query": f"all:{query}",
80
+ "start": 0,
81
+ "max_results": max_results,
82
+ "sortBy": "relevance",
83
+ "sortOrder": "descending",
84
+ },
85
+ timeout=10,
86
+ )
87
+ response.raise_for_status()
88
+
89
+ root = ET.fromstring(response.text)
90
+ namespace = {"atom": "http://www.w3.org/2005/Atom"}
91
+ papers: list[dict] = []
92
+
93
+ for entry in root.findall("atom:entry", namespace):
94
+ title = _clean_text(
95
+ entry.findtext("atom:title", default="", namespaces=namespace)
96
+ )
97
+ if not title:
98
+ continue
99
+
100
+ authors = [
101
+ _clean_text(author.findtext("atom:name", default="", namespaces=namespace))
102
+ for author in entry.findall("atom:author", namespace)[:5]
103
+ ]
104
+ papers.append(
105
+ {
106
+ "title": title,
107
+ "year": (
108
+ entry.findtext("atom:published", default="", namespaces=namespace)
109
+ or ""
110
+ )[:4],
111
+ "authors": [author for author in authors if author],
112
+ "venue": "arXiv",
113
+ "url": entry.findtext("atom:id", default="", namespaces=namespace),
114
+ "citation_count": None,
115
+ "fields_of_study": [],
116
+ "abstract": _clean_text(
117
+ entry.findtext("atom:summary", default="", namespaces=namespace),
118
+ 700,
119
+ ),
120
+ "source": "arXiv",
121
+ }
122
+ )
123
+
124
+ return papers
125
+
126
+
127
+ def _suggest_research_directions(concept: str, papers: list[dict]) -> list[str]:
128
+ title_and_abstract = " ".join(
129
+ f"{paper.get('title', '')} {paper.get('abstract', '')}" for paper in papers
130
+ ).lower()
131
+ directions: list[str] = []
132
+
133
+ keyword_directions = [
134
+ (
135
+ ("efficient", "linear", "sparse", "compression"),
136
+ f"More efficient versions of {concept}",
137
+ ),
138
+ (
139
+ ("scaling", "large-scale", "foundation", "pretraining"),
140
+ f"Scaling {concept} to larger models or datasets",
141
+ ),
142
+ (
143
+ ("vision", "image", "multimodal", "video"),
144
+ f"Using {concept} in vision or multimodal systems",
145
+ ),
146
+ (
147
+ ("retrieval", "knowledge", "rag", "memory"),
148
+ f"Combining {concept} with retrieval or external knowledge",
149
+ ),
150
+ (
151
+ ("robust", "safety", "bias", "privacy"),
152
+ f"Studying robustness, safety, or privacy around {concept}",
153
+ ),
154
+ ]
155
+
156
+ for keywords, direction in keyword_directions:
157
+ if any(keyword in title_and_abstract for keyword in keywords):
158
+ directions.append(direction)
159
+
160
+ if not directions:
161
+ directions = [
162
+ f"Foundational papers that introduced or popularized {concept}",
163
+ f"Recent applications that adapt {concept} to new tasks",
164
+ f"Limitations and follow-up methods that improve on {concept}",
165
+ ]
166
+
167
+ return directions[:5]
168
+
169
+
170
+ async def find_research_context(
171
+ concept: str,
172
+ paper_context: str,
173
+ domain: str = "machine learning",
174
+ max_results: int = 5,
175
+ ) -> dict:
176
+ """
177
+ Finds external research context for a concept discussed in the uploaded paper.
178
+
179
+ Use this when the user asks where a concept leads, what uses it, related work,
180
+ follow-up reading, or how the idea connects to broader research.
181
+
182
+ Args:
183
+ concept (str): The concept or method to investigate.
184
+ paper_context (str): Paper-specific context that makes the search precise.
185
+ domain (str): The broader research domain, such as machine learning.
186
+ max_results (int): Maximum number of papers to return, capped at 10.
187
+
188
+ Returns:
189
+ dict: Related papers, suggested directions, source metadata, or error details.
190
+ """
191
+ concept = _clean_text(concept, 120)
192
+ paper_context = _clean_text(paper_context, 500)
193
+ domain = _clean_text(domain, 80) or "machine learning"
194
+ max_results = _normalize_max_results(max_results)
195
+
196
+ if not concept:
197
+ return {
198
+ "status": "failed",
199
+ "detail": "Provide a non-empty concept to search for research context.",
200
+ }
201
+
202
+ query = _build_research_query(concept, paper_context, domain)
203
+ errors: list[str] = []
204
+
205
+ try:
206
+ papers = _semantic_scholar_papers(query, max_results)
207
+ source = "Semantic Scholar"
208
+ except Exception as exc:
209
+ papers = []
210
+ source = "arXiv"
211
+ errors.append(f"Semantic Scholar search failed: {exc}")
212
+
213
+ if not papers:
214
+ try:
215
+ papers = _arxiv_papers(query, max_results)
216
+ source = "arXiv"
217
+ except Exception as exc:
218
+ errors.append(f"arXiv search failed: {exc}")
219
+
220
+ if not papers:
221
+ return {
222
+ "status": "failed",
223
+ "query": query,
224
+ "detail": "No related papers found from Semantic Scholar or arXiv.",
225
+ "errors": errors,
226
+ }
227
+
228
+ return {
229
+ "status": "success",
230
+ "concept": concept,
231
+ "query": query,
232
+ "source": source,
233
+ "suggested_directions": _suggest_research_directions(concept, papers),
234
+ "papers": papers[:max_results],
235
+ "errors": errors,
236
+ }