decodingdatascience commited on
Commit
6afa3d0
·
verified ·
1 Parent(s): 8ff3856

Create app2.py

Browse files
Files changed (1) hide show
  1. app2.py +506 -0
app2.py ADDED
@@ -0,0 +1,506 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # If needed in Colab, install first:
2
+ # !pip install -U gradio pinecone llama-index llama-index-vector-stores-pinecone llama-index-readers-file pypdf
3
+
4
+ from llama_index.core import VectorStoreIndex, SimpleDirectoryReader, StorageContext, Settings
5
+
6
+ # --- Imports ---
7
+ import logging
8
+ import sys
9
+ import gradio as gr
10
+ import os
11
+
12
+ from pinecone import Pinecone, ServerlessSpec
13
+ from llama_index.core import VectorStoreIndex, SimpleDirectoryReader, StorageContext, Settings
14
+ from llama_index.vector_stores.pinecone import PineconeVectorStore
15
+ from llama_index.readers.file import PDFReader
16
+ from llama_index.llms.openai import OpenAI
17
+ from llama_index.embeddings.openai import OpenAIEmbedding
18
+
19
+
20
+ # --- Logging ---
21
+ logging.basicConfig(stream=sys.stdout, level=logging.INFO)
22
+
23
+
24
+ Settings.llm = OpenAI(model="gpt-4o-mini", temperature=0.2)
25
+ Settings.embed_model = OpenAIEmbedding(model="text-embedding-ada-002")
26
+ Settings.chunk_size = 600
27
+ Settings.chunk_overlap = 200
28
+
29
+
30
+ # Define a system prompt
31
+ system_prompt = '''
32
+ You are AYesha, the Decoding Data Science (DDS) Enterprise HR Chatbot. Answer questions exclusively using the attached DDS HR Handbook. Base all responses on the most up-to-date information available in the handbook. Only respond to queries directly related to DDS HR policies as outlined in the handbook.
33
+
34
+ - If a question pertains to topics outside DDS HR policies, respond politely, clarifying that you are a human resources bot and only answer DDS HR questions.
35
+ - For questions you cannot answer (e.g., requests for old policies, salary details, or confidential information), politely decline and direct the user to email connect@decodingdatascience.com.
36
+ - Never answer questions about anything outside of your scope.
37
+ - Persist in following these constraints for any follow-up questions.
38
+ - Before answering, carefully check that the information and query are within the allowed scope. Follow chain-of-thought reasoning:
39
+ 1. First, reason step-by-step whether the question is covered in the current handbook and is within HR.
40
+ 2. Only after confirming, produce a final answer.
41
+
42
+ Format answers as concise, professional responses. Do not wrap answers in code blocks or any special formatting.
43
+
44
+ Output requirements:
45
+ - For allowed HR questions, answer concisely based only on the latest DDS HR handbook information.
46
+ - For forbidden topics, output: “I’m sorry, I can only answer questions about the latest DDS HR policies. For confidential or other queries, please email connect@decodingdatascience.com.”
47
+
48
+
49
+ **Example 1**
50
+ User: What is the leave encashment policy at DDS?
51
+ Reasoning: This is an HR policy question found in the latest handbook.
52
+ Final Answer: [Provide answer summarized from the latest handbook’s section on leave encashment]
53
+
54
+ **Example 2**
55
+ User: Can you tell me the salary range for Data Scientists?
56
+ Reasoning: Salary details are confidential and not shared by this bot.
57
+ Final Answer: I’m sorry, I can only answer questions about the latest DDS HR policies. For confidential or other queries, please email connect@decodingdatascience.com.
58
+
59
+ **Example 3**
60
+ User: Can you explain what DDS does as a company overall?
61
+ Reasoning: This is not an HR question, so it cannot be answered.
62
+ Final Answer: I’m sorry, I only answer DDS HR policy questions as outlined in the handbook.
63
+
64
+ (Real-world examples should be longer and use precise wording from the handbook where appropriate.)
65
+
66
+ **Important instructions:**
67
+ - Only answer questions directly supported by the latest DDS HR handbook.
68
+ - Decline politely and redirect to the provided email address for any questions outside scope or for confidential information.
69
+ - Always reason before concluding. Only present the answer after checking scope and source.
70
+
71
+ Remember: As AYesha, the DDS HR Enterprise Chatbot, you must never provide information outside authorized HR handbook content and always respond respectfully according to these constraints.
72
+
73
+ '''
74
+
75
+
76
+ # --- Load API Key from Hugging face environment ---
77
+ OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")
78
+ PINECONE_API_KEY = os.getenv("PINECONE_API_KEY")
79
+
80
+
81
+ # --- Initialize Pinecone ---
82
+ pc = Pinecone(api_key=PINECONE_API_KEY)
83
+ index_name = "quickstart"
84
+ dimension = 1536
85
+
86
+
87
+ # --- Delete index if it already exists (optional) ---
88
+ existing_indexes = [idx["name"] for idx in pc.list_indexes()]
89
+
90
+ if index_name in existing_indexes:
91
+ pc.delete_index(index_name)
92
+
93
+
94
+ # --- Create Pinecone index ---
95
+ pc.create_index(
96
+ name=index_name,
97
+ dimension=dimension,
98
+ metric="euclidean",
99
+ spec=ServerlessSpec(cloud="aws", region="us-east-1"),
100
+ )
101
+
102
+ pinecone_index = pc.Index(index_name)
103
+
104
+
105
+ # --- Load PDF documents from folder ---
106
+ documents = SimpleDirectoryReader(
107
+ input_dir="data",
108
+ required_exts=[".pdf"],
109
+ file_extractor={".pdf": PDFReader()}
110
+ ).load_data()
111
+
112
+ if not documents:
113
+ raise ValueError("No PDF documents were loaded from the 'data' folder.")
114
+
115
+
116
+ # --- Create Vector Index ---
117
+ vector_store = PineconeVectorStore(pinecone_index=pinecone_index)
118
+ storage_context = StorageContext.from_defaults(vector_store=vector_store)
119
+
120
+ index = VectorStoreIndex.from_documents(
121
+ documents,
122
+ storage_context=storage_context
123
+ )
124
+
125
+
126
+ # --- Query Engine ---
127
+ query_engine = index.as_query_engine(system_prompt=system_prompt)
128
+
129
+
130
+ # --- Gradio App ---
131
+ def query_doc(prompt):
132
+ try:
133
+ response = query_engine.query(prompt)
134
+ return str(response)
135
+ except Exception as e:
136
+ return f"Error: {str(e)}"
137
+
138
+
139
+ # -------------------------------------------------------------------
140
+ # Professional Gradio UI
141
+ # Only the Gradio interface is updated below.
142
+ # No RAG logic, LLM, embeddings, Pinecone, PDF loading, or prompt rules changed.
143
+ # -------------------------------------------------------------------
144
+
145
+ CUSTOM_CSS = """
146
+ .gradio-container {
147
+ max-width: 1180px !important;
148
+ margin: 0 auto !important;
149
+ }
150
+
151
+ .dds-hero {
152
+ border: 1px solid var(--border-color-primary);
153
+ background: var(--block-background-fill);
154
+ border-radius: 22px;
155
+ padding: 28px;
156
+ margin-bottom: 18px;
157
+ }
158
+
159
+ .dds-title {
160
+ font-size: 2rem;
161
+ font-weight: 750;
162
+ letter-spacing: -0.02em;
163
+ margin-bottom: 8px;
164
+ }
165
+
166
+ .dds-subtitle {
167
+ font-size: 1rem;
168
+ color: var(--body-text-color-subdued);
169
+ max-width: 880px;
170
+ line-height: 1.6;
171
+ }
172
+
173
+ .dds-badges {
174
+ display: flex;
175
+ flex-wrap: wrap;
176
+ gap: 8px;
177
+ margin-top: 18px;
178
+ }
179
+
180
+ .dds-badge {
181
+ border: 1px solid var(--border-color-primary);
182
+ background: var(--background-fill-secondary);
183
+ color: var(--body-text-color);
184
+ border-radius: 999px;
185
+ padding: 7px 12px;
186
+ font-size: 0.86rem;
187
+ }
188
+
189
+ .dds-card {
190
+ border: 1px solid var(--border-color-primary);
191
+ background: var(--block-background-fill);
192
+ border-radius: 18px;
193
+ padding: 18px;
194
+ margin-bottom: 12px;
195
+ }
196
+
197
+ .dds-muted {
198
+ color: var(--body-text-color-subdued);
199
+ font-size: 0.92rem;
200
+ line-height: 1.55;
201
+ }
202
+
203
+ .dds-small-heading {
204
+ font-size: 1rem;
205
+ font-weight: 700;
206
+ margin-bottom: 8px;
207
+ }
208
+
209
+ .dds-footer {
210
+ text-align: center;
211
+ color: var(--body-text-color-subdued);
212
+ font-size: 0.86rem;
213
+ margin-top: 16px;
214
+ }
215
+
216
+ textarea {
217
+ border-radius: 14px !important;
218
+ }
219
+
220
+ button {
221
+ border-radius: 12px !important;
222
+ }
223
+ """
224
+
225
+
226
+ example_questions = [
227
+ "What is the leave policy at DDS?",
228
+ "How can I apply for annual leave?",
229
+ "What should I do if I have an HR-related concern?",
230
+ "Can you explain the employee code of conduct?",
231
+ "What is the process for reporting a workplace issue?"
232
+ ]
233
+
234
+
235
+ def respond(message, history):
236
+ """
237
+ UI wrapper only.
238
+ Calls the existing query_doc() function without changing backend logic.
239
+ """
240
+ if history is None:
241
+ history = []
242
+
243
+ message = (message or "").strip()
244
+
245
+ if not message:
246
+ return history, ""
247
+
248
+ answer = query_doc(message)
249
+
250
+ history = history + [
251
+ {"role": "user", "content": message},
252
+ {"role": "assistant", "content": answer}
253
+ ]
254
+
255
+ return history, ""
256
+
257
+
258
+ theme = gr.themes.Default(
259
+ primary_hue="slate",
260
+ secondary_hue="gray",
261
+ neutral_hue="gray",
262
+ spacing_size="md",
263
+ radius_size="lg",
264
+ text_size="md"
265
+ )
266
+
267
+
268
+ with gr.Blocks(
269
+ theme=theme,
270
+ css=CUSTOM_CSS,
271
+ title="DDS HR Enterprise Chatbot",
272
+ fill_width=True
273
+ ) as demo:
274
+
275
+ gr.HTML("""
276
+ <div class="dds-hero">
277
+ <div class="dds-title">DDS HR Enterprise Chatbot</div>
278
+ <div class="dds-subtitle">
279
+ A professional HR assistant for Decoding Data Science employees.
280
+ Ask questions related to DDS HR policies, employee guidelines, workplace processes,
281
+ and handbook-supported information.
282
+ </div>
283
+ <div class="dds-badges">
284
+ <span class="dds-badge">DDS HR Handbook</span>
285
+ <span class="dds-badge">Policy-Grounded Answers</span>
286
+ <span class="dds-badge">Professional HR Support</span>
287
+ <span class="dds-badge">Light & Dark Mode Friendly</span>
288
+ </div>
289
+ </div>
290
+ """)
291
+
292
+ with gr.Tabs():
293
+
294
+ # -------------------------------------------------------------
295
+ # Layout 1: Conversational Chat Layout
296
+ # -------------------------------------------------------------
297
+ with gr.Tab("HR Chat Assistant"):
298
+
299
+ with gr.Row():
300
+ with gr.Column(scale=3):
301
+
302
+ chatbot = gr.Chatbot(
303
+ label="DDS HR Assistant",
304
+ height=520,
305
+ layout="panel",
306
+ placeholder=(
307
+ "Ask a DDS HR policy question to get started. "
308
+ "Example: What is the leave policy at DDS?"
309
+ ),
310
+ render_markdown=True,
311
+ sanitize_html=True,
312
+ buttons=["copy", "copy_all"]
313
+ )
314
+
315
+ user_message = gr.Textbox(
316
+ label="Your HR Question",
317
+ placeholder="Type your DDS HR policy question here...",
318
+ lines=3
319
+ )
320
+
321
+ with gr.Row():
322
+ submit_btn = gr.Button("Ask DDS HR", variant="primary")
323
+ clear_btn = gr.ClearButton(
324
+ components=[chatbot, user_message],
325
+ value="Clear Chat"
326
+ )
327
+
328
+ with gr.Column(scale=1):
329
+
330
+ gr.HTML("""
331
+ <div class="dds-card">
332
+ <div class="dds-small-heading">How to use this assistant</div>
333
+ <div class="dds-muted">
334
+ Ask questions that are directly related to DDS HR policies.
335
+ The assistant answers using the approved HR handbook context.
336
+ For confidential or unsupported topics, it redirects users to the DDS HR contact email.
337
+ </div>
338
+ </div>
339
+ """)
340
+
341
+ gr.Markdown("### Quick HR Questions")
342
+
343
+ for question in example_questions:
344
+ example_btn = gr.Button(question)
345
+ example_btn.click(
346
+ fn=lambda q=question: q,
347
+ inputs=[],
348
+ outputs=user_message,
349
+ queue=False
350
+ )
351
+
352
+ submit_btn.click(
353
+ fn=respond,
354
+ inputs=[user_message, chatbot],
355
+ outputs=[chatbot, user_message],
356
+ show_progress="minimal"
357
+ )
358
+
359
+ user_message.submit(
360
+ fn=respond,
361
+ inputs=[user_message, chatbot],
362
+ outputs=[chatbot, user_message],
363
+ show_progress="minimal"
364
+ )
365
+
366
+ # -------------------------------------------------------------
367
+ # Layout 2: Classic Single Q&A Layout
368
+ # -------------------------------------------------------------
369
+ with gr.Tab("Classic Q&A View"):
370
+
371
+ with gr.Row():
372
+
373
+ with gr.Column(scale=1):
374
+
375
+ gr.HTML("""
376
+ <div class="dds-card">
377
+ <div class="dds-small-heading">Single Question Mode</div>
378
+ <div class="dds-muted">
379
+ Use this layout when you want a simple question-and-answer experience.
380
+ This mode calls the same query_doc() backend function used in the original app.
381
+ </div>
382
+ </div>
383
+ """)
384
+
385
+ single_question = gr.Textbox(
386
+ label="Ask a DDS HR Question",
387
+ placeholder="Example: What is the leave policy at DDS?",
388
+ lines=5
389
+ )
390
+
391
+ with gr.Row():
392
+ single_submit = gr.Button("Get Answer", variant="primary")
393
+ single_clear = gr.ClearButton(
394
+ components=[single_question],
395
+ value="Clear Question"
396
+ )
397
+
398
+ gr.Examples(
399
+ examples=[[q] for q in example_questions],
400
+ inputs=single_question,
401
+ label="Try sample HR questions"
402
+ )
403
+
404
+ with gr.Column(scale=1):
405
+
406
+ single_answer = gr.Textbox(
407
+ label="DDS HR Answer",
408
+ lines=16,
409
+ buttons=["copy"]
410
+ )
411
+
412
+ single_submit.click(
413
+ fn=query_doc,
414
+ inputs=single_question,
415
+ outputs=single_answer,
416
+ show_progress="minimal"
417
+ )
418
+
419
+ single_question.submit(
420
+ fn=query_doc,
421
+ inputs=single_question,
422
+ outputs=single_answer,
423
+ show_progress="minimal"
424
+ )
425
+
426
+ # -------------------------------------------------------------
427
+ # FAQ and Scope Section
428
+ # -------------------------------------------------------------
429
+ with gr.Tab("FAQs & Scope"):
430
+
431
+ gr.Markdown("## Frequently Asked Questions")
432
+
433
+ with gr.Accordion("What can this chatbot answer?", open=True):
434
+ gr.Markdown(
435
+ """
436
+ This chatbot can answer questions related to DDS HR policies and handbook-supported employee information.
437
+
438
+ Examples include:
439
+ - Leave policy
440
+ - Workplace conduct
441
+ - HR procedures
442
+ - Employee policy guidance
443
+ - Handbook-supported HR processes
444
+ """
445
+ )
446
+
447
+ with gr.Accordion("Can it answer salary or confidential employee questions?", open=False):
448
+ gr.Markdown(
449
+ """
450
+ No. Salary details, confidential employee records, private HR decisions,
451
+ and unsupported internal information should not be answered by this assistant.
452
+
453
+ For such questions, employees should email:
454
+
455
+ **connect@decodingdatascience.com**
456
+ """
457
+ )
458
+
459
+ with gr.Accordion("Can it answer general questions about DDS as a company?", open=False):
460
+ gr.Markdown(
461
+ """
462
+ No. This assistant is scoped specifically to DDS HR policy questions.
463
+
464
+ General business, marketing, training, sales, or company overview questions are outside its HR scope.
465
+ """
466
+ )
467
+
468
+ with gr.Accordion("What happens if the answer is not in the handbook?", open=False):
469
+ gr.Markdown(
470
+ """
471
+ The assistant should clearly say that it cannot answer based on the available HR handbook content
472
+ and redirect the user to:
473
+
474
+ **connect@decodingdatascience.com**
475
+ """
476
+ )
477
+
478
+ with gr.Accordion("Is this suitable for light mode and dark mode?", open=False):
479
+ gr.Markdown(
480
+ """
481
+ Yes. The interface uses Gradio theme variables and neutral styling so it works cleanly
482
+ with both light and dark display modes.
483
+ """
484
+ )
485
+
486
+ gr.HTML("""
487
+ <div class="dds-card">
488
+ <div class="dds-small-heading">Responsible Use Notice</div>
489
+ <div class="dds-muted">
490
+ This chatbot is designed to support HR policy understanding.
491
+ It should not be used as a replacement for official HR decisions,
492
+ confidential HR discussions, legal advice, or direct communication with DDS HR.
493
+ </div>
494
+ </div>
495
+ """)
496
+
497
+ gr.HTML("""
498
+ <div class="dds-footer">
499
+ DDS HR Enterprise Chatbot · Powered by LlamaIndex, Pinecone, OpenAI, and Gradio
500
+ </div>
501
+ """)
502
+
503
+
504
+ # For Hugging Face Spaces, demo.launch() is enough.
505
+ # If you test locally or in Colab and want a public temporary link, you can use: demo.launch(share=True)
506
+ demo.launch()