Danialebrat commited on
Commit
5f1963f
Β·
1 Parent(s): 82272c5

Adding members sections

Browse files

Adding members filter and dashboard and analysis page
Modifying PDF report.

visualization/README.md CHANGED
@@ -121,6 +121,7 @@ The social media table and `DIM_CONTENT` share column names. Any `WHERE` clause
121
  - Lightweight query from `SOCIAL_MEDIA_DB.ML_FEATURES.HELPSCOUT_CONVERSATION_FEATURES`.
122
  - Columns: `conversation_id, status, source, created_at, updated_at, duration_hours, sentiment_polarity, topics, is_refund_request, is_cancellation, is_membership, customer_email`.
123
  - Merges demographics (age/timezone/experience) via email join (`LOWER(customer_email) = LOWER(usora_users.email)`).
 
124
  - Cached **24 hours**. Stored in `st.session_state['helpscout_df']`.
125
 
126
  #### `load_analysis_data(date_start, date_end, topics, sentiments, statuses, sources, is_refund, is_cancellation, is_membership)`
@@ -188,6 +189,7 @@ The app has **5 pages** navigated via the sidebar radio:
188
  **Reads from:** `st.session_state['helpscout_df']` (loaded at app startup).
189
 
190
  **Key sections:**
 
191
  - PDF export button (HelpScout Dashboard PDF)
192
  - 6 KPI metrics: total conversations, average duration, refund requests, cancellations, negative rate, membership joins
193
  - Sentiment distribution (pie + bar)
@@ -196,6 +198,7 @@ The app has **5 pages** navigated via the sidebar radio:
196
  - Status and source breakdown
197
  - Timelines expander (daily conversation volume, refund/cancel trend)
198
  - Depth expander (topic co-occurrence, escalation funnel)
 
199
  - Demographics (age, timezone, experience)
200
 
201
  > **Note:** Global sidebar filters (brand, platform, sentiment, date) do **not** apply to HelpScout pages β€” HelpScout is brand-agnostic and uses its own filter panel.
@@ -207,15 +210,16 @@ The app has **5 pages** navigated via the sidebar radio:
207
  **Receives:** `helpscout_loader` instance.
208
 
209
  **Flow:**
210
- 1. **Filter panel** β€” date range, top_n, topics (multi-select with human-readable labels), sentiments, statuses, sources, and 3 boolean checkboxes (refund / cancellation / membership).
211
- 2. **Fetch Data** button β€” calls `helpscout_loader.load_analysis_data(...)`, stale-checked via `fetch_key` tuple.
212
  3. **KPI row** + distribution charts (sentiment, topics, flags, status).
213
- 4. **AI Summary section:**
 
214
  - "Generate AI Summary" button β†’ calls `HelpScoutSummaryAgent`, stores result in `st.session_state['hs_analysis_summary']`.
215
  - Renders: executive summary, top themes, top complaints, unexpected insights, notable quotes.
216
  - "Export Analysis PDF" button β†’ generates `HelpScoutAnalysisPDF`.
217
- 5. **Paginated conversation cards** β€” 10 per page; each card shows customer name, status, topics (label-mapped), summary, sentiment/topic notes.
218
- 6. **CSV export** button.
219
 
220
  **Pagination:** `st.session_state['hs_analysis_page']`. Reset on new fetch.
221
 
@@ -251,7 +255,7 @@ st.session_state['global_filters'] = {
251
  | `sentiment_page` | SA page / fetch | SA pagination |
252
  | `reply_page` | RR page / fetch | RR pagination |
253
  | `content_summaries` | SA AI buttons | SA AI analysis display |
254
- | `helpscout_df` | `app.py` startup | helpscout_dashboard.py, dashboard.py compact summary |
255
  | `hs_analysis_df` | HS Analysis fetch | helpscout_analysis.py charts + cards |
256
  | `hs_analysis_fetch_key` | HS Analysis fetch | HS Analysis stale-check |
257
  | `hs_analysis_filter_desc` | HS Analysis fetch | human-readable filter string for PDF + agent |
@@ -319,13 +323,13 @@ Sections: cover, executive summary, sentiment, brand, platform, intent, cross-di
319
 
320
  ### HelpScout Dashboard PDF (`utils/helpscout_pdf.py` β€” `HelpScoutDashboardPDF`)
321
 
322
- Generated from the HelpScout Dashboard page. Sections: cover, KPI summary, sentiment, topics, flags & escalation, status & source, timelines, demographics.
323
 
324
  ### HelpScout Analysis PDF (`utils/helpscout_pdf.py` β€” `HelpScoutAnalysisPDF`)
325
 
326
  Generated from the "Export Analysis PDF" button on the HelpScout Analysis page (only available after an AI Summary has been generated).
327
 
328
- Sections: cover, filter summary, KPI summary, chart snapshots, AI summary (executive summary, top themes, top complaints, unexpected insights, notable quotes), conversation cards sample, metadata.
329
 
330
  **Dependencies:** `fpdf2`, `kaleido` (for Plotly PNG rendering at 3Γ— scale).
331
 
@@ -375,6 +379,8 @@ Produces a **page-level** executive report from the filtered HelpScout conversat
375
  2. Include the new value in the `fetch_key` tuple.
376
  3. Add the corresponding `WHERE` clause condition to `_build_analysis_query()` in `helpscout_data_loader.py`.
377
 
 
 
378
  ### Add a new HelpScout topic
379
  - Edit `process_helpscout/config_files/topics.json` (the taxonomy file).
380
  - `helpscout_utils.load_topic_taxonomy()` reloads it on each app start; no other changes needed.
 
121
  - Lightweight query from `SOCIAL_MEDIA_DB.ML_FEATURES.HELPSCOUT_CONVERSATION_FEATURES`.
122
  - Columns: `conversation_id, status, source, created_at, updated_at, duration_hours, sentiment_polarity, topics, is_refund_request, is_cancellation, is_membership, customer_email`.
123
  - Merges demographics (age/timezone/experience) via email join (`LOWER(customer_email) = LOWER(usora_users.email)`).
124
+ - After the demographics merge, adds **`is_member`** boolean: `True` when the customer email matched a Musora user record, `False` otherwise.
125
  - Cached **24 hours**. Stored in `st.session_state['helpscout_df']`.
126
 
127
  #### `load_analysis_data(date_start, date_end, topics, sentiments, statuses, sources, is_refund, is_cancellation, is_membership)`
 
189
  **Reads from:** `st.session_state['helpscout_df']` (loaded at app startup).
190
 
191
  **Key sections:**
192
+ - **Member status filter** (radio at top): "All Customers / Members Only / Non-Members Only" β€” filters the entire dashboard view before any section renders
193
  - PDF export button (HelpScout Dashboard PDF)
194
  - 6 KPI metrics: total conversations, average duration, refund requests, cancellations, negative rate, membership joins
195
  - Sentiment distribution (pie + bar)
 
198
  - Status and source breakdown
199
  - Timelines expander (daily conversation volume, refund/cancel trend)
200
  - Depth expander (topic co-occurrence, escalation funnel)
201
+ - **Member vs Non-Member section**: KPI metrics (member count, non-member count, email match rate) + member share pie chart + sentiment by member status stacked bar + top topics by member status grouped bar
202
  - Demographics (age, timezone, experience)
203
 
204
  > **Note:** Global sidebar filters (brand, platform, sentiment, date) do **not** apply to HelpScout pages β€” HelpScout is brand-agnostic and uses its own filter panel.
 
210
  **Receives:** `helpscout_loader` instance.
211
 
212
  **Flow:**
213
+ 1. **Filter panel** β€” date range, top_n, topics (multi-select with human-readable labels), sentiments, statuses, sources, 3 boolean checkboxes (refund / cancellation / membership), and a **"Customer Type" selectbox** (All / Members Only / Non-Members Only).
214
+ 2. **Fetch Data** button β€” calls `helpscout_loader.load_analysis_data(...)`, stale-checked via `fetch_key` tuple. The Customer Type filter is **not** part of the Snowflake query β€” it is applied in Python after fetching, using the member email set derived from `st.session_state['helpscout_df']`.
215
  3. **KPI row** + distribution charts (sentiment, topics, flags, status).
216
+ 4. **Member vs Non-Member section** β€” always rendered when member data is available; shows share pie, sentiment stacked bar, and top-topics grouped bar split by member status.
217
+ 5. **AI Summary section:**
218
  - "Generate AI Summary" button β†’ calls `HelpScoutSummaryAgent`, stores result in `st.session_state['hs_analysis_summary']`.
219
  - Renders: executive summary, top themes, top complaints, unexpected insights, notable quotes.
220
  - "Export Analysis PDF" button β†’ generates `HelpScoutAnalysisPDF`.
221
+ 6. **Paginated conversation cards** β€” 10 per page; each card shows customer name, status, topics (label-mapped), summary, sentiment/topic notes.
222
+ 7. **CSV export** button.
223
 
224
  **Pagination:** `st.session_state['hs_analysis_page']`. Reset on new fetch.
225
 
 
255
  | `sentiment_page` | SA page / fetch | SA pagination |
256
  | `reply_page` | RR page / fetch | RR pagination |
257
  | `content_summaries` | SA AI buttons | SA AI analysis display |
258
+ | `helpscout_df` | `app.py` startup | helpscout_dashboard.py (includes `is_member`), dashboard.py compact summary, helpscout_analysis.py member filter |
259
  | `hs_analysis_df` | HS Analysis fetch | helpscout_analysis.py charts + cards |
260
  | `hs_analysis_fetch_key` | HS Analysis fetch | HS Analysis stale-check |
261
  | `hs_analysis_filter_desc` | HS Analysis fetch | human-readable filter string for PDF + agent |
 
323
 
324
  ### HelpScout Dashboard PDF (`utils/helpscout_pdf.py` β€” `HelpScoutDashboardPDF`)
325
 
326
+ Generated from the HelpScout Dashboard page. Sections: cover, KPI summary, sentiment, topics, flags & escalation, status & source, timelines, depth, **member vs non-member** (metrics + pie + sentiment bar + topic grouped bar), demographics.
327
 
328
  ### HelpScout Analysis PDF (`utils/helpscout_pdf.py` β€” `HelpScoutAnalysisPDF`)
329
 
330
  Generated from the "Export Analysis PDF" button on the HelpScout Analysis page (only available after an AI Summary has been generated).
331
 
332
+ Sections: cover, filter summary, KPI summary (including member/non-member counts when available), chart snapshots, **member vs non-member breakdown** (pie + sentiment bar + topic grouped bar), AI summary (executive summary, top themes, top complaints, unexpected insights, notable quotes), conversation cards sample, metadata.
333
 
334
  **Dependencies:** `fpdf2`, `kaleido` (for Plotly PNG rendering at 3Γ— scale).
335
 
 
379
  2. Include the new value in the `fetch_key` tuple.
380
  3. Add the corresponding `WHERE` clause condition to `_build_analysis_query()` in `helpscout_data_loader.py`.
381
 
382
+ > **Python-side filters** (those whose data is not in the Snowflake HelpScout table) are applied after fetching rather than in SQL. The member/non-member filter is the canonical example: `is_member` is derived from `st.session_state['helpscout_df']` after the Snowflake fetch. Such filters should **not** be included in the `fetch_key` tuple.
383
+
384
  ### Add a new HelpScout topic
385
  - Edit `process_helpscout/config_files/topics.json` (the taxonomy file).
386
  - `helpscout_utils.load_topic_taxonomy()` reloads it on each app start; no other changes needed.
visualization/components/helpscout_analysis.py CHANGED
@@ -108,13 +108,21 @@ def render_helpscout_analysis(data_loader):
108
  key="hs_analysis_sources",
109
  )
110
 
111
- row3_col1, row3_col2, row3_col3 = st.columns(3)
112
  with row3_col1:
113
  refund_only = st.checkbox("Refund Requests Only", key="hs_analysis_refund")
114
  with row3_col2:
115
  cancel_only = st.checkbox("Cancellations Only", key="hs_analysis_cancel")
116
  with row3_col3:
117
  membership_only = st.checkbox("Membership Joins Only", key="hs_analysis_membership")
 
 
 
 
 
 
 
 
118
 
119
  st.markdown("---")
120
 
@@ -170,6 +178,7 @@ def render_helpscout_analysis(data_loader):
170
  "refund_only": refund_only,
171
  "cancel_only": cancel_only,
172
  "membership_only": membership_only,
 
173
  }
174
  st.session_state["hs_analysis_df"] = result_df
175
  st.session_state["hs_analysis_fetch_key"] = fetch_key
@@ -183,9 +192,25 @@ def render_helpscout_analysis(data_loader):
183
  if not has_data and not fetch_clicked:
184
  return
185
 
186
- analysis_df = st.session_state.get("hs_analysis_df", pd.DataFrame())
187
  filter_desc = st.session_state.get("hs_analysis_filter_desc", "No filters applied")
188
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
189
  if analysis_df.empty:
190
  st.warning("No conversations found for the selected filters. Try adjusting and re-fetching.")
191
  return
@@ -238,6 +263,22 @@ def render_helpscout_analysis(data_loader):
238
  st.plotly_chart(charts.create_volume_timeline(analysis_df, title="Volume Over Time"),
239
  use_container_width=True, key="hs_analysis_vol_timeline2")
240
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
241
  st.markdown("---")
242
 
243
  # ── AI Summary Report ─────────────────────────────────────────────────────
 
108
  key="hs_analysis_sources",
109
  )
110
 
111
+ row3_col1, row3_col2, row3_col3, row3_col4 = st.columns(4)
112
  with row3_col1:
113
  refund_only = st.checkbox("Refund Requests Only", key="hs_analysis_refund")
114
  with row3_col2:
115
  cancel_only = st.checkbox("Cancellations Only", key="hs_analysis_cancel")
116
  with row3_col3:
117
  membership_only = st.checkbox("Membership Joins Only", key="hs_analysis_membership")
118
+ with row3_col4:
119
+ member_status_filter = st.selectbox(
120
+ "Customer Type",
121
+ options=["All", "Members Only", "Non-Members Only"],
122
+ index=0,
123
+ help="Members are customers whose email matches a Musora user account.",
124
+ key="hs_analysis_member_status",
125
+ )
126
 
127
  st.markdown("---")
128
 
 
178
  "refund_only": refund_only,
179
  "cancel_only": cancel_only,
180
  "membership_only": membership_only,
181
+ "member_status": member_status_filter,
182
  }
183
  st.session_state["hs_analysis_df"] = result_df
184
  st.session_state["hs_analysis_fetch_key"] = fetch_key
 
192
  if not has_data and not fetch_clicked:
193
  return
194
 
195
+ analysis_df = st.session_state.get("hs_analysis_df", pd.DataFrame()).copy()
196
  filter_desc = st.session_state.get("hs_analysis_filter_desc", "No filters applied")
197
 
198
+ # Derive is_member from dashboard df (always, so breakdown charts work on "All" too)
199
+ if "customer_email" in analysis_df.columns:
200
+ hs_dashboard = st.session_state.get("helpscout_df", pd.DataFrame())
201
+ if "is_member" in hs_dashboard.columns and not hs_dashboard.empty:
202
+ member_emails = set(
203
+ hs_dashboard[hs_dashboard["is_member"]]["customer_email"].str.lower().dropna()
204
+ )
205
+ analysis_df["is_member"] = analysis_df["customer_email"].str.lower().isin(member_emails)
206
+ # Apply filter when a specific group is selected
207
+ if member_status_filter == "Members Only":
208
+ analysis_df = analysis_df[analysis_df["is_member"]]
209
+ elif member_status_filter == "Non-Members Only":
210
+ analysis_df = analysis_df[~analysis_df["is_member"]]
211
+ elif member_status_filter != "All":
212
+ st.warning("Member data not available β€” customer emails could not be matched to Musora records.")
213
+
214
  if analysis_df.empty:
215
  st.warning("No conversations found for the selected filters. Try adjusting and re-fetching.")
216
  return
 
263
  st.plotly_chart(charts.create_volume_timeline(analysis_df, title="Volume Over Time"),
264
  use_container_width=True, key="hs_analysis_vol_timeline2")
265
 
266
+ # Member vs Non-Member breakdown (only when both groups are present in the view)
267
+ if "is_member" in analysis_df.columns:
268
+ st.markdown("### πŸ‘€ Member vs Non-Member")
269
+ col1, col2 = st.columns(2)
270
+ with col1:
271
+ st.plotly_chart(charts.create_member_status_chart(analysis_df,
272
+ title="Member vs Non-Member"),
273
+ use_container_width=True, key="hs_analysis_member_pie")
274
+ with col2:
275
+ st.plotly_chart(charts.create_member_sentiment_chart(analysis_df,
276
+ title="Sentiment by Member Status"),
277
+ use_container_width=True, key="hs_analysis_member_sentiment")
278
+ st.plotly_chart(charts.create_member_topic_chart(analysis_df,
279
+ title="Top Topics by Member Status"),
280
+ use_container_width=True, key="hs_analysis_member_topics")
281
+
282
  st.markdown("---")
283
 
284
  # ── AI Summary Report ─────────────────────────────────────────────────────
visualization/components/helpscout_dashboard.py CHANGED
@@ -57,6 +57,27 @@ def render_helpscout_dashboard(data_loader, date_range=None):
57
  charts = HelpScoutCharts()
58
  taxonomy = load_topic_taxonomy()
59
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
60
  # ── PDF Export ────────────────────────────────────────────────────────────
61
  with st.expander("πŸ“„ Export PDF Report", expanded=False):
62
  st.markdown(
@@ -201,6 +222,37 @@ def render_helpscout_dashboard(data_loader, date_range=None):
201
  with col2:
202
  st.plotly_chart(charts.create_thread_count_histogram(hs_df), use_container_width=True)
203
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
204
  # ── Demographics ─────────────────────────────────────────────────────────
205
  has_demographics = (
206
  "age_group" in hs_df.columns
 
57
  charts = HelpScoutCharts()
58
  taxonomy = load_topic_taxonomy()
59
 
60
+ # ── Member Status Filter ───────────────────────────────────────────────────
61
+ has_member_data = "is_member" in hs_df.columns
62
+ if has_member_data:
63
+ member_filter = st.radio(
64
+ "Show conversations for:",
65
+ options=["All Customers", "Members Only", "Non-Members Only"],
66
+ horizontal=True,
67
+ key="hs_dash_member_filter",
68
+ )
69
+ if member_filter == "Members Only":
70
+ hs_df = hs_df[hs_df["is_member"]]
71
+ elif member_filter == "Non-Members Only":
72
+ hs_df = hs_df[~hs_df["is_member"]]
73
+ if member_filter != "All Customers" and hs_df.empty:
74
+ st.warning(f"No conversations found for {member_filter.lower().replace(' only', '')}.")
75
+ return
76
+ if member_filter != "All Customers":
77
+ st.info(f"Filtered to **{len(hs_df):,}** {member_filter.lower().replace(' only', '')} conversations.")
78
+ else:
79
+ st.info("ℹ️ Member data not available β€” customer emails could not be matched to Musora user records.")
80
+
81
  # ── PDF Export ────────────────────────────────────────────────────────────
82
  with st.expander("πŸ“„ Export PDF Report", expanded=False):
83
  st.markdown(
 
222
  with col2:
223
  st.plotly_chart(charts.create_thread_count_histogram(hs_df), use_container_width=True)
224
 
225
+ # ── Member vs Non-Member ─────────────────────────────────────────────────
226
+ if "is_member" in hs_df.columns:
227
+ st.markdown("---")
228
+ st.markdown("## πŸ‘€ Member vs Non-Member")
229
+ st.caption(
230
+ "Conversations are classified as **Member** when the customer email matches "
231
+ "a Musora user account, and **Non-Member** otherwise."
232
+ )
233
+
234
+ member_count = int(hs_df["is_member"].sum())
235
+ non_member_count = total - member_count
236
+ match_pct = member_count / total * 100 if total else 0
237
+
238
+ mv1, mv2, mv3 = st.columns(3)
239
+ mv1.metric("Members", f"{member_count:,}",
240
+ f"{match_pct:.1f}% of conversations" if total else None)
241
+ mv2.metric("Non-Members", f"{non_member_count:,}",
242
+ f"{100 - match_pct:.1f}% of conversations" if total else None)
243
+ mv3.metric("Email Match Rate", f"{match_pct:.1f}%")
244
+
245
+ mem_col1, mem_col2 = st.columns(2)
246
+ with mem_col1:
247
+ st.plotly_chart(charts.create_member_status_chart(hs_df),
248
+ use_container_width=True, key="hs_dash_member_pie")
249
+ with mem_col2:
250
+ st.plotly_chart(charts.create_member_sentiment_chart(hs_df),
251
+ use_container_width=True, key="hs_dash_member_sentiment")
252
+
253
+ st.plotly_chart(charts.create_member_topic_chart(hs_df),
254
+ use_container_width=True, key="hs_dash_member_topics")
255
+
256
  # ── Demographics ─────────────────────────────────────────────────────────
257
  has_demographics = (
258
  "age_group" in hs_df.columns
visualization/data/helpscout_data_loader.py CHANGED
@@ -297,11 +297,13 @@ class HelpScoutDataLoader:
297
  if demo_df.empty or "customer_email" not in df.columns:
298
  for col, val in [("age", None), ("age_group", "Unknown"),
299
  ("timezone", None), ("timezone_region", "Unknown"),
300
- ("experience_level", None), ("experience_group", "Unknown")]:
 
301
  df[col] = val
302
  return df
303
 
304
  if "customer_email" not in demo_df.columns:
 
305
  return df
306
 
307
  merge_cols = ["customer_email"]
@@ -311,6 +313,10 @@ class HelpScoutDataLoader:
311
 
312
  merged = df.merge(demo_df[merge_cols], on="customer_email", how="left")
313
 
 
 
 
 
314
  for col in ["age_group", "timezone_region", "experience_group"]:
315
  if col in merged.columns:
316
  merged[col] = merged[col].fillna("Unknown")
 
297
  if demo_df.empty or "customer_email" not in df.columns:
298
  for col, val in [("age", None), ("age_group", "Unknown"),
299
  ("timezone", None), ("timezone_region", "Unknown"),
300
+ ("experience_level", None), ("experience_group", "Unknown"),
301
+ ("is_member", False)]:
302
  df[col] = val
303
  return df
304
 
305
  if "customer_email" not in demo_df.columns:
306
+ df["is_member"] = False
307
  return df
308
 
309
  merge_cols = ["customer_email"]
 
313
 
314
  merged = df.merge(demo_df[merge_cols], on="customer_email", how="left")
315
 
316
+ # is_member: True when the customer email matched a Musora user record
317
+ member_emails = set(demo_df["customer_email"].str.lower().dropna())
318
+ merged["is_member"] = merged["customer_email"].str.lower().isin(member_emails)
319
+
320
  for col in ["age_group", "timezone_region", "experience_group"]:
321
  if col in merged.columns:
322
  merged[col] = merged[col].fillna("Unknown")
visualization/utils/helpscout_pdf.py CHANGED
@@ -89,6 +89,7 @@ class HelpScoutDashboardPDF:
89
  self._status_source_section(df)
90
  self._timelines_section(df)
91
  self._depth_section(df)
 
92
  self._data_summary(df, filter_info)
93
  return bytes(self.pdf.output())
94
  finally:
@@ -237,6 +238,33 @@ class HelpScoutDashboardPDF:
237
  thd = self.charts.create_thread_count_histogram(df)
238
  self._add_two_charts(dur, thd)
239
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
240
  def _data_summary(self, df, filter_info):
241
  self.pdf.add_page()
242
  self.pdf.section_header("Data Summary")
@@ -395,6 +423,14 @@ class HelpScoutAnalysisPDF:
395
  ("Cancellations", f"{flags['is_cancellation']:,}"),
396
  ("Membership Joins", f"{flags['is_membership']:,}"),
397
  ])
 
 
 
 
 
 
 
 
398
 
399
  def _distributions_section(self, df):
400
  self.pdf.add_page()
@@ -403,6 +439,17 @@ class HelpScoutAnalysisPDF:
403
  tbar = self.charts.create_topic_bar_chart(df, title="Topic Distribution")
404
  self._add_two_charts(pie, tbar)
405
  self._add_chart(self.charts.create_topic_sentiment_heatmap(df), img_h=500)
 
 
 
 
 
 
 
 
 
 
 
406
 
407
  def _summary_section(self, result: dict):
408
  self.pdf.add_page()
 
89
  self._status_source_section(df)
90
  self._timelines_section(df)
91
  self._depth_section(df)
92
+ self._member_section(df)
93
  self._data_summary(df, filter_info)
94
  return bytes(self.pdf.output())
95
  finally:
 
238
  thd = self.charts.create_thread_count_histogram(df)
239
  self._add_two_charts(dur, thd)
240
 
241
+ def _member_section(self, df):
242
+ if "is_member" not in df.columns:
243
+ return
244
+ self.pdf.add_page()
245
+ self.pdf.section_header("Member vs Non-Member Analysis")
246
+ total = len(df)
247
+ member_count = int(df["is_member"].sum())
248
+ non_member_count = total - member_count
249
+ match_pct = member_count / total * 100 if total else 0
250
+ self.pdf.metric_row([
251
+ ("Members", f"{member_count:,}"),
252
+ ("Non-Members", f"{non_member_count:,}"),
253
+ ("Email Match Rate", f"{match_pct:.1f}%"),
254
+ ])
255
+ self.pdf.body_text(
256
+ "Members are customers whose email was matched against Musora user records. "
257
+ "Non-Members contacted support without an associated Musora account."
258
+ )
259
+ self._add_two_charts(
260
+ self.charts.create_member_status_chart(df, title="Member vs Non-Member"),
261
+ self.charts.create_member_sentiment_chart(df, title="Sentiment by Member Status"),
262
+ )
263
+ self._add_chart(
264
+ self.charts.create_member_topic_chart(df, title="Top Topics by Member Status"),
265
+ img_h=500,
266
+ )
267
+
268
  def _data_summary(self, df, filter_info):
269
  self.pdf.add_page()
270
  self.pdf.section_header("Data Summary")
 
423
  ("Cancellations", f"{flags['is_cancellation']:,}"),
424
  ("Membership Joins", f"{flags['is_membership']:,}"),
425
  ])
426
+ if "is_member" in df.columns:
427
+ member_count = int(df["is_member"].sum())
428
+ non_member_count = total - member_count
429
+ self.pdf.metric_row([
430
+ ("Members", f"{member_count:,}"),
431
+ ("Non-Members", f"{non_member_count:,}"),
432
+ ("Email Match Rate", f"{member_count / total * 100:.1f}%" if total else "N/A"),
433
+ ])
434
 
435
  def _distributions_section(self, df):
436
  self.pdf.add_page()
 
439
  tbar = self.charts.create_topic_bar_chart(df, title="Topic Distribution")
440
  self._add_two_charts(pie, tbar)
441
  self._add_chart(self.charts.create_topic_sentiment_heatmap(df), img_h=500)
442
+ if "is_member" in df.columns:
443
+ self.pdf.add_page()
444
+ self.pdf.section_header("Member vs Non-Member Breakdown")
445
+ self._add_two_charts(
446
+ self.charts.create_member_status_chart(df, title="Member vs Non-Member"),
447
+ self.charts.create_member_sentiment_chart(df, title="Sentiment by Member Status"),
448
+ )
449
+ self._add_chart(
450
+ self.charts.create_member_topic_chart(df, title="Top Topics by Member Status"),
451
+ img_h=500,
452
+ )
453
 
454
  def _summary_section(self, result: dict):
455
  self.pdf.add_page()
visualization/utils/helpscout_utils.py CHANGED
@@ -104,4 +104,7 @@ def build_filter_description(filters: dict, taxonomy: dict) -> str:
104
  parts.append("Cancellations only")
105
  if filters.get("membership_only"):
106
  parts.append("Membership requests only")
 
 
 
107
  return "; ".join(parts) if parts else "No filters applied β€” showing all conversations"
 
104
  parts.append("Cancellations only")
105
  if filters.get("membership_only"):
106
  parts.append("Membership requests only")
107
+ member_status = filters.get("member_status", "All")
108
+ if member_status and member_status != "All":
109
+ parts.append(f"Customer type: {member_status}")
110
  return "; ".join(parts) if parts else "No filters applied β€” showing all conversations"
visualization/visualizations/helpscout_charts.py CHANGED
@@ -415,6 +415,73 @@ class HelpScoutCharts:
415
  yaxis_title="Conversations", height=self.chart_height)
416
  return fig
417
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
418
  # ─────────────────────────────────────────────────────────────
419
  # Helpers
420
  # ─────────────────────────────────────────────────────────────
 
415
  yaxis_title="Conversations", height=self.chart_height)
416
  return fig
417
 
418
+ # ─────────────────────────────────────────────────────────────
419
+ # Member vs Non-Member charts
420
+ # ─────────────────────────────────────────────────────────────
421
+
422
+ def create_member_status_chart(self, df, title="Member vs Non-Member"):
423
+ """Pie chart: proportion of conversations from Musora members vs non-members."""
424
+ if "is_member" not in df.columns:
425
+ return self._empty_fig(title, "No member data available")
426
+ label_map = {True: "Member", False: "Non-Member"}
427
+ counts = df["is_member"].map(label_map).value_counts()
428
+ color_map = {"Member": "#1982C4", "Non-Member": "#FF6B35"}
429
+ colors = [color_map.get(l, "#CCCCCC") for l in counts.index]
430
+ fig = go.Figure(go.Pie(
431
+ labels=counts.index, values=counts.values,
432
+ marker=dict(colors=colors),
433
+ textinfo="label+percent",
434
+ hovertemplate="<b>%{label}</b><br>Count: %{value}<br>%{percent}<extra></extra>",
435
+ ))
436
+ fig.update_layout(title=title, height=self.chart_height,
437
+ legend=dict(orientation="v", yanchor="middle", y=0.5))
438
+ return fig
439
+
440
+ def create_member_sentiment_chart(self, df, title="Sentiment by Member Status"):
441
+ """Stacked bar: sentiment distribution split by member vs non-member."""
442
+ if "is_member" not in df.columns or "sentiment_polarity" not in df.columns:
443
+ return self._empty_fig(title, "No member/sentiment data available")
444
+ df_c = df.copy()
445
+ df_c["member_status"] = df_c["is_member"].map({True: "Member", False: "Non-Member"})
446
+ pivot = pd.crosstab(df_c["member_status"], df_c["sentiment_polarity"])
447
+ ordered_cols = [s for s in self.sentiment_order if s in pivot.columns]
448
+ pivot = pivot[ordered_cols] if ordered_cols else pivot
449
+ fig = go.Figure()
450
+ for s in (ordered_cols or pivot.columns.tolist()):
451
+ fig.add_trace(go.Bar(
452
+ name=s, x=pivot.index, y=pivot[s],
453
+ marker_color=self.sentiment_colors.get(s, "#CCCCCC"),
454
+ hovertemplate="<b>%{x}</b><br>%{y}<extra></extra>",
455
+ ))
456
+ fig.update_layout(title=title, barmode="stack", xaxis_title="Customer Type",
457
+ yaxis_title="Conversations", height=self.chart_height)
458
+ return fig
459
+
460
+ def create_member_topic_chart(self, df, title="Top Topics by Member Status"):
461
+ """Grouped bar: top-10 topics split by member vs non-member."""
462
+ if "is_member" not in df.columns:
463
+ return self._empty_fig(title, "No member data available")
464
+ exploded = explode_topics(df)
465
+ if exploded.empty:
466
+ return self._empty_fig(title, "No topic data")
467
+ exploded["member_status"] = exploded["is_member"].map({True: "Member", False: "Non-Member"})
468
+ top_topics = exploded["topic_id"].value_counts().head(10).index.tolist()
469
+ exploded = exploded[exploded["topic_id"].isin(top_topics)]
470
+ pivot = pd.crosstab(exploded["topic_id"], exploded["member_status"])
471
+ pivot.index = [topic_label(t, self.taxonomy) for t in pivot.index]
472
+ fig = go.Figure()
473
+ color_map = {"Member": "#1982C4", "Non-Member": "#FF6B35"}
474
+ for col in pivot.columns:
475
+ fig.add_trace(go.Bar(
476
+ name=col, y=pivot.index, x=pivot[col], orientation="h",
477
+ marker_color=color_map.get(col, "#CCCCCC"),
478
+ hovertemplate="<b>%{y}</b><br>%{x}<extra></extra>",
479
+ ))
480
+ fig.update_layout(title=title, barmode="group", xaxis_title="Conversations",
481
+ yaxis_title="Topic", height=self.chart_height + 80,
482
+ yaxis={"categoryorder": "total ascending"})
483
+ return fig
484
+
485
  # ─────────────────────────────────────────────────────────────
486
  # Helpers
487
  # ─────────────────────────────────────────────────────────────