Commit Β·
5f1963f
1
Parent(s): 82272c5
Adding members sections
Browse filesAdding members filter and dashboard and analysis page
Modifying PDF report.
- visualization/README.md +14 -8
- visualization/components/helpscout_analysis.py +43 -2
- visualization/components/helpscout_dashboard.py +52 -0
- visualization/data/helpscout_data_loader.py +7 -1
- visualization/utils/helpscout_pdf.py +47 -0
- visualization/utils/helpscout_utils.py +3 -0
- visualization/visualizations/helpscout_charts.py +67 -0
visualization/README.md
CHANGED
|
@@ -121,6 +121,7 @@ The social media table and `DIM_CONTENT` share column names. Any `WHERE` clause
|
|
| 121 |
- Lightweight query from `SOCIAL_MEDIA_DB.ML_FEATURES.HELPSCOUT_CONVERSATION_FEATURES`.
|
| 122 |
- Columns: `conversation_id, status, source, created_at, updated_at, duration_hours, sentiment_polarity, topics, is_refund_request, is_cancellation, is_membership, customer_email`.
|
| 123 |
- Merges demographics (age/timezone/experience) via email join (`LOWER(customer_email) = LOWER(usora_users.email)`).
|
|
|
|
| 124 |
- Cached **24 hours**. Stored in `st.session_state['helpscout_df']`.
|
| 125 |
|
| 126 |
#### `load_analysis_data(date_start, date_end, topics, sentiments, statuses, sources, is_refund, is_cancellation, is_membership)`
|
|
@@ -188,6 +189,7 @@ The app has **5 pages** navigated via the sidebar radio:
|
|
| 188 |
**Reads from:** `st.session_state['helpscout_df']` (loaded at app startup).
|
| 189 |
|
| 190 |
**Key sections:**
|
|
|
|
| 191 |
- PDF export button (HelpScout Dashboard PDF)
|
| 192 |
- 6 KPI metrics: total conversations, average duration, refund requests, cancellations, negative rate, membership joins
|
| 193 |
- Sentiment distribution (pie + bar)
|
|
@@ -196,6 +198,7 @@ The app has **5 pages** navigated via the sidebar radio:
|
|
| 196 |
- Status and source breakdown
|
| 197 |
- Timelines expander (daily conversation volume, refund/cancel trend)
|
| 198 |
- Depth expander (topic co-occurrence, escalation funnel)
|
|
|
|
| 199 |
- Demographics (age, timezone, experience)
|
| 200 |
|
| 201 |
> **Note:** Global sidebar filters (brand, platform, sentiment, date) do **not** apply to HelpScout pages β HelpScout is brand-agnostic and uses its own filter panel.
|
|
@@ -207,15 +210,16 @@ The app has **5 pages** navigated via the sidebar radio:
|
|
| 207 |
**Receives:** `helpscout_loader` instance.
|
| 208 |
|
| 209 |
**Flow:**
|
| 210 |
-
1. **Filter panel** β date range, top_n, topics (multi-select with human-readable labels), sentiments, statuses, sources,
|
| 211 |
-
2. **Fetch Data** button β calls `helpscout_loader.load_analysis_data(...)`, stale-checked via `fetch_key` tuple.
|
| 212 |
3. **KPI row** + distribution charts (sentiment, topics, flags, status).
|
| 213 |
-
4. **
|
|
|
|
| 214 |
- "Generate AI Summary" button β calls `HelpScoutSummaryAgent`, stores result in `st.session_state['hs_analysis_summary']`.
|
| 215 |
- Renders: executive summary, top themes, top complaints, unexpected insights, notable quotes.
|
| 216 |
- "Export Analysis PDF" button β generates `HelpScoutAnalysisPDF`.
|
| 217 |
-
|
| 218 |
-
|
| 219 |
|
| 220 |
**Pagination:** `st.session_state['hs_analysis_page']`. Reset on new fetch.
|
| 221 |
|
|
@@ -251,7 +255,7 @@ st.session_state['global_filters'] = {
|
|
| 251 |
| `sentiment_page` | SA page / fetch | SA pagination |
|
| 252 |
| `reply_page` | RR page / fetch | RR pagination |
|
| 253 |
| `content_summaries` | SA AI buttons | SA AI analysis display |
|
| 254 |
-
| `helpscout_df` | `app.py` startup | helpscout_dashboard.py, dashboard.py compact summary |
|
| 255 |
| `hs_analysis_df` | HS Analysis fetch | helpscout_analysis.py charts + cards |
|
| 256 |
| `hs_analysis_fetch_key` | HS Analysis fetch | HS Analysis stale-check |
|
| 257 |
| `hs_analysis_filter_desc` | HS Analysis fetch | human-readable filter string for PDF + agent |
|
|
@@ -319,13 +323,13 @@ Sections: cover, executive summary, sentiment, brand, platform, intent, cross-di
|
|
| 319 |
|
| 320 |
### HelpScout Dashboard PDF (`utils/helpscout_pdf.py` β `HelpScoutDashboardPDF`)
|
| 321 |
|
| 322 |
-
Generated from the HelpScout Dashboard page. Sections: cover, KPI summary, sentiment, topics, flags & escalation, status & source, timelines, demographics.
|
| 323 |
|
| 324 |
### HelpScout Analysis PDF (`utils/helpscout_pdf.py` β `HelpScoutAnalysisPDF`)
|
| 325 |
|
| 326 |
Generated from the "Export Analysis PDF" button on the HelpScout Analysis page (only available after an AI Summary has been generated).
|
| 327 |
|
| 328 |
-
Sections: cover, filter summary, KPI summary, chart snapshots, AI summary (executive summary, top themes, top complaints, unexpected insights, notable quotes), conversation cards sample, metadata.
|
| 329 |
|
| 330 |
**Dependencies:** `fpdf2`, `kaleido` (for Plotly PNG rendering at 3Γ scale).
|
| 331 |
|
|
@@ -375,6 +379,8 @@ Produces a **page-level** executive report from the filtered HelpScout conversat
|
|
| 375 |
2. Include the new value in the `fetch_key` tuple.
|
| 376 |
3. Add the corresponding `WHERE` clause condition to `_build_analysis_query()` in `helpscout_data_loader.py`.
|
| 377 |
|
|
|
|
|
|
|
| 378 |
### Add a new HelpScout topic
|
| 379 |
- Edit `process_helpscout/config_files/topics.json` (the taxonomy file).
|
| 380 |
- `helpscout_utils.load_topic_taxonomy()` reloads it on each app start; no other changes needed.
|
|
|
|
| 121 |
- Lightweight query from `SOCIAL_MEDIA_DB.ML_FEATURES.HELPSCOUT_CONVERSATION_FEATURES`.
|
| 122 |
- Columns: `conversation_id, status, source, created_at, updated_at, duration_hours, sentiment_polarity, topics, is_refund_request, is_cancellation, is_membership, customer_email`.
|
| 123 |
- Merges demographics (age/timezone/experience) via email join (`LOWER(customer_email) = LOWER(usora_users.email)`).
|
| 124 |
+
- After the demographics merge, adds **`is_member`** boolean: `True` when the customer email matched a Musora user record, `False` otherwise.
|
| 125 |
- Cached **24 hours**. Stored in `st.session_state['helpscout_df']`.
|
| 126 |
|
| 127 |
#### `load_analysis_data(date_start, date_end, topics, sentiments, statuses, sources, is_refund, is_cancellation, is_membership)`
|
|
|
|
| 189 |
**Reads from:** `st.session_state['helpscout_df']` (loaded at app startup).
|
| 190 |
|
| 191 |
**Key sections:**
|
| 192 |
+
- **Member status filter** (radio at top): "All Customers / Members Only / Non-Members Only" β filters the entire dashboard view before any section renders
|
| 193 |
- PDF export button (HelpScout Dashboard PDF)
|
| 194 |
- 6 KPI metrics: total conversations, average duration, refund requests, cancellations, negative rate, membership joins
|
| 195 |
- Sentiment distribution (pie + bar)
|
|
|
|
| 198 |
- Status and source breakdown
|
| 199 |
- Timelines expander (daily conversation volume, refund/cancel trend)
|
| 200 |
- Depth expander (topic co-occurrence, escalation funnel)
|
| 201 |
+
- **Member vs Non-Member section**: KPI metrics (member count, non-member count, email match rate) + member share pie chart + sentiment by member status stacked bar + top topics by member status grouped bar
|
| 202 |
- Demographics (age, timezone, experience)
|
| 203 |
|
| 204 |
> **Note:** Global sidebar filters (brand, platform, sentiment, date) do **not** apply to HelpScout pages β HelpScout is brand-agnostic and uses its own filter panel.
|
|
|
|
| 210 |
**Receives:** `helpscout_loader` instance.
|
| 211 |
|
| 212 |
**Flow:**
|
| 213 |
+
1. **Filter panel** β date range, top_n, topics (multi-select with human-readable labels), sentiments, statuses, sources, 3 boolean checkboxes (refund / cancellation / membership), and a **"Customer Type" selectbox** (All / Members Only / Non-Members Only).
|
| 214 |
+
2. **Fetch Data** button β calls `helpscout_loader.load_analysis_data(...)`, stale-checked via `fetch_key` tuple. The Customer Type filter is **not** part of the Snowflake query β it is applied in Python after fetching, using the member email set derived from `st.session_state['helpscout_df']`.
|
| 215 |
3. **KPI row** + distribution charts (sentiment, topics, flags, status).
|
| 216 |
+
4. **Member vs Non-Member section** β always rendered when member data is available; shows share pie, sentiment stacked bar, and top-topics grouped bar split by member status.
|
| 217 |
+
5. **AI Summary section:**
|
| 218 |
- "Generate AI Summary" button β calls `HelpScoutSummaryAgent`, stores result in `st.session_state['hs_analysis_summary']`.
|
| 219 |
- Renders: executive summary, top themes, top complaints, unexpected insights, notable quotes.
|
| 220 |
- "Export Analysis PDF" button β generates `HelpScoutAnalysisPDF`.
|
| 221 |
+
6. **Paginated conversation cards** β 10 per page; each card shows customer name, status, topics (label-mapped), summary, sentiment/topic notes.
|
| 222 |
+
7. **CSV export** button.
|
| 223 |
|
| 224 |
**Pagination:** `st.session_state['hs_analysis_page']`. Reset on new fetch.
|
| 225 |
|
|
|
|
| 255 |
| `sentiment_page` | SA page / fetch | SA pagination |
|
| 256 |
| `reply_page` | RR page / fetch | RR pagination |
|
| 257 |
| `content_summaries` | SA AI buttons | SA AI analysis display |
|
| 258 |
+
| `helpscout_df` | `app.py` startup | helpscout_dashboard.py (includes `is_member`), dashboard.py compact summary, helpscout_analysis.py member filter |
|
| 259 |
| `hs_analysis_df` | HS Analysis fetch | helpscout_analysis.py charts + cards |
|
| 260 |
| `hs_analysis_fetch_key` | HS Analysis fetch | HS Analysis stale-check |
|
| 261 |
| `hs_analysis_filter_desc` | HS Analysis fetch | human-readable filter string for PDF + agent |
|
|
|
|
| 323 |
|
| 324 |
### HelpScout Dashboard PDF (`utils/helpscout_pdf.py` β `HelpScoutDashboardPDF`)
|
| 325 |
|
| 326 |
+
Generated from the HelpScout Dashboard page. Sections: cover, KPI summary, sentiment, topics, flags & escalation, status & source, timelines, depth, **member vs non-member** (metrics + pie + sentiment bar + topic grouped bar), demographics.
|
| 327 |
|
| 328 |
### HelpScout Analysis PDF (`utils/helpscout_pdf.py` β `HelpScoutAnalysisPDF`)
|
| 329 |
|
| 330 |
Generated from the "Export Analysis PDF" button on the HelpScout Analysis page (only available after an AI Summary has been generated).
|
| 331 |
|
| 332 |
+
Sections: cover, filter summary, KPI summary (including member/non-member counts when available), chart snapshots, **member vs non-member breakdown** (pie + sentiment bar + topic grouped bar), AI summary (executive summary, top themes, top complaints, unexpected insights, notable quotes), conversation cards sample, metadata.
|
| 333 |
|
| 334 |
**Dependencies:** `fpdf2`, `kaleido` (for Plotly PNG rendering at 3Γ scale).
|
| 335 |
|
|
|
|
| 379 |
2. Include the new value in the `fetch_key` tuple.
|
| 380 |
3. Add the corresponding `WHERE` clause condition to `_build_analysis_query()` in `helpscout_data_loader.py`.
|
| 381 |
|
| 382 |
+
> **Python-side filters** (those whose data is not in the Snowflake HelpScout table) are applied after fetching rather than in SQL. The member/non-member filter is the canonical example: `is_member` is derived from `st.session_state['helpscout_df']` after the Snowflake fetch. Such filters should **not** be included in the `fetch_key` tuple.
|
| 383 |
+
|
| 384 |
### Add a new HelpScout topic
|
| 385 |
- Edit `process_helpscout/config_files/topics.json` (the taxonomy file).
|
| 386 |
- `helpscout_utils.load_topic_taxonomy()` reloads it on each app start; no other changes needed.
|
visualization/components/helpscout_analysis.py
CHANGED
|
@@ -108,13 +108,21 @@ def render_helpscout_analysis(data_loader):
|
|
| 108 |
key="hs_analysis_sources",
|
| 109 |
)
|
| 110 |
|
| 111 |
-
row3_col1, row3_col2, row3_col3 = st.columns(
|
| 112 |
with row3_col1:
|
| 113 |
refund_only = st.checkbox("Refund Requests Only", key="hs_analysis_refund")
|
| 114 |
with row3_col2:
|
| 115 |
cancel_only = st.checkbox("Cancellations Only", key="hs_analysis_cancel")
|
| 116 |
with row3_col3:
|
| 117 |
membership_only = st.checkbox("Membership Joins Only", key="hs_analysis_membership")
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 118 |
|
| 119 |
st.markdown("---")
|
| 120 |
|
|
@@ -170,6 +178,7 @@ def render_helpscout_analysis(data_loader):
|
|
| 170 |
"refund_only": refund_only,
|
| 171 |
"cancel_only": cancel_only,
|
| 172 |
"membership_only": membership_only,
|
|
|
|
| 173 |
}
|
| 174 |
st.session_state["hs_analysis_df"] = result_df
|
| 175 |
st.session_state["hs_analysis_fetch_key"] = fetch_key
|
|
@@ -183,9 +192,25 @@ def render_helpscout_analysis(data_loader):
|
|
| 183 |
if not has_data and not fetch_clicked:
|
| 184 |
return
|
| 185 |
|
| 186 |
-
analysis_df = st.session_state.get("hs_analysis_df", pd.DataFrame())
|
| 187 |
filter_desc = st.session_state.get("hs_analysis_filter_desc", "No filters applied")
|
| 188 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 189 |
if analysis_df.empty:
|
| 190 |
st.warning("No conversations found for the selected filters. Try adjusting and re-fetching.")
|
| 191 |
return
|
|
@@ -238,6 +263,22 @@ def render_helpscout_analysis(data_loader):
|
|
| 238 |
st.plotly_chart(charts.create_volume_timeline(analysis_df, title="Volume Over Time"),
|
| 239 |
use_container_width=True, key="hs_analysis_vol_timeline2")
|
| 240 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 241 |
st.markdown("---")
|
| 242 |
|
| 243 |
# ββ AI Summary Report βββββββββββββββββββββββββββββββββββββββββββββββββββββ
|
|
|
|
| 108 |
key="hs_analysis_sources",
|
| 109 |
)
|
| 110 |
|
| 111 |
+
row3_col1, row3_col2, row3_col3, row3_col4 = st.columns(4)
|
| 112 |
with row3_col1:
|
| 113 |
refund_only = st.checkbox("Refund Requests Only", key="hs_analysis_refund")
|
| 114 |
with row3_col2:
|
| 115 |
cancel_only = st.checkbox("Cancellations Only", key="hs_analysis_cancel")
|
| 116 |
with row3_col3:
|
| 117 |
membership_only = st.checkbox("Membership Joins Only", key="hs_analysis_membership")
|
| 118 |
+
with row3_col4:
|
| 119 |
+
member_status_filter = st.selectbox(
|
| 120 |
+
"Customer Type",
|
| 121 |
+
options=["All", "Members Only", "Non-Members Only"],
|
| 122 |
+
index=0,
|
| 123 |
+
help="Members are customers whose email matches a Musora user account.",
|
| 124 |
+
key="hs_analysis_member_status",
|
| 125 |
+
)
|
| 126 |
|
| 127 |
st.markdown("---")
|
| 128 |
|
|
|
|
| 178 |
"refund_only": refund_only,
|
| 179 |
"cancel_only": cancel_only,
|
| 180 |
"membership_only": membership_only,
|
| 181 |
+
"member_status": member_status_filter,
|
| 182 |
}
|
| 183 |
st.session_state["hs_analysis_df"] = result_df
|
| 184 |
st.session_state["hs_analysis_fetch_key"] = fetch_key
|
|
|
|
| 192 |
if not has_data and not fetch_clicked:
|
| 193 |
return
|
| 194 |
|
| 195 |
+
analysis_df = st.session_state.get("hs_analysis_df", pd.DataFrame()).copy()
|
| 196 |
filter_desc = st.session_state.get("hs_analysis_filter_desc", "No filters applied")
|
| 197 |
|
| 198 |
+
# Derive is_member from dashboard df (always, so breakdown charts work on "All" too)
|
| 199 |
+
if "customer_email" in analysis_df.columns:
|
| 200 |
+
hs_dashboard = st.session_state.get("helpscout_df", pd.DataFrame())
|
| 201 |
+
if "is_member" in hs_dashboard.columns and not hs_dashboard.empty:
|
| 202 |
+
member_emails = set(
|
| 203 |
+
hs_dashboard[hs_dashboard["is_member"]]["customer_email"].str.lower().dropna()
|
| 204 |
+
)
|
| 205 |
+
analysis_df["is_member"] = analysis_df["customer_email"].str.lower().isin(member_emails)
|
| 206 |
+
# Apply filter when a specific group is selected
|
| 207 |
+
if member_status_filter == "Members Only":
|
| 208 |
+
analysis_df = analysis_df[analysis_df["is_member"]]
|
| 209 |
+
elif member_status_filter == "Non-Members Only":
|
| 210 |
+
analysis_df = analysis_df[~analysis_df["is_member"]]
|
| 211 |
+
elif member_status_filter != "All":
|
| 212 |
+
st.warning("Member data not available β customer emails could not be matched to Musora records.")
|
| 213 |
+
|
| 214 |
if analysis_df.empty:
|
| 215 |
st.warning("No conversations found for the selected filters. Try adjusting and re-fetching.")
|
| 216 |
return
|
|
|
|
| 263 |
st.plotly_chart(charts.create_volume_timeline(analysis_df, title="Volume Over Time"),
|
| 264 |
use_container_width=True, key="hs_analysis_vol_timeline2")
|
| 265 |
|
| 266 |
+
# Member vs Non-Member breakdown (only when both groups are present in the view)
|
| 267 |
+
if "is_member" in analysis_df.columns:
|
| 268 |
+
st.markdown("### π€ Member vs Non-Member")
|
| 269 |
+
col1, col2 = st.columns(2)
|
| 270 |
+
with col1:
|
| 271 |
+
st.plotly_chart(charts.create_member_status_chart(analysis_df,
|
| 272 |
+
title="Member vs Non-Member"),
|
| 273 |
+
use_container_width=True, key="hs_analysis_member_pie")
|
| 274 |
+
with col2:
|
| 275 |
+
st.plotly_chart(charts.create_member_sentiment_chart(analysis_df,
|
| 276 |
+
title="Sentiment by Member Status"),
|
| 277 |
+
use_container_width=True, key="hs_analysis_member_sentiment")
|
| 278 |
+
st.plotly_chart(charts.create_member_topic_chart(analysis_df,
|
| 279 |
+
title="Top Topics by Member Status"),
|
| 280 |
+
use_container_width=True, key="hs_analysis_member_topics")
|
| 281 |
+
|
| 282 |
st.markdown("---")
|
| 283 |
|
| 284 |
# ββ AI Summary Report βββββββββββββββββββββββββββββββββββββββββββββββββββββ
|
visualization/components/helpscout_dashboard.py
CHANGED
|
@@ -57,6 +57,27 @@ def render_helpscout_dashboard(data_loader, date_range=None):
|
|
| 57 |
charts = HelpScoutCharts()
|
| 58 |
taxonomy = load_topic_taxonomy()
|
| 59 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 60 |
# ββ PDF Export ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
|
| 61 |
with st.expander("π Export PDF Report", expanded=False):
|
| 62 |
st.markdown(
|
|
@@ -201,6 +222,37 @@ def render_helpscout_dashboard(data_loader, date_range=None):
|
|
| 201 |
with col2:
|
| 202 |
st.plotly_chart(charts.create_thread_count_histogram(hs_df), use_container_width=True)
|
| 203 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 204 |
# ββ Demographics βββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
|
| 205 |
has_demographics = (
|
| 206 |
"age_group" in hs_df.columns
|
|
|
|
| 57 |
charts = HelpScoutCharts()
|
| 58 |
taxonomy = load_topic_taxonomy()
|
| 59 |
|
| 60 |
+
# ββ Member Status Filter βββββββββββββββββββββββββββββββββββββββββββββββββββ
|
| 61 |
+
has_member_data = "is_member" in hs_df.columns
|
| 62 |
+
if has_member_data:
|
| 63 |
+
member_filter = st.radio(
|
| 64 |
+
"Show conversations for:",
|
| 65 |
+
options=["All Customers", "Members Only", "Non-Members Only"],
|
| 66 |
+
horizontal=True,
|
| 67 |
+
key="hs_dash_member_filter",
|
| 68 |
+
)
|
| 69 |
+
if member_filter == "Members Only":
|
| 70 |
+
hs_df = hs_df[hs_df["is_member"]]
|
| 71 |
+
elif member_filter == "Non-Members Only":
|
| 72 |
+
hs_df = hs_df[~hs_df["is_member"]]
|
| 73 |
+
if member_filter != "All Customers" and hs_df.empty:
|
| 74 |
+
st.warning(f"No conversations found for {member_filter.lower().replace(' only', '')}.")
|
| 75 |
+
return
|
| 76 |
+
if member_filter != "All Customers":
|
| 77 |
+
st.info(f"Filtered to **{len(hs_df):,}** {member_filter.lower().replace(' only', '')} conversations.")
|
| 78 |
+
else:
|
| 79 |
+
st.info("βΉοΈ Member data not available β customer emails could not be matched to Musora user records.")
|
| 80 |
+
|
| 81 |
# ββ PDF Export ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
|
| 82 |
with st.expander("π Export PDF Report", expanded=False):
|
| 83 |
st.markdown(
|
|
|
|
| 222 |
with col2:
|
| 223 |
st.plotly_chart(charts.create_thread_count_histogram(hs_df), use_container_width=True)
|
| 224 |
|
| 225 |
+
# ββ Member vs Non-Member βββββββββββββββββββββββββββββββββββββββββββββββββ
|
| 226 |
+
if "is_member" in hs_df.columns:
|
| 227 |
+
st.markdown("---")
|
| 228 |
+
st.markdown("## π€ Member vs Non-Member")
|
| 229 |
+
st.caption(
|
| 230 |
+
"Conversations are classified as **Member** when the customer email matches "
|
| 231 |
+
"a Musora user account, and **Non-Member** otherwise."
|
| 232 |
+
)
|
| 233 |
+
|
| 234 |
+
member_count = int(hs_df["is_member"].sum())
|
| 235 |
+
non_member_count = total - member_count
|
| 236 |
+
match_pct = member_count / total * 100 if total else 0
|
| 237 |
+
|
| 238 |
+
mv1, mv2, mv3 = st.columns(3)
|
| 239 |
+
mv1.metric("Members", f"{member_count:,}",
|
| 240 |
+
f"{match_pct:.1f}% of conversations" if total else None)
|
| 241 |
+
mv2.metric("Non-Members", f"{non_member_count:,}",
|
| 242 |
+
f"{100 - match_pct:.1f}% of conversations" if total else None)
|
| 243 |
+
mv3.metric("Email Match Rate", f"{match_pct:.1f}%")
|
| 244 |
+
|
| 245 |
+
mem_col1, mem_col2 = st.columns(2)
|
| 246 |
+
with mem_col1:
|
| 247 |
+
st.plotly_chart(charts.create_member_status_chart(hs_df),
|
| 248 |
+
use_container_width=True, key="hs_dash_member_pie")
|
| 249 |
+
with mem_col2:
|
| 250 |
+
st.plotly_chart(charts.create_member_sentiment_chart(hs_df),
|
| 251 |
+
use_container_width=True, key="hs_dash_member_sentiment")
|
| 252 |
+
|
| 253 |
+
st.plotly_chart(charts.create_member_topic_chart(hs_df),
|
| 254 |
+
use_container_width=True, key="hs_dash_member_topics")
|
| 255 |
+
|
| 256 |
# ββ Demographics βββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
|
| 257 |
has_demographics = (
|
| 258 |
"age_group" in hs_df.columns
|
visualization/data/helpscout_data_loader.py
CHANGED
|
@@ -297,11 +297,13 @@ class HelpScoutDataLoader:
|
|
| 297 |
if demo_df.empty or "customer_email" not in df.columns:
|
| 298 |
for col, val in [("age", None), ("age_group", "Unknown"),
|
| 299 |
("timezone", None), ("timezone_region", "Unknown"),
|
| 300 |
-
("experience_level", None), ("experience_group", "Unknown")
|
|
|
|
| 301 |
df[col] = val
|
| 302 |
return df
|
| 303 |
|
| 304 |
if "customer_email" not in demo_df.columns:
|
|
|
|
| 305 |
return df
|
| 306 |
|
| 307 |
merge_cols = ["customer_email"]
|
|
@@ -311,6 +313,10 @@ class HelpScoutDataLoader:
|
|
| 311 |
|
| 312 |
merged = df.merge(demo_df[merge_cols], on="customer_email", how="left")
|
| 313 |
|
|
|
|
|
|
|
|
|
|
|
|
|
| 314 |
for col in ["age_group", "timezone_region", "experience_group"]:
|
| 315 |
if col in merged.columns:
|
| 316 |
merged[col] = merged[col].fillna("Unknown")
|
|
|
|
| 297 |
if demo_df.empty or "customer_email" not in df.columns:
|
| 298 |
for col, val in [("age", None), ("age_group", "Unknown"),
|
| 299 |
("timezone", None), ("timezone_region", "Unknown"),
|
| 300 |
+
("experience_level", None), ("experience_group", "Unknown"),
|
| 301 |
+
("is_member", False)]:
|
| 302 |
df[col] = val
|
| 303 |
return df
|
| 304 |
|
| 305 |
if "customer_email" not in demo_df.columns:
|
| 306 |
+
df["is_member"] = False
|
| 307 |
return df
|
| 308 |
|
| 309 |
merge_cols = ["customer_email"]
|
|
|
|
| 313 |
|
| 314 |
merged = df.merge(demo_df[merge_cols], on="customer_email", how="left")
|
| 315 |
|
| 316 |
+
# is_member: True when the customer email matched a Musora user record
|
| 317 |
+
member_emails = set(demo_df["customer_email"].str.lower().dropna())
|
| 318 |
+
merged["is_member"] = merged["customer_email"].str.lower().isin(member_emails)
|
| 319 |
+
|
| 320 |
for col in ["age_group", "timezone_region", "experience_group"]:
|
| 321 |
if col in merged.columns:
|
| 322 |
merged[col] = merged[col].fillna("Unknown")
|
visualization/utils/helpscout_pdf.py
CHANGED
|
@@ -89,6 +89,7 @@ class HelpScoutDashboardPDF:
|
|
| 89 |
self._status_source_section(df)
|
| 90 |
self._timelines_section(df)
|
| 91 |
self._depth_section(df)
|
|
|
|
| 92 |
self._data_summary(df, filter_info)
|
| 93 |
return bytes(self.pdf.output())
|
| 94 |
finally:
|
|
@@ -237,6 +238,33 @@ class HelpScoutDashboardPDF:
|
|
| 237 |
thd = self.charts.create_thread_count_histogram(df)
|
| 238 |
self._add_two_charts(dur, thd)
|
| 239 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 240 |
def _data_summary(self, df, filter_info):
|
| 241 |
self.pdf.add_page()
|
| 242 |
self.pdf.section_header("Data Summary")
|
|
@@ -395,6 +423,14 @@ class HelpScoutAnalysisPDF:
|
|
| 395 |
("Cancellations", f"{flags['is_cancellation']:,}"),
|
| 396 |
("Membership Joins", f"{flags['is_membership']:,}"),
|
| 397 |
])
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 398 |
|
| 399 |
def _distributions_section(self, df):
|
| 400 |
self.pdf.add_page()
|
|
@@ -403,6 +439,17 @@ class HelpScoutAnalysisPDF:
|
|
| 403 |
tbar = self.charts.create_topic_bar_chart(df, title="Topic Distribution")
|
| 404 |
self._add_two_charts(pie, tbar)
|
| 405 |
self._add_chart(self.charts.create_topic_sentiment_heatmap(df), img_h=500)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 406 |
|
| 407 |
def _summary_section(self, result: dict):
|
| 408 |
self.pdf.add_page()
|
|
|
|
| 89 |
self._status_source_section(df)
|
| 90 |
self._timelines_section(df)
|
| 91 |
self._depth_section(df)
|
| 92 |
+
self._member_section(df)
|
| 93 |
self._data_summary(df, filter_info)
|
| 94 |
return bytes(self.pdf.output())
|
| 95 |
finally:
|
|
|
|
| 238 |
thd = self.charts.create_thread_count_histogram(df)
|
| 239 |
self._add_two_charts(dur, thd)
|
| 240 |
|
| 241 |
+
def _member_section(self, df):
|
| 242 |
+
if "is_member" not in df.columns:
|
| 243 |
+
return
|
| 244 |
+
self.pdf.add_page()
|
| 245 |
+
self.pdf.section_header("Member vs Non-Member Analysis")
|
| 246 |
+
total = len(df)
|
| 247 |
+
member_count = int(df["is_member"].sum())
|
| 248 |
+
non_member_count = total - member_count
|
| 249 |
+
match_pct = member_count / total * 100 if total else 0
|
| 250 |
+
self.pdf.metric_row([
|
| 251 |
+
("Members", f"{member_count:,}"),
|
| 252 |
+
("Non-Members", f"{non_member_count:,}"),
|
| 253 |
+
("Email Match Rate", f"{match_pct:.1f}%"),
|
| 254 |
+
])
|
| 255 |
+
self.pdf.body_text(
|
| 256 |
+
"Members are customers whose email was matched against Musora user records. "
|
| 257 |
+
"Non-Members contacted support without an associated Musora account."
|
| 258 |
+
)
|
| 259 |
+
self._add_two_charts(
|
| 260 |
+
self.charts.create_member_status_chart(df, title="Member vs Non-Member"),
|
| 261 |
+
self.charts.create_member_sentiment_chart(df, title="Sentiment by Member Status"),
|
| 262 |
+
)
|
| 263 |
+
self._add_chart(
|
| 264 |
+
self.charts.create_member_topic_chart(df, title="Top Topics by Member Status"),
|
| 265 |
+
img_h=500,
|
| 266 |
+
)
|
| 267 |
+
|
| 268 |
def _data_summary(self, df, filter_info):
|
| 269 |
self.pdf.add_page()
|
| 270 |
self.pdf.section_header("Data Summary")
|
|
|
|
| 423 |
("Cancellations", f"{flags['is_cancellation']:,}"),
|
| 424 |
("Membership Joins", f"{flags['is_membership']:,}"),
|
| 425 |
])
|
| 426 |
+
if "is_member" in df.columns:
|
| 427 |
+
member_count = int(df["is_member"].sum())
|
| 428 |
+
non_member_count = total - member_count
|
| 429 |
+
self.pdf.metric_row([
|
| 430 |
+
("Members", f"{member_count:,}"),
|
| 431 |
+
("Non-Members", f"{non_member_count:,}"),
|
| 432 |
+
("Email Match Rate", f"{member_count / total * 100:.1f}%" if total else "N/A"),
|
| 433 |
+
])
|
| 434 |
|
| 435 |
def _distributions_section(self, df):
|
| 436 |
self.pdf.add_page()
|
|
|
|
| 439 |
tbar = self.charts.create_topic_bar_chart(df, title="Topic Distribution")
|
| 440 |
self._add_two_charts(pie, tbar)
|
| 441 |
self._add_chart(self.charts.create_topic_sentiment_heatmap(df), img_h=500)
|
| 442 |
+
if "is_member" in df.columns:
|
| 443 |
+
self.pdf.add_page()
|
| 444 |
+
self.pdf.section_header("Member vs Non-Member Breakdown")
|
| 445 |
+
self._add_two_charts(
|
| 446 |
+
self.charts.create_member_status_chart(df, title="Member vs Non-Member"),
|
| 447 |
+
self.charts.create_member_sentiment_chart(df, title="Sentiment by Member Status"),
|
| 448 |
+
)
|
| 449 |
+
self._add_chart(
|
| 450 |
+
self.charts.create_member_topic_chart(df, title="Top Topics by Member Status"),
|
| 451 |
+
img_h=500,
|
| 452 |
+
)
|
| 453 |
|
| 454 |
def _summary_section(self, result: dict):
|
| 455 |
self.pdf.add_page()
|
visualization/utils/helpscout_utils.py
CHANGED
|
@@ -104,4 +104,7 @@ def build_filter_description(filters: dict, taxonomy: dict) -> str:
|
|
| 104 |
parts.append("Cancellations only")
|
| 105 |
if filters.get("membership_only"):
|
| 106 |
parts.append("Membership requests only")
|
|
|
|
|
|
|
|
|
|
| 107 |
return "; ".join(parts) if parts else "No filters applied β showing all conversations"
|
|
|
|
| 104 |
parts.append("Cancellations only")
|
| 105 |
if filters.get("membership_only"):
|
| 106 |
parts.append("Membership requests only")
|
| 107 |
+
member_status = filters.get("member_status", "All")
|
| 108 |
+
if member_status and member_status != "All":
|
| 109 |
+
parts.append(f"Customer type: {member_status}")
|
| 110 |
return "; ".join(parts) if parts else "No filters applied β showing all conversations"
|
visualization/visualizations/helpscout_charts.py
CHANGED
|
@@ -415,6 +415,73 @@ class HelpScoutCharts:
|
|
| 415 |
yaxis_title="Conversations", height=self.chart_height)
|
| 416 |
return fig
|
| 417 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 418 |
# βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
|
| 419 |
# Helpers
|
| 420 |
# βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
|
|
|
|
| 415 |
yaxis_title="Conversations", height=self.chart_height)
|
| 416 |
return fig
|
| 417 |
|
| 418 |
+
# βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
|
| 419 |
+
# Member vs Non-Member charts
|
| 420 |
+
# βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
|
| 421 |
+
|
| 422 |
+
def create_member_status_chart(self, df, title="Member vs Non-Member"):
|
| 423 |
+
"""Pie chart: proportion of conversations from Musora members vs non-members."""
|
| 424 |
+
if "is_member" not in df.columns:
|
| 425 |
+
return self._empty_fig(title, "No member data available")
|
| 426 |
+
label_map = {True: "Member", False: "Non-Member"}
|
| 427 |
+
counts = df["is_member"].map(label_map).value_counts()
|
| 428 |
+
color_map = {"Member": "#1982C4", "Non-Member": "#FF6B35"}
|
| 429 |
+
colors = [color_map.get(l, "#CCCCCC") for l in counts.index]
|
| 430 |
+
fig = go.Figure(go.Pie(
|
| 431 |
+
labels=counts.index, values=counts.values,
|
| 432 |
+
marker=dict(colors=colors),
|
| 433 |
+
textinfo="label+percent",
|
| 434 |
+
hovertemplate="<b>%{label}</b><br>Count: %{value}<br>%{percent}<extra></extra>",
|
| 435 |
+
))
|
| 436 |
+
fig.update_layout(title=title, height=self.chart_height,
|
| 437 |
+
legend=dict(orientation="v", yanchor="middle", y=0.5))
|
| 438 |
+
return fig
|
| 439 |
+
|
| 440 |
+
def create_member_sentiment_chart(self, df, title="Sentiment by Member Status"):
|
| 441 |
+
"""Stacked bar: sentiment distribution split by member vs non-member."""
|
| 442 |
+
if "is_member" not in df.columns or "sentiment_polarity" not in df.columns:
|
| 443 |
+
return self._empty_fig(title, "No member/sentiment data available")
|
| 444 |
+
df_c = df.copy()
|
| 445 |
+
df_c["member_status"] = df_c["is_member"].map({True: "Member", False: "Non-Member"})
|
| 446 |
+
pivot = pd.crosstab(df_c["member_status"], df_c["sentiment_polarity"])
|
| 447 |
+
ordered_cols = [s for s in self.sentiment_order if s in pivot.columns]
|
| 448 |
+
pivot = pivot[ordered_cols] if ordered_cols else pivot
|
| 449 |
+
fig = go.Figure()
|
| 450 |
+
for s in (ordered_cols or pivot.columns.tolist()):
|
| 451 |
+
fig.add_trace(go.Bar(
|
| 452 |
+
name=s, x=pivot.index, y=pivot[s],
|
| 453 |
+
marker_color=self.sentiment_colors.get(s, "#CCCCCC"),
|
| 454 |
+
hovertemplate="<b>%{x}</b><br>%{y}<extra></extra>",
|
| 455 |
+
))
|
| 456 |
+
fig.update_layout(title=title, barmode="stack", xaxis_title="Customer Type",
|
| 457 |
+
yaxis_title="Conversations", height=self.chart_height)
|
| 458 |
+
return fig
|
| 459 |
+
|
| 460 |
+
def create_member_topic_chart(self, df, title="Top Topics by Member Status"):
|
| 461 |
+
"""Grouped bar: top-10 topics split by member vs non-member."""
|
| 462 |
+
if "is_member" not in df.columns:
|
| 463 |
+
return self._empty_fig(title, "No member data available")
|
| 464 |
+
exploded = explode_topics(df)
|
| 465 |
+
if exploded.empty:
|
| 466 |
+
return self._empty_fig(title, "No topic data")
|
| 467 |
+
exploded["member_status"] = exploded["is_member"].map({True: "Member", False: "Non-Member"})
|
| 468 |
+
top_topics = exploded["topic_id"].value_counts().head(10).index.tolist()
|
| 469 |
+
exploded = exploded[exploded["topic_id"].isin(top_topics)]
|
| 470 |
+
pivot = pd.crosstab(exploded["topic_id"], exploded["member_status"])
|
| 471 |
+
pivot.index = [topic_label(t, self.taxonomy) for t in pivot.index]
|
| 472 |
+
fig = go.Figure()
|
| 473 |
+
color_map = {"Member": "#1982C4", "Non-Member": "#FF6B35"}
|
| 474 |
+
for col in pivot.columns:
|
| 475 |
+
fig.add_trace(go.Bar(
|
| 476 |
+
name=col, y=pivot.index, x=pivot[col], orientation="h",
|
| 477 |
+
marker_color=color_map.get(col, "#CCCCCC"),
|
| 478 |
+
hovertemplate="<b>%{y}</b><br>%{x}<extra></extra>",
|
| 479 |
+
))
|
| 480 |
+
fig.update_layout(title=title, barmode="group", xaxis_title="Conversations",
|
| 481 |
+
yaxis_title="Topic", height=self.chart_height + 80,
|
| 482 |
+
yaxis={"categoryorder": "total ascending"})
|
| 483 |
+
return fig
|
| 484 |
+
|
| 485 |
# βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
|
| 486 |
# Helpers
|
| 487 |
# βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
|