Spaces:

MusoraProductDepartment
/

Sentiment_analysis

Sleeping

App Files Files Community

Danialebrat commited on 26 days ago

Commit

5f1963f

1 Parent(s): 82272c5

Adding members sections

Browse files

Adding members filter and dashboard and analysis page
Modifying PDF report.

Files changed (7) hide show

visualization/README.md +14 -8
visualization/components/helpscout_analysis.py +43 -2
visualization/components/helpscout_dashboard.py +52 -0
visualization/data/helpscout_data_loader.py +7 -1
visualization/utils/helpscout_pdf.py +47 -0
visualization/utils/helpscout_utils.py +3 -0
visualization/visualizations/helpscout_charts.py +67 -0

visualization/README.md CHANGED Viewed

@@ -121,6 +121,7 @@ The social media table and `DIM_CONTENT` share column names. Any `WHERE` clause
 - Lightweight query from `SOCIAL_MEDIA_DB.ML_FEATURES.HELPSCOUT_CONVERSATION_FEATURES`.
 - Columns: `conversation_id, status, source, created_at, updated_at, duration_hours, sentiment_polarity, topics, is_refund_request, is_cancellation, is_membership, customer_email`.
 - Merges demographics (age/timezone/experience) via email join (`LOWER(customer_email) = LOWER(usora_users.email)`).
 - Cached **24 hours**. Stored in `st.session_state['helpscout_df']`.
 #### `load_analysis_data(date_start, date_end, topics, sentiments, statuses, sources, is_refund, is_cancellation, is_membership)`
@@ -188,6 +189,7 @@ The app has **5 pages** navigated via the sidebar radio:
 **Reads from:** `st.session_state['helpscout_df']` (loaded at app startup).
 **Key sections:**
 - PDF export button (HelpScout Dashboard PDF)
 - 6 KPI metrics: total conversations, average duration, refund requests, cancellations, negative rate, membership joins
 - Sentiment distribution (pie + bar)
@@ -196,6 +198,7 @@ The app has **5 pages** navigated via the sidebar radio:
 - Status and source breakdown
 - Timelines expander (daily conversation volume, refund/cancel trend)
 - Depth expander (topic co-occurrence, escalation funnel)
 - Demographics (age, timezone, experience)
 > **Note:** Global sidebar filters (brand, platform, sentiment, date) do **not** apply to HelpScout pages — HelpScout is brand-agnostic and uses its own filter panel.
@@ -207,15 +210,16 @@ The app has **5 pages** navigated via the sidebar radio:
 **Receives:** `helpscout_loader` instance.
 **Flow:**
-1. **Filter panel** — date range, top_n, topics (multi-select with human-readable labels), sentiments, statuses, sources, and 3 boolean checkboxes (refund / cancellation / membership).
-2. **Fetch Data** button — calls `helpscout_loader.load_analysis_data(...)`, stale-checked via `fetch_key` tuple.
 3. **KPI row** + distribution charts (sentiment, topics, flags, status).
-4. **AI Summary section:**
    - "Generate AI Summary" button → calls `HelpScoutSummaryAgent`, stores result in `st.session_state['hs_analysis_summary']`.
    - Renders: executive summary, top themes, top complaints, unexpected insights, notable quotes.
    - "Export Analysis PDF" button → generates `HelpScoutAnalysisPDF`.
-5. **Paginated conversation cards** — 10 per page; each card shows customer name, status, topics (label-mapped), summary, sentiment/topic notes.
-6. **CSV export** button.
 **Pagination:** `st.session_state['hs_analysis_page']`. Reset on new fetch.
@@ -251,7 +255,7 @@ st.session_state['global_filters'] = {
 | `sentiment_page` | SA page / fetch | SA pagination |
 | `reply_page` | RR page / fetch | RR pagination |
 | `content_summaries` | SA AI buttons | SA AI analysis display |
-| `helpscout_df` | `app.py` startup | helpscout_dashboard.py, dashboard.py compact summary |
 | `hs_analysis_df` | HS Analysis fetch | helpscout_analysis.py charts + cards |
 | `hs_analysis_fetch_key` | HS Analysis fetch | HS Analysis stale-check |
 | `hs_analysis_filter_desc` | HS Analysis fetch | human-readable filter string for PDF + agent |
@@ -319,13 +323,13 @@ Sections: cover, executive summary, sentiment, brand, platform, intent, cross-di
 ### HelpScout Dashboard PDF (`utils/helpscout_pdf.py` — `HelpScoutDashboardPDF`)
-Generated from the HelpScout Dashboard page. Sections: cover, KPI summary, sentiment, topics, flags & escalation, status & source, timelines, demographics.
 ### HelpScout Analysis PDF (`utils/helpscout_pdf.py` — `HelpScoutAnalysisPDF`)
 Generated from the "Export Analysis PDF" button on the HelpScout Analysis page (only available after an AI Summary has been generated).
-Sections: cover, filter summary, KPI summary, chart snapshots, AI summary (executive summary, top themes, top complaints, unexpected insights, notable quotes), conversation cards sample, metadata.
 **Dependencies:** `fpdf2`, `kaleido` (for Plotly PNG rendering at 3× scale).
@@ -375,6 +379,8 @@ Produces a **page-level** executive report from the filtered HelpScout conversat
 2. Include the new value in the `fetch_key` tuple.
 3. Add the corresponding `WHERE` clause condition to `_build_analysis_query()` in `helpscout_data_loader.py`.
 ### Add a new HelpScout topic
 - Edit `process_helpscout/config_files/topics.json` (the taxonomy file).
 - `helpscout_utils.load_topic_taxonomy()` reloads it on each app start; no other changes needed.

 - Lightweight query from `SOCIAL_MEDIA_DB.ML_FEATURES.HELPSCOUT_CONVERSATION_FEATURES`.
 - Columns: `conversation_id, status, source, created_at, updated_at, duration_hours, sentiment_polarity, topics, is_refund_request, is_cancellation, is_membership, customer_email`.
 - Merges demographics (age/timezone/experience) via email join (`LOWER(customer_email) = LOWER(usora_users.email)`).
+- After the demographics merge, adds **`is_member`** boolean: `True` when the customer email matched a Musora user record, `False` otherwise.
 - Cached **24 hours**. Stored in `st.session_state['helpscout_df']`.
 #### `load_analysis_data(date_start, date_end, topics, sentiments, statuses, sources, is_refund, is_cancellation, is_membership)`
 **Reads from:** `st.session_state['helpscout_df']` (loaded at app startup).
 **Key sections:**
+- **Member status filter** (radio at top): "All Customers / Members Only / Non-Members Only" — filters the entire dashboard view before any section renders
 - PDF export button (HelpScout Dashboard PDF)
 - 6 KPI metrics: total conversations, average duration, refund requests, cancellations, negative rate, membership joins
 - Sentiment distribution (pie + bar)
 - Status and source breakdown
 - Timelines expander (daily conversation volume, refund/cancel trend)
 - Depth expander (topic co-occurrence, escalation funnel)
+- **Member vs Non-Member section**: KPI metrics (member count, non-member count, email match rate) + member share pie chart + sentiment by member status stacked bar + top topics by member status grouped bar
 - Demographics (age, timezone, experience)
 > **Note:** Global sidebar filters (brand, platform, sentiment, date) do **not** apply to HelpScout pages — HelpScout is brand-agnostic and uses its own filter panel.
 **Receives:** `helpscout_loader` instance.
 **Flow:**
+1. **Filter panel** — date range, top_n, topics (multi-select with human-readable labels), sentiments, statuses, sources, 3 boolean checkboxes (refund / cancellation / membership), and a **"Customer Type" selectbox** (All / Members Only / Non-Members Only).
+2. **Fetch Data** button — calls `helpscout_loader.load_analysis_data(...)`, stale-checked via `fetch_key` tuple. The Customer Type filter is **not** part of the Snowflake query — it is applied in Python after fetching, using the member email set derived from `st.session_state['helpscout_df']`.
 3. **KPI row** + distribution charts (sentiment, topics, flags, status).
+4. **Member vs Non-Member section** — always rendered when member data is available; shows share pie, sentiment stacked bar, and top-topics grouped bar split by member status.
+5. **AI Summary section:**
    - "Generate AI Summary" button → calls `HelpScoutSummaryAgent`, stores result in `st.session_state['hs_analysis_summary']`.
    - Renders: executive summary, top themes, top complaints, unexpected insights, notable quotes.
    - "Export Analysis PDF" button → generates `HelpScoutAnalysisPDF`.
+6. **Paginated conversation cards** — 10 per page; each card shows customer name, status, topics (label-mapped), summary, sentiment/topic notes.
+7. **CSV export** button.
 **Pagination:** `st.session_state['hs_analysis_page']`. Reset on new fetch.
 | `sentiment_page` | SA page / fetch | SA pagination |
 | `reply_page` | RR page / fetch | RR pagination |
 | `content_summaries` | SA AI buttons | SA AI analysis display |
+| `helpscout_df` | `app.py` startup | helpscout_dashboard.py (includes `is_member`), dashboard.py compact summary, helpscout_analysis.py member filter |
 | `hs_analysis_df` | HS Analysis fetch | helpscout_analysis.py charts + cards |
 | `hs_analysis_fetch_key` | HS Analysis fetch | HS Analysis stale-check |
 | `hs_analysis_filter_desc` | HS Analysis fetch | human-readable filter string for PDF + agent |
 ### HelpScout Dashboard PDF (`utils/helpscout_pdf.py` — `HelpScoutDashboardPDF`)
+Generated from the HelpScout Dashboard page. Sections: cover, KPI summary, sentiment, topics, flags & escalation, status & source, timelines, depth, **member vs non-member** (metrics + pie + sentiment bar + topic grouped bar), demographics.
 ### HelpScout Analysis PDF (`utils/helpscout_pdf.py` — `HelpScoutAnalysisPDF`)
 Generated from the "Export Analysis PDF" button on the HelpScout Analysis page (only available after an AI Summary has been generated).
+Sections: cover, filter summary, KPI summary (including member/non-member counts when available), chart snapshots, **member vs non-member breakdown** (pie + sentiment bar + topic grouped bar), AI summary (executive summary, top themes, top complaints, unexpected insights, notable quotes), conversation cards sample, metadata.
 **Dependencies:** `fpdf2`, `kaleido` (for Plotly PNG rendering at 3× scale).
 2. Include the new value in the `fetch_key` tuple.
 3. Add the corresponding `WHERE` clause condition to `_build_analysis_query()` in `helpscout_data_loader.py`.
+> **Python-side filters** (those whose data is not in the Snowflake HelpScout table) are applied after fetching rather than in SQL. The member/non-member filter is the canonical example: `is_member` is derived from `st.session_state['helpscout_df']` after the Snowflake fetch. Such filters should **not** be included in the `fetch_key` tuple.
 ### Add a new HelpScout topic
 - Edit `process_helpscout/config_files/topics.json` (the taxonomy file).
 - `helpscout_utils.load_topic_taxonomy()` reloads it on each app start; no other changes needed.

visualization/components/helpscout_analysis.py CHANGED Viewed

@@ -108,13 +108,21 @@ def render_helpscout_analysis(data_loader):
             key="hs_analysis_sources",
         )
-    row3_col1, row3_col2, row3_col3 = st.columns(3)
     with row3_col1:
         refund_only = st.checkbox("Refund Requests Only", key="hs_analysis_refund")
     with row3_col2:
         cancel_only = st.checkbox("Cancellations Only", key="hs_analysis_cancel")
     with row3_col3:
         membership_only = st.checkbox("Membership Joins Only", key="hs_analysis_membership")
     st.markdown("---")
@@ -170,6 +178,7 @@ def render_helpscout_analysis(data_loader):
             "refund_only": refund_only,
             "cancel_only": cancel_only,
             "membership_only": membership_only,
         }
         st.session_state["hs_analysis_df"] = result_df
         st.session_state["hs_analysis_fetch_key"] = fetch_key
@@ -183,9 +192,25 @@ def render_helpscout_analysis(data_loader):
     if not has_data and not fetch_clicked:
         return
-    analysis_df = st.session_state.get("hs_analysis_df", pd.DataFrame())
     filter_desc = st.session_state.get("hs_analysis_filter_desc", "No filters applied")
     if analysis_df.empty:
         st.warning("No conversations found for the selected filters. Try adjusting and re-fetching.")
         return
@@ -238,6 +263,22 @@ def render_helpscout_analysis(data_loader):
         st.plotly_chart(charts.create_volume_timeline(analysis_df, title="Volume Over Time"),
                         use_container_width=True, key="hs_analysis_vol_timeline2")
     st.markdown("---")
     # ── AI Summary Report ─────────────────────────────────────────────────────

             key="hs_analysis_sources",
         )
+    row3_col1, row3_col2, row3_col3, row3_col4 = st.columns(4)
     with row3_col1:
         refund_only = st.checkbox("Refund Requests Only", key="hs_analysis_refund")
     with row3_col2:
         cancel_only = st.checkbox("Cancellations Only", key="hs_analysis_cancel")
     with row3_col3:
         membership_only = st.checkbox("Membership Joins Only", key="hs_analysis_membership")
+    with row3_col4:
+        member_status_filter = st.selectbox(
+            "Customer Type",
+            options=["All", "Members Only", "Non-Members Only"],
+            index=0,
+            help="Members are customers whose email matches a Musora user account.",
+            key="hs_analysis_member_status",
+        )
     st.markdown("---")
             "refund_only": refund_only,
             "cancel_only": cancel_only,
             "membership_only": membership_only,
+            "member_status": member_status_filter,
         }
         st.session_state["hs_analysis_df"] = result_df
         st.session_state["hs_analysis_fetch_key"] = fetch_key
     if not has_data and not fetch_clicked:
         return
+    analysis_df = st.session_state.get("hs_analysis_df", pd.DataFrame()).copy()
     filter_desc = st.session_state.get("hs_analysis_filter_desc", "No filters applied")
+    # Derive is_member from dashboard df (always, so breakdown charts work on "All" too)
+    if "customer_email" in analysis_df.columns:
+        hs_dashboard = st.session_state.get("helpscout_df", pd.DataFrame())
+        if "is_member" in hs_dashboard.columns and not hs_dashboard.empty:
+            member_emails = set(
+                hs_dashboard[hs_dashboard["is_member"]]["customer_email"].str.lower().dropna()
+            )
+            analysis_df["is_member"] = analysis_df["customer_email"].str.lower().isin(member_emails)
+            # Apply filter when a specific group is selected
+            if member_status_filter == "Members Only":
+                analysis_df = analysis_df[analysis_df["is_member"]]
+            elif member_status_filter == "Non-Members Only":
+                analysis_df = analysis_df[~analysis_df["is_member"]]
+        elif member_status_filter != "All":
+            st.warning("Member data not available — customer emails could not be matched to Musora records.")
     if analysis_df.empty:
         st.warning("No conversations found for the selected filters. Try adjusting and re-fetching.")
         return
         st.plotly_chart(charts.create_volume_timeline(analysis_df, title="Volume Over Time"),
                         use_container_width=True, key="hs_analysis_vol_timeline2")
+    # Member vs Non-Member breakdown (only when both groups are present in the view)
+    if "is_member" in analysis_df.columns:
+        st.markdown("### 👤 Member vs Non-Member")
+        col1, col2 = st.columns(2)
+        with col1:
+            st.plotly_chart(charts.create_member_status_chart(analysis_df,
+                            title="Member vs Non-Member"),
+                            use_container_width=True, key="hs_analysis_member_pie")
+        with col2:
+            st.plotly_chart(charts.create_member_sentiment_chart(analysis_df,
+                            title="Sentiment by Member Status"),
+                            use_container_width=True, key="hs_analysis_member_sentiment")
+        st.plotly_chart(charts.create_member_topic_chart(analysis_df,
+                        title="Top Topics by Member Status"),
+                        use_container_width=True, key="hs_analysis_member_topics")
     st.markdown("---")
     # ── AI Summary Report ─────────────────────────────────────────────────────

visualization/components/helpscout_dashboard.py CHANGED Viewed

@@ -57,6 +57,27 @@ def render_helpscout_dashboard(data_loader, date_range=None):
     charts = HelpScoutCharts()
     taxonomy = load_topic_taxonomy()
     # ── PDF Export ────────────────────────────────────────────────────────────
     with st.expander("📄 Export PDF Report", expanded=False):
         st.markdown(
@@ -201,6 +222,37 @@ def render_helpscout_dashboard(data_loader, date_range=None):
         with col2:
             st.plotly_chart(charts.create_thread_count_histogram(hs_df), use_container_width=True)
     # ── Demographics ─────────────────────────────────────────────────────────
     has_demographics = (
         "age_group" in hs_df.columns

     charts = HelpScoutCharts()
     taxonomy = load_topic_taxonomy()
+    # ── Member Status Filter ───────────────────────────────────────────────────
+    has_member_data = "is_member" in hs_df.columns
+    if has_member_data:
+        member_filter = st.radio(
+            "Show conversations for:",
+            options=["All Customers", "Members Only", "Non-Members Only"],
+            horizontal=True,
+            key="hs_dash_member_filter",
+        )
+        if member_filter == "Members Only":
+            hs_df = hs_df[hs_df["is_member"]]
+        elif member_filter == "Non-Members Only":
+            hs_df = hs_df[~hs_df["is_member"]]
+        if member_filter != "All Customers" and hs_df.empty:
+            st.warning(f"No conversations found for {member_filter.lower().replace(' only', '')}.")
+            return
+        if member_filter != "All Customers":
+            st.info(f"Filtered to **{len(hs_df):,}** {member_filter.lower().replace(' only', '')} conversations.")
+    else:
+        st.info("ℹ️ Member data not available — customer emails could not be matched to Musora user records.")
     # ── PDF Export ────────────────────────────────────────────────────────────
     with st.expander("📄 Export PDF Report", expanded=False):
         st.markdown(
         with col2:
             st.plotly_chart(charts.create_thread_count_histogram(hs_df), use_container_width=True)
+    # ── Member vs Non-Member ─────────────────────────────────────────────────
+    if "is_member" in hs_df.columns:
+        st.markdown("---")
+        st.markdown("## 👤 Member vs Non-Member")
+        st.caption(
+            "Conversations are classified as **Member** when the customer email matches "
+            "a Musora user account, and **Non-Member** otherwise."
+        )
+        member_count     = int(hs_df["is_member"].sum())
+        non_member_count = total - member_count
+        match_pct        = member_count / total * 100 if total else 0
+        mv1, mv2, mv3 = st.columns(3)
+        mv1.metric("Members",      f"{member_count:,}",
+                   f"{match_pct:.1f}% of conversations" if total else None)
+        mv2.metric("Non-Members",  f"{non_member_count:,}",
+                   f"{100 - match_pct:.1f}% of conversations" if total else None)
+        mv3.metric("Email Match Rate", f"{match_pct:.1f}%")
+        mem_col1, mem_col2 = st.columns(2)
+        with mem_col1:
+            st.plotly_chart(charts.create_member_status_chart(hs_df),
+                            use_container_width=True, key="hs_dash_member_pie")
+        with mem_col2:
+            st.plotly_chart(charts.create_member_sentiment_chart(hs_df),
+                            use_container_width=True, key="hs_dash_member_sentiment")
+        st.plotly_chart(charts.create_member_topic_chart(hs_df),
+                        use_container_width=True, key="hs_dash_member_topics")
     # ── Demographics ─────────────────────────────────────────────────────────
     has_demographics = (
         "age_group" in hs_df.columns

visualization/data/helpscout_data_loader.py CHANGED Viewed

@@ -297,11 +297,13 @@ class HelpScoutDataLoader:
         if demo_df.empty or "customer_email" not in df.columns:
             for col, val in [("age", None), ("age_group", "Unknown"),
                              ("timezone", None), ("timezone_region", "Unknown"),
-                             ("experience_level", None), ("experience_group", "Unknown")]:
                 df[col] = val
             return df
         if "customer_email" not in demo_df.columns:
             return df
         merge_cols = ["customer_email"]
@@ -311,6 +313,10 @@ class HelpScoutDataLoader:
         merged = df.merge(demo_df[merge_cols], on="customer_email", how="left")
         for col in ["age_group", "timezone_region", "experience_group"]:
             if col in merged.columns:
                 merged[col] = merged[col].fillna("Unknown")

         if demo_df.empty or "customer_email" not in df.columns:
             for col, val in [("age", None), ("age_group", "Unknown"),
                              ("timezone", None), ("timezone_region", "Unknown"),
+                             ("experience_level", None), ("experience_group", "Unknown"),
+                             ("is_member", False)]:
                 df[col] = val
             return df
         if "customer_email" not in demo_df.columns:
+            df["is_member"] = False
             return df
         merge_cols = ["customer_email"]
         merged = df.merge(demo_df[merge_cols], on="customer_email", how="left")
+        # is_member: True when the customer email matched a Musora user record
+        member_emails = set(demo_df["customer_email"].str.lower().dropna())
+        merged["is_member"] = merged["customer_email"].str.lower().isin(member_emails)
         for col in ["age_group", "timezone_region", "experience_group"]:
             if col in merged.columns:
                 merged[col] = merged[col].fillna("Unknown")

visualization/utils/helpscout_pdf.py CHANGED Viewed

@@ -89,6 +89,7 @@ class HelpScoutDashboardPDF:
             self._status_source_section(df)
             self._timelines_section(df)
             self._depth_section(df)
             self._data_summary(df, filter_info)
             return bytes(self.pdf.output())
         finally:
@@ -237,6 +238,33 @@ class HelpScoutDashboardPDF:
         thd = self.charts.create_thread_count_histogram(df)
         self._add_two_charts(dur, thd)
     def _data_summary(self, df, filter_info):
         self.pdf.add_page()
         self.pdf.section_header("Data Summary")
@@ -395,6 +423,14 @@ class HelpScoutAnalysisPDF:
             ("Cancellations",     f"{flags['is_cancellation']:,}"),
             ("Membership Joins",  f"{flags['is_membership']:,}"),
         ])
     def _distributions_section(self, df):
         self.pdf.add_page()
@@ -403,6 +439,17 @@ class HelpScoutAnalysisPDF:
         tbar = self.charts.create_topic_bar_chart(df, title="Topic Distribution")
         self._add_two_charts(pie, tbar)
         self._add_chart(self.charts.create_topic_sentiment_heatmap(df), img_h=500)
     def _summary_section(self, result: dict):
         self.pdf.add_page()

             self._status_source_section(df)
             self._timelines_section(df)
             self._depth_section(df)
+            self._member_section(df)
             self._data_summary(df, filter_info)
             return bytes(self.pdf.output())
         finally:
         thd = self.charts.create_thread_count_histogram(df)
         self._add_two_charts(dur, thd)
+    def _member_section(self, df):
+        if "is_member" not in df.columns:
+            return
+        self.pdf.add_page()
+        self.pdf.section_header("Member vs Non-Member Analysis")
+        total = len(df)
+        member_count     = int(df["is_member"].sum())
+        non_member_count = total - member_count
+        match_pct        = member_count / total * 100 if total else 0
+        self.pdf.metric_row([
+            ("Members",          f"{member_count:,}"),
+            ("Non-Members",      f"{non_member_count:,}"),
+            ("Email Match Rate", f"{match_pct:.1f}%"),
+        ])
+        self.pdf.body_text(
+            "Members are customers whose email was matched against Musora user records. "
+            "Non-Members contacted support without an associated Musora account."
+        )
+        self._add_two_charts(
+            self.charts.create_member_status_chart(df, title="Member vs Non-Member"),
+            self.charts.create_member_sentiment_chart(df, title="Sentiment by Member Status"),
+        )
+        self._add_chart(
+            self.charts.create_member_topic_chart(df, title="Top Topics by Member Status"),
+            img_h=500,
+        )
     def _data_summary(self, df, filter_info):
         self.pdf.add_page()
         self.pdf.section_header("Data Summary")
             ("Cancellations",     f"{flags['is_cancellation']:,}"),
             ("Membership Joins",  f"{flags['is_membership']:,}"),
         ])
+        if "is_member" in df.columns:
+            member_count = int(df["is_member"].sum())
+            non_member_count = total - member_count
+            self.pdf.metric_row([
+                ("Members",          f"{member_count:,}"),
+                ("Non-Members",      f"{non_member_count:,}"),
+                ("Email Match Rate", f"{member_count / total * 100:.1f}%" if total else "N/A"),
+            ])
     def _distributions_section(self, df):
         self.pdf.add_page()
         tbar = self.charts.create_topic_bar_chart(df, title="Topic Distribution")
         self._add_two_charts(pie, tbar)
         self._add_chart(self.charts.create_topic_sentiment_heatmap(df), img_h=500)
+        if "is_member" in df.columns:
+            self.pdf.add_page()
+            self.pdf.section_header("Member vs Non-Member Breakdown")
+            self._add_two_charts(
+                self.charts.create_member_status_chart(df, title="Member vs Non-Member"),
+                self.charts.create_member_sentiment_chart(df, title="Sentiment by Member Status"),
+            )
+            self._add_chart(
+                self.charts.create_member_topic_chart(df, title="Top Topics by Member Status"),
+                img_h=500,
+            )
     def _summary_section(self, result: dict):
         self.pdf.add_page()

visualization/utils/helpscout_utils.py CHANGED Viewed

@@ -104,4 +104,7 @@ def build_filter_description(filters: dict, taxonomy: dict) -> str:
         parts.append("Cancellations only")
     if filters.get("membership_only"):
         parts.append("Membership requests only")
     return "; ".join(parts) if parts else "No filters applied — showing all conversations"

         parts.append("Cancellations only")
     if filters.get("membership_only"):
         parts.append("Membership requests only")
+    member_status = filters.get("member_status", "All")
+    if member_status and member_status != "All":
+        parts.append(f"Customer type: {member_status}")
     return "; ".join(parts) if parts else "No filters applied — showing all conversations"

visualization/visualizations/helpscout_charts.py CHANGED Viewed

@@ -415,6 +415,73 @@ class HelpScoutCharts:
                               yaxis_title="Conversations", height=self.chart_height)
         return fig
     # ─────────────────────────────────────────────────────────────
     # Helpers
     # ─────────────────────────────────────────────────────────────

                               yaxis_title="Conversations", height=self.chart_height)
         return fig
+    # ─────────────────────────────────────────────────────────────
+    # Member vs Non-Member charts
+    # ─────────────────────────────────────────────────────────────
+    def create_member_status_chart(self, df, title="Member vs Non-Member"):
+        """Pie chart: proportion of conversations from Musora members vs non-members."""
+        if "is_member" not in df.columns:
+            return self._empty_fig(title, "No member data available")
+        label_map = {True: "Member", False: "Non-Member"}
+        counts = df["is_member"].map(label_map).value_counts()
+        color_map = {"Member": "#1982C4", "Non-Member": "#FF6B35"}
+        colors = [color_map.get(l, "#CCCCCC") for l in counts.index]
+        fig = go.Figure(go.Pie(
+            labels=counts.index, values=counts.values,
+            marker=dict(colors=colors),
+            textinfo="label+percent",
+            hovertemplate="<b>%{label}</b><br>Count: %{value}<br>%{percent}<extra></extra>",
+        ))
+        fig.update_layout(title=title, height=self.chart_height,
+                          legend=dict(orientation="v", yanchor="middle", y=0.5))
+        return fig
+    def create_member_sentiment_chart(self, df, title="Sentiment by Member Status"):
+        """Stacked bar: sentiment distribution split by member vs non-member."""
+        if "is_member" not in df.columns or "sentiment_polarity" not in df.columns:
+            return self._empty_fig(title, "No member/sentiment data available")
+        df_c = df.copy()
+        df_c["member_status"] = df_c["is_member"].map({True: "Member", False: "Non-Member"})
+        pivot = pd.crosstab(df_c["member_status"], df_c["sentiment_polarity"])
+        ordered_cols = [s for s in self.sentiment_order if s in pivot.columns]
+        pivot = pivot[ordered_cols] if ordered_cols else pivot
+        fig = go.Figure()
+        for s in (ordered_cols or pivot.columns.tolist()):
+            fig.add_trace(go.Bar(
+                name=s, x=pivot.index, y=pivot[s],
+                marker_color=self.sentiment_colors.get(s, "#CCCCCC"),
+                hovertemplate="<b>%{x}</b><br>%{y}<extra></extra>",
+            ))
+        fig.update_layout(title=title, barmode="stack", xaxis_title="Customer Type",
+                          yaxis_title="Conversations", height=self.chart_height)
+        return fig
+    def create_member_topic_chart(self, df, title="Top Topics by Member Status"):
+        """Grouped bar: top-10 topics split by member vs non-member."""
+        if "is_member" not in df.columns:
+            return self._empty_fig(title, "No member data available")
+        exploded = explode_topics(df)
+        if exploded.empty:
+            return self._empty_fig(title, "No topic data")
+        exploded["member_status"] = exploded["is_member"].map({True: "Member", False: "Non-Member"})
+        top_topics = exploded["topic_id"].value_counts().head(10).index.tolist()
+        exploded = exploded[exploded["topic_id"].isin(top_topics)]
+        pivot = pd.crosstab(exploded["topic_id"], exploded["member_status"])
+        pivot.index = [topic_label(t, self.taxonomy) for t in pivot.index]
+        fig = go.Figure()
+        color_map = {"Member": "#1982C4", "Non-Member": "#FF6B35"}
+        for col in pivot.columns:
+            fig.add_trace(go.Bar(
+                name=col, y=pivot.index, x=pivot[col], orientation="h",
+                marker_color=color_map.get(col, "#CCCCCC"),
+                hovertemplate="<b>%{y}</b><br>%{x}<extra></extra>",
+            ))
+        fig.update_layout(title=title, barmode="group", xaxis_title="Conversations",
+                          yaxis_title="Topic", height=self.chart_height + 80,
+                          yaxis={"categoryorder": "total ascending"})
+        return fig
     # ─────────────────────────────────────────────────────────────
     # Helpers
     # ─────────────────────────────────────────────────────────────