Spaces:

akisg
/

phq-9-clinician-agent

Running on Zero

App Files Files Community

Akis Giannoukos commited on 5 days ago

Commit

09716a4

1 Parent(s): 9d16b48

Added explainability

Browse files

Files changed (2) hide show

README.md +45 -2
app.py +327 -32

README.md CHANGED Viewed

@@ -17,6 +17,9 @@ A lightweight research demo that simulates a clinician conducting a brief conver
 ## What it does
 - Conversational assessment to infer PHQ‑9 items from natural dialogue (no explicit questionnaire).
 - Live inference of PHQ‑9 item scores, confidences, total score, and severity.
 - Automatic stop when minimum confidence across items reaches a threshold or risk is detected.
 - Optional TTS playback for clinician responses.
@@ -78,10 +81,50 @@ Notes:
 ## Safety
 This demo does not provide therapy or emergency counseling. If a user expresses suicidal intent or risk is inferred, the app ends the conversation and advises contacting emergency services (e.g., 988 in the U.S.).
 ## Development notes
 - Framework: Gradio Blocks
 - ASR: Transformers pipeline (Whisper)
-- TTS: gTTS
-- Prosody features: librosa (lightweight proxies) for the scoring prompt
 PRs and experiments are welcome. This is a research prototype and not a clinical tool.

 ## What it does
 - Conversational assessment to infer PHQ‑9 items from natural dialogue (no explicit questionnaire).
 - Live inference of PHQ‑9 item scores, confidences, total score, and severity.
+- Iterative light explainability after each turn to guide the next question (strong/weak/missing evidence by item).
+- Final explainability at session end aggregating linguistic quotes and acoustic prosody.
+- Self‑reflection step that checks consistency and may adjust low‑confidence item scores.
 - Automatic stop when minimum confidence across items reaches a threshold or risk is detected.
 - Optional TTS playback for clinician responses.
 ## Safety
 This demo does not provide therapy or emergency counseling. If a user expresses suicidal intent or risk is inferred, the app ends the conversation and advises contacting emergency services (e.g., 988 in the U.S.).
+## Architecture
+RecordingAgent → ScoringAgent → ExplainabilityModule(light/full) → ReflectionModule → ReportGenerator
+- RecordingAgent: generates clinician follow‑ups; guided by light explainability when available.
+- ScoringAgent: infers PHQ‑9 item scores and per‑item confidences from transcript (+prosody summary).
+- Explainability (light): keyword‑based evidence strength per item; selects next focus area.
+- Explainability (full): aggregates transcript quotes and averaged prosody features into per‑item objects.
+- Reflection: heuristic pass reduces scores by 1 for items with confidence < τ and missing evidence.
+- ReportGenerator: patient and clinician summaries, confidence bars, highlights, and reflection notes.
+### Output objects
+- Explainability (light):
+  ```json
+  {
+    "evidence_strength": {"appetite": "missing", ...},
+    "recommended_focus": "appetite",
+    "quotes": {"appetite": ["..."], ...},
+    "confidences": {"appetite": 0.34, ...}
+  }
+  ```
+- Explainability (full):
+  ```json
+  {
+    "items": [
+      {"item":"appetite","confidence":0.42,"evidence":["..."],"prosody":["rms_mean=0.012", "zcr_mean=0.065", ...]}
+    ],
+    "notes": "Heuristic placeholder"
+  }
+  ```
+- Reflection report:
+  ```json
+  {
+    "corrected_scores": {"appetite": 1, ...},
+    "final_total": 12,
+    "severity_label": "Moderate Depression",
+    "consistency_score": 0.89,
+    "notes": "Model revised appetite score due to low confidence and missing evidence."
+  }
+  ```
 ## Development notes
 - Framework: Gradio Blocks
 - ASR: Transformers pipeline (Whisper)
+- TTS: gTTS or Coqui TTS
+- Prosody features: librosa proxies; replaceable by OpenSMILE
 PRs and experiments are welcome. This is a research prototype and not a clinical tool.

app.py CHANGED Viewed

@@ -288,6 +288,36 @@ def severity_from_total(total_score: int) -> str:
     return "Severe Depression"
 def transcript_to_text(chat_history: List[Tuple[str, str]]) -> str:
     """Convert chatbot history [(user, assistant), ...] to a plain text transcript."""
     lines = []
@@ -299,15 +329,195 @@ def transcript_to_text(chat_history: List[Tuple[str, str]]) -> str:
     return "\n".join(lines)
 def build_patient_summary(chat_history: List[Tuple[str, str]], meta: Dict[str, Any], display_json: Dict[str, Any]) -> str:
     severity = meta.get("Severity") or display_json.get("Severity")
     total = meta.get("Total_Score") or display_json.get("Total_Score")
     transcript_text = transcript_to_text(chat_history)
-    return (
-        "# Summary for Patient\n\n"
-        "### Conversation Transcript\n\n"
-        f"```\n{transcript_text}\n```"
-    )
 def build_clinician_summary(chat_history: List[Tuple[str, str]], meta: Dict[str, Any], display_json: Dict[str, Any]) -> str:
@@ -319,17 +529,75 @@ def build_clinician_summary(chat_history: List[Tuple[str, str]], meta: Dict[str,
     transcript_text = transcript_to_text(chat_history)
     scores_lines = "\n".join([f"- {k}: {v}" for k, v in scores.items()])
     conf_str = ", ".join([f"{c:.2f}" for c in confidences]) if confidences else ""
-    return (
-        "# Summary for Clinician\n\n"
-        f"- Severity: **{severity}**  \n"
-        f"- PHQ‑9 Total: **{total}**  \n"
-        # f"- High Risk: **{risk}**\n\n"
-        f"### Item Scores\n{scores_lines}\n\n"
-        "### Conversation Transcript\n\n"
-        f"```\n{transcript_text}\n```"
-    )
-def generate_recording_agent_reply(chat_history: List[Tuple[str, str]]) -> str:
     transcript = transcript_to_text(chat_history)
     system_prompt = (
         "You are a clinician conducting a conversational assessment to infer PHQ-9 symptoms "
@@ -337,9 +605,17 @@ def generate_recording_agent_reply(chat_history: List[Tuple[str, str]]) -> str:
         "Ask one concise, natural follow-up question at a time that helps infer symptoms such as mood, "
         "sleep, appetite, energy, concentration, self-worth, psychomotor changes, and suicidal thoughts."
     )
     user_prompt = (
         "Conversation so far (Patient and Clinician turns):\n\n" + transcript +
-        "\n\nRespond with a single short clinician-style question for the patient."
     )
     pipe = get_textgen_pipeline()
     tokenizer = pipe.tokenizer
@@ -620,8 +896,10 @@ def process_turn(
         chat_history[-1] = (chat_history[-1][0], summary)
         finished = True
     else:
-        # Generate next clinician question
-        reply = generate_recording_agent_reply(chat_history)
         chat_history[-1] = (chat_history[-1][0], reply)
     # TTS for the latest clinician message, if enabled
@@ -635,6 +913,9 @@ def process_turn(
         "Severity": severity,
         "Confidence": overall_conf,
         "High_Risk": high_risk,
     }
     # Clear inputs after processing
@@ -782,6 +1063,7 @@ def create_demo():
         meta_state = gr.State()
         finished_state = gr.State()
         turns_state = gr.State()
         # Initialize on load (no autoplay due to browser policies)
         demo.load(_on_load_init, inputs=None, outputs=[chatbot, scores_state, meta_state, finished_state, turns_state])
@@ -802,44 +1084,57 @@ def create_demo():
         intro_play_btn.click(fn=_play_intro_tts, inputs=[tts_enable], outputs=[tts_audio_main])
         # Wire interactions
-        def _process_with_tts(audio, text, chat, th, tts_on, finished, turns, scores, meta, provider, coqui_model, coqui_speaker):
             result = process_turn(audio, text, chat, th, tts_on, finished, turns, scores, meta)
             chat_history, display_json, severity, finished_o, turns_o, _, _, _, last_tts = result
             if tts_on and chat_history and chat_history[-1][1]:
                 new_path = synthesize_tts(chat_history[-1][1], provider=provider, coqui_model_name=coqui_model, coqui_speaker=coqui_speaker)
             else:
                 new_path = None
             # If finished, hide the mic and display summaries in Main
             if finished_o:
-                patient_md = build_patient_summary(chat_history, {"Severity": severity, "Total_Score": display_json.get("Total_Score")}, display_json)
-                clinician_md = build_clinician_summary(chat_history, {"Severity": severity, "Total_Score": display_json.get("Total_Score")}, display_json)
                 summary_md = patient_md + "\n\n---\n\n" + clinician_md
-                return chat_history, display_json, severity, finished_o, turns_o, gr.update(visible=False), None, new_path, new_path, gr.update(value=summary_md, visible=True)
-            return chat_history, display_json, severity, finished_o, turns_o, None, None, new_path, new_path, gr.update(visible=False)
         audio_main.stop_recording(
             fn=_process_with_tts,
-            inputs=[audio_main, text_main, chatbot, threshold, tts_enable, finished_state, turns_state, scores_state, meta_state, tts_provider_dd, coqui_model_tb, coqui_speaker_dd],
-            outputs=[chatbot, score_json, severity_label, finished_state, turns_state, audio_main, text_main, tts_audio, tts_audio_main, main_summary],
             queue=True,
             api_name="message",
         )
         # Text input flow from Advanced tab
-        def _process_text_and_clear(text, chat, th, tts_on, finished, turns, scores, meta, provider, coqui_model, coqui_speaker):
-            res = _process_with_tts(None, text, chat, th, tts_on, finished, turns, scores, meta, provider, coqui_model, coqui_speaker)
             return (*res, "")
         text_adv.submit(
             fn=_process_text_and_clear,
-            inputs=[text_adv, chatbot, threshold, tts_enable, finished_state, turns_state, scores_state, meta_state, tts_provider_dd, coqui_model_tb, coqui_speaker_dd],
-            outputs=[chatbot, score_json, severity_label, finished_state, turns_state, audio_main, text_main, tts_audio, tts_audio_main, main_summary, text_adv],
             queue=True,
         )
         send_adv_btn.click(
             fn=_process_text_and_clear,
-            inputs=[text_adv, chatbot, threshold, tts_enable, finished_state, turns_state, scores_state, meta_state, tts_provider_dd, coqui_model_tb, coqui_speaker_dd],
-            outputs=[chatbot, score_json, severity_label, finished_state, turns_state, audio_main, text_main, tts_audio, tts_audio_main, main_summary, text_adv],
             queue=True,
         )

     return "Severe Depression"
+# ---------------------------
+# PHQ-9 schema and helpers
+# ---------------------------
+PHQ9_KEYS_ORDERED: List[str] = [
+    "interest",
+    "mood",
+    "sleep",
+    "energy",
+    "appetite",
+    "self_worth",
+    "concentration",
+    "motor",
+    "suicidal_thoughts",
+]
+# Lightweight keyword lexicon per item for evidence extraction.
+# Placeholder for future SHAP/attention-based attributions.
+PHQ9_KEYWORDS: Dict[str, List[str]] = {
+    "interest": ["interest", "pleasure", "enjoy", "motivation", "hobbies"],
+    "mood": ["depressed", "down", "sad", "hopeless", "blue", "mood"],
+    "sleep": ["sleep", "insomnia", "awake", "wake up", "night", "dream"],
+    "energy": ["tired", "fatigue", "energy", "exhausted", "worn out"],
+    "appetite": ["appetite", "eat", "eating", "hungry", "food", "weight"],
+    "self_worth": ["worthless", "failure", "guilty", "guilt", "self-esteem", "ashamed"],
+    "concentration": ["concentrate", "focus", "attention", "distracted", "remember"],
+    "motor": ["restless", "slow", "slowed", "agitated", "fidget", "move"],
+    "suicidal_thoughts": ["suicide", "kill myself", "die", "end my life", "self-harm", "hurt myself"],
+}
 def transcript_to_text(chat_history: List[Tuple[str, str]]) -> str:
     """Convert chatbot history [(user, assistant), ...] to a plain text transcript."""
     lines = []
     return "\n".join(lines)
+def _patient_sentences(chat_history: List[Tuple[str, str]]) -> List[str]:
+    """Extract patient-only sentences from chat history."""
+    sentences: List[str] = []
+    for user, _assistant in chat_history:
+        if not user:
+            continue
+        parts = re.split(r"(?<=[.!?])\s+", user.strip())
+        for p in parts:
+            p = p.strip()
+            if p:
+                sentences.append(p)
+    return sentences
+def _extract_quotes_per_item(chat_history: List[Tuple[str, str]]) -> Dict[str, List[str]]:
+    """Heuristic extraction of per-item evidence quotes from patient sentences based on keywords."""
+    quotes: Dict[str, List[str]] = {k: [] for k in PHQ9_KEYS_ORDERED}
+    sentences = _patient_sentences(chat_history)
+    for sent in sentences:
+        s_low = sent.lower()
+        for item, kws in PHQ9_KEYWORDS.items():
+            if any(kw in s_low for kw in kws):
+                if len(quotes[item]) < 5:
+                    quotes[item].append(sent)
+    return quotes
+def explainability_light(
+    chat_history: List[Tuple[str, str]],
+    scores: Dict[str, int],
+    confidences: List[float],
+    threshold: float,
+) -> Dict[str, Any]:
+    """Lightweight explainability per turn.
+    - Inspects transcript for keyword-based evidence per PHQ-9 item.
+    - Classifies evidence strength as strong/weak/missing using keyword hits and confidence.
+    - Suggests next focus item based on lowest-confidence or missing evidence.
+    Returns a JSON-serializable dict.
+    """
+    quotes = _extract_quotes_per_item(chat_history)
+    conf_map: Dict[str, float] = {}
+    for idx, key in enumerate(PHQ9_KEYS_ORDERED):
+        conf_map[key] = float(confidences[idx] if idx < len(confidences) else 0.0)
+    evidence_strength: Dict[str, str] = {}
+    for key in PHQ9_KEYS_ORDERED:
+        hits = len(quotes.get(key, []))
+        conf = conf_map.get(key, 0.0)
+        if hits >= 2 and conf >= max(0.6, threshold - 0.1):
+            evidence_strength[key] = "strong"
+        elif hits >= 1 or conf >= 0.4:
+            evidence_strength[key] = "weak"
+        else:
+            evidence_strength[key] = "missing"
+    low_items = sorted(
+        PHQ9_KEYS_ORDERED,
+        key=lambda k: (evidence_strength[k] != "missing", conf_map.get(k, 0.0))
+    )
+    recommended = low_items[0] if low_items else None
+    return {
+        "evidence_strength": evidence_strength,
+        "low_confidence_items": [k for k in sorted(PHQ9_KEYS_ORDERED, key=lambda x: conf_map.get(x, 0.0))],
+        "recommended_focus": recommended,
+        "quotes": quotes,
+        "confidences": conf_map,
+    }
+def explainability_full(
+    chat_history: List[Tuple[str, str]],
+    confidences: List[float],
+    features_history: Optional[List[Dict[str, float]]],
+) -> Dict[str, Any]:
+    """Aggregate linguistic and acoustic attributions at session end.
+    - Linguistic: keyword-based quotes per item (placeholder for SHAP/attention).
+    - Acoustic: mean of per-turn prosodic features; returned as name=value strings.
+    """
+    def _aggregate_prosody(history: List[Dict[str, float]]) -> Dict[str, float]:
+        agg: Dict[str, float] = {}
+        if not history:
+            return agg
+        keys = set().union(*[d.keys() for d in history if isinstance(d, dict)])
+        for k in keys:
+            vals = [float(d[k]) for d in history if isinstance(d, dict) and k in d]
+            if vals:
+                agg[k] = float(np.mean(vals))
+        return agg
+    quotes = _extract_quotes_per_item(chat_history)
+    conf_map = {k: float(confidences[i] if i < len(confidences) else 0.0) for i, k in enumerate(PHQ9_KEYS_ORDERED)}
+    prosody_agg = _aggregate_prosody(list(features_history or []))
+    prosody_pairs = sorted(list(prosody_agg.items()), key=lambda kv: -abs(kv[1]))
+    prosody_names = [f"{k}={v:.3f}" for k, v in prosody_pairs[:8]]
+    items = []
+    for k in PHQ9_KEYS_ORDERED:
+        items.append({
+            "item": k,
+            "confidence": conf_map.get(k, 0.0),
+            "evidence": quotes.get(k, [])[:5],
+            "prosody": prosody_names,
+        })
+    return {
+        "items": items,
+        "notes": "Heuristic keyword and prosody aggregation; plug in SHAP/attention later.",
+    }
+def reflection_module(
+    scores: Dict[str, int],
+    confidences: List[float],
+    exp_light: Optional[Dict[str, Any]],
+    exp_full: Optional[Dict[str, Any]],
+    threshold: float,
+) -> Dict[str, Any]:
+    """Self-reflection / output reevaluation.
+    Heuristic: if confidence for an item < threshold and evidence is missing, reduce score by 1 (min 0).
+    Returns a `reflection_report` JSON with corrected scores and final summary.
+    """
+    corrected = dict(scores or {})
+    strength = (exp_light or {}).get("evidence_strength", {}) if isinstance(exp_light, dict) else {}
+    changes: List[Tuple[str, int, int]] = []
+    for i, k in enumerate(PHQ9_KEYS_ORDERED):
+        conf = float(confidences[i] if i < len(confidences) else 0.0)
+        if conf < float(threshold) and strength.get(k) == "missing":
+            new_val = max(0, int(corrected.get(k, 0)) - 1)
+            if new_val != corrected.get(k, 0):
+                changes.append((k, int(corrected.get(k, 0)), new_val))
+            corrected[k] = new_val
+    final_total = int(sum(corrected.values()))
+    final_sev = severity_from_total(final_total)
+    consistency = float(1.0 - (len(changes) / max(1, len(PHQ9_KEYS_ORDERED))))
+    if changes:
+        notes = ", ".join([f"{k}: {old}->{new}" for k, old, new in changes])
+        notes = f"Model revised items due to low confidence and missing evidence: {notes}."
+    else:
+        notes = "No score revisions; explanations consistent with outputs."
+    return {
+        "corrected_scores": corrected,
+        "final_total": final_total,
+        "severity_label": final_sev,
+        "consistency_score": consistency,
+        "notes": notes,
+    }
 def build_patient_summary(chat_history: List[Tuple[str, str]], meta: Dict[str, Any], display_json: Dict[str, Any]) -> str:
     severity = meta.get("Severity") or display_json.get("Severity")
     total = meta.get("Total_Score") or display_json.get("Total_Score")
     transcript_text = transcript_to_text(chat_history)
+    # Optional enriched content
+    exp_full = display_json.get("Explainability_Full") or {}
+    reflection = display_json.get("Reflection_Report") or {}
+    lines = []
+    lines.append("# Summary for Patient\n")
+    if total is not None and severity:
+        lines.append(f"- PHQ‑9 Total: **{total}**  ")
+        lines.append(f"- Severity: **{severity}**\n")
+    # Highlights: show one quote per item if available
+    if exp_full and isinstance(exp_full, dict):
+        items = exp_full.get("items", [])
+        if isinstance(items, list) and items:
+            lines.append("### Highlights from our conversation\n")
+            for it in items:
+                item = it.get("item")
+                ev = it.get("evidence", [])
+                if item and ev:
+                    lines.append(f"- {item}: \"{ev[0]}\"")
+            lines.append("")
+    if reflection:
+        note = reflection.get("notes")
+        if note:
+            lines.append("### Reflection\n")
+            lines.append(note)
+            lines.append("")
+    lines.append("### Conversation Transcript\n\n")
+    lines.append(f"```\n{transcript_text}\n```")
+    return "\n".join(lines)
 def build_clinician_summary(chat_history: List[Tuple[str, str]], meta: Dict[str, Any], display_json: Dict[str, Any]) -> str:
     transcript_text = transcript_to_text(chat_history)
     scores_lines = "\n".join([f"- {k}: {v}" for k, v in scores.items()])
     conf_str = ", ".join([f"{c:.2f}" for c in confidences]) if confidences else ""
+    # Optional explainability
+    exp_light = display_json.get("Explainability_Light") or {}
+    exp_full = display_json.get("Explainability_Full") or {}
+    reflection = display_json.get("Reflection_Report") or {}
+    md = []
+    md.append("# Summary for Clinician\n")
+    md.append(f"- Severity: **{severity}**  ")
+    md.append(f"- PHQ‑9 Total: **{total}**  ")
+    if risk is not None:
+        md.append(f"- High Risk: **{risk}**  ")
+    md.append("")
+    md.append("### Item Scores\n" + scores_lines + "\n")
+    # Confidence bars
+    if confidences:
+        bars = []
+        for i, k in enumerate(scores.keys()):
+            c = confidences[i] if i < len(confidences) else 0.0
+            bar_len = int(round(c * 20))
+            bars.append(f"- {k}: [{'#'*bar_len}{'.'*(20-bar_len)}] {c:.2f}")
+        md.append("### Confidence by item\n" + "\n".join(bars) + "\n")
+    # Light explainability snapshot
+    if exp_light:
+        strength = exp_light.get("evidence_strength", {})
+        recommended = exp_light.get("recommended_focus")
+        if strength:
+            md.append("### Evidence strength (light)\n")
+            md.extend([f"- {k}: {v}" for k, v in strength.items()])
+            md.append("")
+        if recommended:
+            md.append(f"- Next focus (if continuing): **{recommended}**\n")
+    # Full explainability excerpts
+    if exp_full and isinstance(exp_full, dict):
+        md.append("### Explainability (final)\n")
+        items = exp_full.get("items", [])
+        for it in items:
+            item = it.get("item")
+            conf = it.get("confidence")
+            ev = it.get("evidence", [])
+            pros = it.get("prosody", [])
+            if item:
+                md.append(f"- {item} (conf {conf:.2f}):")
+                for q in ev[:2]:
+                    md.append(f"  - \"{q}\"")
+                if pros:
+                    md.append(f"  - prosody: {', '.join([str(p) for p in pros[:4]])}")
+        md.append("")
+    # Reflection summary
+    if reflection:
+        md.append("### Self-reflection\n")
+        notes = reflection.get("notes")
+        if notes:
+            md.append(notes)
+        corr = reflection.get("corrected_scores") or {}
+        if corr and corr != scores:
+            changed = [k for k in scores.keys() if corr.get(k) != scores.get(k)]
+            if changed:
+                md.append("- Adjusted items: " + ", ".join(changed))
+        md.append("")
+    md.append("### Conversation Transcript\n\n")
+    md.append(f"```\n{transcript_text}\n```")
+    return "\n".join(md)
+def generate_recording_agent_reply(chat_history: List[Tuple[str, str]], guidance: Optional[Dict[str, Any]] = None) -> str:
     transcript = transcript_to_text(chat_history)
     system_prompt = (
         "You are a clinician conducting a conversational assessment to infer PHQ-9 symptoms "
         "Ask one concise, natural follow-up question at a time that helps infer symptoms such as mood, "
         "sleep, appetite, energy, concentration, self-worth, psychomotor changes, and suicidal thoughts."
     )
+    focus_text = ""
+    if guidance and isinstance(guidance, dict):
+        rec = guidance.get("recommended_focus")
+        if rec:
+            focus_text = (
+                f"\n\nGuidance: Focus the next question on the patient's {str(rec).replace('_', ' ')}. "
+                "Ask naturally about recent changes and their impact on daily life."
+            )
     user_prompt = (
         "Conversation so far (Patient and Clinician turns):\n\n" + transcript +
+        f"{focus_text}\n\nRespond with a single short clinician-style question for the patient."
     )
     pipe = get_textgen_pipeline()
     tokenizer = pipe.tokenizer
         chat_history[-1] = (chat_history[-1][0], summary)
         finished = True
     else:
+        # Iterative explainability (light) to guide next question
+        light_exp = explainability_light(chat_history, scores, confidences, float(threshold))
+        # Generate next clinician question with guidance
+        reply = generate_recording_agent_reply(chat_history, guidance=light_exp)
         chat_history[-1] = (chat_history[-1][0], reply)
     # TTS for the latest clinician message, if enabled
         "Severity": severity,
         "Confidence": overall_conf,
         "High_Risk": high_risk,
+        # Include the last audio features and light explainability for downstream modules/UI
+        "Last_Audio_Features": audio_features,
+        "Explainability_Light": explainability_light(chat_history, scores, confidences, float(threshold)),
     }
     # Clear inputs after processing
         meta_state = gr.State()
         finished_state = gr.State()
         turns_state = gr.State()
+        feats_state = gr.State()
         # Initialize on load (no autoplay due to browser policies)
         demo.load(_on_load_init, inputs=None, outputs=[chatbot, scores_state, meta_state, finished_state, turns_state])
         intro_play_btn.click(fn=_play_intro_tts, inputs=[tts_enable], outputs=[tts_audio_main])
         # Wire interactions
+        def _process_with_tts(audio, text, chat, th, tts_on, finished, turns, scores, meta, provider, coqui_model, coqui_speaker, feats_hist):
             result = process_turn(audio, text, chat, th, tts_on, finished, turns, scores, meta)
             chat_history, display_json, severity, finished_o, turns_o, _, _, _, last_tts = result
+            # Accumulate last audio features
+            feats_hist = feats_hist or []
+            last_feats = (display_json or {}).get("Last_Audio_Features") or {}
+            if isinstance(last_feats, dict) and last_feats:
+                feats_hist = list(feats_hist) + [last_feats]
             if tts_on and chat_history and chat_history[-1][1]:
                 new_path = synthesize_tts(chat_history[-1][1], provider=provider, coqui_model_name=coqui_model, coqui_speaker=coqui_speaker)
             else:
                 new_path = None
             # If finished, hide the mic and display summaries in Main
             if finished_o:
+                # Run full explainability and reflection
+                exp_full = explainability_full(chat_history, display_json.get("Confidences", []), feats_hist)
+                reflect = reflection_module(display_json.get("PHQ9_Scores", {}), display_json.get("Confidences", []), display_json.get("Explainability_Light", {}), exp_full, float(th))
+                display_json["Explainability_Full"] = exp_full
+                display_json["Reflection_Report"] = reflect
+                # Use reflection outputs to set final meta
+                final_sev = reflect.get("severity_label") or severity
+                final_total = reflect.get("final_total") or display_json.get("Total_Score")
+                patient_md = build_patient_summary(chat_history, {"Severity": final_sev, "Total_Score": final_total}, display_json)
+                clinician_md = build_clinician_summary(chat_history, {"Severity": final_sev, "Total_Score": final_total}, display_json)
                 summary_md = patient_md + "\n\n---\n\n" + clinician_md
+                return chat_history, display_json, severity, finished_o, turns_o, gr.update(visible=False), None, new_path, new_path, gr.update(value=summary_md, visible=True), feats_hist
+            return chat_history, display_json, severity, finished_o, turns_o, None, None, new_path, new_path, gr.update(visible=False), feats_hist
         audio_main.stop_recording(
             fn=_process_with_tts,
+            inputs=[audio_main, text_main, chatbot, threshold, tts_enable, finished_state, turns_state, scores_state, meta_state, tts_provider_dd, coqui_model_tb, coqui_speaker_dd, feats_state],
+            outputs=[chatbot, score_json, severity_label, finished_state, turns_state, audio_main, text_main, tts_audio, tts_audio_main, main_summary, feats_state],
             queue=True,
             api_name="message",
         )
         # Text input flow from Advanced tab
+        def _process_text_and_clear(text, chat, th, tts_on, finished, turns, scores, meta, provider, coqui_model, coqui_speaker, feats_hist):
+            res = _process_with_tts(None, text, chat, th, tts_on, finished, turns, scores, meta, provider, coqui_model, coqui_speaker, feats_hist)
             return (*res, "")
         text_adv.submit(
             fn=_process_text_and_clear,
+            inputs=[text_adv, chatbot, threshold, tts_enable, finished_state, turns_state, scores_state, meta_state, tts_provider_dd, coqui_model_tb, coqui_speaker_dd, feats_state],
+            outputs=[chatbot, score_json, severity_label, finished_state, turns_state, audio_main, text_main, tts_audio, tts_audio_main, main_summary, feats_state, text_adv],
             queue=True,
         )
         send_adv_btn.click(
             fn=_process_text_and_clear,
+            inputs=[text_adv, chatbot, threshold, tts_enable, finished_state, turns_state, scores_state, meta_state, tts_provider_dd, coqui_model_tb, coqui_speaker_dd, feats_state],
+            outputs=[chatbot, score_json, severity_label, finished_state, turns_state, audio_main, text_main, tts_audio, tts_audio_main, main_summary, feats_state, text_adv],
             queue=True,
         )