Spaces:

akisg
/

care-notes

Running on Zero

App Files Files Community

care-notes / README.md

Akis Giannoukos

Updated Readme

fae1128 about 1 month ago

preview code

raw

history blame

3.76 kB

metadata

title: Conversational Assessment for Responsive Engagement (CARE) Notes
emoji: 🐢
colorFrom: indigo
colorTo: gray
sdk: gradio
sdk_version: 5.49.1
app_file: app.py
pinned: false
short_description: AI-driven conversational module for depression-triage

PHQ-9 Clinician Agent (Voice-first)

A lightweight research demo that simulates a clinician conducting a brief conversational PHQ-9 screening. The app is voice-first: you tap a circular mic bubble to talk; the model replies and can speak back via TTS. A separate Advanced tab exposes scoring and configuration.

What it does

Conversational assessment to infer PHQ‑9 items from natural dialogue (no explicit questionnaire).
Live inference of PHQ‑9 item scores, confidences, total score, and severity.
Automatic stop when minimum confidence across items reaches a threshold or risk is detected.
Optional TTS playback for clinician responses.

UI overview

Main tab: Large circular mic “Record” bubble
- Tap to start, tap again to stop (processing runs on stop)
- While speaking back (TTS), the bubble shows a speaking state
Chat tab: Plain chat transcript (for reviewing turns)
Advanced tab:
- PHQ‑9 Assessment JSON (live)
- Severity label
- Confidence threshold slider (τ)
- Toggle: Speak clinician responses (TTS)
- Model ID textbox and “Apply model” button

Quick start (local)

Python 3.10+ recommended.
Install deps:
```
pip install -r requirements.txt
```
Run the app:
```
python app.py
```
Open the URL shown in the console (defaults to http://0.0.0.0:7860). Allow microphone access in your browser.

Configuration

Environment variables (all optional):

LLM_MODEL_ID (default google/gemma-2-2b-it): chat model id
ASR_MODEL_ID (default openai/whisper-tiny.en): speech-to-text model id
CONFIDENCE_THRESHOLD (default 0.8): stop when min item confidence ≥ τ
MAX_TURNS (default 12): hard stop cap
USE_TTS (default true): enable TTS playback
MODEL_CONFIG_PATH (default model_config.json): persisted model id
PORT (default 7860): server port

Notes:

If a GPU is available, the app will use it automatically for Transformers pipelines.
Changing the model in Advanced will reload the text-generation pipeline on the next turn.

How to use

Go to Main and tap the mic bubble. Speak naturally.
Tap again to finish your turn. The model replies; if TTS is enabled, you’ll hear it.
The Advanced tab updates live with PHQ‑9 scores and severity. Adjust the confidence threshold if you want the assessment to stop earlier/later.

Troubleshooting

No mic input detected:
- Ensure the site has microphone permission in your browser settings.
- Try refreshing the page after granting permission.
Can’t hear TTS:
- Enable the “Speak clinician responses (TTS)” toggle in Advanced.
- Ensure your system audio output is correct. Some browsers block auto‑play without interaction—use the mic once, then it should work.
Model download slow or fails:
- Check internet connectivity and try again. Some models are large.
Assessment doesn’t stop:
- Increase the confidence threshold slider (τ) in Advanced, or wait until the cap (MAX_TURNS).

Safety

This demo does not provide therapy or emergency counseling. If a user expresses suicidal intent or risk is inferred, the app ends the conversation and advises contacting emergency services (e.g., 988 in the U.S.).

Development notes

Framework: Gradio Blocks
ASR: Transformers pipeline (Whisper)
TTS: gTTS
Prosody features: librosa (lightweight proxies) for the scoring prompt

PRs and experiments are welcome. This is a research prototype and not a clinical tool.