Spaces:

nvidia
/

voice-agent-examples

Running

App Files Files Community

voice-agent-examples / examples /voice_agent_webrtc_langgraph /agents /healthcare-agent /IMPROVEMENTS.md

fciannella

Added the healthcare example

2f49513 2 months ago

preview code

raw

history blame contribute delete

9.86 kB

Healthcare Agent Tool Calling Improvements

Summary

This document describes the comprehensive improvements made to the healthcare agent's tool calling system. The primary goal was to harden tool usage and fix issues with secret question handling by improving docstrings and system prompts.

Problems Identified

1. Vague Tool Docstrings

Original docstrings were too brief (e.g., "Find a patient_id by name. Returns JSON.")
No guidance on WHEN to call each tool
No explanation of parameter formats or return value interpretation
Missing information about tool call sequences
No examples showing expected inputs/outputs

2. Secret Question Flow Issues ⚠️ CRITICAL

The verify_identity tool didn't clearly explain the multi-step secret question flow:

When verify_identity returns a question field, the agent must ASK it to the caller
The agent must then collect the answer and call verify_identity AGAIN with secret_answer
This two-step process was not documented, causing authentication failures

3. System Prompt Lacked Structure

The original prompt was a paragraph of instructions
No clear step-by-step conversation flow
Verification logic was not explicit
Tool calling sequence was implied but not stated

Solutions Implemented

A. Enhanced Tool Docstrings

All 9 tools now have comprehensive docstrings with the following structure:

1. One-line summary
2. WHEN TO CALL: Exact timing/trigger for using this tool
3. PARAMETERS: Detailed explanation of each parameter with examples
4. RETURNS: JSON structure breakdown with field explanations
5. WHAT TO DO: Step-by-step instructions for handling the response
6. EXAMPLE: Real conversation flow showing inputs/outputs

B. Tools Improved

1. `find_patient`

Added: Clear instruction to call this FIRST
Added: Example showing full_name vs first_name/last_name usage
Added: Next step guidance (call verify_identity after)

2. `verify_identity` ⚠️ MOST CRITICAL

Added: 7-step secret question flow example
Added: Authentication logic explanation (DOB + MRN OR DOB + secret)
Added: Explicit instructions: "YOU MUST read the question verbatim to the caller"
Added: Example conversation showing the two-call pattern
Added: Clear decision tree for handling responses

Before:

"""Verify identity before accessing records. Provide any of: full_name, dob (YYYY-MM-DD or free-form), 
MRN last-4, secret answer. Returns JSON with verified flag, needed fields, and optional secret question."""

After:

"""Verify caller identity before accessing medical records. CRITICAL: Identity verification is MANDATORY...
    
CRITICAL SECRET QUESTION FLOW:
1. First call: verify_identity(patient_id="pt_abc", dob_yyyy_mm_dd="1960-01-01")
2. Response: {"verified": false, "needs": ["mrn_last4_or_secret"], "question": "What is your favorite color?"}
3. YOU MUST: Read the question verbatim to the caller: "What is your favorite color?"
4. Collect their answer (e.g., "blue")
5. Second call: verify_identity(patient_id="pt_abc", dob_yyyy_mm_dd="1960-01-01", secret_answer="blue")
6. Response: {"verified": true, "needs": []}
"""

3. `get_patient_profile_tool`

Added: Warning to call ONLY after verified=true
Added: CRITICAL flag for allergies (must check before medication recommendations)
Added: Guidance on using profile data when giving medical advice

4. `triage_symptoms_tool`

Added: Risk level definitions (urgent/soon/self_care)
Added: Decision tree for handling each risk level
Added: Two detailed examples (urgent and self-care cases)
Added: Reminder to consider allergies and current medications

5. `list_providers_tool` & `get_provider_slots_tool` & `schedule_appointment_tool`

Added: Clear sequence: list → slots → schedule
Added: Guidance on presenting options naturally in voice conversation
Added: Time formatting instructions (today at 8pm vs ISO timestamps)

6. `get_preferred_pharmacy_tool`

Added: When to call (before prescriptions)
Added: Example confirmation dialogue

7. `log_call_tool`

Added: Call at END of conversation
Added: What to include in notes
Added: Example with full call summary

C. Restructured System Prompt

The system prompt was completely restructured from a paragraph to a 7-step numbered flow:

## CONVERSATION FLOW (follow this sequence):

1. GREETING
   - Call find_patient(full_name=...)

2. IDENTITY VERIFICATION (CRITICAL)
   - Explicit secret question handling:
     * If 'question' field present: READ THE EXACT QUESTION
     * Wait for answer
     * Call verify_identity again with secret_answer

3. AFTER VERIFIED=TRUE
   - Call get_patient_profile_tool()
   - Ask about symptoms

4. TRIAGE AND GUIDANCE
   - Call triage_symptoms_tool()
   - Handle based on risk level

5. APPOINTMENT BOOKING
   - list_providers → get_slots → schedule

6. PHARMACY CONFIRMATION

7. CLOSING
   - Call log_call_tool()

Added sections:

Clear step-by-step conversation structure
Explicit secret question instructions in prompt (redundant with tool, intentionally)
Communication style guidelines
TTS safety rules

Key Improvements Summary

Area	Before	After
Tool docstrings	1 line, no examples	10-30 lines with WHEN/WHAT/HOW/EXAMPLE
Secret question flow	Implied	7-step explicit instructions with example
System prompt	Paragraph	7-step numbered flow with sub-bullets
Error handling	Not specified	Clear fallback instructions
Examples	None	Every tool has conversation examples
Parameter guidance	Minimal	Format examples and auto-injection notes

Testing Recommendations

To validate these improvements, test the following scenarios:

1. Secret Question Flow (Primary Fix)

Test: John Marshall
- User says: "John Marshall"
- User provides: "January 1st, 1960" (DOB)
- Agent should ASK: "For security, what is your favorite color?"
- User answers: "Blue"
- Agent should verify successfully

Expected: Agent asks the secret question and accepts the answer

2. Wrong Secret Answer

Test: John Marshall with wrong answer
- DOB: correct
- Secret answer: "red" (wrong - correct is "blue")

Expected: Agent explains verification failed, asks for MRN last-4 instead or re-attempts

3. MRN Last-4 Alternative

Test: John Marshall with MRN
- DOB: "January 1st, 1960"
- MRN last-4: "0001"

Expected: Agent verifies without asking secret question

4. Triage Urgent Case

Test: Francesco Ciannella with chest pain
- Symptoms: "I have severe chest pain and I'm short of breath"

Expected: 
- Agent calls triage_symptoms_tool
- Gets risk="urgent"
- Immediately directs to ER/911
- Does NOT offer regular appointment

5. Allergy Check

Test: John Marshall with headache
- Has Penicillin allergy
- Currently takes Acetaminophen PRN

Expected: 
- Agent retrieves profile
- Mentions "Since you're already taking acetaminophen as needed and have a penicillin allergy..."
- Doesn't recommend penicillin-based meds

6. Full Flow End-to-End

Test: Complete call with Francesco Ciannella
1. Name lookup
2. DOB verification
3. Secret question: "What city were you born in?" → "rome"
4. Profile retrieved (Lisinopril, no known allergies)
5. Symptoms: "mild headache and fatigue"
6. Triage returns self_care
7. Books appointment with Dr. Smith tomorrow 8:30am
8. Confirms pharmacy
9. Logs call

Expected: Smooth flow with no repeated questions, proper tool sequence

Files Modified

tools.py - All 9 tool docstrings completely rewritten
prompts.py - Healthcare system prompt moved here (refactored from react_agent.py)
react_agent.py - Now imports SYSTEM_PROMPT from prompts.py
README.md - Added note about improvements and visual flow diagram
IMPROVEMENTS.md - This document
TESTING_GUIDE.md - New comprehensive testing guide

Code Organization

The system prompt has been properly organized following best practices:

Before: System prompt was inline in react_agent.py (cluttering the main logic)

After:

System prompt is now in prompts.py as HEALTHCARE_SYSTEM_PROMPT
react_agent.py imports it cleanly: SYSTEM_PROMPT = hc_prompts.HEALTHCARE_SYSTEM_PROMPT
This follows the pattern used in other agents and makes prompt maintenance easier
Future prompts (e.g., for different specialties or languages) can be added to prompts.py

Backward Compatibility

✅ All changes are backward compatible

No function signatures changed
No breaking changes to logic.py
Only docstrings and prompt text modified
Existing calls will continue to work
Prompt refactoring is transparent to the agent's behavior

Next Steps (Optional Enhancements)

If further hardening is needed, consider:

Add tool call validation - Reject calls to medical tools before verification
Add structured logging - Log tool call sequences for debugging
Add conversation state tracking - Explicit state machine (NOT_VERIFIED → VERIFIED → TRIAGED → SCHEDULED)
Add retry limits - Max 3 verification attempts before ending call
Add mock data validation - Ensure test patients have valid data

Conclusion

The healthcare agent now has enterprise-grade tool documentation with:

✅ Clear, explicit instructions for every tool
✅ Fixed secret question authentication flow
✅ Step-by-step conversation structure
✅ Comprehensive examples
✅ Error handling guidance

This should significantly improve the agent's reliability and reduce authentication errors.