Spaces:
Running
Healthcare Agent Tool Calling Improvements
Summary
This document describes the comprehensive improvements made to the healthcare agent's tool calling system. The primary goal was to harden tool usage and fix issues with secret question handling by improving docstrings and system prompts.
Problems Identified
1. Vague Tool Docstrings
- Original docstrings were too brief (e.g., "Find a patient_id by name. Returns JSON.")
- No guidance on WHEN to call each tool
- No explanation of parameter formats or return value interpretation
- Missing information about tool call sequences
- No examples showing expected inputs/outputs
2. Secret Question Flow Issues β οΈ CRITICAL
The verify_identity tool didn't clearly explain the multi-step secret question flow:
- When
verify_identityreturns aquestionfield, the agent must ASK it to the caller - The agent must then collect the answer and call
verify_identityAGAIN withsecret_answer - This two-step process was not documented, causing authentication failures
3. System Prompt Lacked Structure
- The original prompt was a paragraph of instructions
- No clear step-by-step conversation flow
- Verification logic was not explicit
- Tool calling sequence was implied but not stated
Solutions Implemented
A. Enhanced Tool Docstrings
All 9 tools now have comprehensive docstrings with the following structure:
1. One-line summary
2. WHEN TO CALL: Exact timing/trigger for using this tool
3. PARAMETERS: Detailed explanation of each parameter with examples
4. RETURNS: JSON structure breakdown with field explanations
5. WHAT TO DO: Step-by-step instructions for handling the response
6. EXAMPLE: Real conversation flow showing inputs/outputs
B. Tools Improved
1. find_patient
- Added: Clear instruction to call this FIRST
- Added: Example showing full_name vs first_name/last_name usage
- Added: Next step guidance (call verify_identity after)
2. verify_identity β οΈ MOST CRITICAL
- Added: 7-step secret question flow example
- Added: Authentication logic explanation (DOB + MRN OR DOB + secret)
- Added: Explicit instructions: "YOU MUST read the question verbatim to the caller"
- Added: Example conversation showing the two-call pattern
- Added: Clear decision tree for handling responses
Before:
"""Verify identity before accessing records. Provide any of: full_name, dob (YYYY-MM-DD or free-form),
MRN last-4, secret answer. Returns JSON with verified flag, needed fields, and optional secret question."""
After:
"""Verify caller identity before accessing medical records. CRITICAL: Identity verification is MANDATORY...
CRITICAL SECRET QUESTION FLOW:
1. First call: verify_identity(patient_id="pt_abc", dob_yyyy_mm_dd="1960-01-01")
2. Response: {"verified": false, "needs": ["mrn_last4_or_secret"], "question": "What is your favorite color?"}
3. YOU MUST: Read the question verbatim to the caller: "What is your favorite color?"
4. Collect their answer (e.g., "blue")
5. Second call: verify_identity(patient_id="pt_abc", dob_yyyy_mm_dd="1960-01-01", secret_answer="blue")
6. Response: {"verified": true, "needs": []}
"""
3. get_patient_profile_tool
- Added: Warning to call ONLY after verified=true
- Added: CRITICAL flag for allergies (must check before medication recommendations)
- Added: Guidance on using profile data when giving medical advice
4. triage_symptoms_tool
- Added: Risk level definitions (urgent/soon/self_care)
- Added: Decision tree for handling each risk level
- Added: Two detailed examples (urgent and self-care cases)
- Added: Reminder to consider allergies and current medications
5. list_providers_tool & get_provider_slots_tool & schedule_appointment_tool
- Added: Clear sequence: list β slots β schedule
- Added: Guidance on presenting options naturally in voice conversation
- Added: Time formatting instructions (today at 8pm vs ISO timestamps)
6. get_preferred_pharmacy_tool
- Added: When to call (before prescriptions)
- Added: Example confirmation dialogue
7. log_call_tool
- Added: Call at END of conversation
- Added: What to include in notes
- Added: Example with full call summary
C. Restructured System Prompt
The system prompt was completely restructured from a paragraph to a 7-step numbered flow:
## CONVERSATION FLOW (follow this sequence):
1. GREETING
- Call find_patient(full_name=...)
2. IDENTITY VERIFICATION (CRITICAL)
- Explicit secret question handling:
* If 'question' field present: READ THE EXACT QUESTION
* Wait for answer
* Call verify_identity again with secret_answer
3. AFTER VERIFIED=TRUE
- Call get_patient_profile_tool()
- Ask about symptoms
4. TRIAGE AND GUIDANCE
- Call triage_symptoms_tool()
- Handle based on risk level
5. APPOINTMENT BOOKING
- list_providers β get_slots β schedule
6. PHARMACY CONFIRMATION
7. CLOSING
- Call log_call_tool()
Added sections:
- Clear step-by-step conversation structure
- Explicit secret question instructions in prompt (redundant with tool, intentionally)
- Communication style guidelines
- TTS safety rules
Key Improvements Summary
| Area | Before | After |
|---|---|---|
| Tool docstrings | 1 line, no examples | 10-30 lines with WHEN/WHAT/HOW/EXAMPLE |
| Secret question flow | Implied | 7-step explicit instructions with example |
| System prompt | Paragraph | 7-step numbered flow with sub-bullets |
| Error handling | Not specified | Clear fallback instructions |
| Examples | None | Every tool has conversation examples |
| Parameter guidance | Minimal | Format examples and auto-injection notes |
Testing Recommendations
To validate these improvements, test the following scenarios:
1. Secret Question Flow (Primary Fix)
Test: John Marshall
- User says: "John Marshall"
- User provides: "January 1st, 1960" (DOB)
- Agent should ASK: "For security, what is your favorite color?"
- User answers: "Blue"
- Agent should verify successfully
Expected: Agent asks the secret question and accepts the answer
2. Wrong Secret Answer
Test: John Marshall with wrong answer
- DOB: correct
- Secret answer: "red" (wrong - correct is "blue")
Expected: Agent explains verification failed, asks for MRN last-4 instead or re-attempts
3. MRN Last-4 Alternative
Test: John Marshall with MRN
- DOB: "January 1st, 1960"
- MRN last-4: "0001"
Expected: Agent verifies without asking secret question
4. Triage Urgent Case
Test: Francesco Ciannella with chest pain
- Symptoms: "I have severe chest pain and I'm short of breath"
Expected:
- Agent calls triage_symptoms_tool
- Gets risk="urgent"
- Immediately directs to ER/911
- Does NOT offer regular appointment
5. Allergy Check
Test: John Marshall with headache
- Has Penicillin allergy
- Currently takes Acetaminophen PRN
Expected:
- Agent retrieves profile
- Mentions "Since you're already taking acetaminophen as needed and have a penicillin allergy..."
- Doesn't recommend penicillin-based meds
6. Full Flow End-to-End
Test: Complete call with Francesco Ciannella
1. Name lookup
2. DOB verification
3. Secret question: "What city were you born in?" β "rome"
4. Profile retrieved (Lisinopril, no known allergies)
5. Symptoms: "mild headache and fatigue"
6. Triage returns self_care
7. Books appointment with Dr. Smith tomorrow 8:30am
8. Confirms pharmacy
9. Logs call
Expected: Smooth flow with no repeated questions, proper tool sequence
Files Modified
tools.py- All 9 tool docstrings completely rewrittenprompts.py- Healthcare system prompt moved here (refactored from react_agent.py)react_agent.py- Now imports SYSTEM_PROMPT from prompts.pyREADME.md- Added note about improvements and visual flow diagramIMPROVEMENTS.md- This documentTESTING_GUIDE.md- New comprehensive testing guide
Code Organization
The system prompt has been properly organized following best practices:
Before: System prompt was inline in react_agent.py (cluttering the main logic)
After:
- System prompt is now in
prompts.pyasHEALTHCARE_SYSTEM_PROMPT react_agent.pyimports it cleanly:SYSTEM_PROMPT = hc_prompts.HEALTHCARE_SYSTEM_PROMPT- This follows the pattern used in other agents and makes prompt maintenance easier
- Future prompts (e.g., for different specialties or languages) can be added to
prompts.py
Backward Compatibility
β All changes are backward compatible
- No function signatures changed
- No breaking changes to logic.py
- Only docstrings and prompt text modified
- Existing calls will continue to work
- Prompt refactoring is transparent to the agent's behavior
Next Steps (Optional Enhancements)
If further hardening is needed, consider:
- Add tool call validation - Reject calls to medical tools before verification
- Add structured logging - Log tool call sequences for debugging
- Add conversation state tracking - Explicit state machine (NOT_VERIFIED β VERIFIED β TRIAGED β SCHEDULED)
- Add retry limits - Max 3 verification attempts before ending call
- Add mock data validation - Ensure test patients have valid data
Conclusion
The healthcare agent now has enterprise-grade tool documentation with:
- β Clear, explicit instructions for every tool
- β Fixed secret question authentication flow
- β Step-by-step conversation structure
- β Comprehensive examples
- β Error handling guidance
This should significantly improve the agent's reliability and reduce authentication errors.