Spaces:
Sleeping
Sleeping
| title: FutureCafe Voice Core | |
| emoji: ☎️ | |
| colorFrom: indigo | |
| colorTo: blue | |
| sdk: gradio | |
| app_file: app.py | |
| pinned: false | |
| license: mit | |
| # ☎️ FutureCafe Voice & Chat Assistant | |
| Welcome to **FutureCafe Voice & Chat Assistant** — an AI-powered demo that lets you interact with a virtual cafe agent using either **voice calls** or **chat messages**. | |
| The assistant can answer questions about the cafe, help place food orders, confirm reservations, and provide opening hours or location info — all through a simple web interface. | |
| --- | |
| ## 🎯 Aim of the Project | |
| The goal of this Space is to demonstrate how modern AI components (speech recognition, language models, text-to-speech) can be combined into a **realistic customer service experience** for restaurants and cafes. | |
| FutureCafe Assistant acts like a friendly staff member: | |
| - Answers menu or dietary questions. | |
| - Helps with table reservations. | |
| - Supports order placement and price calculation. | |
| - Provides hours, address, and contact info. | |
| - Handles both **chat** and **voice calls**. | |
| --- | |
| ## 🚀 How to Use | |
| No installation or setup required — everything runs in the browser. | |
| 1. **Voice Call (left panel):** | |
| - Press **Record**, speak your request, then stop recording. | |
| - The assistant transcribes your voice, replies with text, and speaks the response back. | |
| 2. **Chat / SMS (right panel):** | |
| - Type a message in the textbox and press Enter. | |
| - The assistant replies in the chat window. | |
| That’s it! 🎉 You can switch freely between chat and voice. | |
| --- | |
| ## 🛠️ Tools & Technologies | |
| This demo integrates several lightweight but powerful AI tools: | |
| - **[Gradio](https://gradio.app/)** – User interface for voice and chat. | |
| - **Automatic Speech Recognition (ASR)** – Converts microphone input to text using [faster-whisper](https://github.com/SYSTRAN/faster-whisper). | |
| - **Large Language Model (LLM)** – Core conversational logic (via [OpenAI API](https://openai.com) in this build). | |
| - **Text-to-Speech (TTS)** – Synthesizes assistant replies into natural voice using [Piper](https://github.com/rhasspy/piper) or system TTS. | |
| - **Python & Hugging Face Spaces** – Deployment environment. | |
| --- | |
| ## 👩💻 Credits | |
| Developed as part of a portfolio project to explore **multimodal AI assistants** that combine speech, text, and reasoning for practical real-world scenarios. |