Spaces:

Eyob-Sol
/

futurecafe-voice-core

Sleeping

App Files Files Community

futurecafe-voice-core / README.md

Eyob-Sol

Update README.md

88a1d5d verified 3 months ago

preview code

raw

history blame contribute delete

2.37 kB

	---
	title: FutureCafe Voice Core
	emoji: ☎️
	colorFrom: indigo
	colorTo: blue
	sdk: gradio
	app_file: app.py
	pinned: false
	license: mit
	---

	# ☎️ FutureCafe Voice & Chat Assistant

	Welcome to FutureCafe Voice & Chat Assistant — an AI-powered demo that lets you interact with a virtual cafe agent using either voice calls or chat messages.
	The assistant can answer questions about the cafe, help place food orders, confirm reservations, and provide opening hours or location info — all through a simple web interface.

	---

	## 🎯 Aim of the Project

	The goal of this Space is to demonstrate how modern AI components (speech recognition, language models, text-to-speech) can be combined into a realistic customer service experience for restaurants and cafes.

	FutureCafe Assistant acts like a friendly staff member:
	- Answers menu or dietary questions.
	- Helps with table reservations.
	- Supports order placement and price calculation.
	- Provides hours, address, and contact info.
	- Handles both chat and voice calls.

	---

	## 🚀 How to Use

	No installation or setup required — everything runs in the browser.

	1. Voice Call (left panel):
	- Press Record, speak your request, then stop recording.
	- The assistant transcribes your voice, replies with text, and speaks the response back.

	2. Chat / SMS (right panel):
	- Type a message in the textbox and press Enter.
	- The assistant replies in the chat window.

	That’s it! 🎉 You can switch freely between chat and voice.

	---

	## 🛠️ Tools & Technologies

	This demo integrates several lightweight but powerful AI tools:

	- [Gradio](https://gradio.app/) – User interface for voice and chat.
	- Automatic Speech Recognition (ASR) – Converts microphone input to text using [faster-whisper](https://github.com/SYSTRAN/faster-whisper).
	- Large Language Model (LLM) – Core conversational logic (via [OpenAI API](https://openai.com) in this build).
	- Text-to-Speech (TTS) – Synthesizes assistant replies into natural voice using [Piper](https://github.com/rhasspy/piper) or system TTS.
	- Python & Hugging Face Spaces – Deployment environment.


	---

	## 👩‍💻 Credits

	Developed as part of a portfolio project to explore multimodal AI assistants that combine speech, text, and reasoning for practical real-world scenarios.