Spaces:

tusker123
/

accent_classifier

Sleeping

App Files Files Community

tusker123 commited on May 23

Commit

9c84d33

verified ·

1 Parent(s): 190f6a1

Update README.md

Browse files

Files changed (1) hide show

README.md +92 -10

README.md CHANGED Viewed

@@ -1,13 +1,95 @@
 ---
-title: Accent Classifier
-emoji: 🌖
-colorFrom: pink
-colorTo: indigo
-sdk: gradio
-sdk_version: 5.31.0
-app_file: app.py
-pinned: false
-short_description: This Gradio app analyses English access from video file
 ---
-Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

 ---
+language: en
+tags:
+- audio-classification
+- accent-classification
+- english-accents
+- video-analysis
+- gradio
+- transformers
+license: apache-2.0
+model-index:
+- name: english-accent-classifier
+  results:
+  - task: audio-classification
+    dataset: custom
+    metric: accuracy # Or other relevant metrics if available
+    value: N/A
 ---
+# English Accent Classifier with Video Analysis
+This Gradio application analyzes English accents from audio extracted from video files. You can provide a video either via a direct URL or by uploading a file from your local machine.
+## How it Works
+1.  **Input Video:** Provide a video URL (MP4, Loom, Dropbox, Google Drive direct links) or upload a video file.
+2.  **Video Processing:** The application downloads/processes the video.
+3.  **Audio Extraction:** The full audio and a short segment (15 seconds) are extracted.
+4.  **Language Detection:** The short audio is transcribed, and the language is detected.
+5.  **Accent Classification (if English):** A longer audio segment (adjustable duration) is analyzed for English accent.
+6.  **Results:** The detected language, predicted accent, confidence scores, and an audio player for the full extracted audio are displayed.
+## Features
+* **English Accent Classification:** Predicts the accent in English audio.
+* **Language Detection:** Ensures the audio is English before accent analysis.
+* **Flexible Video Input:** Supports URLs and file uploads.
+* **Adjustable Analysis Duration:** Users can set the audio analysis length.
+* **Audio Playback:** Allows users to listen to the extracted audio.
+## Tech Stack
+* [Gradio](https://gradio.app/): Interactive web UI.
+* [Hugging Face Transformers](https://huggingface.co/transformers/): Pre-trained models and pipelines.
+* [Requests](https://requests.readthedocs.io/en/latest/): Downloading video files.
+* [MoviePy](https://zulko.github.io/moviepy/): Video editing for audio extraction.
+* [PyTorch](https://pytorch.org/): Underlying deep learning framework.
+* [Soundfile](https://pysoundfile.readthedocs.io/en/latest/): Audio file handling.
+## Models Used
+* **Accent Classification:** `dima806/english_accents_classification`
+* **Language Detection:** `alexneakameni/language_detection`
+* **Automatic Speech Recognition:** `openai/whisper-tiny.en`
+## Usage
+You can interact with the application directly in your browser. Provide a video URL or upload a file, adjust the analysis duration, and click "Analyze Video". The results will be displayed below.
+### Input Formats
+* **Uploaded Video Files:** `.mp4`
+* **Video URLs:**
+    * Direct MP4 links (ending in `.mp4`)
+    * Loom video share links (`https://www.loom.com/share/...`)
+    * Dropbox direct download links (MP4 links ending in `?dl=1`)
+    * Google Drive direct download links (`https://drive.google.com/uc?id=...&export=download`)
+### Unsupported Formats
+* Webpages embedding videos (e.g., YouTube, news articles).
+* Dropbox shared folder links.
+## FFmpeg Requirement
+This application requires [FFmpeg](https://ffmpeg.org/) to be installed on your system for audio extraction from video files. Follow the installation instructions for your operating system on the FFmpeg website.
+## Troubleshooting
+* **"Invalid URL"**: Ensure the URL meets the specified format requirements.
+* **Audio/Video Processing Errors**: Likely due to missing or incorrectly configured FFmpeg.
+* **Transcription Errors**: Audio may be unclear or contain little speech in the initial 15 seconds.
+* **Non-English Language Detection**: The model is designed for English accent classification only.
+## Citation
+If you use this application in your work, please consider citing the original models and the libraries used.
+```bibtex
+@misc{huggingface_transformers,
+    author = {Hugging Face Team},
+    title = {Transformers: State-of-the-art Natural Language Processing},
+    year = {2019},
+    howpublished = {\url{[https://github.com/huggingface/transformers](https://github.com/huggingface/transformers)}},
+}