OpenSound commited on
Commit
49309df
·
verified ·
1 Parent(s): f7ad1a2

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +13 -60
README.md CHANGED
@@ -1,60 +1,13 @@
1
- # FlexSED: Towards Open-Vocabulary Sound Event Detection
2
-
3
- [![arXiv](https://img.shields.io/badge/arXiv-2409.10819-brightgreen.svg?style=flat-square)](https://arxiv.org/abs/2509.18606)
4
- [![Hugging Face Models](https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Models-blue)](https://huggingface.co/Higobeatz/FlexSED/tree/main)
5
-
6
-
7
- ## News
8
- - Oct 2025: 📦 Released code and pretrained checkpoint
9
- - Sep 2025: 🎉 FlexSED Spotlighted at WASPAA 2025
10
-
11
-
12
- ## Installation
13
-
14
- Clone the repository:
15
- ```
16
- git clone git@github.com:JHU-LCAP/FlexSED.git
17
- ```
18
- Install the dependencies:
19
- ```
20
- cd FlexSED
21
- pip install -r requirements.txt
22
- ```
23
-
24
- ## Usage
25
- ```python
26
- from api import FlexSED
27
- import torch
28
- import soundfile as sf
29
-
30
- # load model
31
- flexsed = FlexSED(device='cuda')
32
-
33
- # run inference
34
- events = ["Dog"]
35
- preds = flexsed.run_inference("example.wav", events)
36
-
37
- # visualize prediciton
38
- flexsed.to_multi_plot(preds, events, fname="example2")
39
-
40
- # (Optional) visualize prediciton by video
41
- # flexsed.to_multi_video(preds, events, audio_path="example2.wav", fname="example2")
42
- ```
43
-
44
- ## Training
45
-
46
- WIP
47
-
48
-
49
- ## Reference
50
-
51
- If you find the code useful for your research, please consider citing:
52
-
53
- ```bibtex
54
- @article{hai2025flexsed,
55
- title={FlexSED: Towards Open-Vocabulary Sound Event Detection},
56
- author={Hai, Jiarui and Wang, Helin and Guo, Weizhe and Elhilali, Mounya},
57
- journal={arXiv preprint arXiv:2509.18606},
58
- year={2025}
59
- }
60
- ```
 
1
+ ---
2
+ title: FlexSED
3
+ emoji: 🎧
4
+ colorFrom: green
5
+ colorTo: indigo
6
+ sdk: gradio
7
+ sdk_version: 5.31.0
8
+ app_file: app.py
9
+ pinned: false
10
+ license: mit
11
+ short_description: State-of-the-art target speech extractor
12
+ tags: ["sound-event-detection"]
13
+ ---