justinchuby commited on
Commit
bf4dd33
Β·
verified Β·
1 Parent(s): 2c41004

Create realtime script

Browse files
Files changed (3) hide show
  1. REALTIME_README.md +205 -0
  2. realtime_detection.py +378 -0
  3. requirements.txt +2 -0
REALTIME_README.md ADDED
@@ -0,0 +1,205 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # BirdNET Real-Time Detection
2
+
3
+ Real-time bird species detection using your microphone and the BirdNET ONNX model.
4
+
5
+ ## Features
6
+
7
+ 🎀 **Live Microphone Input**: Continuously captures and analyzes audio from your microphone
8
+ 🐦 **Real-Time Detection**: Identifies bird species as they sing with configurable confidence thresholds
9
+ πŸ“Š **Live Display**: Dynamic terminal interface showing current detections and recent activity
10
+ ⚑ **Optimized Performance**: Efficient audio processing with rolling buffers and threading
11
+ πŸ”§ **Configurable**: Adjustable confidence thresholds, update intervals, and display options
12
+
13
+ ## Installation
14
+
15
+ ### Manual Installation
16
+
17
+ Install required packages:
18
+
19
+ ```bash
20
+ pip install sounddevice numpy librosa onnxruntime soundfile
21
+ ```
22
+
23
+ ## Usage
24
+
25
+ ### Basic Usage
26
+
27
+ Start real-time detection with default settings:
28
+
29
+ ```bash
30
+ python realtime_detection.py
31
+ ```
32
+
33
+ ### Advanced Options
34
+
35
+ ```bash
36
+ # Higher confidence threshold for fewer false positives
37
+ python realtime_detection.py --confidence 0.3
38
+
39
+ # Show more detections
40
+ python realtime_detection.py --top-k 10
41
+
42
+ # Faster display updates
43
+ python realtime_detection.py --update-interval 0.5
44
+
45
+ # Custom model and labels
46
+ python realtime_detection.py --model custom_model.onnx --labels custom_labels.txt
47
+ ```
48
+
49
+ ### List Audio Devices
50
+
51
+ To see available microphones:
52
+
53
+ ```bash
54
+ python realtime_detection.py --list-devices
55
+ ```
56
+
57
+ ## Command Line Arguments
58
+
59
+ - `--model`: Path to ONNX model file (default: `model.onnx`)
60
+ - `--labels`: Path to species labels file (default: `BirdNET_GLOBAL_6K_V2.4_Labels.txt`)
61
+ - `--confidence`: Minimum confidence threshold (default: 0.1)
62
+ - `--top-k`: Number of top predictions to show (default: 5)
63
+ - `--update-interval`: Display update interval in seconds (default: 1.0)
64
+ - `--list-devices`: List available audio input devices
65
+
66
+ ## Display Interface
67
+
68
+ The real-time interface shows:
69
+
70
+ ### Current Detections
71
+
72
+ ```
73
+ 🐦 Current Detections (Top 5):
74
+ ----------------------------------------
75
+ 1. American Robin
76
+ β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘ 0.8542
77
+
78
+ 2. Song Sparrow
79
+ β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘ 0.3214
80
+ ```
81
+
82
+ ### Recent Activity
83
+
84
+ ```
85
+ πŸ“Š Recent Activity (Last 10):
86
+ ----------------------------------------
87
+ 14:25:32 - American Robin (0.854)
88
+ 14:25:28 - Song Sparrow (0.321)
89
+ 14:25:15 - House Finch (0.287)
90
+ ```
91
+
92
+ ## Technical Details
93
+
94
+ ### Audio Processing
95
+
96
+ - **Sample Rate**: 48kHz (BirdNET requirement)
97
+ - **Window Size**: 3 seconds (144,000 samples)
98
+ - **Buffer**: 6-second rolling buffer for continuous analysis
99
+ - **Processing**: 100ms audio blocks with threaded processing
100
+
101
+ ### Performance Features
102
+
103
+ - **Non-blocking Audio**: Uses threading for audio capture and processing
104
+ - **Efficient Buffering**: Rolling deque buffer prevents memory buildup
105
+ - **Real-time Display**: Separate thread for UI updates
106
+ - **Graceful Shutdown**: Ctrl+C handling for clean exit
107
+
108
+ ### Requirements
109
+
110
+ - **Python**: 3.7+ (recommended: 3.9+)
111
+ - **Audio Input**: Working microphone or audio input device
112
+ - **Memory**: ~200MB RAM for model and audio buffers
113
+ - **CPU**: Moderate CPU usage for real-time inference
114
+
115
+ ## Troubleshooting
116
+
117
+ ### Audio Issues
118
+
119
+ **No microphone detected:**
120
+
121
+ ```bash
122
+ # List available devices
123
+ python realtime_detection.py --list-devices
124
+
125
+ # Check system audio settings
126
+ # Ensure microphone permissions are granted
127
+ ```
128
+
129
+ **Audio quality issues:**
130
+
131
+ - Check microphone positioning (closer to birds)
132
+ - Reduce background noise
133
+ - Adjust confidence threshold
134
+
135
+ ### Performance Issues
136
+
137
+ **High CPU usage:**
138
+
139
+ ```bash
140
+ # Reduce update frequency
141
+ python realtime_detection.py --update-interval 2.0
142
+
143
+ # Increase confidence threshold
144
+ python realtime_detection.py --confidence 0.3
145
+ ```
146
+
147
+ **Memory issues:**
148
+
149
+ - Close other applications
150
+ - The detector uses fixed-size buffers to prevent memory leaks
151
+
152
+ ### Detection Issues
153
+
154
+ **No detections:**
155
+
156
+ - Lower confidence threshold: `--confidence 0.05`
157
+ - Check if birds are actually singing
158
+ - Verify microphone is working
159
+
160
+ **Too many false positives:**
161
+
162
+ - Increase confidence threshold: `--confidence 0.3`
163
+ - Reduce background noise
164
+ - Position microphone outdoors
165
+
166
+ ## Example Sessions
167
+
168
+ ### Backyard Birding
169
+
170
+ ```bash
171
+ # Conservative detection for mixed environment
172
+ python realtime_detection.py --confidence 0.2 --top-k 3
173
+ ```
174
+
175
+ ### Bird Walk
176
+
177
+ ```bash
178
+ # Sensitive detection for bird-rich areas
179
+ python realtime_detection.py --confidence 0.1 --top-k 8 --update-interval 0.5
180
+ ```
181
+
182
+ ### Indoor Testing
183
+
184
+ ```bash
185
+ # High threshold for testing with recorded sounds
186
+ python realtime_detection.py --confidence 0.4 --top-k 5
187
+ ```
188
+
189
+ ## Tips for Best Results
190
+
191
+ 1. **Positioning**: Place microphone outdoors or near open windows
192
+ 2. **Timing**: Early morning and evening are typically best for bird activity
193
+ 3. **Environment**: Quiet locations with minimal human/traffic noise
194
+ 4. **Distance**: Microphone should be within 10-20 feet of singing birds
195
+ 5. **Weather**: Calm, clear conditions provide best audio quality
196
+
197
+ ## Stopping Detection
198
+
199
+ Press `Ctrl+C` at any time to stop the real-time detection and return to terminal.
200
+
201
+ The detector will display:
202
+
203
+ ```
204
+ πŸ›‘ Detection stopped.
205
+ ```
realtime_detection.py ADDED
@@ -0,0 +1,378 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env python3
2
+ """BirdNET Real-Time Audio Classification Script
3
+
4
+ This script captures audio from the microphone and uses the BirdNET ONNX model
5
+ to predict bird species in real-time with continuous display updates.
6
+
7
+ Created using Copilot.
8
+ """
9
+
10
+ from __future__ import annotations
11
+
12
+ import numpy as np
13
+ import sounddevice as sd
14
+ import onnxruntime as ort
15
+ import argparse
16
+ import os
17
+ import time
18
+ import threading
19
+ from collections import deque
20
+ from datetime import datetime
21
+ import queue
22
+
23
+
24
+ class RealTimeBirdDetector:
25
+ """Real-time bird detection using microphone input."""
26
+
27
+ def __init__(
28
+ self,
29
+ model_path: str = "model.onnx",
30
+ labels_path: str = "BirdNET_GLOBAL_6K_V2.4_Labels.txt",
31
+ sample_rate: int = 48000,
32
+ window_duration: float = 3.0,
33
+ confidence_threshold: float = 0.1,
34
+ top_k: int = 5,
35
+ update_interval: float = 1.0,
36
+ ):
37
+ """
38
+ Initialize the real-time bird detector.
39
+
40
+ Args:
41
+ model_path: Path to the ONNX model file
42
+ labels_path: Path to the species labels file
43
+ sample_rate: Audio sample rate (48kHz for BirdNET)
44
+ window_duration: Duration of each analysis window in seconds
45
+ confidence_threshold: Minimum confidence for detections
46
+ top_k: Number of top predictions to display
47
+ update_interval: How often to update predictions (seconds)
48
+ """
49
+ self.model_path = model_path
50
+ self.labels_path = labels_path
51
+ self.sample_rate = sample_rate
52
+ self.window_duration = window_duration
53
+ self.window_size = int(sample_rate * window_duration)
54
+ self.confidence_threshold = confidence_threshold
55
+ self.top_k = top_k
56
+ self.update_interval = update_interval
57
+
58
+ # Audio buffer for continuous recording
59
+ self.audio_buffer = deque(maxlen=self.window_size * 2) # 6 seconds buffer
60
+ self.audio_queue = queue.Queue()
61
+
62
+ # Detection results
63
+ self.current_detections = []
64
+ self.detection_history = deque(maxlen=100) # Keep last 100 detections
65
+ self.running = False
66
+
67
+ # Load model and labels
68
+ self._load_model()
69
+ self._load_labels()
70
+
71
+ def _load_model(self) -> None:
72
+ """Load the ONNX model."""
73
+ try:
74
+ print(f"Loading ONNX model: {self.model_path}")
75
+ self.session = ort.InferenceSession(self.model_path)
76
+
77
+ # Get model info
78
+ input_info = self.session.get_inputs()[0]
79
+ output_info = self.session.get_outputs()[0]
80
+ print(f"Model input: {input_info.name}, shape: {input_info.shape}")
81
+ print(f"Model output: {output_info.name}, shape: {output_info.shape}")
82
+
83
+ except Exception as e:
84
+ raise RuntimeError(f"Error loading ONNX model {self.model_path}: {str(e)}")
85
+
86
+ def _load_labels(self) -> None:
87
+ """Load species labels from file."""
88
+ try:
89
+ print(f"Loading labels from: {self.labels_path}")
90
+ self.labels = []
91
+ with open(self.labels_path, "r", encoding="utf-8") as f:
92
+ for line in f:
93
+ line = line.strip()
94
+ if line:
95
+ # Format: "Scientific_name_Common Name"
96
+ if "_" in line:
97
+ common_name = line.split("_", 1)[1]
98
+ self.labels.append(common_name)
99
+ else:
100
+ self.labels.append(line)
101
+ print(f"Loaded {len(self.labels)} species labels")
102
+
103
+ except Exception as e:
104
+ raise RuntimeError(
105
+ f"Error loading labels file {self.labels_path}: {str(e)}"
106
+ )
107
+
108
+ def _audio_callback(
109
+ self, indata: np.ndarray, frames: int, time_info, status
110
+ ) -> None:
111
+ """Callback function for audio input."""
112
+ if status:
113
+ print(f"Audio status: {status}")
114
+
115
+ # Convert stereo to mono if needed
116
+ if len(indata.shape) > 1:
117
+ audio_data = np.mean(indata, axis=1)
118
+ else:
119
+ audio_data = indata.flatten()
120
+
121
+ # Add to queue for processing
122
+ self.audio_queue.put(audio_data.copy())
123
+
124
+ def _process_audio_buffer(self) -> None:
125
+ """Process audio data from the queue."""
126
+ while self.running:
127
+ try:
128
+ # Get audio data from queue (with timeout)
129
+ audio_chunk = self.audio_queue.get(timeout=0.1)
130
+
131
+ # Add to rolling buffer
132
+ self.audio_buffer.extend(audio_chunk)
133
+
134
+ # Process if we have enough data
135
+ if len(self.audio_buffer) >= self.window_size:
136
+ # Get the most recent window
137
+ window_data = np.array(list(self.audio_buffer)[-self.window_size :])
138
+
139
+ # Run inference
140
+ self._analyze_audio_window(window_data)
141
+
142
+ except queue.Empty:
143
+ continue
144
+ except Exception as e:
145
+ print(f"Error processing audio: {e}")
146
+
147
+ def _analyze_audio_window(self, audio_data: np.ndarray) -> None:
148
+ """Analyze a single audio window."""
149
+ try:
150
+ # Ensure correct format
151
+ audio_data = audio_data.astype(np.float32)
152
+
153
+ # Add batch dimension
154
+ input_data = np.expand_dims(audio_data, axis=0)
155
+
156
+ # Get input name from the model
157
+ input_name = self.session.get_inputs()[0].name
158
+
159
+ # Run inference
160
+ outputs = self.session.run(None, {input_name: input_data})
161
+ predictions = outputs[0]
162
+
163
+ # Get scores for this window
164
+ predictions = np.array(predictions)
165
+ if len(predictions.shape) > 1:
166
+ scores = predictions[0]
167
+ else:
168
+ scores = predictions
169
+
170
+ # Find detections above threshold
171
+ above_threshold = np.where(scores > self.confidence_threshold)[0]
172
+
173
+ # Create detection results
174
+ detections = []
175
+ for idx in above_threshold:
176
+ confidence = float(scores[idx])
177
+ species_name = (
178
+ self.labels[idx] if idx < len(self.labels) else f"Class {idx}"
179
+ )
180
+ detections.append(
181
+ {
182
+ "species": species_name,
183
+ "confidence": confidence,
184
+ "timestamp": datetime.now(),
185
+ }
186
+ )
187
+
188
+ # Sort by confidence
189
+ detections.sort(key=lambda x: x["confidence"], reverse=True)
190
+
191
+ # Update current detections
192
+ self.current_detections = detections[: self.top_k]
193
+
194
+ # Add to history
195
+ if detections:
196
+ self.detection_history.extend(detections[: self.top_k])
197
+
198
+ except Exception as e:
199
+ print(f"Error during inference: {e}")
200
+
201
+ def _display_results(self) -> None:
202
+ """Continuously display detection results."""
203
+ while self.running:
204
+ try:
205
+ # Clear screen (works on most terminals)
206
+ os.system("clear" if os.name == "posix" else "cls")
207
+
208
+ # Display header
209
+ print("🎀 BirdNET Real-Time Detection")
210
+ print("=" * 50)
211
+ print(f"Listening... (Confidence > {self.confidence_threshold:.2f})")
212
+ print(f"Time: {datetime.now().strftime('%H:%M:%S')}")
213
+ print()
214
+
215
+ # Display current detections
216
+ if self.current_detections:
217
+ print(
218
+ f"🐦 Current Detections (Top {len(self.current_detections)}):"
219
+ )
220
+ print("-" * 40)
221
+ for i, detection in enumerate(self.current_detections, 1):
222
+ confidence = detection["confidence"]
223
+ species = detection["species"]
224
+ # Add confidence bars
225
+ bar_length = int(confidence * 20) # Scale to 20 chars
226
+ bar = "β–ˆ" * bar_length + "β–‘" * (20 - bar_length)
227
+ print(f"{i:2d}. {species}")
228
+ print(f" {bar} {confidence:.4f}")
229
+ else:
230
+ print("πŸ” No detections above threshold...")
231
+
232
+ print()
233
+
234
+ # Display recent activity
235
+ if self.detection_history:
236
+ print("πŸ“Š Recent Activity (Last 10):")
237
+ print("-" * 40)
238
+ recent = list(self.detection_history)[-10:]
239
+ for detection in reversed(recent):
240
+ timestamp = detection["timestamp"].strftime("%H:%M:%S")
241
+ species = detection["species"]
242
+ confidence = detection["confidence"]
243
+ print(f"{timestamp} - {species} ({confidence:.3f})")
244
+
245
+ print()
246
+ print("Press Ctrl+C to stop")
247
+
248
+ # Wait before next update
249
+ time.sleep(self.update_interval)
250
+
251
+ except KeyboardInterrupt:
252
+ break
253
+ except Exception as e:
254
+ print(f"Display error: {e}")
255
+
256
+ def start_detection(self) -> None:
257
+ """Start real-time detection."""
258
+ try:
259
+ print("Starting real-time bird detection...")
260
+ print(f"Sample rate: {self.sample_rate} Hz")
261
+ print(f"Window size: {self.window_duration} seconds")
262
+ print(f"Confidence threshold: {self.confidence_threshold}")
263
+ print("Press Ctrl+C to stop\n")
264
+
265
+ self.running = True
266
+
267
+ # Start audio processing thread
268
+ audio_thread = threading.Thread(
269
+ target=self._process_audio_buffer, daemon=True
270
+ )
271
+ audio_thread.start()
272
+
273
+ # Start display thread
274
+ display_thread = threading.Thread(target=self._display_results, daemon=True)
275
+ display_thread.start()
276
+
277
+ # Start audio input stream
278
+ with sd.InputStream(
279
+ callback=self._audio_callback,
280
+ channels=1,
281
+ samplerate=self.sample_rate,
282
+ blocksize=int(self.sample_rate * 0.1), # 100ms blocks
283
+ dtype=np.float32,
284
+ ):
285
+ print("🎀 Microphone active - listening for birds...")
286
+
287
+ # Keep main thread alive
288
+ try:
289
+ while self.running:
290
+ time.sleep(0.1)
291
+ except KeyboardInterrupt:
292
+ pass
293
+
294
+ except Exception as e:
295
+ print(f"Error during detection: {e}")
296
+ finally:
297
+ self.running = False
298
+ print("\nπŸ›‘ Detection stopped.")
299
+
300
+ def stop_detection(self) -> None:
301
+ """Stop detection."""
302
+ self.running = False
303
+
304
+
305
+ def main() -> int:
306
+ """Main function for real-time detection."""
307
+ parser = argparse.ArgumentParser(
308
+ description="BirdNET Real-Time Audio Classification"
309
+ )
310
+ parser.add_argument(
311
+ "--model", default="model.onnx", help="Path to the ONNX model file"
312
+ )
313
+ parser.add_argument(
314
+ "--labels",
315
+ default="BirdNET_GLOBAL_6K_V2.4_Labels.txt",
316
+ help="Path to the labels file",
317
+ )
318
+ parser.add_argument(
319
+ "--confidence",
320
+ type=float,
321
+ default=0.1,
322
+ help="Minimum confidence threshold for detections (default: 0.1)",
323
+ )
324
+ parser.add_argument(
325
+ "--top-k",
326
+ type=int,
327
+ default=5,
328
+ help="Number of top predictions to show (default: 5)",
329
+ )
330
+ parser.add_argument(
331
+ "--update-interval",
332
+ type=float,
333
+ default=1.0,
334
+ help="Display update interval in seconds (default: 1.0)",
335
+ )
336
+ parser.add_argument(
337
+ "--list-devices", action="store_true", help="List available audio input devices"
338
+ )
339
+
340
+ args = parser.parse_args()
341
+
342
+ # List audio devices if requested
343
+ if args.list_devices:
344
+ print("Available audio input devices:")
345
+ print(sd.query_devices())
346
+ return 0
347
+
348
+ # Check if files exist
349
+ if not os.path.exists(args.model):
350
+ print(f"Error: Model file '{args.model}' not found.")
351
+ return 1
352
+
353
+ if not os.path.exists(args.labels):
354
+ print(f"Error: Labels file '{args.labels}' not found.")
355
+ return 1
356
+
357
+ try:
358
+ # Create detector
359
+ detector = RealTimeBirdDetector(
360
+ model_path=args.model,
361
+ labels_path=args.labels,
362
+ confidence_threshold=args.confidence,
363
+ top_k=args.top_k,
364
+ update_interval=args.update_interval,
365
+ )
366
+
367
+ # Start detection
368
+ detector.start_detection()
369
+
370
+ return 0
371
+
372
+ except Exception as e:
373
+ print(f"Error: {str(e)}")
374
+ return 1
375
+
376
+
377
+ if __name__ == "__main__":
378
+ exit(main())
requirements.txt CHANGED
@@ -1,3 +1,5 @@
1
  numpy>=1.21.0
2
  librosa>=0.9.0
3
  onnxruntime>=1.20.0
 
 
 
1
  numpy>=1.21.0
2
  librosa>=0.9.0
3
  onnxruntime>=1.20.0
4
+ sounddevice>=0.4.0
5
+ soundfile>=0.10.0