Spaces:
Sleeping
Sleeping
| <html lang="en"> | |
| <head> | |
| <meta charset="UTF-8"> | |
| <meta name="viewport" content="width=device-width, initial-scale=1.0"> | |
| <title>Speech-to-Speech Model Comparison</title> | |
| <link href="https://cdn.jsdelivr.net/npm/bootstrap@5.3.0-alpha1/dist/css/bootstrap.min.css" rel="stylesheet"> | |
| <link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/6.0.0-beta3/css/all.min.css"> | |
| <style> | |
| body { | |
| background-color: #f0f8ff; | |
| font-family: 'Arial', sans-serif; | |
| } | |
| .container { | |
| background-color: #fff; | |
| border-radius: 15px; | |
| box-shadow: 0 6px 15px rgba(0, 0, 0, 0.15); | |
| padding: 40px; | |
| max-width: 800px; | |
| margin: 30px auto; | |
| } | |
| h3 { | |
| font-size: 2rem; | |
| font-weight: bold; | |
| color: #333; | |
| text-align: center; | |
| margin-bottom: 20px; | |
| } | |
| p { | |
| color: #555; | |
| font-size: 1rem; | |
| line-height: 1.8; | |
| } | |
| .btn { | |
| border-radius: 25px; | |
| font-size: 1.1rem; | |
| padding: 12px 25px; | |
| font-weight: bold; | |
| transition: background-color 0.3s ease, transform 0.2s ease; | |
| } | |
| .btn-primary { | |
| background-color: #007bff; | |
| border: none; | |
| } | |
| .btn-primary:hover { | |
| background-color: #0056b3; | |
| transform: scale(1.05); | |
| } | |
| .icon { | |
| color: #f39c12; | |
| margin-right: 5px; | |
| } | |
| .section-title { | |
| font-size: 1.2rem; | |
| font-weight: bold; | |
| color: #007bff; | |
| display: flex; | |
| align-items: center; | |
| margin-top: 20px; | |
| } | |
| .section-title .fa { | |
| margin-right: 10px; | |
| } | |
| .audio-container { | |
| text-align: center; | |
| margin-top: 20px; | |
| } | |
| .audio-container .audio-item { | |
| display: flex; | |
| justify-content: center; | |
| align-items: center; | |
| margin-bottom: 15px; | |
| } | |
| .audio-container .audio-item span { | |
| margin-right: 10px; | |
| font-weight: bold; | |
| } | |
| audio { | |
| display: inline-block; | |
| } | |
| </style> | |
| </head> | |
| <body> | |
| <div class="container py-5"> | |
| <h3 class="mb-4">⚖️ Speech-to-Speech Model Comparison</h3> | |
| <div id="evaluation-info" class="mb-5"> | |
| <p class="text-start"> | |
| <span class="section-title"><i class="fas fa-info-circle"></i> Welcome to the Speech-to-Speech (S2S) | |
| Model Evaluation! 👏</span> | |
| In this evaluation, you will assess the performance of different S2S models, such as | |
| <strong>ChatGPT-4o</strong>, <strong>FunAudioLLM</strong>, <strong>SpeechGPT</strong>, | |
| <strong>Mini-Omni</strong>, <strong>Cascade</strong>, and <strong>LLaMA-Omni</strong>. | |
| <br> | |
| <span>🎯 <strong>Goal:</strong> Test how well these models handle speech tasks across different domains.<span> | |
| <span class="section-title"><i class="fas fa-tasks"></i> How It Works</span> | |
| Once you select a specific domain and task (e.g., <em>Educational Tutoring</em> and <em>Rhythm | |
| Control</em>), | |
| you will proceed to the evaluation stage. In each round, you will be presented with an audio input. | |
| <span><strong> | |
| <br> | |
| 🌰 Example:</strong></span> | |
| <div class="audio-container"> | |
| <div class="audio-item"> | |
| <span>Audio Sample:</span> | |
| <audio controls> | |
| <source src="/static/audio/sample/input_audio.wav" type="audio/wav"> | |
| </audio> | |
| </div> | |
| </div> | |
| The corresponding text is: | |
| <em>"Say the following sentence at my speed first, then say it again very slowly: | |
| 'Artificial intelligence is changing the world in many ways.'" </em> 🧠 | |
| <small>(Note: the audio plays at 1.5x the normal speed.)</small> | |
| <span class="section-title"><i class="fas fa-star"></i> Model Performance</span> | |
| <div class="audio-container"> | |
| <div class="audio-item"> | |
| <span>ChatGPT-4o:</span> | |
| <audio controls> | |
| <source src="/static/audio/sample/4o_audio.wav" type="audio/wav"> | |
| </audio> | |
| </div> | |
| <p style="margin: 0; text-align: left;"> | |
| 🎙️ <strong>Speech:</strong> Partially followed the instruction on speed. | |
| </p> | |
| <p style="margin: 0; text-align: left;"> | |
| 🧾 <strong>Semantics:</strong> Accurately followed the instruction, with no semantic deviation or | |
| missing | |
| information. | |
| </p> | |
| <br> | |
| <div class="audio-item"> | |
| <span>FunAudioLLM:</span> | |
| <audio controls> | |
| <source src="/static/audio/sample/FunAudio_audio.wav" type="audio/wav"> | |
| </audio> | |
| </div> | |
| <p style="margin: 0; text-align: left;"> | |
| 🎙️ <strong>Speech:</strong> Partially followed the instruction on speed. | |
| </p> | |
| <p style="margin: 0; text-align: left;"> | |
| 🧾 <strong>Semantics:</strong> Accurately followed the instruction, with no semantic deviation or | |
| missing | |
| information. | |
| </p> | |
| <br> | |
| <div class="audio-item"> | |
| <span>SpeechGPT:</span> | |
| <audio controls> | |
| <source src="/static/audio/sample/SpeechGPT.wav" type="audio/wav"> | |
| </audio> | |
| </div> | |
| <p style="margin: 0; text-align: left;"> | |
| 🎙️ <strong>Speech:</strong> Did not follow the instruction on speed. | |
| </p> | |
| <p style="margin: 0; text-align: left;"> | |
| 🧾 <strong>Semantics:</strong> Partially followed the instruction, with minor semantic deviation and | |
| missing information. | |
| </p> | |
| <br> | |
| <div class="audio-item"> | |
| <span>Mini-Omni:</span> | |
| <audio controls> | |
| <source src="/static/audio/sample/mini-omni.wav" type="audio/wav"> | |
| </audio> | |
| </div> | |
| <p style="margin: 0; text-align: left;"> | |
| 🎙️ <strong>Speech:</strong> Did not follow the instruction on speed. | |
| </p> | |
| <p style="margin: 0; text-align: left;"> | |
| 🧾 <strong>Semantics:</strong> Did not follow the instruction, with significant semantic deviation | |
| and missing information. | |
| </p> | |
| </div> | |
| <p class="text-start"> | |
| After making your choice, you'll proceed to the next round. 🔄 | |
| </p> | |
| <p class="text-start"> | |
| <strong>Click the button below to start the evaluation! 🚀</strong> | |
| </p> | |
| </div> | |
| <div class="text-center"> | |
| <a href="http://71.132.14.167:6002/" target="_blank" class="btn btn-primary"><i class="fas fa-play"></i> | |
| Start Evaluation</a> | |
| </div> | |
| </div> | |
| </body> | |
| </html> |