Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
xiaobin zhuang's picture
5 2 6

xiaobin zhuang

xiaobinzhuang
·
https://scholar.google.com/citations?user=a-crUqgAAAAJ&hl=zh-CN
  • auzxb

AI & ML interests

multi modal; audio generation; posting training

Organizations

None yet

authored 3 papers 5 months ago

Sounding that Object: Interactive Object-Aware Image to Audio Generation

Paper • 2506.04214 • Published Jun 4 • 2

AudioTrust: Benchmarking the Multifaceted Trustworthiness of Audio Large Language Models

Paper • 2505.16211 • Published May 22 • 18

MagiCodec: Simple Masked Gaussian-Injected Codec for High-Fidelity Reconstruction and Generation

Paper • 2506.00385 • Published May 31 • 3
authored 4 papers 7 months ago

Seed-TTS: A Family of High-Quality Versatile Speech Generation Models

Paper • 2406.02430 • Published Jun 4, 2024 • 38

DiTAR: Diffusion Transformer Autoregressive Modeling for Speech Generation

Paper • 2502.03930 • Published Feb 6

Seaweed-7B: Cost-Effective Training of Video Generation Foundation Model

Paper • 2504.08685 • Published Apr 11 • 130

KaraTuner: Towards end to end natural pitch correction for singing voice in karaoke

Paper • 2110.09121 • Published Oct 18, 2021
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs