Tango 2: Aligning Diffusion-based Text-to-Audio Generations through Direct Preference Optimization Paper • 2404.09956 • Published Apr 15, 2024 • 12
Attention or Convolution: Transformer Encoders in Audio Language Models for Inference Efficiency Paper • 2311.02772 • Published Nov 5, 2023 • 8
Toward Joint Language Modeling for Speech Units and Text Paper • 2310.08715 • Published Oct 12, 2023 • 10