CoBia: Constructed Conversations Can Trigger Otherwise Concealed Societal Biases in LLMs Paper • 2510.09871 • Published 23 days ago • 2
ADAM: A Diverse Archive of Mankind for Evaluating and Enhancing LLMs in Biographical Reasoning Paper • 2509.22991 • Published Sep 26 • 1
MEENA (PersianMMMU): Multimodal-Multilingual Educational Exams for N-level Assessment Paper • 2508.17290 • Published Aug 24 • 8
The Touché23-ValueEval Dataset for Identifying Human Values behind Arguments Paper • 2301.13771 • Published Jan 31, 2023
MEENA (PersianMMMU): Multimodal-Multilingual Educational Exams for N-level Assessment Paper • 2508.17290 • Published Aug 24 • 8
FineWeb2: One Pipeline to Scale Them All -- Adapting Pre-Training Data Processing to Every Language Paper • 2506.20920 • Published Jun 26 • 73
How Programming Concepts and Neurons Are Shared in Code Language Models Paper • 2506.01074 • Published Jun 1 • 3
Tracing Multilingual Factual Knowledge Acquisition in Pretraining Paper • 2505.14824 • Published May 20 • 4
ELAB: Extensive LLM Alignment Benchmark in Persian Language Paper • 2504.12553 • Published Apr 17 • 2
Generative AI for Character Animation: A Comprehensive Survey of Techniques, Applications, and Future Directions Paper • 2504.19056 • Published Apr 27 • 18
Ask in Any Modality: A Comprehensive Survey on Multimodal Retrieval-Augmented Generation Paper • 2502.08826 • Published Feb 12 • 17
GlotCC: An Open Broad-Coverage CommonCrawl Corpus and Pipeline for Minority Languages Paper • 2410.23825 • Published Oct 31, 2024 • 4
MEXA: Multilingual Evaluation of English-Centric LLMs via Cross-Lingual Alignment Paper • 2410.05873 • Published Oct 8, 2024 • 3