Huu Nguyen's picture

Huu Nguyen

huu-ontocord

·

AI & ML interests

None yet

Recent Activity

updated a dataset about 18 hours ago

mixture-vitae/funcall

published a dataset about 20 hours ago

mixture-vitae/funcall

new activity 1 day ago

fineinstructions/finetemplates:can you add a license

View all activity

Organizations

authored a paper about 1 month ago

MixtureVitae: Open Web-Scale Pretraining Dataset With High Quality Instruction and Reasoning Data Built from Permissive-First Text Sources

Paper • 2509.25531 • Published Sep 29 • 7

authored 2 papers 4 months ago

EmoNet-Face: An Expert-Annotated Benchmark for Synthetic Emotion Recognition

Paper • 2505.20033 • Published May 26 • 4

EmoNet-Voice: A Fine-Grained, Expert-Verified Benchmark for Speech Emotion Detection

Paper • 2506.09827 • Published Jun 11 • 20

authored 3 papers 8 months ago

RedPajama: an Open Dataset for Training Large Language Models

Paper • 2411.12372 • Published Nov 19, 2024 • 56

LLMs Lost in Translation: M-ALERT uncovers Cross-Linguistic Safety Gaps

Paper • 2412.15035 • Published Dec 19, 2024 • 4

Project Alexandria: Towards Freeing Scientific Knowledge from Copyright Burdens via LLMs

Paper • 2502.19413 • Published Feb 26 • 20

authored 4 papers over 1 year ago

ALERT: A Comprehensive Benchmark for Assessing Large Language Models' Safety through Red Teaming

Paper • 2404.08676 • Published Apr 6, 2024 • 3

OpenAssistant Conversations -- Democratizing Large Language Model Alignment

Paper • 2304.07327 • Published Apr 14, 2023 • 7

Data Governance in the Age of Large-Scale Data-Driven Language Technology

Paper • 2206.03216 • Published May 4, 2022

Aurora-M: The First Open Source Multilingual Language Model Red-teamed according to the U.S. Executive Order

Paper • 2404.00399 • Published Mar 30, 2024 • 42