1 16 3

Tom Zak

Tomoomo

AI & ML interests

None yet

Recent Activity

upvoted a paper 2 days ago

Emu3.5: Native Multimodal Models are World Learners

upvoted a paper 2 days ago

The End of Manual Decoding: Towards Truly End-to-End Language Models

upvoted a paper 2 days ago

Kimi Linear: An Expressive, Efficient Attention Architecture

View all activity

Organizations

None yet

upvoted 3 papers 2 days ago

upvoted 3 papers 6 days ago

Qwen3 Technical Report

Paper • 2505.09388 • Published May 14 • 308

The Dragon Hatchling: The Missing Link between the Transformer and Models of the Brain

Paper • 2509.26507 • Published Sep 30 • 522

Visual Diffusion Models are Geometric Solvers

Paper • 2510.21697 • Published 10 days ago • 18

upvoted a paper 10 days ago

LoongRL:Reinforcement Learning for Advanced Reasoning over Long Contexts

Paper • 2510.19363 • Published 12 days ago • 59

commented a paper about 2 months ago

Drivel-ology: Challenging LLMs with Interpreting Nonsense with Depth

Paper • 2509.03867 • Published Sep 4 • 209 •

liked a Space 3 months ago

207

MegaTTS 3 Voice Cloning

🎤

MegaTTS 3 but with voice cloning!

upvoted 4 papers 4 months ago

FantasyPortrait: Enhancing Multi-Character Portrait Animation with Expression-Augmented Diffusion Transformers

Paper • 2507.12956 • Published Jul 17 • 24

Voxtral

Paper • 2507.13264 • Published Jul 17 • 29

MindJourney: Test-Time Scaling with World Models for Spatial Reasoning

Paper • 2507.12508 • Published Jul 16 • 26

A Survey of Context Engineering for Large Language Models

Paper • 2507.13334 • Published Jul 17 • 258

liked a model 6 months ago

Qwen/Qwen3-235B-A22B

Text Generation • 235B • Updated Jul 26 • 243k • • 1.05k

upvoted a collection 7 months ago

Cogito v1 Preview

Collection

5 items • Updated Apr 8 • 119

liked a Space 7 months ago

147

Lumina Image 2.0

🖼

Generate images from text prompts

upvoted 4 papers over 1 year ago

PosterLLaVa: Constructing a Unified Multi-modal Layout Generator with LLM

Paper • 2406.02884 • Published Jun 5, 2024 • 19

Scaling Laws for Reward Model Overoptimization in Direct Alignment Algorithms

Paper • 2406.02900 • Published Jun 5, 2024 • 14

LiveSpeech: Low-Latency Zero-shot Text-to-Speech via Autoregressive Modeling of Audio Discrete Codes

Paper • 2406.02897 • Published Jun 5, 2024 • 16

Block Transformer: Global-to-Local Language Modeling for Fast Inference

Paper • 2406.02657 • Published Jun 4, 2024 • 41

Tom Zak

AI & ML interests

Recent Activity

Organizations

Tomoomo's activity

MegaTTS 3 Voice Cloning

Lumina Image 2.0