8 24 11

Xiangyu Z

PhoenixZ

AI & ML interests

None yet

Recent Activity

upvoted a paper 18 days ago

Bee: A High-Quality Corpus and Full-Stack Suite to Unlock Advanced Fully Open MLLMs

upvoted a paper 18 days ago

InteractiveOmni: A Unified Omni-modal Model for Audio-Visual Multi-turn Dialogue

upvoted a paper 18 days ago

FlashWorld: High-quality 3D Scene Generation within Seconds

View all activity

Organizations

None yet

upvoted 3 papers 18 days ago

Bee: A High-Quality Corpus and Full-Stack Suite to Unlock Advanced Fully Open MLLMs

Paper • 2510.13795 • Published 18 days ago • 50

InteractiveOmni: A Unified Omni-modal Model for Audio-Visual Multi-turn Dialogue

Paper • 2510.13747 • Published 18 days ago • 29

FlashWorld: High-quality 3D Scene Generation within Seconds

Paper • 2510.13678 • Published 18 days ago • 70

upvoted a paper 24 days ago

MM-HELIX: Boosting Multimodal Long-Chain Reflective Reasoning with Holistic Platform and Adaptive Hybrid Policy Optimization

Paper • 2510.08540 • Published 24 days ago • 108

upvoted a paper about 2 months ago

GenExam: A Multidisciplinary Text-to-Image Exam

Paper • 2509.14232 • Published Sep 17 • 21

upvoted a paper 2 months ago

Intern-S1: A Scientific Multimodal Foundation Model

Paper • 2508.15763 • Published Aug 21 • 255

upvoted a paper 3 months ago

Genie Envisioner: A Unified World Foundation Platform for Robotic Manipulation

Paper • 2508.05635 • Published Aug 7 • 73

upvoted a paper 5 months ago

Deciphering Trajectory-Aided LLM Reasoning: An Optimization Perspective

Paper • 2505.19815 • Published May 26 • 36

upvoted a paper 6 months ago

EnerVerse-AC: Envisioning Embodied Environments with Action Condition

Paper • 2505.09723 • Published May 14 • 23

upvoted 2 papers 7 months ago

MM-IFEngine: Towards Multimodal Instruction Following

Paper • 2504.07957 • Published Apr 10 • 35

Envisioning Beyond the Pixels: Benchmarking Reasoning-Informed Visual Editing

Paper • 2504.02826 • Published Apr 3 • 68

upvoted 4 papers 8 months ago

Phi-4-Mini Technical Report: Compact yet Powerful Multimodal Language Models via Mixture-of-LoRAs

Paper • 2503.01743 • Published Mar 3 • 89

upvoted a collection 8 months ago

FLUX.1

Collection

A collection of our FLUX.1 models and LoRAs. • 10 items • Updated 19 days ago • 231

upvoted a paper 8 months ago

OmniAlign-V: Towards Enhanced Alignment of MLLMs with Human Preference

Paper • 2502.18411 • Published Feb 25 • 74

upvoted a paper 9 months ago

Redundancy Principles for MLLMs Benchmarks

Paper • 2501.13953 • Published Jan 20 • 29

upvoted a paper 11 months ago

Expanding Performance Boundaries of Open-Source Multimodal Models with Model, Data, and Test-Time Scaling

Paper • 2412.05271 • Published Dec 6, 2024 • 159

upvoted a collection about 1 year ago

CompassJudger

Collection

4 items • Updated Jul 24 • 8

Xiangyu Z

AI & ML interests

Recent Activity

Organizations

PhoenixZ's activity