LightBagel: A Light-weighted, Double Fusion Framework for Unified Multimodal Understanding and Generation Paper • 2510.22946 • Published 5 days ago • 16
Artificial Hippocampus Networks for Efficient Long-Context Modeling Paper • 2510.07318 • Published 24 days ago • 28
OpenVision: A Fully-Open, Cost-Effective Family of Advanced Vision Encoders for Multimodal Learning Paper • 2505.04601 • Published May 7 • 28
GPT-IMAGE-EDIT-1.5M: A Million-Scale, GPT-Generated Image Dataset Paper • 2507.21033 • Published Jul 28 • 20