arxiv:2504.10068
Yushuo Guan
UnnamedWatcher
AI & ML interests
None yet
Recent Activity
upvoted
a
paper
17 days ago
AVoCaDO: An Audiovisual Video Captioner Driven by Temporal Orchestration
upvoted
a
paper
5 months ago
MME-VideoOCR: Evaluating OCR-Based Capabilities of Multimodal LLMs in
Video Scenarios
authored
a paper
7 months ago
Mavors: Multi-granularity Video Representation for Multimodal Large
Language Model
Organizations
None yet