owen

kenaniah

AI & ML interests

None yet

Recent Activity

upvoted a paper 27 days ago

Paper2Video: Automatic Video Generation from Scientific Papers

upvoted a paper about 1 month ago

Code2Video: A Code-centric Paradigm for Educational Video Generation

upvoted a paper 3 months ago

A Glimpse to Compress: Dynamic Visual Token Pruning for Large Vision-Language Models

View all activity

Organizations

upvoted a paper 27 days ago

Paper2Video: Automatic Video Generation from Scientific Papers

Paper • 2510.05096 • Published 28 days ago • 109

upvoted a paper about 1 month ago

Code2Video: A Code-centric Paradigm for Educational Video Generation

Paper • 2510.01174 • Published Oct 1 • 33

upvoted a paper 3 months ago

A Glimpse to Compress: Dynamic Visual Token Pruning for Large Vision-Language Models

Paper • 2508.01548 • Published Aug 3 • 13

upvoted a paper 5 months ago

Paper2Poster: Towards Multimodal Poster Automation from Scientific Papers

Paper • 2505.21497 • Published May 27 • 108

upvoted a paper 6 months ago

LiveCC: Learning Video LLM with Streaming Speech Transcription at Scale

Paper • 2504.16030 • Published Apr 22 • 37

upvoted a paper 7 months ago

Long-Context Autoregressive Video Modeling with Next-Frame Prediction

Paper • 2503.19325 • Published Mar 25 • 73

upvoted 5 papers 8 months ago

DoraCycle: Domain-Oriented Adaptation of Unified Generative Model in Multimodal Cycles

Paper • 2503.03651 • Published Mar 5 • 16

liked 2 datasets 8 months ago

lyan62/FoodieQA

Viewer • Updated Jun 20 • 392 • 26 • 12

MMInstruction/M3IT

Updated Nov 24, 2023 • 3.14k • 130

upvoted 2 papers 11 months ago

ROICtrl: Boosting Instance Control for Visual Generation

Paper • 2411.17949 • Published Nov 27, 2024 • 87

ShowUI: One Vision-Language-Action Model for GUI Visual Agent

Paper • 2411.17465 • Published Nov 26, 2024 • 88

upvoted a paper 12 months ago

ReCapture: Generative Video Camera Controls for User-Provided Videos using Masked Video Fine-Tuning

Paper • 2411.05003 • Published Nov 7, 2024 • 71

upvoted 2 papers about 1 year ago

EvolveDirector: Approaching Advanced Text-to-Image Generation with Large Vision-Language Models

Paper • 2410.07133 • Published Oct 9, 2024 • 19

Show-o: One Single Transformer to Unify Multimodal Understanding and Generation

Paper • 2408.12528 • Published Aug 22, 2024 • 51

upvoted a paper over 1 year ago

VideoLLM-online: Online Video Large Language Model for Streaming Video

Paper • 2406.11816 • Published Jun 17, 2024 • 25

owen

AI & ML interests

Recent Activity

Organizations

kenaniah's activity