- 
	
	
	WF-VAE: Enhancing Video VAE by Wavelet-Driven Energy Flow for Latent Video Diffusion ModelPaper • 2411.17459 • Published • 12
- 
	
	
	MAGVIT: Masked Generative Video TransformerPaper • 2212.05199 • Published
- 
	
	
	Language Model Beats Diffusion -- Tokenizer is Key to Visual GenerationPaper • 2310.05737 • Published • 6
- 
	
	
	Finite Scalar Quantization: VQ-VAE Made SimplePaper • 2309.15505 • Published • 23
Inui
Norm
		AI & ML interests
Video Diffusion; Large Language Model; Object Detection; OCR
		Recent Activity
						upvoted 
								a
								paper
							
						19 days ago
						
					
						
						
						Less is More: Recursive Reasoning with Tiny Networks
						
						liked
								a model
							
						about 1 month ago
						
					
						
						
						
						rednote-hilab/dots.ocr
						
						liked
								a model
							
						2 months ago
						
					
						
						
						
						meituan-longcat/LongCat-Flash-Chat
						 
								

