- 
	
	
	In Search of Needles in a 10M Haystack: Recurrent Memory Finds What LLMs MissPaper • 2402.10790 • Published • 42
- 
	
	
	Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model ParametersPaper • 2408.03314 • Published • 63
- 
	
	
	Quiet-STaR: Language Models Can Teach Themselves to Think Before SpeakingPaper • 2403.09629 • Published • 78
Gabriel Pendl
jompaaa
		AI & ML interests
None yet
		Recent Activity
						liked
								a model
							
						1 day ago
						
					
						
						
						
						LiquidAI/LFM2-ColBERT-350M
						
						liked
								a model
							
						15 days ago
						
					
						
						
						
						deepseek-ai/DeepSeek-V3.2-Exp
						
						liked
								a model
							
						16 days ago
						
					
						
						
						
						Qwen/Qwen3-Embedding-8B-GGUF
						Organizations
None yet
