Refusal Tokens: A Simple Way to Calibrate Refusals in Large Language Models Paper • 2412.06748 • Published Dec 9, 2024 • 2
DynaGuard: A Dynamic Guardrail Model With User-Defined Policies Paper • 2509.02563 • Published Sep 2 • 20
Exploiting Sparsity for Long Context Inference: Million Token Contexts on Commodity GPUs Paper • 2502.06766 • Published Feb 10
Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach Paper • 2502.05171 • Published Feb 7 • 150
LiveBench: A Challenging, Contamination-Free LLM Benchmark Paper • 2406.19314 • Published Jun 27, 2024 • 23
GenQA: Generating Millions of Instructions from a Handful of Prompts Paper • 2406.10323 • Published Jun 14, 2024 • 5
Be like a Goldfish, Don't Memorize! Mitigating Memorization in Generative LLMs Paper • 2406.10209 • Published Jun 14, 2024 • 8
Transformers Can Do Arithmetic with the Right Embeddings Paper • 2405.17399 • Published May 27, 2024 • 54
Hard Prompts Made Easy: Gradient-Based Discrete Optimization for Prompt Tuning and Discovery Paper • 2302.03668 • Published Feb 7, 2023 • 1
NEFTune: Noisy Embeddings Improve Instruction Finetuning Paper • 2310.05914 • Published Oct 9, 2023 • 14
Baseline Defenses for Adversarial Attacks Against Aligned Language Models Paper • 2309.00614 • Published Sep 1, 2023 • 2
Bring Your Own Data! Self-Supervised Evaluation for Large Language Models Paper • 2306.13651 • Published Jun 23, 2023 • 15