Neel Jain's picture

3 19 9

Neel Jain

nsjain

·

neelsjain

AI & ML interests

None yet

Recent Activity

upvoted a paper about 1 month ago

Strategic Dishonesty Can Undermine AI Safety Evaluations of Frontier LLM

authored a paper 2 months ago

Refusal Tokens: A Simple Way to Calibrate Refusals in Large Language Models

authored a paper 2 months ago

DynaGuard: A Dynamic Guardrail Model With User-Defined Policies

View all activity

Organizations

authored 3 papers 2 months ago

Refusal Tokens: A Simple Way to Calibrate Refusals in Large Language Models

Paper • 2412.06748 • Published Dec 9, 2024 • 2

DynaGuard: A Dynamic Guardrail Model With User-Defined Policies

Paper • 2509.02563 • Published Sep 2 • 20

Exploiting Sparsity for Long Context Inference: Million Token Contexts on Commodity GPUs

Paper • 2502.06766 • Published Feb 10

authored a paper 9 months ago

Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach

Paper • 2502.05171 • Published Feb 7 • 150

authored 7 papers over 1 year ago

LiveBench: A Challenging, Contamination-Free LLM Benchmark

Paper • 2406.19314 • Published Jun 27, 2024 • 23

GenQA: Generating Millions of Instructions from a Handful of Prompts

Paper • 2406.10323 • Published Jun 14, 2024 • 5

Be like a Goldfish, Don't Memorize! Mitigating Memorization in Generative LLMs

Paper • 2406.10209 • Published Jun 14, 2024 • 8

Transformers Can Do Arithmetic with the Right Embeddings

Paper • 2405.17399 • Published May 27, 2024 • 54

Hard Prompts Made Easy: Gradient-Based Discrete Optimization for Prompt Tuning and Discovery

Paper • 2302.03668 • Published Feb 7, 2023 • 1

NEFTune: Noisy Embeddings Improve Instruction Finetuning

Paper • 2310.05914 • Published Oct 9, 2023 • 14

Baseline Defenses for Adversarial Attacks Against Aligned Language Models

Paper • 2309.00614 • Published Sep 1, 2023 • 2

authored a paper over 2 years ago

Bring Your Own Data! Self-Supervised Evaluation for Large Language Models

Paper • 2306.13651 • Published Jun 23, 2023 • 15