Simplify GeneratorExit handling - let it propagate and rely on finally block 6dc4bdb Luigi commited on Oct 12
Fix GeneratorExit exception in cancel generation by adding proper exception handling 4bc617b Luigi commited on Oct 12
Improve cancel generation with robust UI state management and orchestrator pattern 9ac7f36 Luigi commited on Oct 12
Fix cancel generation to gracefully stop ongoing response generation a73d8f4 Luigi commited on Oct 12
Fix dynamic_shapes kwargs to match inputs structure for AOT compilation 273acf8 Luigi commited on Oct 12
Fix AOT compilation dynamic_shapes to match expected arg names for torch.export.export 0a99dfc Luigi commited on Oct 12
Improve model size detection: replace ad-hoc string parsing with reliable params_b field in MODELS dict ab92e0d Luigi commited on Oct 12
Set better defaults for free-tier users: Qwen3-1.7B model, 1024 max tokens, search disabled 2cae073 Luigi commited on Oct 12
Adjust duration estimation for H200 performance - reduce conservative estimates de766da Luigi commited on Oct 12
Use actual parameter count for AOT decision instead of string matching e3e334f Luigi commited on Oct 12
Make AOT compilation conditional for models >= 2B parameters to optimize free tier usage 4500f92 Luigi commited on Oct 12
disable two models that cannot run or too run too slowly on hf spaces with zerogpu 3dc7ced Luigi commited on Oct 11