Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
wenhua cheng's picture
20 6 22

wenhua cheng

wenhuach
Tonic's profile picture eren5656's profile picture alharezcy's profile picture
Β·
  • wenhuach21

AI & ML interests

Model Compression, CV

Recent Activity

posted an update 1 day ago
πŸš€ AutoRound(https://github.com/intel/auto-round) is now supported by SGLang! After integrations with TorchAO, Transformers, and VLLM, AutoRound-quantized models are now officially compatible with SGLang β€” bringing faster and more flexible deployment to your LLM workflows. πŸ’‘ We’ve also enhanced the RTN mode (--iters 0), cutting quantization costs significantly for low-resource users. ⭐ Star our repo and stay tuned for more exciting updates!
reacted to their post with πŸš€ 1 day ago
AutoRound keeps evolving its LLM quantization algorithm! πŸš€ After enhancing W2A16 quantization, we now offer a fast algorithm to generate mixed bits/data-type schemes (~2mins for 8B models), great for MXFP4 and W2A16. Learn more: https://github.com/intel/auto-round/blob/main/docs/step_by_step.md#autoscheme
new activity 1 day ago
Intel/Ling-flash-2.0-gguf-q2ks-mixed-AutoRound:Inference with llama.cpp + Open WebUI gives repeating `?`
View all activity

Organizations

Intel's profile picture Need4Speed's profile picture Qwen's profile picture

wenhuach 's models

None public yet
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs