ComfyUI will still cast all layers to fp8

by silveroxides - opened 17 days ago

17 days ago

With fp8_scaled models that have layers at higher precision, it will still convert these to fp8 when loading by default. You will need a node using custom ops in order to prevent this. I got around it in this manner with my custom node and the fp8_scaled variant of your distill models I made.

noisefloordev

12 days ago

With fp8_scaled models that have layers at higher precision, it will still convert these to fp8 when loading by default. You will need a node using custom ops in order to prevent this. I got around it in this manner with my custom node and the fp8_scaled variant of your distill models I made.

Any more specifics on this? I'm not sure where the right place to check is, but I tried dumping to_load in BaseModel.load_model_weights during UNETLoader, and I see:

Loading weight: blocks.0.cross_attn.k.bias, shape: torch.Size([5120]), dtype: torch.float32
Loading weight: blocks.0.cross_attn.k.weight, shape: torch.Size([5120, 5120]), dtype: torch.float8_e4m3fn

Maybe something is happening later, would be nice to have more info.

lightx2v

Owner 12 days ago

https://huggingface.co/lightx2v/Wan2.2-Distill-Models/blob/main/wan2.2_i2v_scale_fp8_comfyui.json

noisefloordev

11 days ago

That's a WVW workflow, all of my programmatic workflows are built around Comfy's built-in WAN nodes and WVW is its own separate universe. Would be helpful to know where ComfyUI is actually forcing layers to FP8 if it is, I haven't found it yet.

And whether it matters--are these layers just set to higher precision because they're small and it can't hurt, or was it found to help? I'm a little surprised that quantized models don't always leave smaller layers alone (or maybe they do and I haven't noticed).

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment