feat: Add CPU support

#18

by gabegoodhart - opened Aug 28

base: refs/heads/main

←

from: refs/pr/18

Discussion Files changed

+38

-12

gabegoodhart

Aug 28

Description

This PR adds support to modeling_nemotron.py for running inference on CPU. This is a cleaned up version of the edits I made while working on support in llama.cpp.

Changes

Handle failed imports of rmsnorm_fn
Add un-optimized implementation of MambaRMSNormGated.forward
Fix NemotronHMamba2Mixer.torch_forward to use repeat_interleaved for B and C (see discussion here)

feat: Add CPU support3ef26b4b

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Ready to merge

This branch is ready to get merged automatically.

· Sign up or log in to comment