Spaces:

optimum
/

llm-perf-leaderboard

Running

App Files Files Community

What are differences between Max Allocated Memory, Max Reserved Memory and Max Used Memory?

#21

by zhiminy - opened Nov 18, 2023

Discussion

zhiminy

Nov 18, 2023

Could anyone explain it? Thanks!

IlyasMoutawwakil

Hugging Face Optimum org Nov 21, 2023

There are multiple ways to use CUDA memory, pytorch for example will allocate some memory for tensors, but will also reserve some for its computation, so reserved = (allocated + cached). It's important to look at both because the performance we observe depends on that reserved memory. that's also why sometimes you can load a model but OOM when you run it.
Finally the used~= (resrved + non-releasable) is technically what you'll observe on nvidia-smi, the most external view of memory usage.
More in https://pytorch.org/docs/stable/generated/torch.cuda.memory_stats.html

IlyasMoutawwakil changed discussion status to closed Jan 11, 2024

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment