|
|
--- |
|
|
title: README |
|
|
emoji: π |
|
|
colorFrom: pink |
|
|
colorTo: blue |
|
|
sdk: static |
|
|
pinned: true |
|
|
license: openrail |
|
|
--- |
|
|
|
|
|
The NDEM community provides pretrained models along with their checkpoint with the purpose of: |
|
|
|
|
|
- Studying the learning dynamics of the models |
|
|
- Studying how well these learning dynamics match brain learning dynamics |
|
|
|
|
|
Models are pretrained on the Jean-Zay state-owned supercluster: |
|
|
This work was granted access to the HPC resources of IDRIS under the allocations |
|
|
2023-AD011014524 and 2022-AD011013176R1 made by GENCI (P.Orhan). |
|
|
|
|
|
Models currently available are: |
|
|
|
|
|
- Wav2vec2 base model (https://huggingface.co/facebook/wav2vec2-base), but pretrained (no fine-tuning) on Librispeech (English speech), FMA (music), subset of audioset, or all of them together. It also includes a model pretrained on VoxPopuli french dataset. |
|
|
- Wav2vec2 tiny model, where we used only 3 transformer layers. Models' performances are surprisingly high. |
|
|
|
|
|
Scientific papers using the models provided in this repository: |
|
|
Orhan, P., Boubenec, Y., & King, J.-R. (2024). Algebraic structures emerge from the self-supervised learning of natural sounds. https://doi.org/10.1101/2024.03.13.584776 |
|
|
|
|
|
Models are pretrained using HuggingFace's trainer. |
|
|
These models pretraining are often shorter (100,000 steps compared to 400 000) than original pretraining because of resource scarcity. |
|
|
In my experience, most emergences I studied had happened before 100 000 steps. |
|
|
|
|
|
Known version compatibility issues for Wav2vec2: |
|
|
Some Wav2vec2 models are trained with a torch <=2.0.1, while other are trained with torch>2.1.1 |
|
|
This can create critical error when loading the model, at the step of loading the Wav2Vec2PositionalConvEmbedding, |
|
|
which uses different module for the weight_norm depending on the pretraining version. This is quite unfortunate. |
|
|
Consequently I recommend checking for the error message of HuggingFace with from_pretrained(), and changing your torch version if the weights can't be properly loaded. |
|
|
|
|
|
Models trained with 2.0.1 and prior version: |
|
|
|
|
|
model_type-base* |
|
|
|
|
|
model_type-tiny* |
|
|
|
|
|
Models trained with 2.1.1 and following version: |
|
|
|
|
|
model_type-large_X |
|
|
|
|
|
model_type-mini_X |
|
|
|