Diff Interpretation Tuning
ttw commited on
Commit
4860468
·
verified ·
1 Parent(s): e47f102

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +7 -1
README.md CHANGED
@@ -1,12 +1,18 @@
1
  ---
2
  license: mit
 
 
 
 
 
 
3
  ---
4
 
5
  This repository contains the weight diffs and DIT adapters used in the paper [Learning to Interpret Weight Differences in Language Models (Goel et al. 2025)](https://arxiv.org/abs/2510.05092).
6
  This paper introduces *Diff Interpretation Tuning*, a method that trains a LoRA adapter than can be applied to a model to get it to describe its own finetuning induced modifications.
7
 
8
  To play around with the weight diffs and DIT adapters from the paper, please check out our [Google Colab demo notebook](https://colab.research.google.com/drive/12YD_9GRT-y_hFOBqXzyI4eN_lJGKiXwN?usp=sharing).
9
- The code used to train and evaluate the weight diffs and DIT adapters can be found on at [github.com/Aviously/diff-interpretation-tuning](https://github.com/Aviously/diff-interpretation-tuning).
10
 
11
  You can cite our work using the following bibtex
12
  ```
 
1
  ---
2
  license: mit
3
+ base_model:
4
+ - Qwen/Qwen3-1.7B
5
+ - Qwen/Qwen3-4B
6
+ - Qwen/Qwen3-8B
7
+ - google/gemma-3-1b-it
8
+ - google/gemma-3-4b-it
9
  ---
10
 
11
  This repository contains the weight diffs and DIT adapters used in the paper [Learning to Interpret Weight Differences in Language Models (Goel et al. 2025)](https://arxiv.org/abs/2510.05092).
12
  This paper introduces *Diff Interpretation Tuning*, a method that trains a LoRA adapter than can be applied to a model to get it to describe its own finetuning induced modifications.
13
 
14
  To play around with the weight diffs and DIT adapters from the paper, please check out our [Google Colab demo notebook](https://colab.research.google.com/drive/12YD_9GRT-y_hFOBqXzyI4eN_lJGKiXwN?usp=sharing).
15
+ The code used to train and evaluate the weight diffs and DIT adapters can be found at [github.com/Aviously/diff-interpretation-tuning](https://github.com/Aviously/diff-interpretation-tuning).
16
 
17
  You can cite our work using the following bibtex
18
  ```