Improve model card: Add pipeline tag, paper, GitHub link, and description

#1
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +17 -2
README.md CHANGED
@@ -1,12 +1,27 @@
1
  ---
2
- license: mit
3
  datasets:
4
  - liuganghuggingface/demodiff_downstream
 
5
  tags:
6
  - chemistry
7
  - biology
 
8
  ---
9
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
10
  ### Model Configuration
11
 
12
  | Parameter | Value | Description |
@@ -20,4 +35,4 @@ tags:
20
  | **task_name** | `pretrain` | Task type for model training. |
21
  | **tokenizer_name** | `pretrain` | Tokenizer used for model input. |
22
  | **vocab_ring_len** | 300 | Length of the circular vocabulary window. |
23
- | **vocab_size** | 3000 | Total vocabulary size. |
 
1
  ---
 
2
  datasets:
3
  - liuganghuggingface/demodiff_downstream
4
+ license: mit
5
  tags:
6
  - chemistry
7
  - biology
8
+ pipeline_tag: graph-ml
9
  ---
10
 
11
+ # DemoDiff: Graph Diffusion Transformers are In-Context Molecular Designers
12
+
13
+ This repository contains the DemoDiff model, a diffusion-based molecular foundation model for **in-context inverse molecular design**, as presented in the paper [Graph Diffusion Transformers are In-Context Molecular Designers](https://huggingface.co/papers/2510.08744).
14
+
15
+ DemoDiff leverages graph diffusion transformers to generate molecules based on contextual examples, enabling few-shot molecular design across diverse chemical tasks without task-specific fine-tuning. It introduces demonstration-conditioned diffusion models, which define task contexts using a small set of molecule-score examples instead of text descriptions to guide a denoising Transformer for molecule generation. A novel molecular tokenizer with Node Pair Encoding is developed for scalable pretraining, representing molecules at the motif level.
16
+
17
+ Code: https://github.com/liugangcode/DemoDiff
18
+
19
+ ## 🌟 Key Features
20
+
21
+ - **In-Context Learning**: Generate molecules using only contextual examples (no fine-tuning required)
22
+ - **Graph-Based Tokenization**: Novel molecular graph tokenization with BPE-style vocabulary
23
+ - **Comprehensive Benchmarks**: 30+ downstream tasks covering drug discovery, docking, and polymer design
24
+
25
  ### Model Configuration
26
 
27
  | Parameter | Value | Description |
 
35
  | **task_name** | `pretrain` | Task type for model training. |
36
  | **tokenizer_name** | `pretrain` | Tokenizer used for model input. |
37
  | **vocab_ring_len** | 300 | Length of the circular vocabulary window. |
38
+ | **vocab_size** | 3000 | Total vocabulary size. |