Bidhan Roy commited on
Commit
e3029ac
Β·
1 Parent(s): 85d1eb3

readme styling

Browse files
README.md CHANGED
@@ -11,21 +11,23 @@ tags:
11
  - flow-matching
12
  ---
13
 
14
- <img src="bagel_labs_logo.png" alt="Bagel Labs" width="120"/>
15
 
16
- # Paris: A Decentralized Trained Open-Weight Diffusion Model
17
 
18
- <a href="https://huggingface.co/bageldotcom/paris">
19
- <img src="https://img.shields.io/badge/%F0%9F%A4%97%20Like%20this-model-yellow?style=for-the-badge" alt="Like on Hugging Face">
20
  </a>
21
- <a href="https://github.com/bageldotcom/paris">
22
- <img src="https://img.shields.io/github/stars/bageldotcom/paris?style=for-the-badge&logo=github&label=Star%20on%20GitHub" alt="Star on GitHub">
23
  </a>
24
- <a href="https://github.com/bageldotcom/Paris/blob/main/paper.pdf">
25
- <img src="https://img.shields.io/badge/πŸ“„%20Read-Technical%20Report-red?style=for-the-badge" alt="Read Technical Report">
26
  </a>
27
 
28
- The world's first diffusion model trained entirely through decentralized computation. The model consists of 8 expert diffusion models (129M-605M parameters each) trained in complete isolation with no gradient, parameter, or intermediate activation synchronization, achieving superior parallelism efficiency over traditional methods while using 14Γ— less data and 16Γ— less compute than baselines. [Read our technical report](https://github.com/bageldotcom/Paris/blob/main/paper.pdf) to learn more.
 
 
29
 
30
  # Key Characteristics
31
 
@@ -42,7 +44,7 @@ The world's first diffusion model trained entirely through decentralized computa
42
 
43
  # Examples
44
 
45
- ![Paris Generation Examples](generated_images.png)
46
 
47
  *Text-conditioned image generation samples using Paris across diverse prompts and visual styles*
48
 
@@ -77,7 +79,7 @@ Paris implements fully decentralized training where:
77
  - Router trained post-hoc on full dataset for expert selection during inference
78
  - Complete computational independence eliminates requirements for specialized interconnects (InfiniBand, NVLink)
79
 
80
- ![Training Architecture](training_architecture.png)
81
 
82
  *Paris training phase showing complete asynchronous isolation across heterogeneous compute clusters. Unlike traditional parallelization strategies (Data/Pipeline/Model Parallelism), Paris requires zero communication during training.*
83
 
@@ -101,6 +103,10 @@ This zero-communication approach enables training on fragmented compute resource
101
  - **`top-2`**: Weighted ensemble of top-2 experts. Often best quality, 2Γ— inference cost.
102
  - **`full-ensemble`**: All 8 experts weighted by router. Highest compute (8Γ— cost).
103
 
 
 
 
 
104
  ---
105
 
106
  # Performance Metrics
@@ -164,8 +170,4 @@ This zero-communication approach enables training on fragmented compute resource
164
 
165
  MIT License – Open for research and commercial use.
166
 
167
- <div align="center">
168
-
169
- Made with ❀️ by [Bagel Labs](https://bagel.com)
170
-
171
- </div>
 
11
  - flow-matching
12
  ---
13
 
14
+ <img src="images/bagel_labs_logo.png" alt="Bagel Labs" height="28" style="margin-bottom: 20px;"/>
15
 
16
+ <h1 style="font-size: 28px; margin-bottom: 20px;">Paris: A Decentralized Trained Open-Weight Diffusion Model</h1>
17
 
18
+ <a href="https://huggingface.co/bageldotcom/paris" target="_blank">
19
+ <img src="https://img.shields.io/badge/πŸ€—_DOWNLOAD_MODEL_WEIGHTS-FFD21E?style=for-the-badge&logoColor=000000" alt="Download Model Weights" height="40">
20
  </a>
21
+ <a href="https://github.com/bageldotcom/paris" target="_blank">
22
+ <img src="https://img.shields.io/badge/⭐_STAR_ON_GITHUB-100000?style=for-the-badge&logo=github&logoColor=white" alt="Star on GitHub" height="40">
23
  </a>
24
+ <a href="https://github.com/bageldotcom/paris/blob/main/paper.pdf" target="_blank">
25
+ <img src="https://img.shields.io/badge/πŸ“„_READ_PAPER-FF6B6B?style=for-the-badge&logoColor=white" alt="Read Technical Report" height="40">
26
  </a>
27
 
28
+ <div style="margin-top: 20px;"></div>
29
+
30
+ The world's first open-weight diffusion model trained entirely through decentralized computation. The model consists of 8 expert diffusion models (129M-605M parameters each) trained in complete isolation with no gradient, parameter, or intermediate activation synchronization, achieving superior parallelism efficiency over traditional methods while using 14Γ— less data and 16Γ— less compute than baselines. [Read our technical report](https://github.com/bageldotcom/paris/blob/main/paper.pdf) to learn more.
31
 
32
  # Key Characteristics
33
 
 
44
 
45
  # Examples
46
 
47
+ ![Paris Generation Examples](images/generated_images.png)
48
 
49
  *Text-conditioned image generation samples using Paris across diverse prompts and visual styles*
50
 
 
79
  - Router trained post-hoc on full dataset for expert selection during inference
80
  - Complete computational independence eliminates requirements for specialized interconnects (InfiniBand, NVLink)
81
 
82
+ ![Training Architecture](images/training_architecture.png)
83
 
84
  *Paris training phase showing complete asynchronous isolation across heterogeneous compute clusters. Unlike traditional parallelization strategies (Data/Pipeline/Model Parallelism), Paris requires zero communication during training.*
85
 
 
103
  - **`top-2`**: Weighted ensemble of top-2 experts. Often best quality, 2Γ— inference cost.
104
  - **`full-ensemble`**: All 8 experts weighted by router. Highest compute (8Γ— cost).
105
 
106
+ ![Paris Inference Pipeline](images/paris_inference.png)
107
+
108
+ *Multi-expert inference pipeline showing router-based expert selection and three different routing strategies: Top-1 (fastest), Top-2 (best quality), and Full Ensemble (highest compute).*
109
+
110
  ---
111
 
112
  # Performance Metrics
 
170
 
171
  MIT License – Open for research and commercial use.
172
 
173
+ Made with ❀️ by <a href="https://twitter.com/bageldotcom" target="_blank"><img src="https://img.shields.io/badge/Bagel_Labs-1DA1F2?style=for-the-badge&logo=twitter&logoColor=white" alt="Follow Bagel Labs on Twitter" height="28"></a>
 
 
 
 
bagel_labs_logo.png β†’ images/bagel_labs_logo.png RENAMED
File without changes
generated_images.png β†’ images/generated_images.png RENAMED
File without changes
images/paris_inference.png ADDED

Git LFS Details

  • SHA256: b1306739ef8a36f1cc722ba688dfda297a351e56b7dd1f3d1cd883aaf42b955b
  • Pointer size: 131 Bytes
  • Size of remote file: 305 kB
training_architecture.png β†’ images/training_architecture.png RENAMED
File without changes