readme styling

Files changed (5) hide show

README.md +18 -16
bagel_labs_logo.png → images/bagel_labs_logo.png +0 -0
generated_images.png → images/generated_images.png +0 -0
images/paris_inference.png +3 -0
training_architecture.png → images/training_architecture.png +0 -0

README.md CHANGED Viewed

@@ -11,21 +11,23 @@ tags:
 - flow-matching
 ---
-<img src="bagel_labs_logo.png" alt="Bagel Labs" width="120"/>
-# Paris: A Decentralized Trained Open-Weight Diffusion Model
-<a href="https://huggingface.co/bageldotcom/paris">
-  <img src="https://img.shields.io/badge/%F0%9F%A4%97%20Like%20this-model-yellow?style=for-the-badge" alt="Like on Hugging Face">
 </a>
-<a href="https://github.com/bageldotcom/paris">
-  <img src="https://img.shields.io/github/stars/bageldotcom/paris?style=for-the-badge&logo=github&label=Star%20on%20GitHub" alt="Star on GitHub">
 </a>
-<a href="https://github.com/bageldotcom/Paris/blob/main/paper.pdf">
-  <img src="https://img.shields.io/badge/📄%20Read-Technical%20Report-red?style=for-the-badge" alt="Read Technical Report">
 </a>
-The world's first diffusion model trained entirely through decentralized computation. The model consists of 8 expert diffusion models (129M-605M parameters each) trained in complete isolation with no gradient, parameter, or intermediate activation synchronization, achieving superior parallelism efficiency over traditional methods while using 14× less data and 16× less compute than baselines. [Read our technical report](https://github.com/bageldotcom/Paris/blob/main/paper.pdf) to learn more.
 # Key Characteristics
@@ -42,7 +44,7 @@ The world's first diffusion model trained entirely through decentralized computa
 # Examples
-![Paris Generation Examples](generated_images.png)
 *Text-conditioned image generation samples using Paris across diverse prompts and visual styles*
@@ -77,7 +79,7 @@ Paris implements fully decentralized training where:
 - Router trained post-hoc on full dataset for expert selection during inference
 - Complete computational independence eliminates requirements for specialized interconnects (InfiniBand, NVLink)
-![Training Architecture](training_architecture.png)
 *Paris training phase showing complete asynchronous isolation across heterogeneous compute clusters. Unlike traditional parallelization strategies (Data/Pipeline/Model Parallelism), Paris requires zero communication during training.*
@@ -101,6 +103,10 @@ This zero-communication approach enables training on fragmented compute resource
 - **`top-2`**: Weighted ensemble of top-2 experts. Often best quality, 2× inference cost.
 - **`full-ensemble`**: All 8 experts weighted by router. Highest compute (8× cost).
 ---
 # Performance Metrics
@@ -164,8 +170,4 @@ This zero-communication approach enables training on fragmented compute resource
 MIT License – Open for research and commercial use.
-<div align="center">
-Made with ❤️ by [Bagel Labs](https://bagel.com)
-</div>

 - flow-matching
 ---
+<img src="images/bagel_labs_logo.png" alt="Bagel Labs" height="28" style="margin-bottom: 20px;"/>
+<h1 style="font-size: 28px; margin-bottom: 20px;">Paris: A Decentralized Trained Open-Weight Diffusion Model</h1>
+<a href="https://huggingface.co/bageldotcom/paris" target="_blank">
+  <img src="https://img.shields.io/badge/🤗_DOWNLOAD_MODEL_WEIGHTS-FFD21E?style=for-the-badge&logoColor=000000" alt="Download Model Weights" height="40">
 </a>
+<a href="https://github.com/bageldotcom/paris" target="_blank">
+  <img src="https://img.shields.io/badge/⭐_STAR_ON_GITHUB-100000?style=for-the-badge&logo=github&logoColor=white" alt="Star on GitHub" height="40">
 </a>
+<a href="https://github.com/bageldotcom/paris/blob/main/paper.pdf" target="_blank">
+  <img src="https://img.shields.io/badge/📄_READ_PAPER-FF6B6B?style=for-the-badge&logoColor=white" alt="Read Technical Report" height="40">
 </a>
+<div style="margin-top: 20px;"></div>
+The world's first open-weight diffusion model trained entirely through decentralized computation. The model consists of 8 expert diffusion models (129M-605M parameters each) trained in complete isolation with no gradient, parameter, or intermediate activation synchronization, achieving superior parallelism efficiency over traditional methods while using 14× less data and 16× less compute than baselines. [Read our technical report](https://github.com/bageldotcom/paris/blob/main/paper.pdf) to learn more.
 # Key Characteristics
 # Examples
+![Paris Generation Examples](images/generated_images.png)
 *Text-conditioned image generation samples using Paris across diverse prompts and visual styles*
 - Router trained post-hoc on full dataset for expert selection during inference
 - Complete computational independence eliminates requirements for specialized interconnects (InfiniBand, NVLink)
+![Training Architecture](images/training_architecture.png)
 *Paris training phase showing complete asynchronous isolation across heterogeneous compute clusters. Unlike traditional parallelization strategies (Data/Pipeline/Model Parallelism), Paris requires zero communication during training.*
 - **`top-2`**: Weighted ensemble of top-2 experts. Often best quality, 2× inference cost.
 - **`full-ensemble`**: All 8 experts weighted by router. Highest compute (8× cost).
+![Paris Inference Pipeline](images/paris_inference.png)
+*Multi-expert inference pipeline showing router-based expert selection and three different routing strategies: Top-1 (fastest), Top-2 (best quality), and Full Ensemble (highest compute).*
 ---
 # Performance Metrics
 MIT License – Open for research and commercial use.
+Made with ❤️ by <a href="https://twitter.com/bageldotcom" target="_blank"><img src="https://img.shields.io/badge/Bagel_Labs-1DA1F2?style=for-the-badge&logo=twitter&logoColor=white" alt="Follow Bagel Labs on Twitter" height="28"></a>

bagel_labs_logo.png → images/bagel_labs_logo.png RENAMED Viewed

File without changes

generated_images.png → images/generated_images.png RENAMED Viewed

File without changes

images/paris_inference.png ADDED Viewed

Git LFS Details

SHA256: b1306739ef8a36f1cc722ba688dfda297a351e56b7dd1f3d1cd883aaf42b955b
Pointer size: 131 Bytes
Size of remote file: 305 kB

training_architecture.png → images/training_architecture.png RENAMED Viewed

File without changes