Molbap HF Staff commited on
Commit
d7db8a7
·
1 Parent(s): 68e70cf
Files changed (1) hide show
  1. dist/index.html +13 -9
dist/index.html CHANGED
@@ -8,22 +8,22 @@
8
  <script src="https://d3js.org/d3.v7.min.js"></script>
9
  <meta name="viewport" content="width=device-width, initial-scale=1">
10
  <meta charset="utf8">
11
- <title>Transformers Feature Showcase</title>
12
  <link rel="stylesheet" href="style.css">
13
  <link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/prism/1.29.0/themes/prism.min.css">
14
  </head>
15
  <body>
16
  <d-front-matter>
17
  <script id='distill-front-matter' type="text/json">{
18
- "title": "Transformers Feature Showcase",
19
- "description": "An interactive demonstration of transformers library features and design philosophy.",
20
  "published": "Aug 21, 2025",
21
  "authors": [{"author": "Pablo Montalvo", "authorURL": "https://huggingface.co/Molbap"}]
22
  }</script>
23
  </d-front-matter>
24
  <d-title>
25
- <h1>Transformers Feature Showcase</h1>
26
- <p>An interactive demonstration of transformers library features and design philosophy.</p>
27
  </d-title>
28
  <d-byline></d-byline>
29
  <d-article>
@@ -48,9 +48,13 @@
48
  </nav>
49
  </d-contents>
50
  <h2>Introduction</h2>
51
- <p>The <code>transformers</code> library, built with <code>PyTorch</code>, supports all state-of-the-art LLMs, many VLMs, task-specific vision language models, video models, audio models, table models, classical encoders, to a global count of almost 400 models. The name of the library itself is mostly majority driven as many models are not even transformers architectures, like Mamba/RWKV. Regardless, each of these is wrought by the research and engineering team that created them, then harmonized into a now famous interface, and callable with a simple <code>.from_pretrained</code>. Inference and training are supported. The library supports ML courses, cookbooks, and several thousands other open-source libraries depend on it. All models are tested as part of a daily CI ensuring their preservation and reproducibility. Most importantly, it is open-source and has been written by the community for a large part.</p>
52
- <p>The ML wave has not stopped, there’s more and more models being added. <code>Transformers</code> is widely used, and we read the feedback that users post. Whether it’s about a function that had 300+ keyword arguments, duplicated code and helpers, and mentions of <code>Copied from ... </code> everywhere, along with optimisation concerns. Text-only models are relatively tamed, but multimodal models remain to be harmonized.</p>
53
- <p>Here we will dissect what is the design philosophy of transformers, as a continuation from the existing older <a href="https://huggingface.co/docs/transformers/en/philosophy">philosophy</a> page, and an accompanying <a href="https://huggingface.co/blog/transformers-design-philosophy">blog post from 2022</a> . Some time ago I dare not say how long, we discussed with transformers maintainers about the state of things. A lot of recent developments were satisfactory, but if we were only talking about these, self-congratulation would be the only goalpost. Reflecting on this philosophy now, as models pile up, is essential and will drive new developments.</p>
 
 
 
 
54
  <h3>What you will learn</h3>
55
  <p>Every reader, whether an OSS maintainer, power user, or casual fine-tuner, will walk away knowing how to reason about the <code>transformers</code> code base, how to use it better, how to meaningfully contribute to it.
56
  This will also showcase new features you might have missed so you’ll be up-to-date.</p>
@@ -537,7 +541,7 @@ machinery is the <code>attention mask</code>, cause of confusion. Thankfully, we
537
 
538
  // Extract tenet text for tooltips
539
  const tenetTooltips = {
540
- 'source-of-truth': 'We should be a source of truth for all model definitions. Model implementations should be reliable, reproducible, and faithful to the original performances.',
541
  'one-model-one-file': 'All inference (and most of training, loss is separate, not a part of model) logic visible, top‑to‑bottom.',
542
  'code-is-product': 'Optimize for reading, diffing, and tweaking, our users are power users. Variables can be explicit, full words, even several words, readability is primordial.',
543
  'standardize-dont-abstract': 'If it\'s model behavior, keep it in the file; abstractions only for generic infra.',
 
8
  <script src="https://d3js.org/d3.v7.min.js"></script>
9
  <meta name="viewport" content="width=device-width, initial-scale=1">
10
  <meta charset="utf8">
11
+ <title>Scaling insanity: maintaining hundreds of model definitions</title>
12
  <link rel="stylesheet" href="style.css">
13
  <link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/prism/1.29.0/themes/prism.min.css">
14
  </head>
15
  <body>
16
  <d-front-matter>
17
  <script id='distill-front-matter' type="text/json">{
18
+ "title": "Scaling insanity: maintaining hundreds of model definitions",
19
+ "description": "A peek into software engineering for the transformers library",
20
  "published": "Aug 21, 2025",
21
  "authors": [{"author": "Pablo Montalvo", "authorURL": "https://huggingface.co/Molbap"}]
22
  }</script>
23
  </d-front-matter>
24
  <d-title>
25
+ <h1>Scaling insanity: maintaining hundreds of model definitions</h1>
26
+ <p>A peek into software engineering for the transformers library</p>
27
  </d-title>
28
  <d-byline></d-byline>
29
  <d-article>
 
48
  </nav>
49
  </d-contents>
50
  <h2>Introduction</h2>
51
+ <p>The <code>transformers</code> library, built with <code>PyTorch</code>, supports all state-of-the-art LLMs, many VLMs, task-specific vision language models, video models, audio models, table models, classical encoders, to a global count of almost 400 models.</p>
52
+ <p>The name of the library itself is mostly majority driven as many models are not even transformers architectures, like Mamba, Zamba, RWKV, and convolution-based models.</p>
53
+ <p>Regardless, each of these is wrought by the research and engineering team that created them, then harmonized into a now famous interface, and callable with a simple <code>.from_pretrained</code> command.</p>
54
+ <p>Inference works for all models, training is functional for most. The library is a foundation for many machine learning courses, cookbooks, and overall, several thousands other open-source libraries depend on it. All models are tested as part of a daily CI ensuring their preservation and reproducibility. Most importantly, it is <em>open-source</em> and has been written by the community for a large part.</p>
55
+ <p>This isn’t really to brag but to set the stakes: what does it take to keep such a ship afloat, made of so many moving, unrelated parts?</p>
56
+ <p>The ML wave has not stopped, there’s more and more models being added, at a steadily growing rate. <code>Transformers</code> is widely used, and we read the feedback that users post online. Whether it’s about a function that had 300+ keyword arguments, duplicated code and helpers, and mentions of <code>Copied from ... </code> everywhere, along with optimisation concerns. Text-only models are relatively tamed, but multimodal models remain to be harmonized.</p>
57
+ <p>Here we will dissect what is the new design philosophy of transformers, as a continuation from the existing older <a href="https://huggingface.co/docs/transformers/en/philosophy">philosophy</a> page, and an accompanying <a href="https://huggingface.co/blog/transformers-design-philosophy">blog post from 2022</a> . Some time ago I dare not say how long, we discussed with transformers maintainers about the state of things. A lot of recent developments were satisfactory, but if we were only talking about these, self-congratulation would be the only goalpost. Reflecting on this philosophy now, as models pile up, is essential and will drive new developments.</p>
58
  <h3>What you will learn</h3>
59
  <p>Every reader, whether an OSS maintainer, power user, or casual fine-tuner, will walk away knowing how to reason about the <code>transformers</code> code base, how to use it better, how to meaningfully contribute to it.
60
  This will also showcase new features you might have missed so you’ll be up-to-date.</p>
 
541
 
542
  // Extract tenet text for tooltips
543
  const tenetTooltips = {
544
+ 'source-of-truth': 'We aim to be a source of truth for all model definitions. Model implementations should be reliable, reproducible, and faithful to the original performances.',
545
  'one-model-one-file': 'All inference (and most of training, loss is separate, not a part of model) logic visible, top‑to‑bottom.',
546
  'code-is-product': 'Optimize for reading, diffing, and tweaking, our users are power users. Variables can be explicit, full words, even several words, readability is primordial.',
547
  'standardize-dont-abstract': 'If it\'s model behavior, keep it in the file; abstractions only for generic infra.',