Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
121
86
904
Doron Adler
PRO
Norod78
Follow
ermu2001's profile picture
Vilen03's profile picture
selvivincent's profile picture
122 followers
·
268 following
https://linktr.ee/Norod78
Norod78
Norod
AI & ML interests
Fooling around with Generative machine learning models.
Recent Activity
posted
an
update
about 14 hours ago
Multilingual Tokenization Showdown Analyzing 12 LLM Tokenizers Across 204 Languages. First, I've created a dataset with Wikipedia's "Cat" article text in 272 languages: https://huggingface.co/datasets/Norod78/WikiCat-Multilingual For each language entry with at least 100 words, I tokenized the text using 12 tokenizers and calculated the "Characters per token" ratio and "Word per token" ratio. The higher this ratio is, the more information each token represents on average for that language (and perhaps allowing the llm to potentially learn more per-parameter if trained on a dataset of that language). You can see a slideshow summary of the results here: https://norod.github.io/wikicat-tokenizer-eval/tokenizer-slideshow.html I hope I interpreted the results correctly, I've made the code available on GitHub so you can re-create the raw results jsonl with this repo: https://github.com/Norod/wikicat-tokenizer-eval Post on X: https://x.com/Norod78/status/1984366900550266999
liked
a model
about 17 hours ago
MiniMaxAI/MiniMax-M2
liked
a Space
3 days ago
briaai/FIBO-demo
View all activity
Organizations
Norod78
's models
113
Sort: Recently updated
Norod78/SDXL-JojosoStyle-Lora-v2
Text-to-Image
•
Updated
Nov 29, 2023
•
58
•
•
7
Norod78/claymationx-sdxl-lora
Text-to-Image
•
Updated
Nov 29, 2023
•
19
•
•
6
Norod78/ClaymationX_LoRA
Text-to-Image
•
Updated
Nov 27, 2023
•
2
•
•
2
Norod78/SDXL-simpstyle-Lora-v2
Text-to-Image
•
Updated
Nov 26, 2023
•
19
•
•
2
Norod78/weird-fashion-show-outfits-sdxl-lora
Text-to-Image
•
Updated
Nov 26, 2023
•
18
•
•
13
Norod78/sdxl-PaperCutouts-Dreambooth
Text-to-Image
•
Updated
Nov 21, 2023
•
8
•
•
36
Norod78/sdxl-muppetshow-lora
Text-to-Image
•
Updated
Nov 14, 2023
•
3
•
•
15
Norod78/SDXL-PringlesTube-Lora
Text-to-Image
•
Updated
Nov 13, 2023
•
4
•
•
6
Norod78/sdxl-hearthstone-card-style-lora
Text-to-Image
•
Updated
Nov 10, 2023
•
8
•
•
7
Norod78/sdxl-arthur-show-lora
Text-to-Image
•
Updated
Nov 7, 2023
•
•
1
Norod78/sxl-laisha-magazine-cover-lora
Text-to-Image
•
Updated
Nov 3, 2023
•
•
3
Norod78/yet-another-sdxl-tattoo-lora
Text-to-Image
•
Updated
Nov 2, 2023
•
39
•
•
5
Norod78/sdxl-futurama-style-lora
Text-to-Image
•
Updated
Oct 6, 2023
•
6
•
•
2
Norod78/SDXL-BenderBot-LoRA
Text-to-Image
•
Updated
Oct 5, 2023
•
5
•
•
1
Norod78/SD15-Pumpkinhead-LoRA
Text-to-Image
•
Updated
Oct 5, 2023
Norod78/sdxl-pumpkin-head-lora
Text-to-Image
•
Updated
Oct 5, 2023
•
1
•
•
1
Norod78/SD15-IllusionDiffusionPattern-LoRA
Text-to-Image
•
Updated
Sep 20, 2023
•
16
•
25
Norod78/SDXL-simpstyle-Lora
Text-to-Image
•
Updated
Sep 19, 2023
•
9
•
•
17
Norod78/SDXL-Caricaturized-Lora
Text-to-Image
•
Updated
Sep 19, 2023
•
6
•
•
10
Norod78/SDXL-VintageMagStyle-Lora
Text-to-Image
•
Updated
Sep 19, 2023
•
20
•
•
20
Norod78/SDXL-LofiGirl-Lora
Text-to-Image
•
Updated
Sep 19, 2023
•
9
•
•
7
Norod78/english-sienfeld-distilgpt2
Text Generation
•
88.2M
•
Updated
Sep 14, 2023
•
5
Norod78/SDXL-StickerSheet-Lora
Text-to-Image
•
Updated
Aug 31, 2023
•
11
•
•
35
Norod78/SDXL-jojoso_style-Lora
Text-to-Image
•
Updated
Aug 30, 2023
•
12
•
•
4
Norod78/sdxl-BrainSlug-dreambooth
Text-to-Image
•
Updated
Aug 9, 2023
•
3
•
•
2
Norod78/sd15-bender-lora
Text-to-Image
•
Updated
Aug 9, 2023
•
6
•
1
Norod78/llama-hebrew-tokenizer-merged
Updated
Jul 27, 2023
Norod78/llama-hebrew-tokenizer-20k
Updated
Jul 26, 2023
•
1
Norod78/swin-muppet-faces
Image Classification
•
86.8M
•
Updated
Jul 19, 2023
•
2
Norod78/TinyStories-3M-val-Hebrew
Text Generation
•
41.8M
•
Updated
Jul 6, 2023
•
2
Previous
1
2
3
4
Next