Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
121
86
904
Doron Adler
PRO
Norod78
Follow
esselte974's profile picture
pejwano's profile picture
Ale131313's profile picture
123 followers
·
268 following
https://linktr.ee/Norod78
Norod78
Norod
AI & ML interests
Fooling around with Generative machine learning models.
Recent Activity
posted
an
update
about 17 hours ago
Multilingual Tokenization Showdown Analyzing 12 LLM Tokenizers Across 204 Languages. First, I've created a dataset with Wikipedia's "Cat" article text in 272 languages: https://huggingface.co/datasets/Norod78/WikiCat-Multilingual For each language entry with at least 100 words, I tokenized the text using 12 tokenizers and calculated the "Characters per token" ratio and "Word per token" ratio. The higher this ratio is, the more information each token represents on average for that language (and perhaps allowing the llm to potentially learn more per-parameter if trained on a dataset of that language). You can see a slideshow summary of the results here: https://norod.github.io/wikicat-tokenizer-eval/tokenizer-slideshow.html I hope I interpreted the results correctly, I've made the code available on GitHub so you can re-create the raw results jsonl with this repo: https://github.com/Norod/wikicat-tokenizer-eval Post on X: https://x.com/Norod78/status/1984366900550266999
liked
a model
about 19 hours ago
MiniMaxAI/MiniMax-M2
liked
a Space
3 days ago
briaai/FIBO-demo
View all activity
Organizations
Norod78
's models
113
Sort: Recently updated
Norod78/Kippi_Ben_Kippod_FLUX
Text-to-Image
•
Updated
Aug 18, 2024
•
3
•
•
1
Norod78/Flux_1_Dev_LoRA_Paper-Cutout-Style
Text-to-Image
•
Updated
Aug 16, 2024
•
453
•
•
40
Norod78/pokirl-sdxl
Text-to-Image
•
Updated
Aug 12, 2024
•
2
•
Norod78/SmolLM-tokenizer-with-added-hebrew-14k
Updated
Jul 18, 2024
Norod78/gpt2-tokenizer-with-added-hebrew-14k
Updated
Jul 14, 2024
Norod78/Hebrew-GPT2-345M-Stage
Text Generation
•
0.4B
•
Updated
Jul 11, 2024
•
271
•
3
Norod78/SD15-Rubber-Duck-LoRA
Text-to-Image
•
Updated
Jul 4, 2024
Norod78/CoreML-MobileCLIP-S0
Updated
Jun 20, 2024
•
4
•
4
Norod78/kippi-ben-kippod-sdxl
Text-to-Image
•
Updated
Apr 15, 2024
•
8
•
•
1
Norod78/sdxl-emoji-lora
Text-to-Image
•
Updated
Apr 11, 2024
•
24
•
•
9
Norod78/cctv-stlye-sdxl
Text-to-Image
•
Updated
Apr 11, 2024
•
33
•
•
2
Norod78/world-of-warcraft-cinematic-style-sdxl
Text-to-Image
•
Updated
Apr 11, 2024
•
39
•
•
3
Norod78/SDXL-Psychemelt-style-LoRA
Text-to-Image
•
Updated
Feb 18, 2024
•
67
•
•
11
Norod78/SDXL-Fairy-Form-LoRA
Text-to-Image
•
Updated
Feb 15, 2024
•
•
6
Norod78/sd15-megaphone-lora
Text-to-Image
•
Updated
Feb 13, 2024
•
11
•
3
Norod78/fruits-and-vegetables-gone-bad-sdxl-lora
Text-to-Image
•
Updated
Feb 6, 2024
•
15
•
•
3
Norod78/SDXL-Below-Huddled-LoRA
Text-to-Image
•
Updated
Feb 3, 2024
•
•
2
Norod78/SD15-BambaBaby-LoRA
Text-to-Image
•
Updated
Jan 25, 2024
•
8
•
2
Norod78/SDXL-DollZ-Style-LoRA
Text-to-Image
•
Updated
Jan 22, 2024
•
1
•
•
1
Norod78/SDXL-LaundryArt-LoRA-r16
Text-to-Image
•
Updated
Jan 17, 2024
•
•
2
Norod78/SDXL-LaundryArt-LoRA-r32
Text-to-Image
•
Updated
Jan 17, 2024
•
8
•
•
6
Norod78/SDXL-HuWoof-LoRA
Text-to-Image
•
Updated
Jan 14, 2024
•
5
•
•
2
Norod78/sdxl-vintage-face-style-lora
Text-to-Image
•
Updated
Jan 11, 2024
•
10
•
•
1
Norod78/sdxl-chalkboarddrawing-lora
Text-to-Image
•
Updated
Jan 9, 2024
•
75
•
•
6
Norod78/sdxl-humeow-lora
Text-to-Image
•
Updated
Jan 3, 2024
•
5
•
•
14
Norod78/sdxl-humeow-lora-r16
Text-to-Image
•
Updated
Jan 3, 2024
•
2
•
•
1
Norod78/SDXL-YarnArtStyle-LoRA
Text-to-Image
•
Updated
Jan 2, 2024
•
19
•
•
42
Norod78/SDXL-ShoshiZohar-Lora
Text-to-Image
•
Updated
Dec 30, 2023
•
1
•
•
1
Norod78/dream-dis-pix-xl
Text-to-Image
•
Updated
Dec 25, 2023
•
2
•
•
4
Norod78/SDXL-xmasize-Lora
Text-to-Image
•
Updated
Dec 7, 2023
•
4
•
•
4
Previous
1
2
3
4
Next