NIck Saga
nick007x
AI & ML interests
"The individual who delivers production-ready data better than corporations"
Recent Activity
updated
a dataset
about 11 hours ago
nick007x/eevblog-posts
published
a dataset
about 17 hours ago
nick007x/eevblog-posts
posted
an
update
12 days ago
๐ Hey i have Just uploaded 2 new datasets for code and scientific reasoning models:
1. ArXiv Papers (4.6TB) A massive scientific corpus with papers and metadata across all domains.Perfect for training models on academic reasoning, literature review, and scientific knowledge mining. ๐Link: https://huggingface.co/datasets/nick007x/arxiv-papers
2. GitHub Code 2025 (1 TB)a comprehensive code dataset for code generation and analysis tasks. mostly contains GitHub's high quality top 1 million repos above 2 stars ๐Link: https://huggingface.co/datasets/nick007x/github-code-2025