Spaces:
Paused
Paused
| ### Code to consider including: | |
| [flan-alpaca](https://github.com/declare-lab/flan-alpaca)<br /> | |
| [text-generation-webui](https://github.com/oobabooga/text-generation-webui)<br /> | |
| [minimal-llama](https://github.com/zphang/minimal-llama/)<br /> | |
| [finetune GPT-NeoX](https://nn.labml.ai/neox/samples/finetune.html)<br /> | |
| [GPTQ-for_LLaMa](https://github.com/qwopqwop200/GPTQ-for-LLaMa/compare/cuda...Digitous:GPTQ-for-GPT-NeoX:main)<br /> | |
| [OpenChatKit on multi-GPU](https://github.com/togethercomputer/OpenChatKit/issues/20)<br /> | |
| [Non-Causal LLM](https://huggingface.co/docs/transformers/main/en/model_doc/gptj#transformers.GPTJForSequenceClassification)<br /> | |
| [OpenChatKit_Offload](https://github.com/togethercomputer/OpenChatKit/commit/148b5745a57a6059231178c41859ecb09164c157)<br /> | |
| [Flan-alpaca](https://github.com/declare-lab/flan-alpaca/blob/main/training.py)<br /> | |
| ### Some open source models: | |
| [GPT-NeoXT-Chat-Base-20B](https://huggingface.co/togethercomputer/GPT-NeoXT-Chat-Base-20B/tree/main)<br /> | |
| [GPT-NeoX](https://huggingface.co/docs/transformers/model_doc/gpt_neox)<br /> | |
| [GPT-NeoX-20B](https://huggingface.co/EleutherAI/gpt-neox-20b)<br /> | |
| [Pythia-6.9B](https://huggingface.co/EleutherAI/pythia-6.9b)<br /> | |
| [Pythia-12B](https://huggingface.co/EleutherAI/neox-ckpt-pythia-12b)<br /> | |
| [Flan-T5-XXL](https://huggingface.co/google/flan-t5-xxl)<br /> | |
| [GPT-J-Moderation-6B](https://huggingface.co/togethercomputer/GPT-JT-Moderation-6B)<br /> | |
| [OIG safety models](https://laion.ai/blog/oig-dataset/#safety-models)<br /> | |
| [BigScience-mT0](https://huggingface.co/mT0)<br /> | |
| [BigScience-XP3](https://huggingface.co/datasets/bigscience/xP3)<br /> | |
| [BigScience-Bloomz](https://huggingface.co/bigscience/bloomz)<br /> | |
| ### Some create commons models that would be interesting to use: | |
| [Galactica-120B](https://huggingface.co/facebook/galactica-120b)<br /> | |
| [LLaMa-small-pt](https://huggingface.co/decapoda-research/llama-smallint-pt)<br /> | |
| [LLaMa-64b-4bit](https://huggingface.co/maderix/llama-65b-4bit/tree/main)<br /> | |
| ### Papers/Repos | |
| [Self-improve](https://arxiv.org/abs/2210.11610)<br /> | |
| [Coding](https://arxiv.org/abs/2303.17491)<br /> | |
| [self-reflection](https://arxiv.org/abs/2303.11366)<br /> | |
| [RLHF](https://arxiv.org/abs/2204.05862)<br /> | |
| [DERA](https://arxiv.org/abs/2303.17071)<br /> | |
| [HAI Index Report 2023](https://aiindex.stanford.edu/report/)<br /> | |
| [LLaMa](https://arxiv.org/abs/2302.13971)<br /> | |
| [GLM-130B](https://github.com/THUDM/GLM-130B)<br /> | |
| [RWKV RNN](https://github.com/BlinkDL/RWKV-LM)<br /> | |
| [Toolformer](https://arxiv.org/abs/2302.04761)<br /> | |
| [GPTQ](https://github.com/qwopqwop200/GPTQ-for-LLaMa)<br /> | |
| [Retro](https://www.deepmind.com/publications/improving-language-models-by-retrieving-from-trillions-of-tokens)<br /> | |
| [Clinical_outperforms](https://arxiv.org/abs/2302.08091)<br /> | |
| [Chain-Of-Thought](https://github.com/amazon-science/mm-cot)<br /> | |
| [scaling law1](https://arxiv.org/abs/2203.15556)<br /> | |
| [Big-bench](https://github.com/google/BIG-bench)<br /> | |
| [Natural-Instructions](https://github.com/allenai/natural-instructions)<br /> | |
| ### Other projects: | |
| [StackLLaMa](https://huggingface.co/blog/stackllama)<br /> | |
| [Alpaca-CoT](https://github.com/PhoebusSi/alpaca-CoT)<br /> | |
| [ColossalAIChat](https://github.com/hpcaitech/ColossalAI/tree/main/applications/Chat)<br /> | |
| [EasyLM](https://github.com/young-geng/EasyLM.git)<br /> | |
| [Koala](https://bair.berkeley.edu/blog/2023/04/03/koala/)<br /> | |
| [Vicuna](https://vicuna.lmsys.org/)<br /> | |
| [Flan-Alpaca](https://github.com/declare-lab/flan-alpaca)<br /> | |
| [FastChat](https://chat.lmsys.org/)<br /> | |
| [alpaca-lora](https://github.com/h2oai/alpaca-lora)<br /> | |
| [alpaca.http](https://github.com/Nuked88/alpaca.http)<br /> | |
| [chatgpt-retrieval-pllugin](https://github.com/openai/chatgpt-retrieval-plugin)<br /> | |
| [subtl.ai docs search on private docs](https://www.subtl.ai/)<br /> | |
| [gertel](https://gretel.ai/)<br /> | |
| [alpaca_lora_4bit](https://github.com/johnsmith0031/alpaca_lora_4bit)<br /> | |
| [alpaca_lora_4bit_readme](https://github.com/s4rduk4r/alpaca_lora_4bit_readme)<br /> | |
| [code alpaca](https://github.com/sahil280114/codealpaca)<br /> | |
| [serge](https://github.com/nsarrazin/serge)<br /> | |
| [BlinkDL](https://huggingface.co/spaces/BlinkDL/ChatRWKV-gradio)<br /> | |
| [RWKV-LM](https://github.com/BlinkDL/RWKV-LM)<br /> | |
| [MosaicCM](https://github.com/mosaicml/examples#large-language-models-llms)<br /> | |
| [OpenAI Plugins](https://openai.com/blog/chatgpt-plugins)<br /> | |
| [GPT3.5-Turbo-PGVector](https://github.com/gannonh/gpt3.5-turbo-pgvector)<br /> | |
| [LLaMa-Adapter](https://github.com/ZrrSkywalker/LLaMA-Adapter)<br /> | |
| [llama-index](https://github.com/jerryjliu/llama_index)<br /> | |
| [minimal-llama](https://github.com/zphang/minimal-llama/)<br /> | |
| [llama.cpp](https://github.com/ggerganov/llama.cpp)<br /> | |
| [ggml](https://github.com/ggerganov/ggml)<br /> | |
| [mmap](https://justine.lol/mmap/)<br /> | |
| [lamma.cpp more](https://til.simonwillison.net/llms/llama-7b-m2)<br /> | |
| [TargetedSummarization](https://github.com/helliun/targetedSummarization)<br /> | |
| [OpenFlamingo](https://laion.ai/blog/open-flamingo/)<br /> | |
| [Auto-GPT](https://github.com/Torantulino/Auto-GPT)<br /> | |
| ### Apache2/etc. Data | |
| [OIG 43M instructions](https://laion.ai/blog/oig-dataset/) [direct HF link](https://huggingface.co/datasets/laion/OIG)<br /> | |
| [More on OIG](https://laion.ai/blog/oig-dataset/)<br /> | |
| [DataSet Viewer](https://huggingface.co/datasets/viewer/?dataset=squad)<br /> | |
| [Anthropic RLHF](https://huggingface.co/datasets/Anthropic/hh-rlhf)<br /> | |
| [WebGPT_Comparisons](https://huggingface.co/datasets/openai/webgpt_comparisons)<br /> | |
| [Self_instruct](https://github.com/yizhongw/self-instruct)<br /> | |
| [20BChatModelData](https://github.com/togethercomputer/OpenDataHub)<br /> | |
| ### Apache2/MIT/BSD-3 Summarization Data | |
| [xsum for Summarization](https://huggingface.co/datasets/xsum)<br /> | |
| [Apache2 Summarization](https://huggingface.co/datasets?task_categories=task_categories:summarization&license=license:apache-2.0&sort=downloads)<br /> | |
| [MIT summarization](https://huggingface.co/datasets?task_categories=task_categories:summarization&license=license:mit&sort=downloads)<br /> | |
| [BSD-3 summarization](https://huggingface.co/datasets?task_categories=task_categories:summarization&license=license:bsd-3-clause&sort=downloads)<br /> | |
| [OpenRail](https://huggingface.co/datasets?task_categories=task_categories:summarization&license=license:openrail&sort=downloads)<br /> | |
| [Summarize_from_feedback](https://huggingface.co/datasets/openai/summarize_from_feedback)<br /> | |
| ### Ambiguous License Data | |
| [GPT-4-LLM](https://github.com/Instruction-Tuning-with-GPT-4/GPT-4-LLM)<br /> | |
| [GPT4All](https://huggingface.co/datasets/nomic-ai/gpt4all_prompt_generations)<br /> | |
| [LinkGPT4](https://github.com/lm-sys/FastChat/issues/90#issuecomment-1493250773)<br /> | |
| [ShareGPT52K](https://huggingface.co/datasets/RyokoAI/ShareGPT52K)<br /> | |
| [ShareGPT_Vicuna](https://huggingface.co/datasets/anon8231489123/ShareGPT_Vicuna_unfiltered)<br /> | |
| [ChatLogs](https://chatlogs.net/)<br /> | |
| [Alpaca-CoT](https://github.com/PhoebusSi/alpaca-CoT)<br /> | |
| [LaMini-LM](https://github.com/mbzuai-nlp/LaMini-LM)<br /> | |
| ### Non-commercial Data | |
| [GPT-3 based Alpaca Cleaned](https://github.com/gururise/AlpacaDataCleaned)<br /> | |
| ### Prompt ENGR | |
| [Prompt/P-tuning](https://github.com/huggingface/peft)<br /> | |
| [Prompt/P-tuing Nemo/NVIDIA](https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/main/nlp/nemo_megatron/prompt_learning.html)<br /> | |
| [Info](https://lilianweng.github.io/posts/2023-03-15-prompt-engineering/)<br /> | |
| [Info2](https://github.com/dair-ai/Prompt-Engineering-Guide)<br /> | |
| [Prompt-Tuning](https://arxiv.org/abs/2104.08691)<br /> | |
| [P-tuning v2](https://arxiv.org/abs/2110.07602)<br /> | |
| [babyagi](https://github.com/yoheinakajima/babyagi/blob/main/babyagi.py#L97-L134)<br /> | |
| [APE](https://www.promptingguide.ai/techniques/ape)<br /> | |
| ### Validation | |
| [Bleu/Rouge/Meteor/Bert-Score](https://arize.com/blog-course/generative-ai-metrics-bleu-score/)<br /> | |
| ### Generate Hyperparameters | |
| [hot-to-generate](https://huggingface.co/blog/how-to-generate)<br /> | |
| [Notes_on_Transformers Chpt5](https://christianjmills.com/posts/transformers-book-notes/chapter-5/index.html)<br /> | |
| [Notes_on_Transformers_Chpt10](https://christianjmills.com/posts/transformers-book-notes/chapter-10/index.html)<br /> | |
| ### Embeddings | |
| [OpenAI Expensive?](https://medium.com/@nils_reimers/openai-gpt-3-text-embeddings-really-a-new-state-of-the-art-in-dense-text-embeddings-6571fe3ec9d9)<br /> | |
| [Leaderboard](https://huggingface.co/spaces/mteb/leaderboard)<br /> | |
| ### Commercial products | |
| [OpenAI](https://platform.openai.com/docs/guides/fine-tuning/advanced-usage)<br /> | |
| [OpenAI Tokenizer](https://platform.openai.com/tokenizer)<br /> | |
| [OpenAI Playground](https://platform.openai.com/playground)<br /> | |
| [OpenAI Chat](https://chat.openai.com/chat?)<br /> | |
| [OpenAI GPT-4 Chat](https://chat.openai.com/chat?model=gpt-4)<br /> | |
| [cohere](https://cohere.io/)<br /> | |
| [coherefinetune](https://docs.cohere.ai/reference/finetune)<br /> | |
| [DocsBotAI](https://docsbot.ai/)<br /> | |
| [Perplexity](https://www.perplexity.ai/)<br /> | |
| [VoiceFlow](https://www.voiceflow.com/)<br /> | |
| [NLPCloud](https://nlpcloud.com/effectively-using-gpt-j-gpt-neo-gpt-3-alternatives-few-shot-learning.html)<br /> | |
| ### Multinode inference | |
| [FasterTransformer](https://github.com/triton-inference-server/fastertransformer_backend#multi-node-inference)<br /> | |
| [Kubernetes Triton](https://developer.nvidia.com/blog/deploying-nvidia-triton-at-scale-with-mig-and-kubernetes/)<br /> | |
| ### Faster inference | |
| [text-generation-inference](https://github.com/huggingface/text-generation-inference)<br /> | |
| [Optimum](https://github.com/huggingface/optimum)<br /> | |
| ### Semi-Open source Semi-Commercial products | |
| [OpenAssistant](https://open-assistant.io/)<br /> | |
| [OpenAssistant Repo](https://github.com/LAION-AI/Open-Assistant)<br /> | |
| [OpenChatKit](https://github.com/togethercomputer/OpenChatKit)<br /> | |
| [OpenChatKit2](https://github.com/togethercomputer/OpenDataHub)<br /> | |
| [OpenChatKit3](https://www.together.xyz/blog/openchatkit)<br /> | |
| [OpenChatKit4](https://github.com/togethercomputer/OpenChatKit/blob/main/training/README.md#arguments)<br /> | |
| [OpenChatKitPreview](https://api.together.xyz/open-chat?preview=1)<br /> | |
| [langchain](https://python.langchain.com/en/latest/)<br /> | |
| [langchain+pinecone](https://www.youtube.com/watch?v=nMniwlGyX-c)<br /> | |
| ### Q/A docs | |
| [HUMATA](https://www.humata.ai/)<br /> | |
| [OSSCHat](https://osschat.io/)<br /> | |
| [NeuralSearchCohere](https://txt.cohere.com/embedding-archives-wikipedia/)<br /> | |
| [ue5](https://github.com/bublint/ue5-llama-lora)<br /> | |
| ### AutoGPT type projects | |
| [AgentGPT](https://github.com/reworkd/AgentGPT)<br /> | |
| [Self-DEBUG](https://arxiv.org/abs/2304.05128)<br /> | |
| [BabyAGI](https://github.com/yoheinakajima/babyagi/)<br /> | |
| [AutoPR](https://github.com/irgolic/AutoPR)<br /> | |
| ### Cloud fine-tune | |
| [AWS](https://docs.aws.amazon.com/sagemaker/latest/dg/jumpstart-fine-tune.html)<br /> | |
| [AWS2](https://aws.amazon.com/blogs/machine-learning/training-large-language-models-on-amazon-sagemaker-best-practices/)<br /> | |
| ### Chatbots: | |
| [GPT4ALL Chat](https://github.com/nomic-ai/gpt4all-chat)<br /> | |
| [GLT4ALL](https://github.com/nomic-ai/gpt4all)<br /> | |
| [OASSST](https://open-assistant.io/chat)<br /> | |
| [FastChat](https://github.com/lm-sys/FastChat)<br /> | |
| [Dolly](https://huggingface.co/spaces/HuggingFaceH4/databricks-dolly)<br /> | |
| [HF Instructions](https://huggingface.co/spaces/HuggingFaceH4/instruction-model-outputs-filtered)<br /> | |
| [DeepSpeed Chat](https://github.com/microsoft/DeepSpeedExamples/tree/master/applications/DeepSpeed-Chat)<br /> | |
| [LoraChat](https://github.com/bupticybee/FastLoRAChat)<br /> | |
| [Tabby](https://github.com/TabbyML/tabby)<br /> | |
| [TalkToModel](https://github.com/dylan-slack/TalkToModel)<br /> | |
| [You.com](https://you.com/)<br /> | |
| ### LangChain or Agent related | |
| [Gradio Tools](https://github.com/freddyaboulton/gradio-tools)<br /> | |
| [LLM Agents](https://blog.langchain.dev/gradio-llm-agents/)<br /> | |
| [Meta Prompt](https://github.com/mbchang/meta-prompt)<br /> | |
| [HF Agents](https://huggingface.co/docs/transformers/transformers_agents) | |
| [HF Agents Collab](https://colab.research.google.com/drive/1c7MHD-T1forUPGcC_jlwsIptOzpG3hSj) | |
| [Einstein GPT](https://www.salesforce.com/products/einstein/overview/?d=cta-body-promo-8) | |
| [SMOL-AI](https://github.com/smol-ai/developer) | |
| [Pandas-AI](https://github.com/gventuri/pandas-ai/) | |
| ### Summaries | |
| [LLMs](https://github.com/Mooler0410/LLMsPracticalGuide)<br /> | |
| ### Deployment | |
| [MLC-LLM](https://github.com/mlc-ai/mlc-llm)<br /> | |
| ### Evaluations | |
| [LMSYS (check for latest glob)](https://lmsys.org/blog/2023-05-25-leaderboard/)<br /> | |
| [LMSYS Chatbot Arena](https://chat.lmsys.org/?arena)<br /> | |
| [LMSYS Add model](https://github.com/lm-sys/FastChat/blob/main/docs/arena.md#how-to-add-a-new-model)<br /> | |
| [NLL](https://blog.gopenai.com/lmflow-benchmark-an-automatic-evaluation-framework-for-open-source-llms-ef5c6f142418)<br /> | |
| [HackAPrompt](https://www.aicrowd.com/challenges/hackaprompt-2023/leaderboards)<br /> | |