Spaces:

kelvin-t-lu
/

chatbot

Paused

App Files Files Community

chatbot / docs /LINKS.md

kelvin-t-lu

init

dbd2ac6 about 2 years ago

preview code

raw

history blame contribute delete

13 kB

	### Code to consider including:
	[flan-alpaca](https://github.com/declare-lab/flan-alpaca)<br />
	[text-generation-webui](https://github.com/oobabooga/text-generation-webui)<br />
	[minimal-llama](https://github.com/zphang/minimal-llama/)<br />
	[finetune GPT-NeoX](https://nn.labml.ai/neox/samples/finetune.html)<br />
	[GPTQ-for_LLaMa](https://github.com/qwopqwop200/GPTQ-for-LLaMa/compare/cuda...Digitous:GPTQ-for-GPT-NeoX:main)<br />
	[OpenChatKit on multi-GPU](https://github.com/togethercomputer/OpenChatKit/issues/20)<br />
	[Non-Causal LLM](https://huggingface.co/docs/transformers/main/en/model_doc/gptj#transformers.GPTJForSequenceClassification)<br />
	[OpenChatKit_Offload](https://github.com/togethercomputer/OpenChatKit/commit/148b5745a57a6059231178c41859ecb09164c157)<br />
	[Flan-alpaca](https://github.com/declare-lab/flan-alpaca/blob/main/training.py)<br />

	### Some open source models:
	[GPT-NeoXT-Chat-Base-20B](https://huggingface.co/togethercomputer/GPT-NeoXT-Chat-Base-20B/tree/main)<br />
	[GPT-NeoX](https://huggingface.co/docs/transformers/model_doc/gpt_neox)<br />
	[GPT-NeoX-20B](https://huggingface.co/EleutherAI/gpt-neox-20b)<br />
	[Pythia-6.9B](https://huggingface.co/EleutherAI/pythia-6.9b)<br />
	[Pythia-12B](https://huggingface.co/EleutherAI/neox-ckpt-pythia-12b)<br />
	[Flan-T5-XXL](https://huggingface.co/google/flan-t5-xxl)<br />
	[GPT-J-Moderation-6B](https://huggingface.co/togethercomputer/GPT-JT-Moderation-6B)<br />
	[OIG safety models](https://laion.ai/blog/oig-dataset/#safety-models)<br />
	[BigScience-mT0](https://huggingface.co/mT0)<br />
	[BigScience-XP3](https://huggingface.co/datasets/bigscience/xP3)<br />
	[BigScience-Bloomz](https://huggingface.co/bigscience/bloomz)<br />

	### Some create commons models that would be interesting to use:
	[Galactica-120B](https://huggingface.co/facebook/galactica-120b)<br />
	[LLaMa-small-pt](https://huggingface.co/decapoda-research/llama-smallint-pt)<br />
	[LLaMa-64b-4bit](https://huggingface.co/maderix/llama-65b-4bit/tree/main)<br />

	### Papers/Repos
	[Self-improve](https://arxiv.org/abs/2210.11610)<br />
	[Coding](https://arxiv.org/abs/2303.17491)<br />
	[self-reflection](https://arxiv.org/abs/2303.11366)<br />
	[RLHF](https://arxiv.org/abs/2204.05862)<br />
	[DERA](https://arxiv.org/abs/2303.17071)<br />
	[HAI Index Report 2023](https://aiindex.stanford.edu/report/)<br />
	[LLaMa](https://arxiv.org/abs/2302.13971)<br />
	[GLM-130B](https://github.com/THUDM/GLM-130B)<br />
	[RWKV RNN](https://github.com/BlinkDL/RWKV-LM)<br />
	[Toolformer](https://arxiv.org/abs/2302.04761)<br />
	[GPTQ](https://github.com/qwopqwop200/GPTQ-for-LLaMa)<br />
	[Retro](https://www.deepmind.com/publications/improving-language-models-by-retrieving-from-trillions-of-tokens)<br />
	[Clinical_outperforms](https://arxiv.org/abs/2302.08091)<br />
	[Chain-Of-Thought](https://github.com/amazon-science/mm-cot)<br />
	[scaling law1](https://arxiv.org/abs/2203.15556)<br />
	[Big-bench](https://github.com/google/BIG-bench)<br />
	[Natural-Instructions](https://github.com/allenai/natural-instructions)<br />

	### Other projects:
	[StackLLaMa](https://huggingface.co/blog/stackllama)<br />
	[Alpaca-CoT](https://github.com/PhoebusSi/alpaca-CoT)<br />
	[ColossalAIChat](https://github.com/hpcaitech/ColossalAI/tree/main/applications/Chat)<br />
	[EasyLM](https://github.com/young-geng/EasyLM.git)<br />
	[Koala](https://bair.berkeley.edu/blog/2023/04/03/koala/)<br />
	[Vicuna](https://vicuna.lmsys.org/)<br />
	[Flan-Alpaca](https://github.com/declare-lab/flan-alpaca)<br />
	[FastChat](https://chat.lmsys.org/)<br />
	[alpaca-lora](https://github.com/h2oai/alpaca-lora)<br />
	[alpaca.http](https://github.com/Nuked88/alpaca.http)<br />
	[chatgpt-retrieval-pllugin](https://github.com/openai/chatgpt-retrieval-plugin)<br />
	[subtl.ai docs search on private docs](https://www.subtl.ai/)<br />
	[gertel](https://gretel.ai/)<br />
	[alpaca_lora_4bit](https://github.com/johnsmith0031/alpaca_lora_4bit)<br />
	[alpaca_lora_4bit_readme](https://github.com/s4rduk4r/alpaca_lora_4bit_readme)<br />
	[code alpaca](https://github.com/sahil280114/codealpaca)<br />
	[serge](https://github.com/nsarrazin/serge)<br />
	[BlinkDL](https://huggingface.co/spaces/BlinkDL/ChatRWKV-gradio)<br />
	[RWKV-LM](https://github.com/BlinkDL/RWKV-LM)<br />
	[MosaicCM](https://github.com/mosaicml/examples#large-language-models-llms)<br />
	[OpenAI Plugins](https://openai.com/blog/chatgpt-plugins)<br />
	[GPT3.5-Turbo-PGVector](https://github.com/gannonh/gpt3.5-turbo-pgvector)<br />
	[LLaMa-Adapter](https://github.com/ZrrSkywalker/LLaMA-Adapter)<br />
	[llama-index](https://github.com/jerryjliu/llama_index)<br />
	[minimal-llama](https://github.com/zphang/minimal-llama/)<br />
	[llama.cpp](https://github.com/ggerganov/llama.cpp)<br />
	[ggml](https://github.com/ggerganov/ggml)<br />
	[mmap](https://justine.lol/mmap/)<br />
	[lamma.cpp more](https://til.simonwillison.net/llms/llama-7b-m2)<br />
	[TargetedSummarization](https://github.com/helliun/targetedSummarization)<br />
	[OpenFlamingo](https://laion.ai/blog/open-flamingo/)<br />
	[Auto-GPT](https://github.com/Torantulino/Auto-GPT)<br />

	### Apache2/etc. Data
	[OIG 43M instructions](https://laion.ai/blog/oig-dataset/) [direct HF link](https://huggingface.co/datasets/laion/OIG)<br />
	[More on OIG](https://laion.ai/blog/oig-dataset/)<br />
	[DataSet Viewer](https://huggingface.co/datasets/viewer/?dataset=squad)<br />
	[Anthropic RLHF](https://huggingface.co/datasets/Anthropic/hh-rlhf)<br />
	[WebGPT_Comparisons](https://huggingface.co/datasets/openai/webgpt_comparisons)<br />
	[Self_instruct](https://github.com/yizhongw/self-instruct)<br />
	[20BChatModelData](https://github.com/togethercomputer/OpenDataHub)<br />

	### Apache2/MIT/BSD-3 Summarization Data
	[xsum for Summarization](https://huggingface.co/datasets/xsum)<br />
	[Apache2 Summarization](https://huggingface.co/datasets?task_categories=task_categories:summarization&license=license:apache-2.0&sort=downloads)<br />
	[MIT summarization](https://huggingface.co/datasets?task_categories=task_categories:summarization&license=license:mit&sort=downloads)<br />
	[BSD-3 summarization](https://huggingface.co/datasets?task_categories=task_categories:summarization&license=license:bsd-3-clause&sort=downloads)<br />
	[OpenRail](https://huggingface.co/datasets?task_categories=task_categories:summarization&license=license:openrail&sort=downloads)<br />
	[Summarize_from_feedback](https://huggingface.co/datasets/openai/summarize_from_feedback)<br />

	### Ambiguous License Data
	[GPT-4-LLM](https://github.com/Instruction-Tuning-with-GPT-4/GPT-4-LLM)<br />
	[GPT4All](https://huggingface.co/datasets/nomic-ai/gpt4all_prompt_generations)<br />
	[LinkGPT4](https://github.com/lm-sys/FastChat/issues/90#issuecomment-1493250773)<br />
	[ShareGPT52K](https://huggingface.co/datasets/RyokoAI/ShareGPT52K)<br />
	[ShareGPT_Vicuna](https://huggingface.co/datasets/anon8231489123/ShareGPT_Vicuna_unfiltered)<br />
	[ChatLogs](https://chatlogs.net/)<br />
	[Alpaca-CoT](https://github.com/PhoebusSi/alpaca-CoT)<br />
	[LaMini-LM](https://github.com/mbzuai-nlp/LaMini-LM)<br />

	### Non-commercial Data
	[GPT-3 based Alpaca Cleaned](https://github.com/gururise/AlpacaDataCleaned)<br />

	### Prompt ENGR
	[Prompt/P-tuning](https://github.com/huggingface/peft)<br />
	[Prompt/P-tuing Nemo/NVIDIA](https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/main/nlp/nemo_megatron/prompt_learning.html)<br />
	[Info](https://lilianweng.github.io/posts/2023-03-15-prompt-engineering/)<br />
	[Info2](https://github.com/dair-ai/Prompt-Engineering-Guide)<br />
	[Prompt-Tuning](https://arxiv.org/abs/2104.08691)<br />
	[P-tuning v2](https://arxiv.org/abs/2110.07602)<br />
	[babyagi](https://github.com/yoheinakajima/babyagi/blob/main/babyagi.py#L97-L134)<br />
	[APE](https://www.promptingguide.ai/techniques/ape)<br />

	### Validation
	[Bleu/Rouge/Meteor/Bert-Score](https://arize.com/blog-course/generative-ai-metrics-bleu-score/)<br />

	### Generate Hyperparameters
	[hot-to-generate](https://huggingface.co/blog/how-to-generate)<br />
	[Notes_on_Transformers Chpt5](https://christianjmills.com/posts/transformers-book-notes/chapter-5/index.html)<br />
	[Notes_on_Transformers_Chpt10](https://christianjmills.com/posts/transformers-book-notes/chapter-10/index.html)<br />

	### Embeddings
	[OpenAI Expensive?](https://medium.com/@nils_reimers/openai-gpt-3-text-embeddings-really-a-new-state-of-the-art-in-dense-text-embeddings-6571fe3ec9d9)<br />
	[Leaderboard](https://huggingface.co/spaces/mteb/leaderboard)<br />

	### Commercial products
	[OpenAI](https://platform.openai.com/docs/guides/fine-tuning/advanced-usage)<br />
	[OpenAI Tokenizer](https://platform.openai.com/tokenizer)<br />
	[OpenAI Playground](https://platform.openai.com/playground)<br />
	[OpenAI Chat](https://chat.openai.com/chat?)<br />
	[OpenAI GPT-4 Chat](https://chat.openai.com/chat?model=gpt-4)<br />
	[cohere](https://cohere.io/)<br />
	[coherefinetune](https://docs.cohere.ai/reference/finetune)<br />
	[DocsBotAI](https://docsbot.ai/)<br />
	[Perplexity](https://www.perplexity.ai/)<br />
	[VoiceFlow](https://www.voiceflow.com/)<br />
	[NLPCloud](https://nlpcloud.com/effectively-using-gpt-j-gpt-neo-gpt-3-alternatives-few-shot-learning.html)<br />

	### Multinode inference
	[FasterTransformer](https://github.com/triton-inference-server/fastertransformer_backend#multi-node-inference)<br />
	[Kubernetes Triton](https://developer.nvidia.com/blog/deploying-nvidia-triton-at-scale-with-mig-and-kubernetes/)<br />

	### Faster inference
	[text-generation-inference](https://github.com/huggingface/text-generation-inference)<br />
	[Optimum](https://github.com/huggingface/optimum)<br />

	### Semi-Open source Semi-Commercial products
	[OpenAssistant](https://open-assistant.io/)<br />
	[OpenAssistant Repo](https://github.com/LAION-AI/Open-Assistant)<br />
	[OpenChatKit](https://github.com/togethercomputer/OpenChatKit)<br />
	[OpenChatKit2](https://github.com/togethercomputer/OpenDataHub)<br />
	[OpenChatKit3](https://www.together.xyz/blog/openchatkit)<br />
	[OpenChatKit4](https://github.com/togethercomputer/OpenChatKit/blob/main/training/README.md#arguments)<br />
	[OpenChatKitPreview](https://api.together.xyz/open-chat?preview=1)<br />
	[langchain](https://python.langchain.com/en/latest/)<br />
	[langchain+pinecone](https://www.youtube.com/watch?v=nMniwlGyX-c)<br />

	### Q/A docs
	[HUMATA](https://www.humata.ai/)<br />
	[OSSCHat](https://osschat.io/)<br />
	[NeuralSearchCohere](https://txt.cohere.com/embedding-archives-wikipedia/)<br />
	[ue5](https://github.com/bublint/ue5-llama-lora)<br />

	### AutoGPT type projects
	[AgentGPT](https://github.com/reworkd/AgentGPT)<br />
	[Self-DEBUG](https://arxiv.org/abs/2304.05128)<br />
	[BabyAGI](https://github.com/yoheinakajima/babyagi/)<br />
	[AutoPR](https://github.com/irgolic/AutoPR)<br />

	### Cloud fine-tune
	[AWS](https://docs.aws.amazon.com/sagemaker/latest/dg/jumpstart-fine-tune.html)<br />
	[AWS2](https://aws.amazon.com/blogs/machine-learning/training-large-language-models-on-amazon-sagemaker-best-practices/)<br />

	### Chatbots:
	[GPT4ALL Chat](https://github.com/nomic-ai/gpt4all-chat)<br />
	[GLT4ALL](https://github.com/nomic-ai/gpt4all)<br />
	[OASSST](https://open-assistant.io/chat)<br />
	[FastChat](https://github.com/lm-sys/FastChat)<br />
	[Dolly](https://huggingface.co/spaces/HuggingFaceH4/databricks-dolly)<br />
	[HF Instructions](https://huggingface.co/spaces/HuggingFaceH4/instruction-model-outputs-filtered)<br />
	[DeepSpeed Chat](https://github.com/microsoft/DeepSpeedExamples/tree/master/applications/DeepSpeed-Chat)<br />
	[LoraChat](https://github.com/bupticybee/FastLoRAChat)<br />
	[Tabby](https://github.com/TabbyML/tabby)<br />
	[TalkToModel](https://github.com/dylan-slack/TalkToModel)<br />
	[You.com](https://you.com/)<br />

	### LangChain or Agent related
	[Gradio Tools](https://github.com/freddyaboulton/gradio-tools)<br />
	[LLM Agents](https://blog.langchain.dev/gradio-llm-agents/)<br />
	[Meta Prompt](https://github.com/mbchang/meta-prompt)<br />
	[HF Agents](https://huggingface.co/docs/transformers/transformers_agents)
	[HF Agents Collab](https://colab.research.google.com/drive/1c7MHD-T1forUPGcC_jlwsIptOzpG3hSj)
	[Einstein GPT](https://www.salesforce.com/products/einstein/overview/?d=cta-body-promo-8)
	[SMOL-AI](https://github.com/smol-ai/developer)
	[Pandas-AI](https://github.com/gventuri/pandas-ai/)

	### Summaries
	[LLMs](https://github.com/Mooler0410/LLMsPracticalGuide)<br />

	### Deployment
	[MLC-LLM](https://github.com/mlc-ai/mlc-llm)<br />

	### Evaluations
	[LMSYS (check for latest glob)](https://lmsys.org/blog/2023-05-25-leaderboard/)<br />
	[LMSYS Chatbot Arena](https://chat.lmsys.org/?arena)<br />
	[LMSYS Add model](https://github.com/lm-sys/FastChat/blob/main/docs/arena.md#how-to-add-a-new-model)<br />
	[NLL](https://blog.gopenai.com/lmflow-benchmark-an-automatic-evaluation-framework-for-open-source-llms-ef5c6f142418)<br />
	[HackAPrompt](https://www.aicrowd.com/challenges/hackaprompt-2023/leaderboards)<br />