Spaces:
Sleeping
Sleeping
File size: 5,264 Bytes
9c37331 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 |
{
"cells": [
{
"cell_type": "markdown",
"id": "04cabe4c",
"metadata": {},
"source": [
"Uncommend and run if dependencies are not installed"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "cc4d2b9b",
"metadata": {},
"outputs": [],
"source": [
"# !pip install -q pyyaml\n",
"# !pip install -q requests\n",
"# !pip install -q dotenv\n",
"# !pip install -qU langchain-community\n",
"# !pip install -q pypdf\n",
"# %pip install -qU langchain-groq\n",
"# !pip install -q chromadb\n",
"# !pip install -q sentence-transformers"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "7cdfaebc",
"metadata": {},
"outputs": [],
"source": [
"import sys\n",
"import os\n",
"\n",
"project_root = os.path.abspath(\"..\") # adjust this depending on where your notebook lives\n",
"if project_root not in sys.path:\n",
" sys.path.insert(0, project_root)\n"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "72e187e0",
"metadata": {},
"outputs": [],
"source": [
"from src.pipeline import ChatPipeline"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "f79416f1",
"metadata": {},
"outputs": [],
"source": [
"from src.utils import load_config"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "ba557b13",
"metadata": {},
"outputs": [],
"source": [
"cp = ChatPipeline()"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "49dc2580",
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"d:\\Thesis\\Vinayak Rana\\LLM\\RAG\\src\\embedding.py:16: LangChainDeprecationWarning: The class `HuggingFaceEmbeddings` was deprecated in LangChain 0.2.2 and will be removed in 1.0. An updated version of the class exists in the :class:`~langchain-huggingface package and should be used instead. To use it run `pip install -U :class:`~langchain-huggingface` and import as `from :class:`~langchain_huggingface import HuggingFaceEmbeddings``.\n",
" return HuggingFaceEmbeddings(model_name=self.model_name)\n",
"c:\\Users\\vinny\\Miniconda3\\envs\\scholarchatbot\\lib\\site-packages\\tqdm\\auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html\n",
" from .autonotebook import tqdm as notebook_tqdm\n",
"d:\\Thesis\\Vinayak Rana\\LLM\\RAG\\src\\pipeline.py:79: LangChainDeprecationWarning: Since Chroma 0.4.x the manual persistence method is no longer supported as docs are automatically persisted.\n",
" vector_store.persist()\n",
"d:\\Thesis\\Vinayak Rana\\LLM\\RAG\\llm\\answer_generator.py:23: LangChainDeprecationWarning: Please see the migration guide at: https://python.langchain.com/docs/versions/migrating_memory/\n",
" self.memory = ConversationBufferWindowMemory(\n"
]
}
],
"source": [
"cp.setup(arxiv_id=\"2407.05040\")"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "ca77354b",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'Based on the provided context, here\\'s a differentiation between Self-Instruct, Evol-Instruct, and OSSInstruct:\\n\\n1. **Self-Instruct**: This technique is used to align language models with self-generated instructions. It involves generating instruction-following data points through the Self-Instruct technique, which is utilized in Codealpaca and CodeLlama. The Self-Instruct technique is described in the paper \"Self-instruct: Aligning language models with self-generated instructions\" by Yizhong Wang et al. (2022).\\n\\n2. **Evol-Instruct**: This technique is used to evolve instruction-following data in both depth and breadth dimensions. It is employed in Wizardcoder to further evolve the Codealpaca dataset. The Evol-Instruct method is described in the paper \"EvolInstruct\" by Can Xu et al. (2023a).\\n\\n3. **OSSInstruct**: This technique is used to create instruction-following data from unlabeled open-source code snippets. It is employed in Magicoder to construct a method. The OSSInstruct technique is not described in detail in the provided context, but it is mentioned as a distinct method used in Magicoder.\\n\\nIn summary, Self-Instruct generates instruction-following data points, Evol-Instruct evolves instruction-following data, and OSSInstruct creates instruction-following data from open-source code snippets.'"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"cp.query(\"can you differentiate between self instruct , evol instruct and OSS ?\")"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "scholarchatbot",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.18"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
|