Spaces:
Paused
Paused
| <h1 align="center"> | |
| π LiteLLM | |
| </h1> | |
| <p align="center"> | |
| <p align="center"> | |
| <a href="https://render.com/deploy?repo=https://github.com/BerriAI/litellm" target="_blank" rel="nofollow"><img src="https://render.com/images/deploy-to-render-button.svg" alt="Deploy to Render"></a> | |
| <a href="https://railway.app/template/HLP0Ub?referralCode=jch2ME"> | |
| <img src="https://railway.app/button.svg" alt="Deploy on Railway"> | |
| </a> | |
| </p> | |
| <p align="center">Call all LLM APIs using the OpenAI format [Bedrock, Huggingface, VertexAI, TogetherAI, Azure, OpenAI, Groq etc.] | |
| <br> | |
| </p> | |
| <h4 align="center"><a href="https://docs.litellm.ai/docs/simple_proxy" target="_blank">LiteLLM Proxy Server (LLM Gateway)</a> | <a href="https://docs.litellm.ai/docs/hosted" target="_blank"> Hosted Proxy (Preview)</a> | <a href="https://docs.litellm.ai/docs/enterprise"target="_blank">Enterprise Tier</a></h4> | |
| <h4 align="center"> | |
| <a href="https://pypi.org/project/litellm/" target="_blank"> | |
| <img src="https://img.shields.io/pypi/v/litellm.svg" alt="PyPI Version"> | |
| </a> | |
| <a href="https://www.ycombinator.com/companies/berriai"> | |
| <img src="https://img.shields.io/badge/Y%20Combinator-W23-orange?style=flat-square" alt="Y Combinator W23"> | |
| </a> | |
| <a href="https://wa.link/huol9n"> | |
| <img src="https://img.shields.io/static/v1?label=Chat%20on&message=WhatsApp&color=success&logo=WhatsApp&style=flat-square" alt="Whatsapp"> | |
| </a> | |
| <a href="https://discord.gg/wuPM9dRgDw"> | |
| <img src="https://img.shields.io/static/v1?label=Chat%20on&message=Discord&color=blue&logo=Discord&style=flat-square" alt="Discord"> | |
| </a> | |
| </h4> | |
| LiteLLM manages: | |
| - Translate inputs to provider's `completion`, `embedding`, and `image_generation` endpoints | |
| - [Consistent output](https://docs.litellm.ai/docs/completion/output), text responses will always be available at `['choices'][0]['message']['content']` | |
| - Retry/fallback logic across multiple deployments (e.g. Azure/OpenAI) - [Router](https://docs.litellm.ai/docs/routing) | |
| - Set Budgets & Rate limits per project, api key, model [LiteLLM Proxy Server (LLM Gateway)](https://docs.litellm.ai/docs/simple_proxy) | |
| [**Jump to LiteLLM Proxy (LLM Gateway) Docs**](https://github.com/BerriAI/litellm?tab=readme-ov-file#openai-proxy---docs) <br> | |
| [**Jump to Supported LLM Providers**](https://github.com/BerriAI/litellm?tab=readme-ov-file#supported-providers-docs) | |
| π¨ **Stable Release:** Use docker images with the `-stable` tag. These have undergone 12 hour load tests, before being published. [More information about the release cycle here](https://docs.litellm.ai/docs/proxy/release_cycle) | |
| Support for more providers. Missing a provider or LLM Platform, raise a [feature request](https://github.com/BerriAI/litellm/issues/new?assignees=&labels=enhancement&projects=&template=feature_request.yml&title=%5BFeature%5D%3A+). | |
| # Usage ([**Docs**](https://docs.litellm.ai/docs/)) | |
| > [!IMPORTANT] | |
| > LiteLLM v1.0.0 now requires `openai>=1.0.0`. Migration guide [here](https://docs.litellm.ai/docs/migration) | |
| > LiteLLM v1.40.14+ now requires `pydantic>=2.0.0`. No changes required. | |
| <a target="_blank" href="https://colab.research.google.com/github/BerriAI/litellm/blob/main/cookbook/liteLLM_Getting_Started.ipynb"> | |
| <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/> | |
| </a> | |
| ```shell | |
| pip install litellm | |
| ``` | |
| ```python | |
| from litellm import completion | |
| import os | |
| ## set ENV variables | |
| os.environ["OPENAI_API_KEY"] = "your-openai-key" | |
| os.environ["ANTHROPIC_API_KEY"] = "your-anthropic-key" | |
| messages = [{ "content": "Hello, how are you?","role": "user"}] | |
| # openai call | |
| response = completion(model="openai/gpt-4o", messages=messages) | |
| # anthropic call | |
| response = completion(model="anthropic/claude-3-sonnet-20240229", messages=messages) | |
| print(response) | |
| ``` | |
| ### Response (OpenAI Format) | |
| ```json | |
| { | |
| "id": "chatcmpl-565d891b-a42e-4c39-8d14-82a1f5208885", | |
| "created": 1734366691, | |
| "model": "claude-3-sonnet-20240229", | |
| "object": "chat.completion", | |
| "system_fingerprint": null, | |
| "choices": [ | |
| { | |
| "finish_reason": "stop", | |
| "index": 0, | |
| "message": { | |
| "content": "Hello! As an AI language model, I don't have feelings, but I'm operating properly and ready to assist you with any questions or tasks you may have. How can I help you today?", | |
| "role": "assistant", | |
| "tool_calls": null, | |
| "function_call": null | |
| } | |
| } | |
| ], | |
| "usage": { | |
| "completion_tokens": 43, | |
| "prompt_tokens": 13, | |
| "total_tokens": 56, | |
| "completion_tokens_details": null, | |
| "prompt_tokens_details": { | |
| "audio_tokens": null, | |
| "cached_tokens": 0 | |
| }, | |
| "cache_creation_input_tokens": 0, | |
| "cache_read_input_tokens": 0 | |
| } | |
| } | |
| ``` | |
| Call any model supported by a provider, with `model=<provider_name>/<model_name>`. There might be provider-specific details here, so refer to [provider docs for more information](https://docs.litellm.ai/docs/providers) | |
| ## Async ([Docs](https://docs.litellm.ai/docs/completion/stream#async-completion)) | |
| ```python | |
| from litellm import acompletion | |
| import asyncio | |
| async def test_get_response(): | |
| user_message = "Hello, how are you?" | |
| messages = [{"content": user_message, "role": "user"}] | |
| response = await acompletion(model="openai/gpt-4o", messages=messages) | |
| return response | |
| response = asyncio.run(test_get_response()) | |
| print(response) | |
| ``` | |
| ## Streaming ([Docs](https://docs.litellm.ai/docs/completion/stream)) | |
| liteLLM supports streaming the model response back, pass `stream=True` to get a streaming iterator in response. | |
| Streaming is supported for all models (Bedrock, Huggingface, TogetherAI, Azure, OpenAI, etc.) | |
| ```python | |
| from litellm import completion | |
| response = completion(model="openai/gpt-4o", messages=messages, stream=True) | |
| for part in response: | |
| print(part.choices[0].delta.content or "") | |
| # claude 2 | |
| response = completion('anthropic/claude-3-sonnet-20240229', messages, stream=True) | |
| for part in response: | |
| print(part) | |
| ``` | |
| ### Response chunk (OpenAI Format) | |
| ```json | |
| { | |
| "id": "chatcmpl-2be06597-eb60-4c70-9ec5-8cd2ab1b4697", | |
| "created": 1734366925, | |
| "model": "claude-3-sonnet-20240229", | |
| "object": "chat.completion.chunk", | |
| "system_fingerprint": null, | |
| "choices": [ | |
| { | |
| "finish_reason": null, | |
| "index": 0, | |
| "delta": { | |
| "content": "Hello", | |
| "role": "assistant", | |
| "function_call": null, | |
| "tool_calls": null, | |
| "audio": null | |
| }, | |
| "logprobs": null | |
| } | |
| ] | |
| } | |
| ``` | |
| ## Logging Observability ([Docs](https://docs.litellm.ai/docs/observability/callbacks)) | |
| LiteLLM exposes pre defined callbacks to send data to Lunary, MLflow, Langfuse, DynamoDB, s3 Buckets, Helicone, Promptlayer, Traceloop, Athina, Slack | |
| ```python | |
| from litellm import completion | |
| ## set env variables for logging tools (when using MLflow, no API key set up is required) | |
| os.environ["LUNARY_PUBLIC_KEY"] = "your-lunary-public-key" | |
| os.environ["HELICONE_API_KEY"] = "your-helicone-auth-key" | |
| os.environ["LANGFUSE_PUBLIC_KEY"] = "" | |
| os.environ["LANGFUSE_SECRET_KEY"] = "" | |
| os.environ["ATHINA_API_KEY"] = "your-athina-api-key" | |
| os.environ["OPENAI_API_KEY"] = "your-openai-key" | |
| # set callbacks | |
| litellm.success_callback = ["lunary", "mlflow", "langfuse", "athina", "helicone"] # log input/output to lunary, langfuse, supabase, athina, helicone etc | |
| #openai call | |
| response = completion(model="openai/gpt-4o", messages=[{"role": "user", "content": "Hi π - i'm openai"}]) | |
| ``` | |
| # LiteLLM Proxy Server (LLM Gateway) - ([Docs](https://docs.litellm.ai/docs/simple_proxy)) | |
| Track spend + Load Balance across multiple projects | |
| [Hosted Proxy (Preview)](https://docs.litellm.ai/docs/hosted) | |
| The proxy provides: | |
| 1. [Hooks for auth](https://docs.litellm.ai/docs/proxy/virtual_keys#custom-auth) | |
| 2. [Hooks for logging](https://docs.litellm.ai/docs/proxy/logging#step-1---create-your-custom-litellm-callback-class) | |
| 3. [Cost tracking](https://docs.litellm.ai/docs/proxy/virtual_keys#tracking-spend) | |
| 4. [Rate Limiting](https://docs.litellm.ai/docs/proxy/users#set-rate-limits) | |
| ## π Proxy Endpoints - [Swagger Docs](https://litellm-api.up.railway.app/) | |
| ## Quick Start Proxy - CLI | |
| ```shell | |
| pip install 'litellm[proxy]' | |
| ``` | |
| ### Step 1: Start litellm proxy | |
| ```shell | |
| $ litellm --model huggingface/bigcode/starcoder | |
| #INFO: Proxy running on http://0.0.0.0:4000 | |
| ``` | |
| ### Step 2: Make ChatCompletions Request to Proxy | |
| > [!IMPORTANT] | |
| > π‘ [Use LiteLLM Proxy with Langchain (Python, JS), OpenAI SDK (Python, JS) Anthropic SDK, Mistral SDK, LlamaIndex, Instructor, Curl](https://docs.litellm.ai/docs/proxy/user_keys) | |
| ```python | |
| import openai # openai v1.0.0+ | |
| client = openai.OpenAI(api_key="anything",base_url="http://0.0.0.0:4000") # set proxy to base_url | |
| # request sent to model set on litellm proxy, `litellm --model` | |
| response = client.chat.completions.create(model="gpt-3.5-turbo", messages = [ | |
| { | |
| "role": "user", | |
| "content": "this is a test request, write a short poem" | |
| } | |
| ]) | |
| print(response) | |
| ``` | |
| ## Proxy Key Management ([Docs](https://docs.litellm.ai/docs/proxy/virtual_keys)) | |
| Connect the proxy with a Postgres DB to create proxy keys | |
| ```bash | |
| # Get the code | |
| git clone https://github.com/BerriAI/litellm | |
| # Go to folder | |
| cd litellm | |
| # Add the master key - you can change this after setup | |
| echo 'LITELLM_MASTER_KEY="sk-1234"' > .env | |
| # Add the litellm salt key - you cannot change this after adding a model | |
| # It is used to encrypt / decrypt your LLM API Key credentials | |
| # We recommend - https://1password.com/password-generator/ | |
| # password generator to get a random hash for litellm salt key | |
| echo 'LITELLM_SALT_KEY="sk-1234"' >> .env | |
| source .env | |
| # Start | |
| docker-compose up | |
| ``` | |
| UI on `/ui` on your proxy server | |
|  | |
| Set budgets and rate limits across multiple projects | |
| `POST /key/generate` | |
| ### Request | |
| ```shell | |
| curl 'http://0.0.0.0:4000/key/generate' \ | |
| --header 'Authorization: Bearer sk-1234' \ | |
| --header 'Content-Type: application/json' \ | |
| --data-raw '{"models": ["gpt-3.5-turbo", "gpt-4", "claude-2"], "duration": "20m","metadata": {"user": "ishaan@berri.ai", "team": "core-infra"}}' | |
| ``` | |
| ### Expected Response | |
| ```shell | |
| { | |
| "key": "sk-kdEXbIqZRwEeEiHwdg7sFA", # Bearer token | |
| "expires": "2023-11-19T01:38:25.838000+00:00" # datetime object | |
| } | |
| ``` | |
| ## Supported Providers ([Docs](https://docs.litellm.ai/docs/providers)) | |
| | Provider | [Completion](https://docs.litellm.ai/docs/#basic-usage) | [Streaming](https://docs.litellm.ai/docs/completion/stream#streaming-responses) | [Async Completion](https://docs.litellm.ai/docs/completion/stream#async-completion) | [Async Streaming](https://docs.litellm.ai/docs/completion/stream#async-streaming) | [Async Embedding](https://docs.litellm.ai/docs/embedding/supported_embedding) | [Async Image Generation](https://docs.litellm.ai/docs/image_generation) | | |
| |-------------------------------------------------------------------------------------|---------------------------------------------------------|---------------------------------------------------------------------------------|-------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------|-------------------------------------------------------------------------------|-------------------------------------------------------------------------| | |
| | [openai](https://docs.litellm.ai/docs/providers/openai) | β | β | β | β | β | β | | |
| | [Meta - Llama API](https://docs.litellm.ai/docs/providers/meta_llama) | β | β | β | β | | | | |
| | [azure](https://docs.litellm.ai/docs/providers/azure) | β | β | β | β | β | β | | |
| | [AI/ML API](https://docs.litellm.ai/docs/providers/aiml) | β | β | β | β | β | β | | |
| | [aws - sagemaker](https://docs.litellm.ai/docs/providers/aws_sagemaker) | β | β | β | β | β | | | |
| | [aws - bedrock](https://docs.litellm.ai/docs/providers/bedrock) | β | β | β | β | β | | | |
| | [google - vertex_ai](https://docs.litellm.ai/docs/providers/vertex) | β | β | β | β | β | β | | |
| | [google - palm](https://docs.litellm.ai/docs/providers/palm) | β | β | β | β | | | | |
| | [google AI Studio - gemini](https://docs.litellm.ai/docs/providers/gemini) | β | β | β | β | | | | |
| | [mistral ai api](https://docs.litellm.ai/docs/providers/mistral) | β | β | β | β | β | | | |
| | [cloudflare AI Workers](https://docs.litellm.ai/docs/providers/cloudflare_workers) | β | β | β | β | | | | |
| | [cohere](https://docs.litellm.ai/docs/providers/cohere) | β | β | β | β | β | | | |
| | [anthropic](https://docs.litellm.ai/docs/providers/anthropic) | β | β | β | β | | | | |
| | [empower](https://docs.litellm.ai/docs/providers/empower) | β | β | β | β | | |
| | [huggingface](https://docs.litellm.ai/docs/providers/huggingface) | β | β | β | β | β | | | |
| | [replicate](https://docs.litellm.ai/docs/providers/replicate) | β | β | β | β | | | | |
| | [together_ai](https://docs.litellm.ai/docs/providers/togetherai) | β | β | β | β | | | | |
| | [openrouter](https://docs.litellm.ai/docs/providers/openrouter) | β | β | β | β | | | | |
| | [ai21](https://docs.litellm.ai/docs/providers/ai21) | β | β | β | β | | | | |
| | [baseten](https://docs.litellm.ai/docs/providers/baseten) | β | β | β | β | | | | |
| | [vllm](https://docs.litellm.ai/docs/providers/vllm) | β | β | β | β | | | | |
| | [nlp_cloud](https://docs.litellm.ai/docs/providers/nlp_cloud) | β | β | β | β | | | | |
| | [aleph alpha](https://docs.litellm.ai/docs/providers/aleph_alpha) | β | β | β | β | | | | |
| | [petals](https://docs.litellm.ai/docs/providers/petals) | β | β | β | β | | | | |
| | [ollama](https://docs.litellm.ai/docs/providers/ollama) | β | β | β | β | β | | | |
| | [deepinfra](https://docs.litellm.ai/docs/providers/deepinfra) | β | β | β | β | | | | |
| | [perplexity-ai](https://docs.litellm.ai/docs/providers/perplexity) | β | β | β | β | | | | |
| | [Groq AI](https://docs.litellm.ai/docs/providers/groq) | β | β | β | β | | | | |
| | [Deepseek](https://docs.litellm.ai/docs/providers/deepseek) | β | β | β | β | | | | |
| | [anyscale](https://docs.litellm.ai/docs/providers/anyscale) | β | β | β | β | | | | |
| | [IBM - watsonx.ai](https://docs.litellm.ai/docs/providers/watsonx) | β | β | β | β | β | | | |
| | [voyage ai](https://docs.litellm.ai/docs/providers/voyage) | | | | | β | | | |
| | [xinference [Xorbits Inference]](https://docs.litellm.ai/docs/providers/xinference) | | | | | β | | | |
| | [FriendliAI](https://docs.litellm.ai/docs/providers/friendliai) | β | β | β | β | | | | |
| | [Galadriel](https://docs.litellm.ai/docs/providers/galadriel) | β | β | β | β | | | | |
| | [Novita AI](https://novita.ai/models/llm?utm_source=github_litellm&utm_medium=github_readme&utm_campaign=github_link) | β | β | β | β | | | | |
| | [Featherless AI](https://docs.litellm.ai/docs/providers/featherless_ai) | β | β | β | β | | | | |
| | [Nebius AI Studio](https://docs.litellm.ai/docs/providers/nebius) | β | β | β | β | β | | | |
| [**Read the Docs**](https://docs.litellm.ai/docs/) | |
| ## Contributing | |
| Interested in contributing? Contributions to LiteLLM Python SDK, Proxy Server, and LLM integrations are both accepted and highly encouraged! | |
| **Quick start:** `git clone` β `make install-dev` β `make format` β `make lint` β `make test-unit` | |
| See our comprehensive [Contributing Guide (CONTRIBUTING.md)](CONTRIBUTING.md) for detailed instructions. | |
| # Enterprise | |
| For companies that need better security, user management and professional support | |
| [Talk to founders](https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat) | |
| This covers: | |
| - β **Features under the [LiteLLM Commercial License](https://docs.litellm.ai/docs/proxy/enterprise):** | |
| - β **Feature Prioritization** | |
| - β **Custom Integrations** | |
| - β **Professional Support - Dedicated discord + slack** | |
| - β **Custom SLAs** | |
| - β **Secure access with Single Sign-On** | |
| # Contributing | |
| We welcome contributions to LiteLLM! Whether you're fixing bugs, adding features, or improving documentation, we appreciate your help. | |
| ## Quick Start for Contributors | |
| ```bash | |
| git clone https://github.com/BerriAI/litellm.git | |
| cd litellm | |
| make install-dev # Install development dependencies | |
| make format # Format your code | |
| make lint # Run all linting checks | |
| make test-unit # Run unit tests | |
| ``` | |
| For detailed contributing guidelines, see [CONTRIBUTING.md](CONTRIBUTING.md). | |
| ## Code Quality / Linting | |
| LiteLLM follows the [Google Python Style Guide](https://google.github.io/styleguide/pyguide.html). | |
| Our automated checks include: | |
| - **Black** for code formatting | |
| - **Ruff** for linting and code quality | |
| - **MyPy** for type checking | |
| - **Circular import detection** | |
| - **Import safety checks** | |
| Run all checks locally: | |
| ```bash | |
| make lint # Run all linting (matches CI) | |
| make format-check # Check formatting only | |
| ``` | |
| All these checks must pass before your PR can be merged. | |
| # Support / talk with founders | |
| - [Schedule Demo π](https://calendly.com/d/4mp-gd3-k5k/berriai-1-1-onboarding-litellm-hosted-version) | |
| - [Community Discord π](https://discord.gg/wuPM9dRgDw) | |
| - Our numbers π +1 (770) 8783-106 / β+1 (412) 618-6238β¬ | |
| - Our emails βοΈ ishaan@berri.ai / krrish@berri.ai | |
| # Why did we build this | |
| - **Need for simplicity**: Our code started to get extremely complicated managing & translating calls between Azure, OpenAI and Cohere. | |
| # Contributors | |
| <!-- ALL-CONTRIBUTORS-LIST:START - Do not remove or modify this section --> | |
| <!-- prettier-ignore-start --> | |
| <!-- markdownlint-disable --> | |
| <!-- markdownlint-restore --> | |
| <!-- prettier-ignore-end --> | |
| <!-- ALL-CONTRIBUTORS-LIST:END --> | |
| <a href="https://github.com/BerriAI/litellm/graphs/contributors"> | |
| <img src="https://contrib.rocks/image?repo=BerriAI/litellm" /> | |
| </a> | |
| ## Run in Developer mode | |
| ### Services | |
| 1. Setup .env file in root | |
| 2. Run dependant services `docker-compose up db prometheus` | |
| ### Backend | |
| 1. (In root) create virtual environment `python -m venv .venv` | |
| 2. Activate virtual environment `source .venv/bin/activate` | |
| 3. Install dependencies `pip install -e ".[all]"` | |
| 4. Start proxy backend `uvicorn litellm.proxy.proxy_server:app --host localhost --port 4000 --reload` | |
| ### Frontend | |
| 1. Navigate to `ui/litellm-dashboard` | |
| 2. Install dependencies `npm install` | |
| 3. Run `npm run dev` to start the dashboard | |