Getting Started & Usage
Once LiteLLM is deployed and the health check passes, connecting your applications and issuing your first call takes a few minutes.
How to Call LiteLLM from Any Application
LiteLLM is OpenAI-compatible. Any application that can call the OpenAI API can call LiteLLM with no code changes — just update the base URL and API key.
Creating Virtual Keys
Virtual keys let you issue scoped API credentials to teams, services, or apps without exposing your provider keys.
The returned key (sk-...) is what the application uses. LiteLLM tracks spend per key.
Viewing Available Models
curl <http://localhost:4000/v1/models> \\
-H "Authorization: Bearer <YOUR_KEY>"
Checking Usage and Spend
# Per-key usage
curl <http://localhost:4000/key/info> \\
-H "Authorization: Bearer <LITELLM_MASTER_KEY>" \\
-d '{"key": "sk-..."}'
# All keys overview
curl <http://localhost:4000/key/list> \\
-H "Authorization: Bearer <LITELLM_MASTER_KEY>"
Testing a Specific Model
curl -X POST <http://localhost:4000/v1/chat/completions> \\
-H "Authorization: Bearer <YOUR_KEY>" \\
-H "Content-Type: application/json" \\
-d '{
"model": "claude-3-5-sonnet",
"messages": [{"role": "user", "content": "What is 2+2?"}]
}'
Streaming Responses
curl -X POST <http://localhost:4000/v1/chat/completions> \\
-H "Authorization: Bearer <YOUR_KEY>" \\
-H "Content-Type: application/json" \\
-d '{"model":"gpt-4o","messages":[{"role":"user","content":"Tell me a story"}],"stream":true}'
Connecting Dify to LiteLLM
In the Dify settings, configure a new model provider with:
Once configured, all Dify model calls route through LiteLLM.
Practical Tips
- Use model aliases (not provider model IDs) in your app code — this lets you swap the underlying model without code changes
- Issue separate virtual keys per application or team for usage isolation
- Set budget limits on virtual keys to prevent runaway costs
- Use the /health endpoint in your application startup health checks
Shakudo SaaS-first quick start
This section is for customers using LiteLLM as a managed component inside Shakudo. Start from the Shakudo platform instead of installing or exposing LiteLLM manually.
1. Access the component in Shakudo
- Sign in to your Shakudo workspace with your organization-approved account.
- Open the workspace or environment where this component is enabled.
- Go to the Applications or component catalog area and select LiteLLM.
- If you cannot see the component, ask your workspace administrator to confirm that it is enabled for your role and environment.
2. Open the component UI
- Use the Shakudo-provided Open, Launch, or Access action for LiteLLM.
- Let Shakudo handle authentication, networking, and workspace routing. Avoid using internal service URLs unless your administrator explicitly provides them.
- Confirm that the component opens in the expected workspace before creating or changing resources.
3. Complete a first safe use case
Open the LiteLLM dashboard or API endpoint, select an approved model, and send a small test completion through the gateway to confirm routing and credentials are working.
- Use a small non-production example first, especially when testing credentials, scans, model calls, or data connections.
- Name the test clearly so other workspace users can recognize it as a first-run validation.
4. Monitor and validate the result
- Check the component UI for run status, logs, traces, scan results, job history, or project activity, depending on the component.
- Return to Shakudo if you need platform-level status, access control changes, or administrator support.
- Record any errors, missing permissions, or unexpected results before retrying with production workloads.
5. Next steps
- Review the use cases, administration, and troubleshooting pages in this knowledge base for deeper examples.
- For production usage, follow your team’s Shakudo workspace policies for credentials, data access, resource limits, and approvals.
- Previous getting-started content snapshot
- The page content below was present before this SaaS-first section was added. It is retained here as an inline snapshot so existing guidance is not lost.
- heading_1: Getting Started & Usage; paragraph: Once LiteLLM is deployed and the health check passes, connecting your applications and issuing your first call takes a few minutes.; heading_2: How to Call LiteLLM from Any Application; paragraph: LiteLLM is OpenAI-compatible. Any application that can call the OpenAI API can call LiteLLM with no code changes — just update the base URL and API key.; code: # Python — using the openai SDKfrom openai import OpenAI
- client = OpenAI(api_key="<YOUR_VIRTUAL_KEY>",base_url="http://litellm.hyperplane-litellm.svc.cluster.local:4000/v1")
- response = client.chat.completions.create(model="gpt-4o", # use the alias defined in litellmConfigmessages=[{"role": "user", "content": "Summarize this document."}])print(response.choices[0].message.content); paragraph: The model name is the alias from litellmConfig.model_list (e.g. gpt-4o, claude-3-5-sonnet, gemini-flash). LiteLLM routes it to the correct provider.; heading_2: Creating Virtual Keys; paragraph: Virtual keys let you issue scoped API credentials to teams, services, or apps without exposing your provider keys.; code: # Create a virtual key via the LiteLLM APIcurl -X POST http://localhost:4000/key/generate \-H "Authorization: Bearer <LITELLM_MASTER_KEY>" \-H "Content-Type: application/json" \-d '{"models": ["gpt-4o","gemini-flash"],"max_budget": 100,"budget_duration": "monthly","metadata": {"team": "risk-team", "environment": "prod"}}'; paragraph: The returned key (sk-...) is what the application uses. LiteLLM tracks spend per key.; heading_2: Viewing Available Models; code: curl http://localhost:4000/v1/models \-H "Authorization: Bearer <YOUR_KEY>"