Xinference [Xorbits Inference]
https://inference.readthedocs.io/en/latest/index.html
Overviewโ
| Property | Details | 
|---|---|
| Description | Xinference is an open-source platform to run inference with any open-source LLMs, image generation models, and more. | 
| Provider Route on LiteLLM | xinference/ | 
| Link to Provider Doc | Xinference โ | 
| Supported Operations | /embeddings,/images/generations | 
LiteLLM supports Xinference Embedding + Image Generation calls.
API Base, Keyโ
# env variable
os.environ['XINFERENCE_API_BASE'] = "http://127.0.0.1:9997/v1"
os.environ['XINFERENCE_API_KEY'] = "anything" #[optional] no api key required
Sample Usage - Embeddingโ
from litellm import embedding
import os
os.environ['XINFERENCE_API_BASE'] = "http://127.0.0.1:9997/v1"
response = embedding(
    model="xinference/bge-base-en",
    input=["good morning from litellm"],
)
print(response)
Sample Usage api_base paramโ
from litellm import embedding
import os
response = embedding(
    model="xinference/bge-base-en",
    api_base="http://127.0.0.1:9997/v1",
    input=["good morning from litellm"],
)
print(response)
Image Generationโ
Usage - LiteLLM Python SDKโ
from litellm import image_generation
import os
# xinference image generation call
response = image_generation(
    model="xinference/stabilityai/stable-diffusion-3.5-large",
    prompt="A beautiful sunset over a calm ocean",
    api_base="http://127.0.0.1:9997/v1",
)
print(response)
Usage - LiteLLM Proxy Serverโ
1. Setup config.yamlโ
model_list:
  - model_name: xinference-sd
    litellm_params:
      model: xinference/stabilityai/stable-diffusion-3.5-large
      api_base: http://127.0.0.1:9997/v1
      api_key: anything
  model_info:
    mode: image_generation
general_settings:
  master_key: sk-1234
2. Start the proxyโ
litellm --config config.yaml
# RUNNING on http://0.0.0.0:4000
3. Test itโ
curl --location 'http://0.0.0.0:4000/v1/images/generations' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer sk-1234' \
--data '{
    "model": "xinference-sd",
    "prompt": "A beautiful sunset over a calm ocean",
    "n": 1,
    "size": "1024x1024",
    "response_format": "url"
}'
Advanced Usage - With Additional Parametersโ
from litellm import image_generation
import os
os.environ['XINFERENCE_API_BASE'] = "http://127.0.0.1:9997/v1"
response = image_generation(
    model="xinference/stabilityai/stable-diffusion-3.5-large",
    prompt="A beautiful sunset over a calm ocean",
    n=1,                           # number of images
    size="1024x1024",             # image size
    response_format="b64_json",   # return format
)
print(response)
Supported Image Generation Modelsโ
Xinference supports various stable diffusion models. Here are some examples:
| Model Name | Function Call | 
|---|---|
| stabilityai/stable-diffusion-3.5-large | image_generation(model="xinference/stabilityai/stable-diffusion-3.5-large", prompt="...") | 
| stabilityai/stable-diffusion-xl-base-1.0 | image_generation(model="xinference/stabilityai/stable-diffusion-xl-base-1.0", prompt="...") | 
| runwayml/stable-diffusion-v1-5 | image_generation(model="xinference/runwayml/stable-diffusion-v1-5", prompt="...") | 
For a complete list of supported image generation models, see: https://inference.readthedocs.io/en/latest/models/builtin/image/index.html
Supported Modelsโ
All models listed here https://inference.readthedocs.io/en/latest/models/builtin/embedding/index.html are supported
| Model Name | Function Call | 
|---|---|
| bge-base-en | embedding(model="xinference/bge-base-en", input) | 
| bge-base-en-v1.5 | embedding(model="xinference/bge-base-en-v1.5", input) | 
| bge-base-zh | embedding(model="xinference/bge-base-zh", input) | 
| bge-base-zh-v1.5 | embedding(model="xinference/bge-base-zh-v1.5", input) | 
| bge-large-en | embedding(model="xinference/bge-large-en", input) | 
| bge-large-en-v1.5 | embedding(model="xinference/bge-large-en-v1.5", input) | 
| bge-large-zh | embedding(model="xinference/bge-large-zh", input) | 
| bge-large-zh-noinstruct | embedding(model="xinference/bge-large-zh-noinstruct", input) | 
| bge-large-zh-v1.5 | embedding(model="xinference/bge-large-zh-v1.5", input) | 
| bge-small-en-v1.5 | embedding(model="xinference/bge-small-en-v1.5", input) | 
| bge-small-zh | embedding(model="xinference/bge-small-zh", input) | 
| bge-small-zh-v1.5 | embedding(model="xinference/bge-small-zh-v1.5", input) | 
| e5-large-v2 | embedding(model="xinference/e5-large-v2", input) | 
| gte-base | embedding(model="xinference/gte-base", input) | 
| gte-large | embedding(model="xinference/gte-large", input) | 
| jina-embeddings-v2-base-en | embedding(model="xinference/jina-embeddings-v2-base-en", input) | 
| jina-embeddings-v2-small-en | embedding(model="xinference/jina-embeddings-v2-small-en", input) | 
| multilingual-e5-large | embedding(model="xinference/multilingual-e5-large", input) |