OpenAI 兼容性
本文内容均由Ollama官方文档翻译,仅供个人学习,如有差异请以官网文档为准(https://ollama.com)ollama.cadn.net.cn
[!NOTE]
OpenAI 兼容性处于实验阶段,未来可能经历重大调整,包括破坏性变更。如需完整功能地访问 Ollama API,请参阅 Ollama 的 Python 库、JavaScript 库 和 REST API。ollama.cadn.net.cn
Ollama 提供了对 OpenAI API 部分功能的实验性兼容,以帮助现有应用程序接入 Ollama。ollama.cadn.net.cn
使用说明
OpenAI Python 库
from openai import OpenAI
client = OpenAI(
base_url='http://localhost:11434/v1/',
api_key='ollama',
)
chat_completion = client.chat.completions.create(
messages=[
{
'role': 'user',
'content': 'Say this is a test',
}
],
model='llama3.2',
)
response = client.chat.completions.create(
model="llava",
messages=[
{
"role": "user",
"content": [
{"type": "text", "text": "What's in this image?"},
{
"type": "image_url",
"image_url": "data:image/png;base64,******",
},
],
}
],
max_tokens=300,
)
completion = client.completions.create(
model="llama3.2",
prompt="Say this is a test",
)
list_completion = client.models.list()
model = client.models.retrieve("llama3.2")
embeddings = client.embeddings.create(
model="all-minilm",
input=["why is the sky blue?", "why is the grass green?"],
)
结构化输出
from pydantic import BaseModel
from openai import OpenAI
client = OpenAI(base_url="http://localhost:11434/v1", api_key="ollama")
class FriendInfo(BaseModel):
name: str
age: int
is_available: bool
class FriendList(BaseModel):
friends: list[FriendInfo]
try:
completion = client.beta.chat.completions.parse(
temperature=0,
model="llama3.1:8b",
messages=[
{"role": "user", "content": "I have two friends. The first is Ollama 22 years old busy saving the world, and the second is Alonso 23 years old and wants to hang out. Return a list of friends in JSON format"}
],
response_format=FriendList,
)
friends_response = completion.choices[0].message
if friends_response.parsed:
print(friends_response.parsed)
elif friends_response.refusal:
print(friends_response.refusal)
except Exception as e:
print(f"Error: {e}")
OpenAI JavaScript 库
import OpenAI from 'openai'
const openai = new OpenAI({
baseURL: 'http://localhost:11434/v1/',
apiKey: 'ollama',
})
const chatCompletion = await openai.chat.completions.create({
messages: [{ role: 'user', content: 'Say this is a test' }],
model: 'llama3.2',
})
const response = await openai.chat.completions.create({
model: "llava",
messages: [
{
role: "user",
content: [
{ type: "text", text: "What's in this image?" },
{
type: "image_url",
image_url: "data:image/png;base64,******",
},
],
},
],
})
const completion = await openai.completions.create({
model: "llama3.2",
prompt: "Say this is a test.",
})
const listCompletion = await openai.models.list()
const model = await openai.models.retrieve("llama3.2")
const embedding = await openai.embeddings.create({
model: "all-minilm",
input: ["why is the sky blue?", "why is the grass green?"],
})
curl
curl http://localhost:11434/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "llama3.2",
"messages": [
{
"role": "system",
"content": "You are a helpful assistant."
},
{
"role": "user",
"content": "Hello!"
}
]
}'
curl http://localhost:11434/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "llava",
"messages": [
{
"role": "user",
"content": [
{
"type": "text",
"text": "What'\''s in this image?"
},
{
"type": "image_url",
"image_url": {
"url": "data:image/png;base64,******"
}
}
]
}
],
"max_tokens": 300
}'
curl http://localhost:11434/v1/completions \
-H "Content-Type: application/json" \
-d '{
"model": "llama3.2",
"prompt": "Say this is a test"
}'
curl http://localhost:11434/v1/models
curl http://localhost:11434/v1/models/llama3.2
curl http://localhost:11434/v1/embeddings \
-H "Content-Type: application/json" \
-d '{
"model": "all-minilm",
"input": ["why is the sky blue?", "why is the grass green?"]
}'
端点
/v1/chat/completions
支持的功能
- [x] 聊天补全
- [x] 流式传输
- [x] JSON 模式
- [x] 可复现的输出
- [x] 愿景
- [x] 工具
- [ ] Logprobs
支持的请求字段
- [x]
model
- [x]
messages
- [x] 文本
content
- [x] 图像
content
- [x] Base64 编码的图像
- [ ] 图像 URL
- [x] 包含
content 个部分的数组
- [x]
frequency_penalty
- [x]
presence_penalty
- [x]
response_format
- [x]
seed
- [x]
stop
- [x]
stream
- [x]
stream_options
- [x]
temperature
- [x]
top_p
- [x]
max_tokens
- [x]
tools
- [ ]
tool_choice
- [ ]
logit_bias
- [ ]
user
- [ ]
n
/v1/completions
支持的功能
- [x] 补全
- [x] 流式传输
- [x] JSON 模式
- [x] 可复现的输出
- [ ] Logprobs
支持的请求字段
- [x]
model
- [x]
prompt
- [x]
frequency_penalty
- [x]
presence_penalty
- [x]
seed
- [x]
stop
- [x]
stream
- [x]
stream_options
- [x]
temperature
- [x]
top_p
- [x]
max_tokens
- [x]
suffix
- [ ]
best_of
- [ ]
echo
- [ ]
logit_bias
- [ ]
user
- [ ]
n
注释
/v1/models
注释
created 对应于模型上次被修改的时间
owned_by 对应 ollama 用户名,默认为 "library"
/v1/models/{model}
注释
created 对应于模型上次被修改的时间
owned_by 对应 ollama 用户名,默认为 "library"
/v1/embeddings
支持的请求字段
- [x]
model
- [x]
input
- [x] 字符串
- [x] 字符串数组
- [ ] 令牌数组
- [ ] 令牌数组的数组
- [ ]
encoding format
- [ ]
dimensions
- [ ]
user
模型
使用模型前,先将其拉取到本地 ollama pull:ollama.cadn.net.cn
ollama pull llama3.2
默认模型名称
对于依赖默认 OpenAI 模型名称(例如 gpt-3.5-turbo)的工具,请使用 ollama cp 将现有模型名称复制到临时名称:ollama.cadn.net.cn
ollama cp llama3.2 gpt-3.5-turbo
之后,可以在 model 字段中指定此新模型名称:ollama.cadn.net.cn
curl http://localhost:11434/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-3.5-turbo",
"messages": [
{
"role": "user",
"content": "Hello!"
}
]
}'
设置上下文大小
OpenAI API 没有提供设置模型上下文长度的方式。如果您需要更改上下文长度,请创建一个如下所示的 Modelfile:ollama.cadn.net.cn
FROM <some model>
PARAMETER num_ctx <context size>
使用 ollama create mymodel 命令创建一个具有更新上下文大小的新模型。使用更新后的模型名称调用 API:ollama.cadn.net.cn
curl http://localhost:11434/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "mymodel",
"messages": [
{
"role": "user",
"content": "Hello!"
}
]
}'