Documentation Index
Fetch the complete documentation index at: https://docs.aireiter.com/llms.txt
Use this file to discover all available pages before exploring further.
Authorizations
API 密钥,Bearer Token 格式获取 API Key:访问 API Key 管理页面 获取您的 API KeyAuthorization: Bearer YOUR_API_KEY
Body
模型名称OpenAI 兼容模型(推荐用于工具调用):
gpt-4o-mini — 轻量快速,适合高频简单任务
gpt-4o — 综合推荐,性能与成本平衡
gpt-5.2 — 高性能版本,完整支持工具调用
- 更多模型请查询
GET /api/v1/models
Claude 模型(Anthropic 协议转发):
claude-haiku-4-5-20251001 — 轻量快速
claude-sonnet-4-5-20250929 — 综合推荐
claude-sonnet-4-6 — Sonnet 新版
claude-opus-4-5-20251101 — 旗舰推理模型
claude-opus-4-6 — Opus 新版,能力最强
完整模型列表请查询:GET /api/v1/models
消息列表模型根据消息历史生成下一条回复。每条消息包含 role 和 content。
角色类型
system — 系统提示(会从消息列表中提取并单独处理)
user — 用户消息
assistant — AI 回复(用于多轮对话)
纯文本消息:[{"role": "user", "content": "你好"}]
多轮对话:[
{"role": "system", "content": "你是一位专业的代码审查员。"},
{"role": "user", "content": "帮我看下这段代码"},
{"role": "assistant", "content": "我来分析一下..."},
{"role": "user", "content": "有没有性能问题?"}
]
多轮工具调用:[
{"role": "user", "content": "北京今天天气如何?"},
{
"role": "assistant",
"content": null,
"tool_calls": [
{
"id": "call_001",
"type": "function",
"function": {"name": "get_weather", "arguments": "{\"city\": \"北京\"}"}
}
]
},
{
"role": "tool",
"tool_call_id": "call_001",
"content": "北京:25°C,晴"
}
]
最大输出 Token 数控制模型最多生成的 token 数量,模型可能在达到上限前自然结束。最小值:1。
默认不限制(受模型上下文窗口约束)。
最大输出 Token 数(max_tokens 的新名称)与 max_tokens 完全等价,两者提供其一即可,max_tokens 优先级更高。
是否启用流式输出设为 true 时,通过 SSE(Server-Sent Events)实时流式返回。默认 true。如需非流式响应,需显式传入 "stream": false。设为 false 时,等待生成完成后一次性返回完整响应。
温度,范围 0–2
- 低值(如
0.2):输出更确定、保守
- 高值(如
0.8):输出更随机、有创意
默认 1.0。不建议与 top_p 同时使用。
核采样参数,范围 0–1从累积概率达到 top_p 的 token 集合中采样。默认 1.0。
不建议与 temperature 同时使用。
频率惩罚,范围 -2.0–2.0正值会根据 token 在已生成文本中的出现频率对其进行惩罚,降低重复输出的概率。默认 0。
存在惩罚,范围 -2.0–2.0正值会对已出现过的 token 进行惩罚,鼓励模型探索新话题。默认 0。
随机种子设置后,相同 seed + 相同请求参数将尽量产生确定性输出,便于结果复现。
停止序列遇到此字符串(或数组中任意一个)时,模型立即停止生成。最多 4 个。{"stop": ["\n\nUser:", "###END###"]}
工具定义列表定义模型可调用的函数工具,每个工具包含名称、描述和参数 JSON Schema。{
"tools": [
{
"type": "function",
"function": {
"name": "get_weather",
"description": "获取指定城市的当前天气",
"parameters": {
"type": "object",
"properties": {
"city": {
"type": "string",
"description": "城市名称,例如:北京"
},
"unit": {
"type": "string",
"enum": ["celsius", "fahrenheit"]
}
},
"required": ["city"]
}
}
}
]
}
工具选择策略
"auto" — 模型自行决定是否调用工具(默认)
"required" — 强制模型必须调用某个工具
"none" — 禁止调用任何工具
{"type": "function", "function": {"name": "get_weather"}} — 强制调用指定工具
{"tool_choice": "required"}
是否允许并行工具调用设为 false 时,每次只调用一个工具。默认 true(允许并行)。
响应格式控制模型输出格式。
格式类型
"text" — 普通文本(默认)
"json_object" — JSON 对象,模型输出合法 JSON
"json_schema" — 严格按照指定 JSON Schema 输出
当 type 为 "json_schema" 时必填
JSON Schema 定义,模型输出严格遵循此结构
JSON 对象模式:{"response_format": {"type": "json_object"}}
JSON Schema 模式:{
"response_format": {
"type": "json_schema",
"json_schema": {
"name": "analysis_result",
"strict": true,
"schema": {
"type": "object",
"properties": {
"summary": {"type": "string"},
"score": {"type": "number"},
"tags": {"type": "array", "items": {"type": "string"}}
},
"required": ["summary", "score", "tags"]
}
}
}
}
推理强度适用于支持推理的模型(如 gpt-5.2 及以上),控制模型推理深度。
"low" — 快速推理,节省 token
"medium" — 均衡推理
"high" — 深度推理,更准确但消耗更多 token
Response
补全唯一标识符示例:"chatcmpl-9vKqnMf3Ax8ZpRdTw2LsYe7b"
对象类型,固定为 "chat.completion"(非流式)或 "chat.completion.chunk"(流式)
计费任务 ID(项目扩展字段),用于追踪本次调用的积分消耗记录
生成结果数组(始终只有 1 条)
非流式响应中的完整消息对象
role — 固定为 "assistant"
content — 文本内容(工具调用时为 null)
tool_calls — 工具调用列表(有工具调用时存在)
流式响应中的增量内容
- 首帧:
{"role": "assistant"}
- 文本帧:
{"content": "..."}
- 工具调用帧:
{"tool_calls": [{"index": 0, "id": "...", "type": "function", "function": {"name": "...", "arguments": ""}}]}
- 结束帧:
{}
停止原因
"stop" — 自然结束
"length" — 达到 max_tokens 上限
"tool_calls" — 模型请求调用工具
"content_filter" — 内容过滤
Token 使用统计(非流式中在响应体顶层,流式中在最后一帧)
curl https://aireiter.com/api/v1/chat/completions \
-H "Authorization: Bearer $API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "claude-sonnet-4-5-20250929",
"messages": [
{"role": "user", "content": "你好,世界"}
]
}'
{
"id": "chatcmpl-9vKqnMf3Ax8ZpRdTw2LsYe7b",
"object": "chat.completion",
"created": 1741680000,
"model": "claude-sonnet-4-5-20250929",
"task_id": "task_abc123",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "你好!有什么我可以帮你的吗?"
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 12,
"completion_tokens": 15,
"total_tokens": 27
}
}
使用示例
基础对话
from openai import OpenAI
client = OpenAI(
api_key="YOUR_API_KEY",
base_url="https://aireiter.com/api/v1"
)
response = client.chat.completions.create(
model="claude-sonnet-4-5-20250929",
messages=[
{"role": "user", "content": "用 Python 实现快速排序"}
]
)
print(response.choices[0].message.content)
系统提示词 + 多轮对话
response = client.chat.completions.create(
model="claude-sonnet-4-5-20250929",
messages=[
{"role": "system", "content": "你是一位资深 Python 开发专家,擅长代码审查和性能优化。"},
{"role": "user", "content": "什么是 GIL?"},
{"role": "assistant", "content": "GIL(全局解释器锁)是..."},
{"role": "user", "content": "怎么绕过 GIL 实现真正的并行?"}
],
temperature=0.3
)
流式响应
stream = client.chat.completions.create(
model="claude-sonnet-4-5-20250929",
messages=[{"role": "user", "content": "写一篇关于量子计算的技术博客"}],
stream=True
)
for chunk in stream:
delta = chunk.choices[0].delta
if delta.content:
print(delta.content, end="", flush=True)
工具调用(完整多轮流程)
import json
tools = [
{
"type": "function",
"function": {
"name": "get_stock_price",
"description": "获取股票实时价格",
"parameters": {
"type": "object",
"properties": {
"ticker": {"type": "string", "description": "股票代码,例如:AAPL"}
},
"required": ["ticker"]
}
}
}
]
messages = [{"role": "user", "content": "特斯拉股价是多少?"}]
# 第一轮:模型决定调用工具
response = client.chat.completions.create(
model="claude-sonnet-4-5-20250929",
messages=messages,
tools=tools
)
# 处理工具调用
if response.choices[0].finish_reason == "tool_calls":
tool_call = response.choices[0].message.tool_calls[0]
# 执行工具(业务逻辑)
tool_result = {"price": 245.80, "currency": "USD"}
# 第二轮:将结果返回给模型
messages += [
response.choices[0].message,
{
"role": "tool",
"tool_call_id": tool_call.id,
"content": json.dumps(tool_result)
}
]
final = client.chat.completions.create(
model="claude-sonnet-4-5-20250929",
messages=messages,
tools=tools
)
print(final.choices[0].message.content)
结构化输出(JSON Schema)
response = client.chat.completions.create(
model="claude-sonnet-4-5-20250929",
messages=[
{"role": "user", "content": "分析以下产品评论的情感:'这款耳机音质出色,但续航一般'"}
],
response_format={
"type": "json_schema",
"json_schema": {
"name": "sentiment_analysis",
"strict": True,
"schema": {
"type": "object",
"properties": {
"sentiment": {"type": "string", "enum": ["positive", "negative", "mixed"]},
"score": {"type": "number", "description": "情感分值 -1.0 到 1.0"},
"highlights": {"type": "array", "items": {"type": "string"}}
},
"required": ["sentiment", "score", "highlights"]
}
}
}
)
result = json.loads(response.choices[0].message.content)
print(result)
# {"sentiment": "mixed", "score": 0.2, "highlights": ["音质出色", "续航一般"]}
采样参数控制
# 创意写作:高温度 + 频率惩罚
response = client.chat.completions.create(
model="claude-sonnet-4-5-20250929",
messages=[{"role": "user", "content": "写一首关于秋天的现代诗"}],
temperature=0.9,
frequency_penalty=0.5,
presence_penalty=0.3
)
# 精确输出:低温度 + 固定种子
response = client.chat.completions.create(
model="claude-sonnet-4-5-20250929",
messages=[{"role": "user", "content": "计算 1234 × 5678 的结果"}],
temperature=0.0,
seed=42
)
流式响应事件格式
data: {"id":"chatcmpl-xxx","object":"chat.completion.chunk","created":1741680000,"model":"claude-sonnet-4-5-20250929","task_id":"task_abc123","choices":[{"index":0,"delta":{"role":"assistant"},"finish_reason":null}]}
data: {"id":"chatcmpl-xxx","object":"chat.completion.chunk","created":1741680000,"model":"claude-sonnet-4-5-20250929","choices":[{"index":0,"delta":{"content":"你好"},"finish_reason":null}]}
data: {"id":"chatcmpl-xxx","object":"chat.completion.chunk","created":1741680000,"model":"claude-sonnet-4-5-20250929","choices":[{"index":0,"delta":{"content":"!"},"finish_reason":null}]}
data: {"id":"chatcmpl-xxx","object":"chat.completion.chunk","created":1741680000,"model":"claude-sonnet-4-5-20250929","choices":[{"index":0,"delta":{},"finish_reason":"stop"}],"usage":{"prompt_tokens":12,"completion_tokens":5,"total_tokens":17}}
data: [DONE]
工具调用流式事件:
data: {"id":"chatcmpl-xxx","object":"chat.completion.chunk","created":1741680000,"model":"claude-sonnet-4-5-20250929","choices":[{"index":0,"delta":{"tool_calls":[{"index":0,"id":"call_abc","type":"function","function":{"name":"get_weather","arguments":""}}]},"finish_reason":null}]}
data: {"id":"chatcmpl-xxx","object":"chat.completion.chunk","created":1741680000,"model":"claude-sonnet-4-5-20250929","choices":[{"index":0,"delta":{"tool_calls":[{"index":0,"function":{"arguments":"{\"city\""}}]},"finish_reason":null}]}
data: {"id":"chatcmpl-xxx","object":"chat.completion.chunk","created":1741680000,"model":"claude-sonnet-4-5-20250929","choices":[{"index":0,"delta":{"tool_calls":[{"index":0,"function":{"arguments":": \"北京\"}"}}]},"finish_reason":null}]}
data: {"id":"chatcmpl-xxx","object":"chat.completion.chunk","created":1741680000,"model":"claude-sonnet-4-5-20250929","choices":[{"index":0,"delta":{},"finish_reason":"tool_calls"}],"usage":{"prompt_tokens":85,"completion_tokens":22,"total_tokens":107}}
data: [DONE]
注意事项
-
认证方式:仅支持
Authorization: Bearer <api_key> 格式,使用 OpenAI SDK 时直接设置 api_key 即可。
-
默认流式:
stream 参数默认为 true。如需非流式响应,需显式传入 "stream": false。
-
积分不足:余额不足时返回 HTTP
402,请充值后重试。
-
stop 与停止序列:
stop 参数目前不生效,底层供应商暂不支持该功能,传入参数不会报错但也不会在指定序列处停止生成。
-
response_format 注意:当前底层供应商对
response_format 支持有限——json_object 模式下模型可能仍输出 Markdown 代码块而非纯 JSON;json_schema 模式下 Schema 约束可能不被遵守。如需结构化输出,建议在 prompt 中明确描述所需格式。
-
工具参数:
parameters 字段必须是合法的 JSON Schema,required 数组决定哪些参数为必填项。
-
模型选择建议:
- Haiku — 高频简单问答,成本最低
- Sonnet — 代码生成、文档处理,综合推荐
- Opus — 复杂推理、长文分析,能力最强
-thinking 系列 — 数学证明、逻辑推导等需要深度思考的场景