精準0誤差,輸入價格打骨折!OpenAI官宣API支持結構化輸出,JSON準確率100%
還在絞盡腦汁想一堆提示詞,為一頓操作后五花八門的輸出結果而頭疼?
OpenAI終于聽到了群眾的呼聲,為廣大開發者送上渴望已久的第一大功能。
OpenAI今日宣布新功能上線,ChatGPT API現已支持JSON結構化輸出。
圖片
JSON(JavaScript Object Notation)是文件和數據交換格式的行業標準,因為它既易于人類讀取又易于機器解析。
然而,LLM常常與JSON對著干,經常會產生幻覺,要不生成僅部分遵循指令的響應,要不就生成一堆「天書」,根本無法完全解析。
圖片
這就需要開發人員使用多種開源工具、嘗試不同的提示或重復請求等來生成理想的輸出結果,耗時耗力。
結構化輸出功能于今天發布,以上棘手的難題迎刃而解,確保模型生成的輸出與JSON中規定的schema相匹配。
一直以來,結構化輸出功能是開發人員呼聲最高的頭號功能,奧特曼在推文中也表示,該版本是應廣大用戶的要求發布的。
圖片
OpenAI發布的新功能確實擊中了許多開發者的心,他們一致認為「This is a big deal」。
紛紛留言表示贊嘆,直呼「Excellent!」。
圖片
幾家歡喜幾家愁,OpenAI的這次更新,又讓人擔心會吞噬初創公司。
圖片
然而,對于更多的普通用戶來說,他們更關心的問題是GPT-5到底什么時候發布,至于JSON Schema,「那是什么?」
圖片
圖片
畢竟,沒有GPT-5的消息,OpenAI今年秋季的DevDay,可能與去年相比,將會顯得安靜了許多。
輕松確保模式一致性
有了結構化輸出,只需要定義一個JSON Schema,AI就會不再「任性」,乖乖按照指令要求輸出數據。
并且,新功能不僅僅讓AI變得更加聽話,還能大大提高輸出內容的可靠性。
在對復雜的JSON schema的跟蹤評估中,帶有結構化輸出的新模型gpt-4o-2024-08-06獲得了100%的滿分。相比之下,gpt-4-0613的得分不到40%。
圖片
實際上,JSON Schema功能就是OpenAI在去年的DevDay上推出的。
現在,OpenAI在API中擴展了這項功能,確保模型生成的輸出與開發人員提供的JSON Schema完全匹配。
從非結構化輸入生成結構化數據是當今應用中人工智能的核心用例之一。
開發人員使用OpenAI API構建強大的助手,能夠通過函數調用獲取數據和回答問題,提取結構化數據以進行數據輸入,并構建多步驟的智能體工作流(multi-step agentic workflows),從而允許LLM采取行動。
技術原理
OpenAI采用了一種雙管齊下的方法來提高模型輸出與JSON Schema的匹配度。
最新的gpt-4o-2024-08-06模型經過訓練,可以更好地理解復雜的Schema并生成與之匹配的輸出。
盡管模型性能已顯著提升,在基準測試中達到了93%的準確性,但固有不確定性仍然存在。
為了確保開發者構建應用的穩定性,OpenAI提供了一種更高準確度的方法來約束模型的輸出,從而實現100%的可靠性。
約束解碼
OpenAI采用了一種稱為約束采樣或約束解碼的技術,默認情況下,模型生成輸出時完全不受約束,可能從詞匯表中選擇任何token作為下一個輸出。
這種靈活性可能導致錯誤,例如,在生成有效JSON時隨意插入無效字符。
為了避免此類錯誤,OpenAI使用動態約束解碼的方法,確保生成的輸出token始終符合提供的schema。
為了實現這一點,OpenAI將提供的JSON Schema轉換為上下文無關文法(CFG)。
對于每個JSON Schema,OpenAI計算出一個代表該模式的語法,并在采樣期間高效地訪問預處理的組件。
這種方法不僅使生成的輸出更準確,還減少了不必要的延遲。首次請求新模式可能會有額外的處理時間,但隨后的請求通過緩存機制實現快速響應。
備選方案
除了CFG方法,其他方法通常使用有限狀態機(FSM)或正則表達式來進行約束解碼。
然而,這些方法在動態更新有效token時能力有限。特別是對于復雜的嵌套或遞歸數據結構,FSM通常難以處理。
OpenAI的CFG方法在表達復雜schema時表現出色。例如,支持遞歸模式的JSON schema在OpenAI API上已得到實現,但無法通過FSM方法表達。
輸入成本節省一半
支持函數調用的所有模型均可實現結構化輸出,包括最新的GPT-4o和GPT-4o-mini模型,以及微調模型。
此功能可在Chat Completions API、Assistants API和Batch API上使用,并兼容視覺輸入。
與gpt-4o-2024-05-13版本相比,gpt-4o-2024-08-06版本在成本上也更具優勢,開發者可以在輸入端節省50%的成本(2.50美元/1M oken),在輸出端節省33%的成本(10.00美元/1M token)。
如何使用結構化輸出
在API中可以使用兩種形式引入結構化輸出:
函數調用
通過在函數定義中設置strict: true,可以實現通過工具的結構化輸出。
此功能適用于支持工具的所有型號,包括所有型號gpt-4-0613和gpt-3.5-turbo-0613及更高版本。
啟用結構化輸出后,模型輸出將與提供的工具定義匹配。
示例請求:
POST /v1/chat/completions
{
"model": "gpt-4o-2024-08-06",
"messages": [
{
"role": "system",
"content": "You are a helpful assistant. The current date is August 6, 2024. You help users query for the data they are looking for by calling the query function."
},
{
"role": "user",
"content": "look up all my orders in may of last year that were fulfilled but not delivered on time"
}
],
"tools": [
{
"type": "function",
"function": {
"name": "query",
"description": "Execute a query.",
"strict": true,
"parameters": {
"type": "object",
"properties": {
"table_name": {
"type": "string",
"enum": ["orders"]
},
"columns": {
"type": "array",
"items": {
"type": "string",
"enum": [
"id",
"status",
"expected_delivery_date",
"delivered_at",
"shipped_at",
"ordered_at",
"canceled_at"
]
}
},
"conditions": {
"type": "array",
"items": {
"type": "object",
"properties": {
"column": {
"type": "string"
},
"operator": {
"type": "string",
"enum": ["=", ">", "<", ">=", "<=", "!="]
},
"value": {
"anyOf": [
{
"type": "string"
},
{
"type": "number"
},
{
"type": "object",
"properties": {
"column_name": {
"type": "string"
}
},
"required": ["column_name"],
"additionalProperties": false
}
]
}
},
"required": ["column", "operator", "value"],
"additionalProperties": false
}
},
"order_by": {
"type": "string",
"enum": ["asc", "desc"]
}
},
"required": ["table_name", "columns", "conditions", "order_by"],
"additionalProperties": false
}
}
}
]
}
示例輸出:
{
"table_name": "orders",
"columns": ["id", "status", "expected_delivery_date", "delivered_at"],
"conditions": [
{
"column": "status",
"operator": "=",
"value": "fulfilled"
},
{
"column": "ordered_at",
"operator": ">=",
"value": "2023-05-01"
},
{
"column": "ordered_at",
"operator": "<",
"value": "2023-06-01"
},
{
"column": "delivered_at",
"operator": ">",
"value": {
"column_name": "expected_delivery_date"
}
}
],
"order_by": "asc"
}
response_format參數的新選項
開發人員現在可以通過response_format的新選項json_schema選擇是否需要規定格式的輸出。
當模型不調用工具,而是以結構化方式響應用戶時,這一功能非常有用。
此功能適用于最新的GPT-4o型號:今天發布的gpt-4o-2024-08-06和gpt-4o-mini-2024-07-18 。
將response_format設置為strict:true時,模型輸出將與提供的schema匹配。
示例請求:
POST /v1/chat/completions
{
"model": "gpt-4o-2024-08-06",
"messages": [
{
"role": "system",
"content": "You are a helpful math tutor."
},
{
"role": "user",
"content": "solve 8x + 31 = 2"
}
],
"response_format": {
"type": "json_schema",
"json_schema": {
"name": "math_response",
"strict": true,
"schema": {
"type": "object",
"properties": {
"steps": {
"type": "array",
"items": {
"type": "object",
"properties": {
"explanation": {
"type": "string"
},
"output": {
"type": "string"
}
},
"required": ["explanation", "output"],
"additionalProperties": false
}
},
"final_answer": {
"type": "string"
}
},
"required": ["steps", "final_answer"],
"additionalProperties": false
}
}
}
}
示例輸出:
{
"steps": [
{
"explanation": "Subtract 31 from both sides to isolate the term with x.",
"output": "8x + 31 - 31 = 2 - 31"
},
{
"explanation": "This simplifies to 8x = -29.",
"output": "8x = -29"
},
{
"explanation": "Divide both sides by 8 to solve for x.",
"output": "x = -29 / 8"
}
],
"final_answer": "x = -29 / 8"
}
開發人員可以使用結構化輸出逐步生成答案,以引導達到預期的輸出。
根據OpenAI的說法,開發人員不需要驗證或重試格式不正確的響應,并且該功能允許更簡單的提示。
原生SDK支持
OpenAI稱他們的Python和Node SDK已更新,原生支持結構化輸出。
為工具提供架構或響應格式就像提供Pydantic或Zod對象一樣簡單,OpenAI的SDK能將數據類型轉換為支持的JSON模式、自動將JSON響應反序列化為類型化數據結構以及解析拒絕。
from enum import Enum
from typing import Union
from pydantic import BaseModel
import openai
from openai import OpenAI
class Table(str, Enum):
orders = "orders"
customers = "customers"
products = "products"
class Column(str, Enum):
id = "id"
status = "status"
expected_delivery_date = "expected_delivery_date"
delivered_at = "delivered_at"
shipped_at = "shipped_at"
ordered_at = "ordered_at"
canceled_at = "canceled_at"
class Operator(str, Enum):
eq = "="
gt = ">"
lt = "<"
le = "<="
ge = ">="
ne = "!="
class OrderBy(str, Enum):
asc = "asc"
desc = "desc"
class DynamicValue(BaseModel):
column_name: str
class Condition(BaseModel):
column: str
operator: Operator
value: Union[str, int, DynamicValue]
class Query(BaseModel):
table_name: Table
columns: list[Column]
conditions: list[Condition]
order_by: OrderBy
client = OpenAI()
completion = client.beta.chat.completions.parse(
model="gpt-4o-2024-08-06",
messages=[
{
"role": "system",
"content": "You are a helpful assistant. The current date is August 6, 2024. You help users query for the data they are looking for by calling the query function.",
},
{
"role": "user",
"content": "look up all my orders in may of last year that were fulfilled but not delivered on time",
},
],
tools=[
openai.pydantic_function_tool(Query),
],
)
print(completion.choices[0].message.tool_calls[0].function.parsed_arguments)
而且,本機結構化輸出支持也可用于response_format 。
from pydantic import BaseModel
from openai import OpenAI
class Step(BaseModel):
explanation: str
output: str
class MathResponse(BaseModel):
steps: list[Step]
final_answer: str
client = OpenAI()
completion = client.beta.chat.completions.parse(
model="gpt-4o-2024-08-06",
messages=[
{"role": "system", "content": "You are a helpful math tutor."},
{"role": "user", "content": "solve 8x + 31 = 2"},
],
response_format=MathResponse,
)
message = completion.choices[0].message
if message.parsed:
print(message.parsed.steps)
print(message.parsed.final_answer)
else:
print(message.refusal)
其他用例
開發人員經常使用OpenAI的模型為各種用例生成結構化數據。
其他一些例子包括:
-根據用戶意圖動態生成用戶界面
開發人員可以使用結構化輸出來創建代碼或UI生成應用程序。
使用相同的response_format ,可根據用戶輸入生成不同的 UI。
例如,創建「園丁的登錄界面」:
圖片
就是用以下代碼生成的:
{
"type": "div",
"label": "",
"children": [
{
"type": "header",
"label": "",
"children": [
{
"type": "div",
"label": "Green Thumb Gardening",
"children": [],
"attributes": [{ "name": "className", "value": "site-title" }]
},
{
"type": "div",
"label": "Bringing Life to Your Garden",
"children": [],
"attributes": [{ "name": "className", "value": "site-tagline" }]
}
],
"attributes": [{ "name": "className", "value": "header" }]
},
{
"type": "section",
"label": "",
"children": [
{
"type": "div",
"label": "",
"children": [
{
"type": "div",
"label": "About Us",
"children": [
{
"type": "div",
"label": "At Green Thumb Gardening, we specialize in transforming your outdoor spaces into beautiful, thriving gardens. Our team has decades of experience in horticulture and landscape design.",
"children": [],
"attributes": [
{ "name": "className", "value": "about-description" }
]
}
],
"attributes": [{ "name": "className", "value": "about-section" }]
}
],
"attributes": [{ "name": "className", "value": "content" }]
}
],
"attributes": [{ "name": "className", "value": "about-container" }]
},
{
"type": "section",
"label": "",
"children": [
{
"type": "div",
"label": "",
"children": [
{
"type": "div",
"label": "Our Services",
"children": [
{
"type": "div",
"label": "Garden Design",
"children": [],
"attributes": [
{ "name": "className", "value": "service-item" }
]
},
{
"type": "div",
"label": "Plant Care & Maintenance",
"children": [],
"attributes": [
{ "name": "className", "value": "service-item" }
]
},
{
"type": "div",
"label": "Seasonal Cleanup",
"children": [],
"attributes": [
{ "name": "className", "value": "service-item" }
]
},
{
"type": "div",
"label": "Custom Landscaping",
"children": [],
"attributes": [
{ "name": "className", "value": "service-item" }
]
}
],
"attributes": [{ "name": "className", "value": "services-list" }]
}
],
"attributes": [{ "name": "className", "value": "content" }]
}
],
"attributes": [{ "name": "className", "value": "services-container" }]
}
],
"attributes": [{ "name": "className", "value": "landing-page" }]
}
- 將最終答案與支撐性的推理或附加評論分開
為模型提供一個單獨的思維鏈字段可以提高響應的最終質量。
請求:
{
"model": "gpt-4o-2024-08-06",
"messages": [
{
"role": "system",
"content": "You are a helpful assistant"
},
{
"role": "user",
"content": "9.11 and 9.9 -- which is bigger?"
}
],
"response_format": {
"type": "json_schema",
"json_schema": {
"name": "reasoning_schema",
"strict": true,
"schema": {
"type": "object",
"properties": {
"reasoning_steps": {
"type": "array",
"items": {
"type": "string"
},
"description": "The reasoning steps leading to the final conclusion."
},
"answer": {
"type": "string",
"description": "The final answer, taking into account the reasoning steps."
}
},
"required": ["reasoning_steps", "answer"],
"additionalProperties": false
}
}
}
}
結構化輸出:
{
"reasoning_steps": [
"First step is to compare the numbers 9.11 and 9.9.",
"Both numbers have the same whole number part, which is 9.",
"To compare the decimal parts, convert them to the same number of decimal places.",
"9.11 has two decimal places: it is 9.11.",
"9.9 has one decimal place: it can be rewritten as 9.90.",
"Now, compare 9.11 and 9.90 by looking at the decimal parts.",
"Compare 11 with 90.",
"90 is greater than 11, so 9.90 is greater than 9.11."
],
"answer": "9.9 is bigger than 9.11."
}
- 從非結構化數據中提取結構化數據
例如,指示模型從會議記錄中提取待辦事項、截止日期和作業等內容。
請求:
POST /v1/chat/completions
{
"model": "gpt-4o-2024-08-06",
"messages": [
{
"role": "system",
"content": "Extract action items, due dates, and owners from meeting notes."
},
{
"role": "user",
"content": "...meeting notes go here..."
}
],
"response_format": {
"type": "json_schema",
"json_schema": {
"name": "action_items",
"strict": true,
"schema": {
"type": "object",
"properties": {
"action_items": {
"type": "array",
"items": {
"type": "object",
"properties": {
"description": {
"type": "string",
"description": "Description of the action item."
},
"due_date": {
"type": ["string", "null"],
"description": "Due date for the action item, can be null if not specified."
},
"owner": {
"type": ["string", "null"],
"description": "Owner responsible for the action item, can be null if not specified."
}
},
"required": ["description", "due_date", "owner"],
"additionalProperties": false
},
"description": "List of action items from the meeting."
}
},
"required": ["action_items"],
"additionalProperties": false
}
}
}
}
結構化輸出:
{
"action_items": [
{
"description": "Collaborate on optimizing the path planning algorithm",
"due_date": "2024-06-30",
"owner": "Jason Li"
},
{
"description": "Reach out to industry partners for additional datasets",
"due_date": "2024-06-25",
"owner": "Aisha Patel"
},
{
"description": "Explore alternative LIDAR sensor configurations and report findings",
"due_date": "2024-06-27",
"owner": "Kevin Nguyen"
},
{
"description": "Schedule extended stress tests for the integrated navigation system",
"due_date": "2024-06-28",
"owner": "Emily Chen"
},
{
"description": "Retest the system after bug fixes and update the team",
"due_date": "2024-07-01",
"owner": "David Park"
}
]
}
安全的結構化輸出
安全是OpenAI的首要任務——新的結構化輸出功能將遵守OpenAI現有的安全政策,并且仍然允許模型拒絕不安全的請求。
為了使開發更簡單,API響應上有一個新的refusal字符串值,它允許開發人員以編程方式檢測模型是否生成拒絕而不是與架構匹配的輸出。
當響應不包含拒絕并且模型的響應沒有過早中斷(如finish_reason所示)時,模型的響應將可靠地生成與提供的schema匹配的有效JSON。
{
"id": "chatcmpl-9nYAG9LPNonX8DAyrkwYfemr3C8HC",
"object": "chat.completion",
"created": 1721596428,
"model": "gpt-4o-2024-08-06",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"refusal": "I'm sorry, I cannot assist with that request."
},
"logprobs": null,
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 81,
"completion_tokens": 11,
"total_tokens": 92
},
"system_fingerprint": "fp_3407719c7f"
}
參考資料:
https://openai.com/index/introducing-structured-outputs-in-the-api/