編程革命徹底爆發！剛剛，OpenAI最強智能體上線ChatGPT

作者：新智元 2025-05-17 08:55:41

人工智能

OpenAI最強AI編程智能體真的來了！Codex震撼上線，由o3優化版codex-1加持，多任務并行，半小時干完數天軟件工程任務。

從今天起，AI編程正式開啟新時代！

剛剛，Greg Brockman帶隊與OpenAI六人團隊開啟線上直播，震撼發布了一款云端AI編程智能體——Codex。

用奧特曼的話來說就是，一個人就能打造無數爆款應用的時代來了！

圖片

Codex由新模型codex-1加持，這是o3的一個特調版本，專為軟件工程量身打造。

它不僅能在云端沙盒環境中安全地并行處理多項任務，而且通過與GitHub無縫集成，還可以直接調用你的代碼庫。

它不僅僅是一款工具，更是一位「10x工程師」，能夠同時做到：

快速構建功能模塊
深入解答代碼庫問題
精準修復代碼漏洞
提交PR
自動執行測試驗證

過去，這些任務或許耗費開發者數小時乃至數日，如今Codex最多在30分鐘內高效完成。

點擊ChatGPT側邊欄，輸入提示后，直接點擊「代碼」分配任務，或「提問」咨詢代碼庫相關問題

通過強化學習，Codex基于真實世界的編碼任務和多樣化環境訓練，生成的代碼不僅符合人類偏好，還能無縫融入標準工作流。

基準測試顯示，codex-1在SWE-bench上拿下72.1%的高分，一舉擊敗了Claude 3.7以及o3-high。

圖片

從今天起，Codex將向全球ChatGPT Pro、Enterprise和Team用戶正式開放，Plus和Edu用戶很快就能上手了。

圖片

可以說，AI編程智能體Codex的橫空出世，或將重塑軟件開發的底層邏輯，徹底點燃了編程革命的火種。

Codex多任務并行，AI編程超級加速器

早在2021年，OpenAI首次發布了CodeX模型，開啟了「氛圍編程」（vibe coding）的時代。

這種編程方式讓開發者與AI協同工作，代碼生產變得更加直觀、高效。

幾周前，OpenAI又推出了CodeX CLI，一款可在本地終端運行的智能體。

但這只是開始！

OpenAI今天推出全新的Codex智能體，再次將軟件工程推向一個全新的高度。

接下來，一睹Codex編碼的驚艷表現吧。

連接GitHub賬戶后，OpenAI研究員Thibault Sottiaux選擇了一個開源倉庫preparedness repo。

圖片

然后，他收到了三個任務：

第一個是提問：讓代碼智能體Codex解釋代碼庫，說明整體結構
第二個是代碼任務：要求在代碼庫中查找并修復某個地方bug
第三個任務是提問：遍歷代碼庫，主動提出自己可以執行的任務建議

圖片

接下來演示中，Thibault向Codex下達多個任務，比如拼寫和語法糾錯、智能任務委派、多倉庫適配。

在糾錯方面，他故意在指令中加入拼寫錯誤，Codex不僅理解了意圖，還主動找出了代碼庫中的拼寫和語法問題并修復，細致到令人驚嘆。

圖片

當Thibault提出希望代碼庫「易維護、無bug」的目標時，Codex遍歷代碼庫后，主動發現了可變默認值、不一致的超時設置等問題，并自行生成了修復任務。

這種「自我委派」能力，堪稱智能體的巔峰表現。

	圖片

值得注意的是，Codex智能體運行在OpenAI計算基礎設施上，與強化學習共享同一套久經考驗的系統。

每個任務都在獨立的虛擬沙盒中運行，配備專屬的文件系統、CPU、內存、和網絡策略，確保了高效安全。

圖片

除了preparedness倉庫，Codex還無縫處理了CodeX CLI庫，展現其在不同項目中的泛化能力。

不論是開源項目，還是內部代碼庫，Codex都游刃有余。

Codex接收到了用戶反饋的bug，因為特殊字符文件名導致了diff命令報錯。

圖片

在解決過程中，它不僅能復現問題，還可以編寫測試腳本、運行linter檢查，并生成PR，整個過程僅需幾分鐘。

Thibault直言，「這原本可能花費我30分鐘，甚至幾個小時完成」。

圖片

此外，OpenAI研究員Katy Shi演示中強調，Codex的PR包含了詳細的摘要，清晰說明了修改內容和引用的代碼，測試結果一目了然。

圖片

一番演示下來，Greg表示，Codex讓自己深刻感受到了AGI！

對齊人類偏好實戰4個開源庫

OpenAI訓練codex-1的一個主要目標，是確保其輸出能高度符合人類的編碼偏好與標準。

與OpenAI o3相比，codex-1能穩定生成更為簡潔的代碼修改補丁，可以直接供人工審查并集成到標準工作流程中。

為了體現Codex生成代碼的簡潔和高效，OpenAI提供了Codex和o3對比的4個開源庫實戰實例：

astropy

astropy是一個用于天文學的Python開源庫。

圖片

第一個問題是astropy/astropy的倉庫中，Modeling模塊中的separability_matrix無法正確計算嵌套CompoundModels的可分離性。

圖片

可以看到，在修改前后的代碼版本對比中，使用Codex修改生成了十分簡潔的代碼。

相比之下，o3修改的代碼就顯得有些冗長了，甚至還將一些「不必要」的注釋加入了源代碼中。

圖片

matplotlib

Matplotlib是一個用于創建靜態、動畫和交互式可視化的Python綜合性庫。

圖片

這次問題是修復Bug：在mlab._spectral_helper中的窗口校正（windows correction）不正確。

圖片

同樣可以看到，Codex修改代碼的過程更為簡潔。

圖片

django

Django是基于Python的Web框架，這個問題是修復僅包含duration（時長）的表達式在SQLite和MySQL上無法正常工作。

圖片

Codex的修復過程依然優雅，并且相比o3，還首先補上了缺少的依賴調用。

圖片

expensify

expensify是一個圍繞聊天的財務協作的開源軟件。

圖片

OpenAI給出的問題是「dd [HOLD for payment 2024-10-14] [$250] LHN - 刪除緩存后，成員聊天室名稱在LHN中未更新」。

圖片

同樣可以看到Codex的問題定位和修改更為精準和有效，o3甚至進行了一次無效的代碼的修改。

圖片

OpenAI團隊已經用上了

OpenAI的技術團隊已經開始將Codex作為他們日常工具包的一部分。

OpenAI的工程師最常使用Codex來執行重復且范圍明確的任務，如重構、重命名和編寫測試，這些任務會打斷他們的專注。

它同樣適用于搭建新功能、連接組件、修復錯誤和起草文檔。

團隊正在圍繞Codex建立新的習慣：處理值班問題、在一天開始時規劃任務，以及執行后臺工作以保持進度。

通過減少上下文切換和提醒被遺忘的待辦事項，Codex幫助工程師更快地交付并專注于最重要的事情。

在正式發布前，OpenAI與少數外部測試者合作，評估Codex在不同代碼庫、開發流程與團隊環境中的實際表現：

Cisco作為早期設計合作伙伴，探索Codex在加速工程團隊構思落地方面的潛力，并通過評估真實用例向OpenAI提供反饋，助力模型優化。
Temporal借助Codex實現功能開發、問題調試、測試編寫與執行的加速，并用于重構大型代碼庫。Codex還能在后臺處理復雜任務，幫助工程師保持專注與高效迭代。
Superhuman利用Codex自動處理小型重復任務，如提高測試覆蓋率和修復集成故障；還使產品經理能夠無需工程介入（除代碼審查外）完成輕量級代碼更改，提升配對效率。
Kodiak在Codex支持下加速調試工具開發、測試覆蓋和代碼重構，推進其自動駕駛系統Kodiak Driver的研發。Codex也作為參考工具，幫助工程師理解陌生代碼棧，提供相關上下文與歷史更改。

根據目前的使用經驗來看，OpenAI建議：可同時向多個代理分配邊界清晰的任務，并嘗試多種任務類型與提示方式，以更全面地發掘模型能力。

模型系統消息

通過以下系統消息，開發者可以了解codex-1的默認行為，并針對自己的工作流進行調整。

例如，系統消息會引導Codex運行AGENTS.md文件中提到的所有測試，但如果時間緊張，就可以要求Codex跳過這些測試。

# Instructions
- The user will provide a task.
- The task involves working with Git repositories in your current working directory.
- Wait for all terminal commands to be completed (or terminate them) before finishing.


# Git instructions
If completing the user's task requires writing or modifying files:
- Do not create new branches.
- Use git to commit your changes.
- If pre-commit fails, fix issues and retry.
- Check git status to confirm your commit. You must leave your worktree in a clean state.
- Only committed code will be evaluated.
- Do not modify or amend existing commits.


# AGENTS.md spec
- Containers often contain AGENTS.md files. These files can appear anywhere in the container's filesystem. Typical locations include `/`, `~`, and in various places inside of Git repos.
- These files are a way for humans to give you (the agent) instructions or tips for working within the container.
- Some examples might be: coding conventions, info about how code is organized, or instructions for how to run or test code.
- AGENTS.md files may provide instructions about PR messages (messages attached to a GitHub Pull Request produced by the agent, describing the PR). These instructions should be respected.
- Instructions in AGENTS.md files:
  - The scope of an AGENTS.md file is the entire directory tree rooted at the folder that contains it.
  - For every file you touch in the final patch, you must obey instructions in any AGENTS.md file whose scope includes that file.
  - Instructions about code style, structure, naming, etc. apply only to code within the AGENTS.md file's scope, unless the file states otherwise.
  - More-deeply-nested AGENTS.md files take precedence in the case of conflicting instructions.
  - Direct system/developer/user instructions (as part of a prompt) take precedence over AGENTS.md instructions.
- AGENTS.md files need not live only in Git repos. For example, you may find one in your home directory.
- If the AGENTS.md includes programmatic checks to verify your work, you MUST run all of them and make a best effort to validate that the checks pass AFTER all code changes have been made.
  - This applies even for changes that appear simple, i.e. documentation. You still must run all of the programmatic checks.


# Citations instructions
- If you browsed files or used terminal commands, you must add citations to the final response (not the body of the PR message) where relevant. Citations reference file paths and terminal outputs with the following formats:
  1) `【F:<file_path>?L<line_start>(-L<line_end>)?】`
  - File path citations must start with `F:`. `file_path` is the exact file path of the file relative to the root of the repository that contains the relevant text.
  -`line_start` is the 1-indexed start line number of the relevant output within that file.
  2) `【<chunk_id>?L<line_start>(-L<line_end>)?】`
  - Where `chunk_id` is the chunk_id of the terminal output, `line_start` and `line_end` are the 1-indexed start and end line numbers of the relevant output within that chunk.
- Line ends are optional, and if not provided, line end is the same as line start, so only 1 line is cited.
- Ensure that the line numbers are correct, and that the cited file paths or terminal outputs are directly relevant to the word or clause before the citation.
- Do not cite completely empty lines inside the chunk, only cite lines that have content.
- Only cite from file paths and terminal outputs, DO NOT cite from previous pr diffs and comments, nor cite git hashes as chunk ids.
- Use file path citations that reference any code changes, documentation or files, and use terminal citations only for relevant terminal output.
- Prefer file citations over terminal citations unless the terminal output is directly relevant to the clauses before the citation, i.e. clauses on test results.
  - For PR creation tasks, use file citations when referring to code changes in the summary section of your final response, and terminal citations in the testing section.
  - For question-answering tasks, you should only use terminal citations if you need to programmatically verify an answer (i.e. counting lines of code). Otherwise, use file citations.

Codex CLI更新

上個月，OpenAI推出了一款輕量級開源工具——Codex CLI，可以讓o3和o4-mini等強大模型直接運行在本地終端中，幫助開發者更快完成任務。

圖片

這一次，OpenAI同時發布了專為Codex CLI優化的小模型版本——codex-1的o4-mini版本。

它具備低延遲、強指令理解力和代碼編輯能力，現已成為Codex CLI的默認模型，同時也可通過API使用（名稱為codex-mini-latest），并將持續迭代更新。

此外，Codex CLI的登錄方式也簡化了，開發者現在可以直接用ChatGPT賬戶登錄，選擇API組織，系統將自動生成并配置API密鑰。

為了鼓勵使用，從今天起30天內，使用ChatGPT賬戶登錄Codex CLI的用戶將獲得免費額度：Plus用戶獲得5美元API使用額度；Pro用戶獲得50美元。

Codex貴不貴

在接下來的幾周內，所有用戶可以「量大管飽」的試用Codex功能。

隨后，OpenAI將引入限流機制和靈活定價，支持按需購買額外使用量。

對于開發者，codex-mini-latest模型已在Responses API上提供，價格為：

每百萬輸入Token：$1.50
每百萬輸出Token：$6.00
并享有75%的提示緩存折扣

Codex當前仍處于研究預覽階段，尚不支持圖像輸入等前端能力，也暫不具備在任務執行中進行實時糾正的能力。

此外，委派任務給Codex智能體的響應時間較長，用戶可能需要適應這類異步協作的工作方式。

隨著模型能力不斷提升，Codex將能處理更復雜、更持久的開發任務，逐步成為更像「遠程開發伙伴」的存在。

下一步是什么

OpenAI的目標是開發者專注自己擅長的工作，其余任務交由AI代理處理，從而提升效率與生產力。

Codex將支持實時協作與異步任務委托，兩種工作模式將逐步融合。

Codex CLI等工具已經成為開發者加速編碼的標配，而由ChatGPT中的Codex引領的異步、多智能體協作流程，有望成為工程師高效產出高質量代碼的新范式。

未來，開發者將能在IDE和日常工具中與AI協同工作——提問、獲取建議、委派復雜任務，所有操作整合在一個統一的工作流程中。

OpenAI計劃進一步提升交互性和靈活性：

支持任務中途提供指導
與AI協作實施策略
接收主動進度更新
與常用工具（如GitHub、CLI、問題跟蹤器、CI系統）深度集成，便捷分配任務

圖片

軟件工程正成為首批因AI而大幅提效的行業之一，將全面釋放個人與小團隊的巨大潛力。

與此同時，OpenAI也正與合作伙伴共同研究智能體的廣泛應用將如何影響開發流程、技能發展和全球人才分布。

參考資料：

https://www.youtube.com/watch?v=hhdpnbfH6NU

https://openai.com/index/introducing-codex/

責任編輯：武曉燕來源：新智元

成人免费xxxxx在线视频软件_久久精品久久久_亚洲国产精品久久久_天天色天天色_亚洲人成一区_欧美一级欧美三级在线观看