圖譜賦能AI Agent:探索下一代智能體的發展前景與挑戰 原創
摘要
人工智能代理經歷了從強化學習(RL)早期主導地位到由大型語言模型(LLMs)驅動的智能代理崛起的范式轉變,現在正進一步邁向RL和LLM能力的協同融合。這一進步賦予了人工智能代理越來越強大的能力。盡管取得了這些進展,但要完成復雜的現實世界任務,代理需要有效規劃和執行,保持可靠的記憶,并與其他代理順暢協調。實現這些能力涉及應對始終存在的復雜信息、操作和互動。鑒于這一挑戰,數據結構化可以通過將復雜無序的數據轉換為結構良好的形式,使代理能夠更有效地理解和處理,從而發揮出有希望的作用。在此背景下,圖譜在組織、管理和利用復雜數據關系方面具有天然優勢,為支持高級人工智能代理所需的能力提供了強大的數據結構化范式。為此,本調查首次系統地回顧了圖譜如何賦能人工智能代理。具體來說,我們探討了圖形技術與核心智能體功能的整合,重點介紹了顯著的應用案例,并識別了未來研究的潛在途徑。通過全面審視這一迅速發展的交叉領域,我們希望激發下一代AI智能體的發展,使其能夠應對日益復雜的圖譜挑戰。相關資源已在GitHub鏈接中收集并持續更新供社區參考。
Graphs Meet AI Agents: Taxonomy, Progress, and Future Opportunities
??https://github.com/YuanchenBei/Awesome-Graphs-Meet-Agents??
核心速覽
研究背景
- 研究問題:這篇文章要解決的問題是如何通過圖技術增強AI代理的功能,包括規劃、執行、記憶和多智能體協調。具體來說,圖技術如何幫助AI代理更有效地處理復雜任務中的信息、操作和交互。
- 研究難點:該問題的研究難點包括:處理復雜任務中的非結構化數據、在多智能體系統中實現有效的信息傳遞和協調、以及在動態環境中維護和更新記憶。
- 相關工作:該問題的研究相關工作包括強化學習(RL)和大語言模型(LLM)在AI代理中的應用,以及圖技術在數據組織和知識提取方面的成功應用。
研究方法
這篇論文提出了通過圖技術來增強AI代理功能的方法。具體來說,
- 圖在代理規劃中的應用:圖可以用于組織任務推理形式、安排任務分解過程,并構建高效的任務決策搜索過程。
- 任務推理:使用知識圖譜輔助推理和結構組織推理來增強LLM代理的任務理解能力。
- 任務分解:構建任務依賴圖(TDG)來表示子任務的依賴關系,并優化執行路徑。
- 任務決策搜索:使用狀態空間圖(SSG)來表示狀態之間的轉換,并通過搜索算法(如蒙特卡羅樹搜索)來優化決策過程。
2.圖在代理執行中的應用:圖可以幫助組織工具的使用和環境交互。
- 工具使用:構建工具圖來清晰地顯示工具之間的連接,使代理能夠高效地使用和管理大量工具。
- 環境交互:使用場景圖來編碼視覺場景中的對象及其空間或語義關系,并采用啟發式方法和學習方法來建模這些關系。
3.圖在代理記憶中的應用:圖結構記憶可以有效地揭示各種信息之間的潛在關聯。
- 記憶組織:將知識和經驗存儲為相互連接的表示,使用知識圖譜和其他結構化形式來組織長期記憶。
- 記憶檢索:使用圖檢索增強生成(graph-based RAG)來準確高效地檢索有用信息。
- 記憶維護:動態更新和細化記憶表示和圖拓撲,以響應新的經驗和交互。
4.圖在多智能體協調中的應用:圖可以用于建模智能體之間的通信路徑和任務分配。
- 任務特定關系:使用任務依賴圖和任務分配圖來優化任務執行和信息交換。
- 環境特定關系:根據特定環境的特征動態學習智能體之間的關系權重。
- 協調拓撲優化:通過邊重要性測量、圖自編碼器優化和強化學習來優化多智能體系統的通信拓撲。
結果與分析
- 科學計算:圖學習結合代理系統在科學計算領域表現出顯著潛力,特別是在自動化科學發現和生物信息學分析方面。
- 具身AI:圖基表示在具身AI中提供了強大的工具,增強了場景理解能力,并支持更明智的決策。
- 游戲AI:圖結構在表示和建模游戲AI中越來越受到關注,特別是在多智能體游戲和文本游戲中。
- 代理信息檢索:圖學習可以支持結構化的檢索規劃和自適應推理,提高復雜任務中的信息檢索效率。
- 工業和自動化系統:圖組織和學習可以在工業系統中實現高效的性能,增強系統的可擴展性、動態演化和魯棒性。
- 人類社會:LLM代理在分析和模擬人類社會行為方面顯示出巨大潛力,特別是在社交網絡和信息傳播方面。
總體結論
這篇論文全面系統地回顧了圖技術與AI代理的交叉領域,探討了圖技術如何增強代理的規劃、執行、記憶和多智能體協調功能。此外,還研究了代理范式如何反過來增強圖學習。基于詳細的回顧,論文總結了有意義的應用、開放問題和未來的研究方向,為下一代面對日益復雜和混亂任務信息的代理提供了新的潛在方法。相關資源已在Github鏈接中組織并持續更新。
論文評價
優點與創新
- 系統綜述:本文提供了圖技術與AI代理交叉領域的第一個全面系統綜述,涵蓋了從基于強化學習(RL)到基于大型語言模型(LLM)的代理范式。
- 新穎的分類視角:論文引入了一種新穎的分類視角,探討了圖如何增強代理的核心功能:規劃、執行、記憶和多代理協調。
- 雙向創新:論文不僅討論了圖如何增強代理功能,還探討了代理如何反過來推動圖學習的進步,強調了雙向創新和整體視角。
- 應用和未來機會:基于綜述,論文進一步討論了圖增強AI代理的有意義應用、關鍵挑戰和未來機會。
- 資源更新:相關資源在Github鏈接中收集并持續更新,為研究和工業社區提供了寶貴的參考資料。
不足與反思
- 基準評估:現有的基準測試在任務定義或評估數據上存在差異,使得統一評估變得具有挑戰性。此外,針對多代理推理、大規模動態環境中的記憶和協作等新興復雜場景,缺乏以圖為中心的代理基準。
- 圖基礎模型:盡管有許多基于圖的圖學習方法,但在代理功能中廣泛使用的圖操作符仍然缺乏。開發有效的圖基礎模型(GFMs)是一個有前景的方向,特別是從效果、可解釋性和可擴展性(EES)的角度設計GFMs。
- 隱私和安全:圖組織和學習在建立實體之間的連接時可能會帶來安全問題,包括數據隱私和攻擊防御。未來的研究應專注于設計更安全的代理協調和環境交互策略。
- 多模態代理:盡管LLM代理在語言空間中取得了顯著進展,但多模態代理(能夠理解和整合文本、視覺和語音等信息流的代理)的研究仍在不斷發展。圖學習可以在多模態數據的抽象和連接中發揮重要作用。
- 模型上下文協議(MCP):MCP協議提供了一種標準化手段,用于無縫集成代理應用和外部數據源、工具和服務。未來的研究可以探索如何利用圖學習增強MCP的兩個有前景的方面:高效的數據集成和個性化推薦。
- 開放代理網絡(OAN):一個開放的代理網絡(OAN)將是一個公共、去中心化的網絡,其中代理被注冊和編排。未來的研究可以探索如何在OAN中實現圖學習,以提高代理網絡的效率和風險控制能力。
關鍵問題及回答
問題1:圖技術在AI代理的規劃功能中具體有哪些應用?
- 任務推理:使用知識圖譜輔助推理和結構組織推理來增強LLM代理的任務理解能力。例如,QA-GNN通過結合語言模型和輔助知識圖譜來提高問答系統的性能,而ToG和KG-CoT則利用基于知識圖譜的檢索增強生成(RAG)來提升LLM代理的任務推理能力。
- 任務分解:構建任務依賴圖(TDG)來表示子任務的依賴關系,并優化執行路徑。TDG是一種有向無環圖(DAG),用于表示任務之間的依賴關系,從而幫助代理識別和執行有效的任務子序列。
- 任務決策搜索:使用狀態空間圖(SSG)來表示狀態之間的轉換,并通過搜索算法(如蒙特卡羅樹搜索)來優化決策過程。SSG通過將狀態和狀態之間的轉換表示為圖的節點和邊,幫助代理在復雜環境中進行有效的決策。
問題2:圖技術在AI代理的執行功能中有哪些具體應用?
- 工具使用:構建工具圖來清晰地顯示工具之間的連接,使代理能夠高效地使用和管理大量工具。例如,GPTSwarm通過將代理抽象為有向圖(DAG),每個節點代表一個函數,邊表示信息流,從而幫助代理高效地調用和管理工具。
- 環境交互:使用場景圖來編碼視覺場景中的對象及其空間或語義關系,并采用啟發式方法和學習方法來建模這些關系。場景圖通過將對象及其關系表示為圖的節點和邊,幫助代理更好地理解和交互復雜的環境。
問題3:圖技術在AI代理的記憶功能中有哪些具體應用?
- 記憶組織:將知識和經驗存儲為相互連接的表示,使用知識圖譜和其他結構化形式來組織長期記憶。例如,AriGraph將代理的記憶表示為結構化的知識圖譜,包含語義事實和事件,從而幫助代理回憶復雜的結構和關系。
- 記憶檢索:使用圖檢索增強生成(graph-based RAG)來準確高效地檢索有用信息。例如,G-Retriever和GFM-RAG通過結合語義相似性和圖度量來設計定制的檢索器,從而提高信息檢索的準確性和效率。
- 記憶維護:動態更新和細化記憶表示和圖拓撲,以響應新的經驗和交互。例如,A-MEM通過動態索引和鏈接創建互聯的知識網絡,從而允許代理在不斷變化的環境中持續改進其記憶表示。
?? Taxonomy
Graph for Agent Planning
Task Reasoning
Knowledge Graph-Auxiliary Reasoning
- (NAACL 2021) QA-GNN: Reasoning with language models and knowledge graphs for question answering [Paper] [Code]
- (ICLR 2024) Think-on-graph: Deep and responsible reasoning of large language model on knowledge graph [Paper] [Code]
- (ICLR 2024) Reasoning on graphs: Faithful and interpretable large language model reasoning [Paper] [Code]
- (IJCAI 2024) Kg-cot: Chain-of-thought prompting of large language models over knowledge graphs for knowledge-aware question answering [Paper]
- (ACL 2024) Mindmap: Knowledge graph prompting sparks graph of thoughts in large language models [Paper] [Code]
- (WWW 2025) Paths-over-graph: Knowledge graph empowered large language model reasoning [Paper]
Structure-Organized Reasoning
- (NeurIPS 2023) Tree of thoughts: Deliberate problem solving with large language models [Paper] [Code]
- (NAACL 2024) GoT: Effective Graph-of-Thought Reasoning in Language Models [Paper] [Code]
- (AAAI 2024) Graph of thoughts: Solving elaborate problems with large language models [Paper] [Code]
- (AAAI 2025) Ratt: A thought structure for coherent and correct llm reasoning [Paper] [Code]
- (Arxiv 2025) Reasoning with Graphs: Structuring Implicit Knowledge to Enhance LLMs Reasoning [Paper]
Task Decomposition
- (CoLM 2024) Agentkit: Structured LLM reasoning with dynamic graphs [Paper] [Code]
- (TMLR 2024) Feudal Graph Reinforcement Learning [Paper]
- (NeurIPS 2024) Can Graph Learning Improve Planning in LLM-based Agents? [Paper] [Code]
- (ACL 2024) Villageragent: A graph-based multi-agent framework for coordinating complex task dependencies in minecraft [Paper] [Code]
- (Arxiv 2024) DAG-Plan: Generating Directed Acyclic Dependency Graphs for Dual-Arm Cooperative Planning [Paper] [Demo]
- (ICRA 2025) Enhancing Multi-Agent Systems via Reinforcement Learning with LLM-based Planner and Graph-based Policy [Paper]
- (Arxiv 2025) DynTaskMAS: A Dynamic Task Graph-driven Framework for Asynchronous and Parallel LLM-based Multi-Agent Systems [Paper]
- (Arxiv 2025) Plan-over-Graph: Towards Parallelable LLM Agent Schedule [Paper] [Code]
Task Decision Searching
- (NeurIPS 2014) Deep Learning for Real-Time Atari Game Play Using Offline Monte-Carlo Tree Search Planning [Paper]
- (AAAI 2018, Best Paper) Memory-augmented monte carlo tree search [Paper]
- (ACML 2020) Monte-Carlo Graph Search: the Value of Merging Similar States [Paper]
- (ICAPS 2021) Improving alphazero using monte-carlo graph search [Paper]
- (AI 2024) Evolving interpretable decision trees for reinforcement learning [Paper]
- (AAMAS 2024) Continuous monte carlo graph search [Paper] [Code]
- (ICLR 2024) Promptagent: Strategic planning with language models enables expert-level prompt optimization [Paper] [Code]
- (ICML 2024) Language Agent Tree Search Unifies Reasoning, Acting, and Planning in Language Models [Paper] [Code]
Graph for Agent Execution
Tool Usage
- (ACL 2024) Graph Chain-of-Thought: Augmenting Large Language Models by Reasoning on Graphs [Paper] [Code]
- (ECCV 2024) ControlLLM: Augment Language Models with Tools by Searching on Graphs [Paper] [Code]
- (ICML 2024) GPTSwarm: Language Agents as Optimizable Graphs [Paper] [Code]
- (Arxiv 2024) ToolNet: Connecting Large Language Models with Massive Tools via Tool Graph [Paper]
- (Arxiv 2025) Graph RAG-Tool Fusion [Paper] [Code]
Environment Interaction
Heuristic-Based Relationship
- (ICRA 2023) Multiagent Reinforcement Learning for Autonomous Routing and Pickup Problem with Adaptation to Variable Demand [Paper]
- (COR 2024) A Graph Reinforcement Learning Framework for Neural Adaptive Large Neighbourhood Search [Paper]
- (AAMAS 2024) Towards Generalizability of Multi-Agent Reinforcement Learning in Graphs with Recurrent Message Passing [Paper] [Code]
- (RAL 2024) Language-Grounded Dynamic Scene Graphs for Interactive Object Search with Mobile Manipulation [Paper] [Code]
- (Arxiv 2024) PlanAgent: A Multi-modal Large Language Agent for Closed-loop Vehicle Motion Planning [Paper]
- (CVPR 2025) GUI-Xplore: Empowering Generalizable GUI Agents with One Exploration [Paper] [Code]
- (Arxiv 2025) Multi-agent Auto-Bidding with Latent Graph Diffusion Models [Paper]
- (Arxiv 2025) A Schema-Guided Reason-while-Retrieve framework for Reasoning on Scene Graphs with Large-Language-Models (LLMs) [Paper]
Learning-Based Relationship
- (NeurIPS Workshop 2018) Deep Multi-Agent Reinforcement Learning with Relevance Graphs [Paper] [Code]
- (CoRL 2023) Learning Control Admissibility Models with Graph Neural Networks for Multi-Agent Navigation [Paper]
- (AAMAS 2023) TransfQMix: Transformers for Leveraging the Graph Structure of Multi-Agent Reinforcement Learning Problems [Paper] [Code]
- (TAI 2024) Reinforcement Learned Multi–Agent Cooperative Navigation in Hybrid Environment with Relational Graph Learning [Paper]
- (NCA 2024) Graph network-based human movement prediction for socially-aware robot navigation in shared workspaces [Paper]
Graph for Agent Memory
Memory Organization
- (JMS 2021) Towards Self-X Cognitive Manufacturing Network: An Industrial Knowledge Graph-Based Multi-Agent Reinforcement Learning Approach [Paper]
- (Arxiv 2024) AriGraph: Learning Knowledge Graph World Models with Episodic Memory for LLM Agents [Paper]
- (Arxiv 2024) On the Structural Memory of LLM Agents [Paper] [Code]
- (Arxiv 2024) From Local to Global: A GraphRAG Approach to Query-Focused Summarization [Paper] [Code]
- (Arxiv 2024) KG-Retriever: Efficient Knowledge Indexing for Retrieval-Augmented Large Language Models [Paper] [Code]
- (SIGIR 2025) Enhancing the Patent Matching Capability of Large Language Models via the Memory Graph [Paper] [Code]
- (AAAI 2025) LLM-Powered Decentralized Generative Agents with Adaptive Hierarchical Knowledge Graph for Cooperative Planning [Paper] [Code]
- (WWW 2025) Graphusion: A RAG Framework for Scientific Knowledge Graph Construction with a Global Perspective [Paper] [Code]
- (Arxiv 2025) Agentic Reasoning: Reasoning LLMs with Tools for the Deep Research [Paper] [Code]
Memory Retrieval
- (NeurIPS 2024) G-Retriever: Retrieval-Augmented Generation for Textual Graph Understanding and Question Answering [Paper] [Code]
- (Arxiv 2024) LightRAG: Simple and Fast Retrieval-Augmented Generation [Paper] [Code]
- (ICLR 2025) Simple is Effective: The Roles of Graphs and Large Language Models in Knowledge-Graph-Based Retrieval-Augmented Generation [Paper] [Code]
- (NAACL 2025) GRAG: Graph Retrieval-Augmented Generation [Paper] [Code]
- (Arxiv 2025) GFM-RAG: Graph Foundation Model for Retrieval Augmented Generation [Paper] [Code]
- (Arxiv 2025) PathRAG: Pruning Graph-based Retrieval Augmented Generation with Relational Paths [Paper] [Code]
Memory Maintenance
- (NeurIPS 2024) HippoRAG: Neurobiologically Inspired Long-Term Memory for Large Language Models [Paper] [Code]
- (Arxiv 2024) LightRAG: Simple and Fast Retrieval-Augmented Generation [Paper] [Code]
- (Arxiv 2024) KG-Agent: An Efficient Autonomous Agent Framework for Complex Reasoning over Knowledge Graph [Paper]
- (Arxiv 2024) AriGraph: Learning Knowledge Graph World Models with Episodic Memory for LLM Agents [Paper] [Code]
- (AAAI 2025) LLM-Powered Decentralized Generative Agents with Adaptive Hierarchical Knowledge Graph for Cooperative Planning [Paper] [Code]
- (Arxiv 2025) Zep: A Temporal Knowledge Graph Architecture for Agent Memory [Paper] [Code]
- (Arxiv 2025) A-Mem: Agentic Memory for LLM Agents [Paper] [Code]
- (Arxiv 2025) InstructRAG: Leveraging Retrieval-Augmented Generation on Instruction Graphs for LLM-Based Task Planning [Paper]
Graphs for Multi-Agent Coordination
Coordination Message Passing
Task-Specific Relationship
- (NeurIPS 2022) Learning NP-Hard Multi-Agent Assignment Planning using GNN: Inference on a Random Graph and Provable Auction-Fitted Q-learning [Paper]
- (ICRA 2025) Enhancing Multi-Agent Systems via Reinforcement Learning with LLM-based Planner and Graph-based Policy [Paper]
- (ICLR 2025) Scaling Large Language Model-based Multi-Agent Collaboration [Paper] [Code]
- (Arxiv 2025) DynTaskMAS: A Dynamic Task Graph-driven Framework for Asynchronous and Parallel LLM-based Multi-Agent Systems [Paper]
- (Arxiv 2025) MAGNNET: Multi-Agent Graph Neural Network-based Efficient Task Allocation for Autonomous Vehicles with Deep Reinforcement Learning [Paper]
- (Arxiv 2025) GNNs as Predictors of Agentic Workflow Performances [Paper] [Code]
Environment-Specific Relationship
- (ICASSP 2021) Graphcomm: A Graph Neural Network Based Method for Multi-Agent Reinforcement Learning [Paper]
- (ITSC 2022) Graph Convolution-Based Deep Reinforcement Learning for Multi-Agent Decision-Making in Interactive Traffic Scenarios [Paper]
- (TITS 2022) Multi-Agent Trajectory Prediction with Heterogeneous Edge-Enhanced Graph Attention Network [Paper]
- (TCCN 2023) MAGNNETO: A Graph Neural Network-based Multi-Agent system for Traffic Engineering [Paper] [Code]
- (TPAMI 2023) Robust Multi-Agent Communication With Graph Information Bottleneck Optimization [Paper]
- (IROS 2024) Transformer-based Multi-Agent Reinforcement Learning for Generalization of Heterogeneous Multi-Robot Cooperation [Paper]
- (ICLR 2025) Exponential Topology-enabled Scalable Communication in Multi-agent Reinforcement Learning [Paper] [Code]
Coordination Topology Optimization
- (AAAI 2020) Multi-Agent Game Abstraction via Graph Attention Neural Network [Paper]
- (AAMAS 2021) Deep Implicit Coordination Graphs for Multi-agent Reinforcement Learning [Paper] [Code]
- (AAMAS 2021) Multi-Agent Graph-Attention Communication and Teaming [Paper]
- (NeurIPS 2021) Learning Distilled Collaboration Graph for Multi-Agent Perception [Paper] [Code]
- (TNNLS 2022) Online Multi-Agent Forecasting with Interpretable Collaborative Graph Neural Networks [Paper]
- (AAMAS 2023) Learning Graph-Enhanced Commander-Executor for Multi-Agent Navigation [Paper] [Code]
- (ICLR 2024) Learning Multi-Agent Communication from Graph Modeling Perspective [Paper] [Code]
- (Arxiv 2024) G-Designer: Architecting Multi-agent Communication Topologies via Graph Neural Networks [Paper] [Code]
- (COLM 2024) A Dynamic LLM-Powered Agent Network for Task-Oriented Agent Collaboration [Paper] [Code]
- (ICML 2024) GPTSwarm: Language Agents as Optimizable Graphs [Paper] [Code]
- (JCISE 2025) Adaptive Network Intervention for Complex Systems: A Hierarchical Graph Reinforcement Learning Approach [Paper]
- (ICRA 2025) Reliable and Efficient Multi-Agent Coordination via Graph Neural Network Variational Autoencoders [Paper] [Code]
- (ICLR 2025) Cut the Crap: An Economical Communication Pipeline for LLM-based Multi-Agent Systems [Paper] [Code]
- (Arxiv 2025) Deep Meta Coordination Graphs for Multi-agent Reinforcement Learning [Paper] [Code]
- (Arxiv 2025) Adaptive Graph Pruning for Multi-Agent Communication [Paper]
Agents for Graph Learning
Graph Annotation and Synthesis
- (NeurIPS 2020) Graph Policy Network for Transferable Active Learning on Graphs [Paper] [Code]
- (AAAI 2022) Batch Active Learning with Graph Neural Networks via Multi-Agent Deep Reinforcement Learning [Paper]
- (Arxiv 2024) Exploring the Potential of Large Language Models in Graph Generation [Paper]
- (Arxiv 2024) LLM-Based Multi-Agent Systems are Scalable Graph Generative Models [Paper] [Code]
- (ICLR Workshop 2025) IGDA: Interactive Graph Discovery through Large Language Model Agents [Paper]
- (Arxiv 2025) Plan-over-Graph: Towards Parallelable LLM Agent Schedule [Paper] [Code]
- (Arxiv 2025) GraphMaster: Automated Graph Synthesis via LLM Agents in Data-Limited Environments [Paper]
Graph Understanding
- (KDD 2020) Policy-GNN: Aggregation Optimization for Graph Neural Networks [Paper]
- (WWW 2021) SUGAR: Subgraph Neural Network with Reinforcement Pooling and Self-Supervised Mutual Information Mechanism [Paper] [Code]
- (NeurIPS 2023) MAG-GNN: Reinforcement Learning Boosted Graph Neural Network [Paper] [Code]
- (ICLR 2023) Agent-based Graph Neural Networks [Paper] [Code]
- (Arxiv 2023) Graph Agent: Explicit Reasoning Agent for Graphs [Paper]
- (Arxiv 2023) A Versatile Graph Learning Approach through LLM-based Agent [Paper]
- (KDD 2024) GraphWiz: An Instruction-Following Language Model for Graph Computational Problems [Paper] [Code]
- (SIGIR 2024) GraphGPT: Graph Instruction Tuning for Large Language Models [Paper] [Code]
- (ICLR 2024) One For All: Towards Training One Graph Model For All Classification Tasks [Paper] [Code]
- (ICML 2024) LLaGA: Large Language and Graph Assistant [Paper] [Code]
- (KDD 2024) ZeroG: Investigating Cross-dataset Zero-shot Transferability in Graphs [Paper] [Code]
- (Arxiv 2024) GraphAgent: Agentic Graph Language Assistant [Paper] [Code]
- (Arxiv 2024) Scalable and Accurate Graph Reasoning with LLM-based Multi-Agents [Paper]
- (Arxiv 2024) GraphTeam: Facilitating Large Language Model-based Graph Reasoning via Multi-Agent Collaboration [Paper]
- (Arxiv 2024) GraphInstruct: Empowering Large Language Models with Graph Understanding and Reasoning Capability [Paper] [Code]
- (AAAI 2025) Graph Agent Network: Empowering Nodes with Inference Capabilities for Adversarial Resilience [Paper]
?? Benchmarks and Open-Source Toolkits
General
- (NeurIPS 2021, RL Agent, Multi-Agent Coordination) Benchmarking Multi-Agent Deep Reinforcement Learning Algorithms in Cooperative Tasks [Paper] [Code]
- (JMLR 2024, RL Agent, Multi-Agent Coordination) BenchMARL: Benchmarking Multi-Agent Reinforcement Learning [Paper] [Code]
- (JMLR 2025, RL Agent, Agent Memory) Memory Gym: Towards Endless Tasks to Benchmark Memory Capabilities of Agents [Paper] [Code]
- (EMNLP 2023, LLM Agent, Tool Usage) API-Bank: A Comprehensive Benchmark for Tool-Augmented LLMs [Paper] [Code]
- (NeurIPS 2023, LLM Agent, Task Reasoning, Task Decomposition) PlanBench: An Extensible Benchmark for Evaluating Large Language Models on Planning and Reasoning about Change [Paper] [Code]
- (ICLR 2024, LLM Agent, General) AgentBench: Evaluating LLMs as Agents [Paper] [Code]
- (ICLR 2024, LLM Agent, Tool Usage) ToolLLM: Facilitating Large Language Models to Master 16000+ Real-world APIs [Paper] [Code]
- (NeurIPS 2024, LLM Agent, Tool Usage) GTA: A Benchmark for General Tool Agents [Paper] [Code]
- (ICML 2024, LLM Agent, Task Reasoning, Task Decomposition, Tool Usage) TravelPlanner: A Benchmark for Real-World Planning with Language Agents [Paper] [Code]
- (ICLR 2025, LLM Agent, Tool Usage, Agent-Environment Interaction) τ-bench: A Benchmark for Tool-Agent-User Interaction in Real-World Domains [Paper] [Code]
- (NAACL 2025, LLM Agent, Tool Usage, Agent-Environment Interaction) ToolSandbox: A Stateful, Conversational, Interactive Evaluation Benchmark for LLM Tool Use Capabilities [Paper] [Code]
- (Arxiv 2025, LLM Agent, Task Reasoning, Task Decomposition, Multi-Agent Coordination) REALM-Bench: A Real-World Planning Benchmark for LLMs and Multi-Agent Systems [Paper] [Code]
Graph-Related
- (LLM Agent, Tool Usage, Multi-Agent Coordination) LangGraph [Docs] [Code]
- (NeurIPS 2024, LLM Agent, Graph Modeling) GLBench: A Comprehensive Benchmark for Graph with Large Language Models [Paper] [Code]
- (NeurIPS 2024, LLM Agent, Graph Modeling) Can Large Language Models Analyze Graphs like Professionals? A Benchmark, Datasets and Models [Paper] [Code]
- (ICLR 2025, LLM Agent, Graph Modeling) GraphArena: Evaluating and Exploring Large Language Models on Graph Computation [Paper] [Code]
- (ICML 2024, LLM Agent, Task Reasoning, Tool Usage, Multi-Agent Coordination) GPTSwarm: Language Agents as Optimizable Graphs [Paper] [Code]
- (ICLR 2025, LLM Agent, Multi-Agent Coordinatio) Scaling Large Language Model-based Multi-Agent Collaboration [Paper] [Code]
- (Arxiv 2025, LLM Agent, Tool Usage) Graph RAG-Tool Fusion [Paper] [Code]
- (Arxiv 2025, LLM Agent, Task Reasoning, Task Decomposition) GNNs as Predictors of Agentic Workflow Performances [Paper] [Code]
本文轉載自??知識圖譜科技??,作者:Wolfgang
