Agent 最新研究综述(2026-05-14)
本报告自动生成自 papers.cool/arxiv/cs.AI
筛选标准:AI Agent 系统相关论文
生成时间:2026/5/14 14:28:35
📊 今日概况
- 总论文数: 25 篇
- Agent 相关: 16 篇
方向分布
| 方向 | 论文数 |
|---|---|
| other | 4 |
| multi_agent | 4 |
| planning | 6 |
| safety | 1 |
| evaluation | 1 |
| memory | 2 |
1️⃣ 今日 Agent 相关论文列表
OTHER (4 篇)
1. Harnessing Agentic Evolution
- arXiv ID: 2605.13821
- 研究方向: other
- 核心要点:
- agentic,evolution,aevo,feedback,agent,accumulated,editing,future,evidence,meta
2. How to Interpret Agent Behavior
- arXiv ID: 2605.13625
- 研究方向: other
- 核心要点:
- agent,onomy,taxonomy,behavior,interpret,oversight,act,agents,trajectories,runtime
3. MMSkills: Towards Multimodal Skills for General Visual Agents
- arXiv ID: 2605.13527
- 研究方向: other
- 核心要点:
- multimodal,mmskills,skill,reusable,visual,skills,procedural,agents,packages,agent
4. TRIAGE: Evaluating Prospective Metacognitive Control in LLMs under Resource Constraints
- arXiv ID: 2605.13414
- 研究方向: other
- 核心要点:
- metacognitive,triage,prospective,control,resource,language,token,budget,per,agent
MULTI_AGENT (4 篇)
1. ScioMind: Cognitively Grounded Multi-Agent Social Simulation with Anchoring-Based Belief Dynamics and Dynamic Profiles
- arXiv ID: 2605.13725
- 研究方向: multi_agent
- 核心要点:
- sciomind,belief,cognitively,grounded,anchoring,llm,agent,opinion,simulation,social
2. Multi-Agent Systems in Emergency Departments: Validation Study on a ED Digital Twin
- arXiv ID: 2605.13345
- 研究方向: multi_agent
- 核心要点:
- departments,abm,emergency,des,resource,strategies,agent,mas,validation,patient
3. IdeaForge: A Knowledge Graph-Grounded Multi-Agent Framework for Cross-Methodology Innovation Analysis and Patent Claim Generation
- arXiv ID: 2605.13311
- 研究方向: multi_agent
- 核心要点:
- innovation,patent,ideaforge,graph,claim,methodology,grounded,agent,triz,claims
4. Discrete Diffusion for Complex and Congested Multi-Agent Path Finding with Sparse Social Attention
- arXiv ID: 2605.13296
- 研究方向: multi_agent
- 核心要点:
- mapf,repair,lns2,d3pm,congested,initializer,drafts,difflns,discrete,diffusion
PLANNING (6 篇)
1. Adaptive mine planning under geological uncertainty: A POMDP framework for sequential decision-making
- arXiv ID: 2605.13702
- 研究方向: planning
- 核心要点:
- pomdp,geological,belief,mine,decisions,planning,uncertainty,decision,adaptive,updating
2. Scaling Retrieval-Augmented Reasoning with Parallel Search and Explicit Merging
- arXiv ID: 2605.13534
- 研究方向: memory, planning
- 核心要点:
- reasoning,multisearch,retrieval,merging,information,snr,query,search,retrieved,parallel
3. RS-Claw: Progressive Active Tool Exploration via Hierarchical Skill Trees for Remote Sensing Agents
- arXiv ID: 2605.13391
- 研究方向: memory, planning
- 核心要点:
- tool,claw,skill,rag,invocation,reasoning,agent,active,agents,remote
4. Ego2World: Compiling Egocentric Cooking Videos into Executable Worlds for Belief-State Planning
- arXiv ID: 2605.13335
- 研究方向: planning
- 核心要点:
- ego2world,egocentric,executable,belief,cooking,worlds,graph,videos,embodied,compiling
5. Diversity of Extensions in Abstract Argumentation
- arXiv ID: 2605.13332
- 研究方向: planning
- 核心要点:
- argumentation,extensions,arguments,diversity,admits,abstract,diverse,reasoning,notion,conflicts
6. Achieving Gold-Medal-Level Olympiad Reasoning via Simple and Unified Scaling
- arXiv ID: 2605.13301
- 研究方向: planning
- 核心要点:
- olympiad,medal,ipho,reasoning,imo,gold,recipe,2025,sft,level
SAFETY (1 篇)
1. Position: Assistive Agents Need Accessibility Alignment
- arXiv ID: 2605.13579
- 研究方向: safety
- 核心要点:
- assistive,accessibility,bvi,agentic,alignment,sighted,agents,design,deployment,778
EVALUATION (1 篇)
1. RealICU: Do LLM Agents Understand Long-Context ICU Data? A Benchmark Beyond Behavior Imitation
- arXiv ID: 2605.13542
- 研究方向: evaluation
- 核心要点:
- realicu,icu,patient,actions,hindsight,physician,physicians,llm,chengzhi,windows
MEMORY (2 篇)
1. Scaling Retrieval-Augmented Reasoning with Parallel Search and Explicit Merging
- arXiv ID: 2605.13534
- 研究方向: memory, planning
- 核心要点:
- reasoning,multisearch,retrieval,merging,information,snr,query,search,retrieved,parallel
2. RS-Claw: Progressive Active Tool Exploration via Hierarchical Skill Trees for Remote Sensing Agents
- arXiv ID: 2605.13391
- 研究方向: memory, planning
- 核心要点:
- tool,claw,skill,rag,invocation,reasoning,agent,active,agents,remote
2️⃣ 研究趋势分析
今日热点方向
根据今日 16 篇相关论文分析:
- planning 方向: 6 篇论文 🔥 热点
- other 方向: 4 篇论文 🔥 热点
- multi_agent 方向: 4 篇论文 🔥 热点
技术范式变化
- Tool Calling → Tool Learning: 从简单工具调用到自主工具学习
新兴架构模式
- 暂无明显新架构模式
3️⃣ 关键洞察
- Memory 正在成为基础设施: 越来越多的系统将记忆能力视为标配,而非可选特性
- Planning 从规则转向学习: 传统符号规划正在被神经网络学习取代
- Multi-Agent 协作标准化: 多智能体通信协议和协调机制正在形成共识
- Safety 从后置到前置: 安全性设计正在融入系统架构,而非事后补救
- 评估基准快速演进: Agent 能力评估正在从单一任务向复杂场景扩展
- 开源方案快速迭代: 商业 Agent 能力正在被开源实现快速追赶
4️⃣ 技术演进路径
1 | Prompt Engineering |
当前热点路径
- RAG → Memory System → World Model: 记忆架构持续深化
- ReAct → Planning System → Goal Reasoning: 推理能力增强
5️⃣ 与开源 Agent 项目的关联
主流项目对照
| 开源项目 | 相关方向 | 今日论文验证 |
|---|---|---|
| LangChain | tool, planning | ✅ |
| LlamaIndex | memory, rag | ✅ |
| AutoGPT | planning, autonomous | ✅ |
| CrewAI | multi-agent | ✅ |
| Mem0 | memory | ✅ |
| OpenDevin | tool, planning | ➖ |
设计验证与演进
被验证的设计:
- Memory System 的必要性得到持续验证
- Tool Use 作为 Agent 核心能力已成共识
- Multi-Agent 架构在复杂任务中表现优越
需要演进的设计:
- 简单的 RAG 正在被 Memory System 取代
- 单体 Agent 架构在复杂场景中受限
- 静态 Tool Definition 需要向动态学习演进
6️⃣ 架构级结论
- Memory First: 新 Agent 项目应优先设计 Memory System,而非事后添加
- Tool Abstraction: 工具抽象层应支持动态发现和学习,而非硬编码
- Multi-Agent Ready: 即使当前是单 Agent,架构应预留多 Agent 扩展能力
- Safety by Design: 安全机制应在架构设计阶段考虑,而非事后补救
- Evaluation Driven: 建立持续评估机制,而非依赖人工测试
7️⃣ 下一步行动建议
Memory Schema 设计
- 采用分层记忆架构: Working Memory → Episodic → Long-term
- 设计统一的 Memory Interface,支持多种后端(向量、图、关系型)
- 实现 Memory Compression 机制,避免无限增长
Retrieval Policy 升级
- 从简单相似度检索升级为混合检索(关键词 + 向量 + 知识图谱)
- 实现上下文感知的动态检索策略
- 考虑引入 Reranking 机制提升相关性
Agent Orchestration 调整
- 设计标准化的 Agent 通信协议
- 实现动态任务分配机制
- 考虑引入 Orchestrator 角色
📚 附录
论文完整列表
- Harnessing Agentic Evolution - other
- ScioMind: Cognitively Grounded Multi-Agent Social Simulation with Anchoring-Based Belief Dynamics and Dynamic Profiles - multi_agent
- Adaptive mine planning under geological uncertainty: A POMDP framework for sequential decision-making - planning
- How to Interpret Agent Behavior - other
- Position: Assistive Agents Need Accessibility Alignment - safety
- RealICU: Do LLM Agents Understand Long-Context ICU Data? A Benchmark Beyond Behavior Imitation - evaluation
- Scaling Retrieval-Augmented Reasoning with Parallel Search and Explicit Merging - memory, planning
- MMSkills: Towards Multimodal Skills for General Visual Agents - other
- TRIAGE: Evaluating Prospective Metacognitive Control in LLMs under Resource Constraints - other
- RS-Claw: Progressive Active Tool Exploration via Hierarchical Skill Trees for Remote Sensing Agents - memory, planning
- Multi-Agent Systems in Emergency Departments: Validation Study on a ED Digital Twin - multi_agent
- Ego2World: Compiling Egocentric Cooking Videos into Executable Worlds for Belief-State Planning - planning
- Diversity of Extensions in Abstract Argumentation - planning
- IdeaForge: A Knowledge Graph-Grounded Multi-Agent Framework for Cross-Methodology Innovation Analysis and Patent Claim Generation - multi_agent
- Achieving Gold-Medal-Level Olympiad Reasoning via Simple and Unified Scaling - planning
- Discrete Diffusion for Complex and Congested Multi-Agent Path Finding with Sparse Social Attention - multi_agent
本报告由 OpenClaw 自动生成
面向 Agent 架构师,提供决策参考