[go: up one dir, main page]

0% found this document useful (0 votes)
34 views1 page

Sample

Uploaded by

mobeen
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
34 views1 page

Sample

Uploaded by

mobeen
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 1
arXiv:2303.11366v1 [cs.AI] 20 Mar 2023 Reflexion: an autonomous agent with dynamic memory and self-reflection Noah Shinn Beck Labash Northeastern University Northeastern University Boston, MA Boston, MA shinn.nénortheastern.edu Labach.bénortheastern.edu Ashwin Gopinath ‘Massachusetts Institute of Technology ‘Cambridge, MA agopitnit eda Abstract Recent advancements in decision-making large language model (LLM) agents have ‘demonstrated impressive performance across various benchmarks. However, these state-of-the-art approaches typically necessitate internal model fine-tuning, external model fine-tuning, or policy optimization over a defined state space. Implementing these methods can prove challenging due to the scarcity of high-quality taining data or the lack of well-defined state space, Moreover, these agents do not possess certain qualities inherent to human decision-making processes, specifically the ability to lea from mistakes. Self-rflection allows humans to efficiently solve ‘novel problems through a process of trial and error. Building on recent research, we propose Reflexion, an approach that endows an agent with dynamic memory and self-reflection capabilities to enhance its existing reasoning trace and task-specific action choice abilities, To achieve full automation, we introduce a straightforward yet effective heuristic that enables the agent to pinpoint hallucination instances, _void repetition in action sequences, and, in some environments, construct an inter- ‘nal memory map of the given environment. To assess our approach, we evaluate the agent's ability to complete decision-making tasks in AlfWorld environments and knowledge-intensive, search-based question-and-answer tasks in HotPolQA ‘environments, We observe success rates of 97% and 51%, respectively, and provide a discussion on the emergent property of self-reflection. 1 Introduction ‘Mastering decision-making and knowledge-intensive search tasks in novel environments is a crucial skill set for large-scale natural language agents. LLMs such as OpenAI's GPT:3 (Brown et al, 2020), Google’s PaLM (Chowdhery et a., 2022), and others have achieved impressive results on various benchmarks (Kaplan et al, 2020; Rae et al., 2021; Nakano et al, 2021; Kojima et al.. 2022; Ouyang et al, 2022; Chung et al., 2022). These models exhibit human-like abilities to understand tasks in given environments, marking significant progress inthe field of natural language processing. Grounding complex tasks in natural language allows agents to overcome high syntactic barriers that may result in false-negative errors, However, learning optimal policies for natural language RL agents is challenging due to vast and mostly unbound state spaces. Several decision-making approaches have been proposed to enable natural language agents to select their next action without a learned policy in text-based environments, Chain-of-thought (CoT) Under review Preprin

You might also like