Literature review

Thank you for your feedback!

LLMs have demonstrated remarkable capabilities across a wide range of natural language tasks. See Language Models are Few-Shot Learners and GPT-4 Technical Report. However, using LLMs as autonomous agents that can interact with environments and solve complex problems remains challenging. Recent work has explored various approaches to enhance the abilities of LLMs abilities in areas such as reasoning, planning, and decision-making.
In Four AI Agent Strategies That Improve GPT-4 and GPT-3.5 Performance, the author highlights that such agentic workflows, where LLMs iterate over outputs multiple times with feedback, can lead to significant performance gains compared to single-pass generation. On benchmarks like HumanEval for code generation, agentic approaches using GPT-3.5 have achieved up to 95.1% accuracy, far surpassing the 67% of GPT-4 in zero-shot settings.
The author outlines four key design patterns for agentic workflows:
- Reflection: Ensure that agents examine and critique their own work
- Tool Use: Use external tools and APIs
- Planning: Develop and run multi-step plans
- Multi-agent collaboration: Ensure that multiple agents work together.
A key development has been the advent of chain-of-thought prompting, which allows LLMs to break down complex problems into steps. Building on this, several frameworks have emerged that leverage multiple LLM instances working collaboratively as agents. For example, the Reflexion framework uses verbal reinforcement learning, where an LLM agent provides feedback on its own outputs and uses that to improve in subsequent iterations. This allows for targeted improvements without requiring model fine-tuning.
Multi-agent debate (MAD) approaches take collaboration further by having multiple distinct LLM agents discuss and reason about problems together. The authors demonstrated that such debates can improve factuality and reasoning in language models. Encouraging Divergent Thinking in Large Language Models through Multi-Agent Debate proposed a debate framework with "angel" and "devil" agents to encourage divergent thinking. The CAMEL framework explored role-playing between agents with distinct personas.

Your Browser is Out of Date

Literature review

Literature review