LLMs Get Lost In Multi-Turn Conversation

The longer a conversation goes, the more likely that a large language model (LLM) will go astray. A research paper from Philippe Laban, Hiroaki Hayashi, Yingbo Zhou, and Jennifer Neville finds that most models lose aptitude—and unreliability skyrockets—in multi-turn exchanges:

We find that LLMs often make assumptions in early turns and prematurely attempt to generate final solutions, on which they overly rely. In simpler terms, we discover that when LLMs take a wrong turn in a conversation, they get lost and do not recover.

Effectively, these models talk when they should listen. The researchers found that LLMs generate overly verbose responses, which leads them to…

Speculate about missing details instead of asking questions
Propose final answers too early
Over-explain their guesses
Build on their own incorrect past outputs

The takeaway: these aren’t answer machines or reasoning engines; they’re conversation engines. They are great at interpreting a request and at generating stylistically appropriate responses. What happens in between can get messy. And sometimes, the more they talk, the worse it gets.

LLMs Get Lost In Multi-Turn Conversation | arxiv.org