Agreed; that’s why I was very careful to say “one approach.” I suspect that technique exploits a feature of the LLM’s sampler that penalizes repetition. This simple rule is effective at stopping the model from going into linguistic loops, but appears to go wrong in the edge case where the only “correct” output is a loop.
There are certainly other approaches that work on an LLM that wouldn’t work on a human. Similar to how you might be able to get an autonomous car’s vision network to detect “stop sign” by showing it a field of what looks to us like random noise. This can be exploited for productive reasons too; I’ve seen LLM prompts that look like densely packed nonsense to me but have very helpful results.
Agreed; that’s why I was very careful to say “one approach.” I suspect that technique exploits a feature of the LLM’s sampler that penalizes repetition. This simple rule is effective at stopping the model from going into linguistic loops, but appears to go wrong in the edge case where the only “correct” output is a loop.
There are certainly other approaches that work on an LLM that wouldn’t work on a human. Similar to how you might be able to get an autonomous car’s vision network to detect “stop sign” by showing it a field of what looks to us like random noise. This can be exploited for productive reasons too; I’ve seen LLM prompts that look like densely packed nonsense to me but have very helpful results.