One of the tasks that we fine-tuned the model on is ConvAI2 (or "Persona-chat") ...

One of the tasks that we fine-tuned the model on is ConvAI2 (or "Persona-chat") which specifically aims to improve the model's consistency by conditioning its responses on a given persona. See here: https://arxiv.org/abs/1801.07243. In this research we found that conditioning on a persona improves consistency, but that the models still aren't always perfectly consistent. In particular, the model can only see a few turns back in the dialogue history, and may contradict itself if the conversation is long enough that the model simply does not see the full conversation.

As far as facts go, we also fine-tuned the model on the Wizard of Wikipedia task (https://arxiv.org/abs/1811.01241), which helps improve its knowledgeability. Again, it still isn't perfect.