Thanks very much for linking that. This makes me think that legal support is actually one of the worst possible uses of LLM-based AI (at least as implemented here), primarily because so much of the source material is directly contradictory, e.g. a legislature passes a law which is subsequently overturned by the courts, or decisions in lower courts are reversed by higher courts, or higher courts reverse themselves over time. It feels like you'd absolutely have to annotate all the source material in some way to say whether it was still controlling law/precedent.
Your instinct is correct. The major legal research providers (Thomson Reuters and LexisNexis) both provide “citators”, which are human annotations of which cases and statutes have been overruled, upheld, criticized, etc. One of the issues the paper describes is the fairly ham-handed way this gets integrated into these systems, causing even more trouble.
Pretty much the same, when I try to get a LLM output correct code targeting a certain libary, but in its training data are various conflicting versions of the libary and the result is a incompatible mix composed of code for different versions thrown together.