Optical character recognition and audio transcription are improving at such a pace that I don't think this will be a significant barrier for future historians. Even now, on a computer with modest resources (e.g. laptop without dedicated GPU), whisper.cpp makes it practical to transcribe hours of podcast audio or other speech. And the transcription only needs to be done once.
you would probably run into the bigger issue of how to search for what is truly relevant. will the researchers of 3000 be able to tell the difference between AI or content-farm clickbait vs decent primary and secondary sources?
even today we have issues determining if what sources from antiquity say is true.
This is not a novel problem, historians already deal with this. Written history is rife with lies, inaccuracies, huge holes (often artificially created), and propaganda. The scale may be larger, but the fundamental issue is the same.
the problem with AI generated content is that this starts extending into areas that didn't really have much of a reason to lie in the first place.
as a general example, there wasn't much propaganda value in lying about food, so for the most part we can trust those descriptions to be true, particularly in guides intended to be written as cookbooks. in 2024, we have legion accounts making fake recipes for the sole purpose of getting clicks for ad dollars.
https://github.com/ggerganov/whisper.cpp