I would assume Hong Kong records as Cantonese is the official language (for at least English<->Cantonese), I would also assume the Guangdong province would also be a source of material as well.
For most people in Guangdong, Cantonese is at most a spoken language. They learn Standard Written Chinese with Mandarin pronunciation at school and if they want to write down something said in Cantonese, they might substitute characters of equivalent meaning (e.g. 是 instead of 係) or with similar pronunciation instead of the "official" characters used in Hong Kong.
The Hong Kong government is not much different. Their actual language policy is "Chinese and English are the official languages of Hong Kong. Committed to openness and accountability, the Government produces important documents in both English and Chinese. Correspondence with individual members of the public is always in the language appropriate to the recipients. Simultaneous interpretation in English / Cantonese / Putonghua is made available to meetings of the Legislative Council and Government boards and committees as needed." https://www.csb.gov.hk/english/aboutus/org/scsd/1470.html
So they recognize Cantonese and Putonghua as different spoken forms, but only one written language. I've never seen a Hong Kong government website offer translation into both Cantonese and Mandarin, it's always just Standard Written Chinese with a choice of Traditional or Simplified characters.
Most written Cantonese content on the internet is probably produced by Hong Kongers in informal contexts such as forums, but then it's not clearly marked as such and might be mixed with Standard Written Chinese and English.
It's interesting and sad to see the forced assimilative process of erasing written Cantonese. I remember HK in the late 90s still had newspapers that published in written Cantonese. Just across the border in Shenzhen, without the British influence and prior to the explosion of industry and tech in the early 2000s, you could still see nonstandard signage that were in Cantonese in store windows. I think getting rid of spoken Cantonese is likely a generational and not just an effort that can be done in a decade or so, but I've both experienced and did field work on how the Wu dialects were more or less systematically erased from official, and now even private realms. The Shanghai variety, itself developed only in the early 1800s from a pidgin of the Suzhou and Nanjing varieties mixed with northern influences, is actually quite well-documented by foreign sources in writing, with a pidgin developing off of that and English and Portuguese that also survives in English sources and academically studied in great detail by Chinese authors in English but not in Chinese to anything close to the same degree. Starting with the millenial generation the speaking of the dialects in schools, even outside of class, became subject to punishment. With public education starting at the pre-kindergarten level enforcing the rule, across two or three generations even those whose first language is one of the dialects became more or less forced into Mandarin speakers and losing their fluency. I have little reason to doubt that something similar will simply be forced upon Hong Kong as well. Luckily sci-hub is your friend and written Cantonese seems to be better represented than written Wu through a cursory search.
In typical academic fashion, it's behind a login wall and doesn't offer an easy way to download the whole corpus. (Understandable, given that it's based on transcribing movies that are probably still copyright-protected, but annoying.) Also, no translations.
That corpus is CC-BY licensed (yay!) and puts the download page front-and-center, so I like it. There's no translations either, but recordings are included, so it might still be useful for a project of mine.