The html tag has a "lang" attribute, and the server itself can send a Content-Language HTTP header. Most CMSes these days set one or both once multi-lingual is enabled.
Additionally the browser can utilize the OS or it's own spellcheck word database: check every word in every dictionary and the dictionary with the most matches is likely to be the relevant one.
> Additionally the browser can utilize the OS or it's own spellcheck word database: check every word in every dictionary and the dictionary with the most matches is likely to be the relevant one.
Every word seems excessive, especially if a page has an excessive amount of text on it.
I have noticed that 'certain' sites that obfuscate titles by homoglyphs are recognized as vietnamese by chrome. That seems like something based on the actual content of the page.
If it is based on analysis done by the local machine, no problem. However, if it is based on analysis done by google servers, big problem!