It's not a completely useless idea. LLMs are pretty good at relating parallel concepts to each other. If we could annotate the whale speak with behavioral data we might catch something we'd otherwise have missed. Since whale children need to start (almost) from scratch, it sounds worthwhile to tap into that for teaching an LLM alongside a real whale infant.
it's an important step though. with a whale llm and chatbot we would have a tool to study whale language and communication actively rather than just being able to listen to their interactions passively. i could think of all sorts of cool experiments with an algorithm that can generate whale click sounds and elicit predictable replies from actual whales.
> Training an MT model without access to any translation resources at training time (known as unsupervised translation) was the necessary next step. Research we are presenting at EMNLP 2018 outlines our recent accomplishments with that task. Our new approach provides a dramatic improvement over previous state-of-the-art unsupervised approaches and is equivalent to supervised approaches trained with nearly 100,000 reference translations. To give some idea of the level of advancement, an improvement of 1 BLEU point (a common metric for judging the accuracy of MT) is considered a remarkable achievement in this field; our methods showed an improvement of more than 10 BLEU points.
Although, this specific method does require the relative conceptual spacing of words to be similar between language; I don't see how that would be the case for Human <-> Whale languages.
No, translation from one language to the other doesn't occur in vacuum, there are millions of examples of translated text done by humans, without it LLM wouldn't learn anything.
From the article:
“I’ve no doubt that you could produce a language model that could learn to produce sperm-whale-like sequences,” Dr. Rendell said. “But that’s all you get.”