Hacker News new | past | comments | ask | show | jobs | submit login
VÉgA – Vocabulary of Ancient Egyptian (vega-vocabulaire-egyptien-ancien.fr)
92 points by downvotetruth on Sept 15, 2023 | hide | past | favorite | 23 comments



For similar large scale efforts to make ancient language texts available online with translations see:

* CDLI, https://cdli.mpiwg-berlin.mpg.de/, for cuneiform (tablets, seals, and other objects)

* Perseus Digital Library, https://www.loc.gov/item/2004564290/, for ancient and classical literature (started with Ancient greek and expanded to Latin)

If a small number of rich people donated a few million each, we’d have all tablets and other unread documents read in a short time (academic restrictions on access may be a problem) and they would look and knowledgeable.


I feel like the magnificent Frank Krueger’s recent effort to use ML for the purpose of translating cuneiform are relevant in this context.

https://praeclarum.org/2023/06/09/cuneiform.html


> If a small number of rich people donated a few million each, we’d have all tablets and other unread documents read in a short time

There is a really large number of unread tablets.


100k, 10 million? Please give us a guess. How many are in known languages, how many unknown languages?


The ones in known languages are more than enough to absorb hundreds of millions of dollars.

People seem to have the impression that ancient writing is rare. It isn't. Two things are rare:

- Paper that can survive for thousands of years without rotting. This is why classical documents are rare.

- The desire to get some more ancient tablets translated. This is why translations of cuneiform tablets are rare. There is far more source material available than anyone is willing to pay for.


Also, the fact that most ancient writing is boring. Yes, like Irving Finkel, you might come across a new version of the Flood legend, or the rules to an ancient form of Backgammon, but 99% of the stuff is just receipts and tax reports, maybe useful to translate as part of a student's dissertation, but of no more interest than that.



Did you know? You can use hieroglyphs right here, right now and pretty much everywhere else: 𓂺

https://en.m.wikipedia.org/wiki/List_of_Egyptian_hieroglyphs


the unicode implementation is garbage...

𓂋𓐍

see?

also wikihiero is crappy.

meanwhile, i don't suck at rudimentary code and banged this out quickly after self-teaching to fix it after the fact: https://github.com/semiessessi/beautiful-hieroglyphs/blob/ma...

also most other tools suck.


> most other tools suck.

Your think they suck because they're not coloured? But black-and-white images are standard, that's how you'll be seeing them in dictionaries, textbooks and transcriptions.

Also, I think some hieratic texts use different inks, like red and black, and the transcription into hieroglyphs needs to convey this difference. If you use a coloured hieroglyphic font, it would be hard to transcribe these hieratic texts.


Please add images to the README!


added one. thanks for the suggestion


Also, despite there being a (well, multiple) hyeroglyphs with a penis, it seems that to say "penis" you need to use (at least) three hyeroglyphs of which only the last one is a penis: https://app.vega-lexique.fr/?entries=w9978

That sounds... vaguely inefficient to me? Then again, it's the same for "dog" ( https://app.vega-lexique.fr/?entries=w1699 ), so maybe I should start studying acient Egyptian to find out what's going on here... maybe it's determiners, or symbols to signify the start of a noun (but then, why are they different between "penis" and "dog"?)?

Explanations are welcome!


Hieroglyphic systems like Ancient Egyptian work on a combination of phonetic and semantic (meaning-based) principles. To write a word like "penis" or "dog" you will write several symbols, some indicating the sound, and some indicating the meaning. This is still how it works today in languages that use Chinese characters. A Chinese character like 液 yè "liquid" has two parts: the first part to the left (the three dots) indicates that the meaning has something to do with water. The part to the right is phonetic and means that the word is pronounced similarly to the other word 夜 yè "night" (which character can itself be broken down into semantic and phonetic parts). In the Egyptian examples you linked, the phonetic parts are followed by a semantic part.

Furthermore, in ancient systems, there's some nondeterminism in how you can write: you can write a word using only the meaning element, or only the sound elements, or a combination of both, depending on how you feel, how much space you have, etc. You see the same kind of thing in Mayan hieroglyphics and in Chinese writing before it was standardized by Li Si in the Qin dynasty.

This kind of system is useful because, by complementing the phonetic writing with semantic symbols, you can disambiguate homophones and maybe make reading faster. But I don't think this functional reason is why systems like this emerged: usually writing systems start off fully semantic and then acquire phonetic parts as time goes on. Semantic writing seems to be the "default" somehow, if you have writing at all, and phonetic writing is something that people have to figure out over hundreds of years of cultural evolution.


> That sounds... vaguely inefficient to me?

All languages and writing have built-in redundancy

In Egyptian, you can probably get away with writing the determiner in question + vertical stroke below it, to mean the determined thing. I guess Egyptians would understand you (that's how they wrote some common words like 'sun', with sun disk + | stroke). The vertical stroke is somewhat like putting spaces around the symbol in our writing system.

But usually that's not desirable because then you need to be extra-careful with drawing your dog or penis. And most writing wasn't carved in stone, it was a simplified hieratic style where a length of one stroke could be very important

So, some redundancy is useful. In Egyptian it usually means adding phonetic symbols before the image (imagine drawing a pen before a penis to show how the word sounds; then, you could have also drawn a duck before penis before the penis image to mean a different word that sounds like duck).


determinatives are for disambiguiating words and clarity. everything is phonetically spelled minus some vowels (usually assumed to be 'e').

sounds can be overspecified for clarity. some of the symbols have multiple consonants.

here is something short i made whilst working toward automatic transcription, transliteration and translation

https://imgur.com/a/KjkYKZS

4 of the glyphs here are silent, despite having phonetic values (or you could look at it as different glyphs being silent...)

interesting point about the penis glyphs is that Google Noto font project has purposefully decided to leave it as tofu, so much for "no more tofu". Microsoft have done the same with their Segoe font...


Isn‘t this (𓂸) penis and just one symbol?



𓁏


A million?


fascinating. the software landscape of egyptology is mostly a barren wasteland with a few superstars dotted around (Serge Rosmorduc stands out!)

good to see such fine work!


Modern Egyptian vocabulary: A7a


A7a fe3lan




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: