Hacker News new | past | comments | ask | show | jobs | submit login
Alar: The making of an open-source dictionary (zerodha.tech)
126 points by ronakjain90 on Sept 28, 2020 | hide | past | favorite | 17 comments



I read the whole article, it was well written and fascinating.

But for those who don't read to the end, you will miss out on quite cool link that OP posted -> https://github.com/knadh/dictmaker (He wrote an OS project that runs all the infrastructure for your very own dictionary website.)


For the confused reader: s/OS/OSS/


Thank you for creating Dictionary for my First Language, Kannada . Really fascinated to see Work of V. Krishna .

ಇದನ್ನು ನೋಡಿ ತುಂಬಾ ಸಂತೋಷವಾಯಿತು , ಧನ್ಯವಾದಗಳು .


> So, that is the story of Alar and V. Krishna, the beauty of open data, and the incredible and infinite ways in which tiny, random events such as an overheard conversation, changes timelines, the Butterfly effect.

Awesome article, thank you for sharing!


this is great. Do we know other open source dictionaries?

I know about https://www.dicts.info/ (which itself is compilation of multiple sources, some open sourced by universities, some more shady)


To name a few (each have data that can be downloaded):

- Jibiki.fr (Japanese-French)

- CFDict (Chinese-French) https://chine.in/mandarin/dictionnaire/CFDICT/

- 教育部臺灣閩南語常用詞辭典 (Taiwanese Hokkien) https://twblg.dict.edu.tw/holodict_new/ https://github.com/g0v/moedict-data-twblg

- 台日大辭典 (Taiwanese Hokkien-Japanese) https://github.com/fhl-net/Lim-Chun-iok_2008_Tai-jip-Tua-su-...

- Littré XML (French) https://www.littre.org/faq

- 重編國語辭典修訂本 (Chinese)

Note that all have different, sometimes incompatible license. In particular dictionaries from Taiwan's Ministry of Education usually don't allow derivatives.

There is also a lot of dictionaries digitized on Archive.org that felt into public domain would require transformation into text (actually the Jibiki project did that with the Cesselin).


Well Wordnet is often described as a combination of a dictionary and thesaurus structured so a computer can understand it https://wordnet.princeton.edu/


In my experience Wordnet as an English dictionary is much worse than the dictionary (the New Oxford American Dictionary) included with Macos and iOS.


I run a russian-english/russian-german Open Data dictionary, https://openrussian.org


I'm sure you know it, but there is https://www.wiktionary.org/ too


My dad's German Persian dictionary is freely available on https://farhang.im with source code on https://git.hmt.im/hmt/farhang-3 I haven't yet found the ideal way to share the sqlite database file in a current state.


One of the older and well-known ones is Jim Breen's WWWJDIC (Japanese–English):

http://nihongo.monash.edu/cgi-bin/wwwjdic

(Creative Commons Attribution-ShareAlike Licence)

The dictionary files are used in a lot of websites and apps that provide a Japanese–English dictionary.


CC-Canto[1] is an open source Cantonese-English dictionary.

[1] - https://cantonese.org/about.html


Wonderful effort folks. You are definitely creating something that's going to last a long time and help a lot of people. Zerodha is a company I will adore from now on.


That's beautiful and moving! This is what computers are really for.


ಧನ್ಯವಾದ


(Translation: thanks)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: