I went to an interesting talk once at the Boston Python meetup, where a guy figu...

projectorlochsa · on May 30, 2017

There's a very good book for Latin that uses that trick.

Goes from zero to extremely complex Latin. Whole book is in Latin, no translations.

https://www.amazon.com/Lingua-Latina-Illustrata-Pars-Familia...

The only requirement is knowledge of orthographic alphabet and how each sound is produced. Latin, fortunately, has very simple sounds compared to English or Swedish.

It took me about 2 years to go through both parts and I was amazed at how easy the journey was. Could speak and write Latin fluently without issues.

camillomiller · on May 30, 2017

Is any book like that available for contemporary languages?

mci · on May 31, 2017

Check out https://vivariumnovum.it/risorse-didattiche/propria-formazio... :)

dpfu · on May 31, 2017

This is a awesome resource, thanks a lot! Some of these books are of course quite old. For example I would not recommend trying to learning contemporary German using this [1] book. Reading 'fraktur' alone can be rather challenging …

[1] [Worman, J. H.: Erstes Deutsches Buch, nach der Natürlichen Methode, für Schule und Haus, American Book Company, New York 1880](https://vivariumnovum.it/edizioni/libri/dominio-pubblico/Wor...)

camillomiller · on May 31, 2017

Holy moly, you made a languages nerd's day

wofo · on May 31, 2017

I learned Dutch using a similar approach (https://www.amazon.com/Delftse-Methode-Nederlands-voor-buite...)

SilasX · on May 31, 2017

Don't you have to have the translations at some point, or at least some side channel (i.e. an illustration that the phrase refers to), in order to "ground the symbols"?

schoen · on May 31, 2017

I have this book and I can tell you that it does use illustrations quite a bit (although most vocabulary is ultimately probably defined using other words). The very first chapter presents a labeled map of the Mediterranean region and begins:

"Rōma in Italiā est. Italia in Eurōpā est. Graecia in Eurōpā est. Italia et Graecia in Eurōpā sunt. Hispānia quoque in Eurōpā est. Hispānia et Italia et Graecia in Eurōpā sunt. Aegyptus in Eurōpā nōn est, Aegyptus in Āfricā est. Gallia nōn in Āfricā est, Gallia est in Eurōpā. Syria nōn est in Eurōpā, sed in Asiā. Arabia quoque in Asiā est. Syria et Arabia in Asiā sunt. Germaia nōn in Asiā, sed in Eurōpā est. Britannia quoque in Eurōpā est. Germānia et Britannia sunt in Eurōpā."

There are marginal notes highlighting things that the author wants you to notice or learn from the examples. Especially at the beginning, the marginal notes often do not discuss things in complete sentences but simply highlight particular grammatical features; for example the notes to the part I just quoted say "-a -ā: Italia...; in Italiā", which is supposed to make you realize that somehow the ending -a changes to -ā when something is "in" something (which later will be revealed to be the Latin ablative case), and "est sunt: Italia in Eurōpā est; Italia et Graecia in Eurōpā sunt", which is supposed to make you realize that sunt 'are' is the plural of est 'is'.

SilasX · on May 31, 2017

I want that, for every language I want to learn!

Edit: Before anyone says it, in my original comment, that should have been an e.g. instead of an i.e.

mijoharas · on May 31, 2017

Thanks to you I just learnt the difference between those two. Thanks! (Also writing this comment reminded me of this https://xkcd.com/1053/ )

stefanwlb · on May 31, 2017

Are you aware of something similar for Biblical Greek?

peterburkimsher · on May 31, 2017

stefan - I'm building http://pingtype.github.io for studying Chinese, and reading the Bible every day to practice.

I already forked the code to make a version to help Chinese speakers learn English.

I thought about Biblical Greek & Hebrew & Aramaic & Latin, but I wasn't sure if there's a market. Evidently there is! The biggest challenge is making a good dictionary. If I write the code, could you help to fix the dictionary?

nandemo · on May 31, 2017

How about this?

1. Get a frequency list. The most common word's rank is 1, the second is 2, etc. [0]

2. Then use your favorite Spaced Repetition Software (such as anki) to learn the words in that order.

3. Define a sentence's difficulty as the maximum rank over all its words. You could refine it by adding tie-breakers but I think it doesn't matter. Then sort the sentences in order of difficulty.

[0] See https://en.wiktionary.org/wiki/Wiktionary:Frequency_lists

cmmike · on May 31, 2017

Clozemaster already offers this for over 100 language pairings via the Fluency Fast Track feature, https://www.clozemaster.com.

Clozemaster's a game to learn and practice a language in context. The objective is to fill in the missing word in a given sentence for thousands of sentences. The missing word is the most difficult word in the sentence according to a frequency list for the language, and the Fluency Fast Track feature allows you to play a sentence for each unique missing word in order of difficulty like you described.

ramblerman · on May 31, 2017

I just spent an hour playing the fast track.

As a Spanish learner with a strong foundation but struggling to get over the next hurdle this is just perfect. Thanks for this link!

cmmike · on May 31, 2017

Awesome - glad to hear!

csa · on May 31, 2017

Great idea but...

The word "run" is a relatively high frequency word. How many different meanings for "run" does a learner need to know? Is that in isolation? With collocations? As phrasal verbs?

In many languages, the most frequently appearing words also have the most varied meanings. Interestingly, many highly vernacular languages also use relatively few words, but those words have a lot of meanings that are clearly known by the speech communities.

FWIW, some theoretical linguists consider this a non-issue. People in the field (i.e., people might die if I get this wrong) know otherwise.

While your idea sounds nice in principle, I hope you will accept the idea that reality may be slightly more complex.

* Collins Cobuild has 50+ meanings for "run" if phrasal verbs are included. Most non-native speakers are not even aware of the potential breadth of meanings it offers.

nandemo · on May 31, 2017

It's not just an idea, I've used it to learn about 2000 Hebrew words, albeit in a somewhat different form and along with other methods/materials.

I'm a somewhat experienced language learner. I'm fully aware that words have multiple meanings but that's not as problematic as you think. A lot of distinct meanings of "run" are related, so it's not like you need to memorize every one individually. Besides, they aren't all equally important. In many cases, those different meanings are paralled in my native language (or another language I know), e.g you can translate "run" as "correr" in "he runs", "the river runs..." and "they ran the risk...". More to the point, I always study words in context. In this method the context is provided by the rest of the sentence.

microcolonel · on May 31, 2017

The blessing of language is that word frequency follows a power law. The first couple hundred words cover much of the language anyway.

_pvxk · on May 31, 2017

This is https://en.wikipedia.org/wiki/Zipf%27s_law

The other side of the coin is that the long tail is long:

the frequency of any word in a corpus is inversely proportional to its rank in the frequency table. For large corpora, about 40% to 60% of the words are hapax legomena [words appearing exactly once], and another 10% to 15% are dis legomena[words appearing exactly twice]. Thus, in the Brown Corpus of American English, about half of the 50,000 words are hapax legomena within that corpus. https://en.wikipedia.org/wiki/Hapax_legomenon

So the chance of seeing a hapax in any given sentence is really high.

What's worse, the more rare words in a sentence are often the important content words. When I try to decipher sentences in some language I don't know much of, what I end up understanding is often something like "And then he XXXXX-ed the YYYYY just like that hahaha" (I understood 8/10 words! And I even know that word 4 is a past tense verb! But not at all the meaning).

(Not that you shouldn't study the most frequent first, that's still a good rule.)

nandemo · on June 2, 2017

> (I understood 8/10 words! And I even know that word 4 is a past tense verb! But not at all the meaning).

That's an important point. Knowing 80% of the words in a text doesn't mean you understand 80% of its meaning, that is, you probably wouldn't get high marks on a basic comprehension test.

I've seen some studies that indicate you need to know at least (around) 95% of the words in a text in order to understand it "enough" of it. (I don't have the links right now but could look it up at home if you're interested).

microcolonel · on May 31, 2017

Right, though I think the long tail is beyond the line between language and culture. There comes a point where additional words are not a matter of understanding utterances, but of following culture.

Effectively none of the English-speaking world would bother to say "Sochi" without the olympics, but they knew the English language and had enough culture to understand from context that it is a place in Russia which hosted the 2014 winter olympics.

If you know enough of the language that you can ask "what's that?" at a non-disruptive rate in conversation (or look it up quickly in a dictionary or encyclopedia), I think that counts as good enough.

biesnecker · on May 31, 2017

I built something like this for Mandarin Chinese at a company I worked for in Shanghai, but the company was acquired and sort of put out to pasture, and it never launched. :-(

Essentially, we took already word-segmented dialog (splitting Chinese sentences into individual words is non-trivial, so having it already segmented was super useful), matched it to words that you knew, and suggested the next lesson you should learn by the percentage of vocabulary that would be new or challenging for you. It was pretty awesome, would love to have another shot at it someday.

LeifCarrotson · on May 31, 2017

I would love to have a chance to try it! A couple hundred hours into learning Chinese, and that sounds useful. Any chance you can release it?

....if not, what would you consider your best competition?

biesnecker · on May 31, 2017

Sadly this was about six years ago. The code and the platform it was built on are long gone. I don't know of anyone else who is doing exactly the same thing, though Allset Learning in Shanghai (run by a friend and fellow ex-employee of the company I built this thing at) is making graded readers built around similar ideas. Not really the same as an adaptive system, though.

peterburkimsher · on May 31, 2017

I'm working on http://pingtype.github.io which also does word spacing, and literal translations.

Faaak · on May 31, 2017

I would be really interested in a German version of this.

I wanted to do more a less the same: a "translator" that translates a German text content into a german text but by replacing words you don't know (e.g. extracted from memrise) into words you know. That way you can start reading texts of your foreign language without looking at your dictionary every sentence.

tdeck · on May 31, 2017

Years ago I found a great example of this on the letter level for learning Cyrillic - takes only a couple minutes to run through and it's really satisfying:

http://www.alphadictionary.com/rusgrammar/alphabet.html

avereveard · on May 31, 2017

don't Assimil courses work like that? not word by word but definitely builds up on common words to reach ever complex sentences