Don't you have to have the translations at some point, or at least some side channel (i.e. an illustration that the phrase refers to), in order to "ground the symbols"?
I have this book and I can tell you that it does use illustrations quite a bit (although most vocabulary is ultimately probably defined using other words). The very first chapter presents a labeled map of the Mediterranean region and begins:
"Rōma in Italiā est. Italia in Eurōpā est. Graecia in Eurōpā est. Italia et Graecia in Eurōpā sunt. Hispānia quoque in Eurōpā est. Hispānia et Italia et Graecia in Eurōpā sunt. Aegyptus in Eurōpā nōn est, Aegyptus in Āfricā est. Gallia nōn in Āfricā est, Gallia est in Eurōpā. Syria nōn est in Eurōpā, sed in Asiā. Arabia quoque in Asiā est. Syria et Arabia in Asiā sunt. Germaia nōn in Asiā, sed in Eurōpā est. Britannia quoque in Eurōpā est. Germānia et Britannia sunt in Eurōpā."
There are marginal notes highlighting things that the author wants you to notice or learn from the examples. Especially at the beginning, the marginal notes often do not discuss things in complete sentences but simply highlight particular grammatical features; for example the notes to the part I just quoted say "-a -ā: Italia...; in Italiā", which is supposed to make you realize that somehow the ending -a changes to -ā when something is "in" something (which later will be revealed to be the Latin ablative case), and "est sunt: Italia in Eurōpā est; Italia et Graecia in Eurōpā sunt", which is supposed to make you realize that sunt 'are' is the plural of est 'is'.