Semantics at this point. Data-generating process is a term that's also used. A model seeks to mimic or match the real process to a reasonable approximation. Hence "true model" as a scoring engine or data relationship that represents reality completely.
Unless you have data, that is. The data is the basis for assuming that a process has generated data. Either that, or the data has existed for all eternity, and therefore could never have been collected.
Suppose I take the function y = log(x) and add random white noise. The function log() and the parameters on the random white noise process are the data generating process. We could then fit a model y = \beta X + \epsilon, and then compare the "true" (first) model to our second model. When the natural world generates our data, the idea behind all this is the same: there is a process which generates the data, and the data reveals information about that process to an approximate degree.
Sure! Most of the equations you find in your nearest physics or chemistry books are validated experimentally/empirically. The validation comes by better and better approximating the data generating process.