There’s really no harm in doing that, and it’s still a pretty good idea.
I generally try and get my data sources as far as possible with the database, then leave framework/language specific things to the last step, means that-if nothing else-someone else picking up your dataset in a different language/framework toolset doesn’t need to pick up yours as a dependency, and you’re not spending time re-implementing what a database can already do (and can do more portably).
The only downside to letting the database do some of the pre-processing is that I don't have a full raw data set to work with within either R or Python. If I decide I need a an existing measure aggregated up to a different level, or a new measure, I've got to go back to the database and then bring in an additional query. So I have less flexibility within the R or Python environment. But you make a good point: there's trade offs either way, and keeping the dataset as something like a materialized view on the database makes it a little more open to others' usage.
I generally try and get my data sources as far as possible with the database, then leave framework/language specific things to the last step, means that-if nothing else-someone else picking up your dataset in a different language/framework toolset doesn’t need to pick up yours as a dependency, and you’re not spending time re-implementing what a database can already do (and can do more portably).