Do pandas and numpy count as DSLs? This always confuses me. I think it's just a library, with the same language and the same semantics, but different data structures.
This is a good question, at what point does an extensive abstraction, a library on top of a language to extend the language to make it accessible for a specialized use-case, turn into a "DSL"?
I also never saw Numpy/Pandas as a DSL but rather a extensive layer on top of Python whose complexity is largely a result of the usecase rather than being the result of attempting to be a full DSL on top of the language ala Matlab for Python.
This is likely one of those scenarios where DSL's are one of many options available to particular languages to solve a particular problem set but are hard to identify in practice. Not to mention the many times it doesn't make sense to develop a full DSL layer but regardless the ease of creating them in some languages makes it a commonly abused trope (as many OO-related concepts are applied to everything where other old solutions are far superior).
It's difficult to differentiate between the functional utility vs purely aesthetic optimizations of various abstractions, so I wouldn't be quick to blame negligence as much as communicating the best tools for the job on a language-by-language basis.
I'd even go as far as arguing that the fact that pandas / numpy isn't a DSL causes some of its awkwardnesses, e.g. the fact that you have to use & for `and` in pandas, and the fact that you have to parenthesize expressions like `df[(df.a == 7) & (df.b == 2)]` instead of `df[df.a == 7 & df.b == 2]` or python's wonky operator precedence will try to execute `7 & b` first. Also we could even have special dataframe scoping rules like `df[a == 7 and b == 2]`, but we have to do `df.a` instead, exactly because pandas is NOT a DSL.
Numpy array 1 / array 2 where the second array has 0s and NaNs in it. Numpy has overridden division to allow division by 0 and NaN (Numpy added data type) in addition to vectorization.
Moreover, you're encouraged to not iterate (generally a lot slower) if you can help it when using these libraries.
Embedded DSL's are just libraries, what makes something an embedded DSL is that it attempts to be a literate fluent configuration language in the host languages native syntax. If it doesn't use the host langauge's syntax, it's not an embedded DSL, it's an external one.
You don't have to intrude new syntax to create an embedded DSL, that's the whole point of an embedded DSL, it uses the languages existing syntax. Smalltalk and Lisp are full of DSL's, as is Ruby, of the three only Lisp has the ability for syntactic abstraction, every Smalltalk DSL uses native syntax. See Seaside's DSL for html generation or Glorp's for database mappings.
I don't think you can introduce new syntax in Python and have it run as part of the language, so magic methods, decorators and metaclasses are as good as it gets. You'd have to write a parser to handle new syntax, and that makes it external, right?