I used to be fervent embedded DSL fan many years ago (particularly in college with Scheme/Racket) but since then I have grown to hate them particularly Scala, Groovy and Ruby DSLs.
External DSLs (that is a DSL independent of the host language)... I love them. An external DSL is like a protocol or the ultimate formal API (an example would be HTTP).
Why do I dislike embedded DSLs: I guess the crux could be that you are essentially mixing two languages when the reality it is just one so you are doing non canonical things in the host to make the guest language look pretty (eg going to town with implicits, extension functions, and operator overloading (scala)). Basically a whole bunch of confusing stuff to normal developers just to make the DSL look pretty.
With Scheme/Lisp/Clojure it is OK because embedded DSLs using sexp look like lisp.
In Haskell it is sort of OK because Monads and various other things Haskell provides (laziness).
With Scala its the absolute worse with operator overloading, implicits, and various other gotchas.
In my experience it also gets really confusing trying to figure what is going on when a embedded DSL breakk as well as extremely difficult to secure an embedded DSL. `ie System.exit(1)` is just one call away.
There are exceptions of course. DSLs that mimic external DSLs one-to-one. One example being jOOQ as it aims to be one to one with an external DSL ... SQL.
Embedded DSLs often seem like the typical "bike shed coding styles" taken to the extreme. In my opinion they often value form over function, creating an entirely new (subset of the) language just because.
I can obsess about code formatting as much as anybody else but creating my own subset of the language just for a specific domain just seems excessive. It puts additional burden on anyone else trying to work with it because they now not only have to worry about the "host" language but also the idiosyncrasies of the embedded DSL.
Counter argument for statically enforced dsl (e.g. Kotlin):
I have an idea for a DSL - parse the templates for a site to generate type information about what buttons, inputs, interpolated values, etc can be found on each page. Combine that with parsed route information to allow one to generate dsl binding for a all the pages / routes.
If we're doing that with Kotlin, then afterwards we can get a really nice, statically enforced testing api (for Selenium, etc), with IDE completion. And you don't even have to run them to make sure that the element selectors work (barring of course any bugs in your parsing setup).
I had this idea several months ago but haven't had the opportunity to use it. I'm itching to because it seems like a feasible, if not slightly insane, way to making writing automated end to end tests easier to write and maintain.
tl;dr if you're going to do some code gen for code generation, why not make the interface a DSL?
I'd go the other way - rather than generate code from templates, declare the dynamic parts of the page in code, and generate both the implementation and the test accessors from that (I'd use a Wicket-like approach where the HTML fragments only included ids and all the logic was in code). I haven't seen this done for the actual HTML side of things, but https://github.com/http4s/rho gives you a declarative way to, uh, declare your HTTP routes, and then you can generate swagger or a client from the exact same declarative routing construct value that you're using to run your actual routes.
Would you say this is the battle between syntax & semantics?
If you are inventing new syntax to support semantics you don't fully understand along the way, it usually ends up as some franken language.
Lisp makes this a feature by limiting what most people end up doing into one syntax, and with things like macroexpand-1, you can see what "real code" is being generated.
I failed to explain why with Lisp and Haskell it feels ok but I think you sort of nailed it with the "expansion" part. With Haskell and Lisp you think in expressions being expanded.
The DSLs in Ruby, Groovy, Scala (often but not always) are not macro expansion. Often times these DSLs are very mutable and are like a state machine (and in fact have a state machine underneath which causes a whole bunch of concurrency issues).
Would add that Haskell DSL's can be formed from functionality derived from different kinds of monads and other categorical structures[1]. That sounds highfalutin, but different types of monads and other structures (functors, applicatives, etc) give you insane amounts of power to define your DSL in a very precise way. It even gives you a formal basis to reason about your DSL should you want to go that far with it.
I think another difference at least for Common Lisp DSL's are embeded into the language (look at loop) and so people are comfortable with using them and somewhat how to write them.
This doesn't mean there isn't people who hate stuff like loop being in cl but it does go someway to showing why it's okay. I think the design of lisp encourages them as well (as the previous poster said) by the fact that the entirety of lisp is built up from a extremely limited amount of instructions so the entire language can be argued to be a DSL on top of a DSL on top of a DSL etc.
I find them very useful in Scala, because they make for a way to do "config" that you can refactor safely under the normal rules of the language. E.g. in akka-http/spray if there's a bunch of directives that I keep repeating in my HTTP routing config then I can just pull that out into a variable that's just a plain old Scala variable, and use that variable like a normal variable. Whereas if I want to pull out a repeated stanza from a routing config file I have to find out the rules for that config file and maybe it's not even possible.
The stuff Scala does can be confusing to some people, but I find a lot of it actually simplifies things. Certainly I never want to go back to a language where operators are treated as a special case differently from normal functions (at the same time there are certain Scala libraries I avoid because those libraries give their functions stupid names - but that's a problem that exists in any lanugage). Implicits make for a nice middle-ground if you use them the right way, to represent things that would have to be either painfully verbose or completely invisible in another language but really want to be almost-but-not-quite invisible. If you use them for something that should be more explicit, or for something that's not worth tracking at all, then that can become a problem (though again a problem you'd have in any language).
A lot of the stuff certain libraries do with "embedded DSLs" in Scala because it just shouldn't be a DSL at all - IMO there's no reason unit test code shouldn't just be plain old code. But the cases where you really do need a DSL more than make up for it. The wonderful thing I've found in Scala is that you never need an external config file or magic annotation or anything like that - everything is just Scala, and once you understand Scala (which, granted, probably takes a little longer than other languages) you never need to worry about understanding what some framework or external DSL is doing.
> I find them very useful in Scala, because they make for a way to do "config" that you can refactor safely under the normal rules of the language
If you see the choices as either DSL or external config file, then I can understand why DSLs are attractive.
But I found that Spray suffers because it tries to hard to be a DSL rather than just being Scala. When our team tried to do something that wasn't 100% obvious, or didn't fit with the documentation we ended up spending a lot of time trying to understand how the DSL worked under the covers so that we could work out what we needed to do, and then work out how to turn that back into the DSL.
It's been a while since I worked with it so (a) it might have changed, (b) I'm probably remembering details incorrectly, but I recall several conversations with team-mates who were just trying to do something simple like extracted a value from a request in a slightly non-standard way, where I would have to explain "no, this part is installing a spray directive, it runs at class initialisation, this is the part that runs at request time". Once you understand how spray is implemented, you can work those bits out, but the DSL doesn't make it clear at all, and if getting something to work requires that you understand the spray design, and you need to know Scala, then the DSL is just one more thing to learn, when a idiomatic standard scala API would be clearer all round.
My experience was that Spray (and Slick) fell right into the trap of: you need to know how the tool implemented to be productive in it, but once you know that the DSL is no longer adding enough value to justify its awkwardness.
> if getting something to work requires that you understand the spray design, and you need to know Scala, then the DSL is just one more thing to learn, when a idiomatic standard scala API would be clearer all round.
I found that at least it was all ordinary Scala code, so I could always click through to the definitions in my IDE - I actually did that quite a lot early on, basing my custom directives off of copy/pasting directives from the standard library. Contrast that with e.g. Jersey, where when I wanted to add a custom marshaller I couldn't even tell where to start - the existing marshallers had annotations, but the annotation was just an inert annotation, I had no idea how to get from there to the implementation code or how to hook up my own annotation that would do the same thing.
I'd be interested in an "idiomatic standard Scala API" that had to pretensions to be anything other than code, but I suspect it would end up looking very similar to Spray. It seems like every web framework in every language does some kind of routing config outside the language - even e.g. Django, you'd think Python is dynamic enough that you could just use Python, but actually the way you do the routing is registering a bunch of (string, function) pairs where the string is some kind of underdocumented expression language for HTTP routes. So from my perspective the options really are DSL or a config that's to a certain extent external, because those are the only things I've ever seen web frameworks offer - I'd be very interested to see any alternative.
Pandas & Numpy are extremely useful DSLs in Python. Being able to add vectorization to the language and use structures like DataFrames in an R-like way is very powerful. There are some good use case scenarios.
Language features like operator overloading do exist for a reason, and that's because the language designer was aware of situations where you do want to use those features.
Do pandas and numpy count as DSLs? This always confuses me. I think it's just a library, with the same language and the same semantics, but different data structures.
This is a good question, at what point does an extensive abstraction, a library on top of a language to extend the language to make it accessible for a specialized use-case, turn into a "DSL"?
I also never saw Numpy/Pandas as a DSL but rather a extensive layer on top of Python whose complexity is largely a result of the usecase rather than being the result of attempting to be a full DSL on top of the language ala Matlab for Python.
This is likely one of those scenarios where DSL's are one of many options available to particular languages to solve a particular problem set but are hard to identify in practice. Not to mention the many times it doesn't make sense to develop a full DSL layer but regardless the ease of creating them in some languages makes it a commonly abused trope (as many OO-related concepts are applied to everything where other old solutions are far superior).
It's difficult to differentiate between the functional utility vs purely aesthetic optimizations of various abstractions, so I wouldn't be quick to blame negligence as much as communicating the best tools for the job on a language-by-language basis.
I'd even go as far as arguing that the fact that pandas / numpy isn't a DSL causes some of its awkwardnesses, e.g. the fact that you have to use & for `and` in pandas, and the fact that you have to parenthesize expressions like `df[(df.a == 7) & (df.b == 2)]` instead of `df[df.a == 7 & df.b == 2]` or python's wonky operator precedence will try to execute `7 & b` first. Also we could even have special dataframe scoping rules like `df[a == 7 and b == 2]`, but we have to do `df.a` instead, exactly because pandas is NOT a DSL.
Numpy array 1 / array 2 where the second array has 0s and NaNs in it. Numpy has overridden division to allow division by 0 and NaN (Numpy added data type) in addition to vectorization.
Moreover, you're encouraged to not iterate (generally a lot slower) if you can help it when using these libraries.
Embedded DSL's are just libraries, what makes something an embedded DSL is that it attempts to be a literate fluent configuration language in the host languages native syntax. If it doesn't use the host langauge's syntax, it's not an embedded DSL, it's an external one.
You don't have to intrude new syntax to create an embedded DSL, that's the whole point of an embedded DSL, it uses the languages existing syntax. Smalltalk and Lisp are full of DSL's, as is Ruby, of the three only Lisp has the ability for syntactic abstraction, every Smalltalk DSL uses native syntax. See Seaside's DSL for html generation or Glorp's for database mappings.
I don't think you can introduce new syntax in Python and have it run as part of the language, so magic methods, decorators and metaclasses are as good as it gets. You'd have to write a parser to handle new syntax, and that makes it external, right?
An internal DSL would have to be part of the native language. Either Python doesn't (directly) support this, or magic methods partially allow the creation of DSLs by extending the operators.
I get what you're trying to say, I think, but you should use a different term. Pandas and NumPy aren't DSLs unless you interpret the 'L' to mean Library.
It is unusual but perfectly cromulent in Python to overload the magic methods on a class to provide whatever semantics you like though operators. So to me it doesn't seem like a DSL.
There was a recipe on ActiveState's site for essentially creating new operators by defining classes that overrode default operator semantics in "both directions" if you will. So you could write:
foo <<my_op>> bar
And my_op could do whatever it wanted to with foo and bar, by overloading the left- and right-shift magic methods in the my_op class. Neat, eh? (But still not a DSL! Heh.)
By definition, internal (or embedded) DSLs (a term with well established use) are valid host-language code, relying on whatever host language features exist that allow defining code that reads fluently for the application domain. That is what distinguishes them from external DSLs.
I have a theory on why people maybe overvalue embedded DLSs: many problems become trivial when the terms of a language are a natural fit for the problem domain, where 'terms' are just any abstractions (new types, instantiations of old types).
So, maybe people are assuming that this benefit derives from having a full language which is a natural fit for a problem domain? (My contention being that the terms are what's really significant and the rest can be nice, but diminishing returns + tradeoffs.)
edit: fixed phrasing. I'll also add: I think certain sub-problems in an application can be specialized enough to need something really different (e.g. SQL, Prolog), but it seems like a relatively uncommon thing.
I used to like embedded DSLs a lot too, but then I realized they (mostly) were just the builder pattern + a lot of "extra" syntax (never written anything large enough in a Lisp or Haskell). The builder pattern feels more readable and has fewer context switches when writing code.
A "nice syntax to the builder pattern" describes every single applicative based DSL I've seen on Haskell. (I think this generalizes for all theoretically possible ones, but I need more coffee for being sure.)
But you get the advantage that applicative is a standard syntax, what you don't get on most builders.
I have started intuiting it like this. I want you to imagine the concept of a "producer of values", which is a fairly broad notion. This can be
1. a list, which produces its elements one at a time;
2. a computation that might fail to produce a value, which either produces one value or none at all;
3. an I/O computation, which may ask the user for a value to produce),
4. a stateful computation, which produces a value that may depend on some hidden state;
5. a parser, which produces a value that may depend on the source data being parsed;
6. a source of pseudorandomness, which produces a that is unpredictable;
or any other of a nearly infinite set of things. For such a producer of values, we can imagine that
A) The Functor instance says, "Hey, give me a function and I'll create a new producer that produces the values you get if you apply that function to the values I produce."
-- Example: If the producer is the infinite stream [1..] and the function is (^2), the Functor instance lets us create a new producer that produces the infinite stream of positive perfect squares.
B) The Applicative instance says, "Okay, that's cool, but you know what I can do beyond that? Give me two different producers of values, and I promise you I'll create a producer that combines values from both of the two producers you gave me. So in a sense, I am a combiner of producers."
-- Example: If the first producer is a database of previous actions the player of a game has taken, and the second producer is a source of randomness, the Applicative instance lets us create a new producer that produces an AI decision based on player action history but with some randomness thrown in to look more human.
C) The Monad instance says, "Pah, and you thought that was neat? Look what I can do! If you give me a producer and several different possible producers, I can create a new producer that chooses which of the different possible producers to run next based on the values produced by the first producer. So in a sense, I am the opposite of Applicative: I am a splitter of producers."
-- Example: if the first producer is a stateful computation that extracts the player health from the state in a game, we may have two different producers lined up to follow: one produces a commiserative message mentioning their score, and the other produces a message telling them round number n is starting. The Monad instance lets us create a new producer that delegates to either of the two depending on whether the player is dead (health <= 0) or alive (health > 0).
... I should really get around to writing this blog post ... I feel like I have rewritten it a thousand times in various comments at various points...
+1. And you have to learn the all damn dsl wich is always badly documented and with zero tooling or support instead of just using the regular language API for witch you are equiped for.
(For those who were wondering, the article is about DSLs in Kotlin).
DSLs can be very elegant ways to improve code readability (as long as the assumptions and language they use is meaningful to the team).
This is an area that I wish node.js had invested in. Perl6 has "slangs" which are extremely powerful, see [1] for some examples that just flow elegantly. Combine that with custom operators to write code like "4.7kΩ ± 5%" that would make sense to electronicians [2] and you can get some really user-friendly syntaxes.
(Again, it bears repeating that a DSL improves readability by sacrificing generality, so a DSL is for audience that's usually a subset of that language's users. That electronics code would be useless to the average web developer.)
Wow. The Perl6 slang feature is awesome. I can't believe I missed it.
I haven't had a chance to use Perl 6 but I would really like an excuse. I love Lua's LPeg module. Perl 6 has deeply embedded PEGs (i.e. Grammars, aka parsing combinators) into it's design. Knowing first-hand how powerful PEGs can be if the engine and language integration is implemented well, I've never doubted the awesomeness and elegance of Perl 6. It's unclear to me if Perl 6 Grammars' Action feature is as powerful as LPeg's capturing and inline transformation primitives[1], but the fact that you can plug a grammar into Perl 6's parser is crazy awesome.
I'm not surprised Slang exists--I've known it was possible; just surprised that it's a module and idiomatic pattern and I hadn't read about it before.
[1] See http://www.inf.puc-rio.br/~roberto/lpeg/#captures Most PEG engines just return your raw parse tree as a complete data structure and require you to fix it up manually. LPeg has really well thought-out extensions that allow you to express capture and tree transformations much more naturally; specifically, inline (both syntactically and runtime execution) with the pattern definition(s).
Having read a lot of Python code that uses scientific libraries that override operators and key indexing I'll have to disagree.
The result is easier to read if you're familiar with it but exponentially more complicated if you're not and still learning. It creates marginally shorter code but I feel like the sacrifice of the explicitness / verbosity of a normal method / function call isn't worth it.
Those libraries also make it easier to write that code. The majority of people using scientific libraries will be used to mathematical and scientific notation, so the closer the programming is to that, the better for them.
sure, that should come with a huge warning sign in the documentation. Edit: and as code should be self documenting, that warning sign could be an extra function, sure.
And now ask somebody to type those unicode chars on the top of his/her head...
Plus how do you look what it does in google ? In a a documentation ?
How do you generate documentation for those ?
Cause you have no method name, no class name, nothing.
And then what's the exact implementation behind it ? Does it do something funny ? So I have to mix the dozen of funny details of all the code from the guys who though the 20 times they do this super-important-operation they need to recreate a whole new syntax ?
Programming languages characteristically haven’t adopted Unicode syntax because Unicode input methods aren’t “there yet”. But if we start accepting Unicode in programming languages, at least as an option with ASCII as a fallback, then input methods, searching, and so on will be forced to improve.
I currently use Emacs’ TeX input method, so I can type “\forall\alpha. \alpha \to \alpha” and get “∀α. α → α” or ”4.7k\Omega \pm 5%” for “4.7kΩ ± 5%”, which isn’t too bad. And in Haskell at least there’s Hoogle, which lets you search for unfamiliar operators, even Unicode ones. For instance, searching for “∈” gives me base-unicode-symbols:Data.Foldable.Unicode.∈ :: (Foldable t, Eq a) => a -> t a -> Bool and tells me it’s equal to the “elem” function which tests for membership in a container.
Using ASCII art instead of standard symbols introduces some cognitive load for beginners as well. When I was tutoring computer science in college, I had students get confused by many little things, like “->” instead of “→” (“Minus greater than? What could that possibly mean?”) or writing “=>” instead of “>=” because “≤” is written “<=”.
I've been using Fira Code[0] which supports ligatures and (in addition to being a nice looking monospace font) it makes reading code with those symbols so much nicer. I mainly write Python and that generally uses a pretty limited set of special symbols, but it really does make a difference when you see ≤ rather than <= and 'long equals' instead of == etc. I definitely recommend it.
I'd love to see ligature/unicode support rolled out more widely, symbols allow for more easy differentiation and they're so nice to look at when you do have them.
For now, I've been using Emacs configured in a way to replace e.g. the word "lambda" with a symbol "λ" for display only - i.e. the original text stays in the text file, but on the screen, it gets rendered as a pretty version. I recall Haskell folks developed a lot of such replacements for mathematical symbols, too, but I can't get a proper Unicode font that would make them readable at the sizes I use (I prefer rather small text for code, so I can see more of it at the same time).
I’ve tried Fira Code before, and I think it goes a bit too far honestly. It’s nice that it preserves character widths, but ideally I’d like to get away from monospaced fonts as well—they’re pretty much just a holdover from the technical limitations of typewriters and text displays. The main reason I don’t use proportional-width typefaces in programming is that I like to work in a text-only terminal (to avoid distraction) and that the ASCII-art approximations of symbols tend not to look very good.
> And now ask somebody to type those unicode chars on the top of his/her head...
That's about the only problematic thing with this, though your editor should let you do it relatively easily; if it doesn't, find a one that doesn't suck ;).
> Plus how do you look what it does in google ? In a a documentation ? How do you generate documentation for those ?
Like with any other piece of API code. There's nothing special here.
> And then what's the exact implementation behind it ? Does it do something funny ? So I have to mix the dozen of funny details of all the code from the guys who though the 20 times they do this super-important-operation they need to recreate a whole new syntax ?
Check the associated documentation, or source code if available.
I get the impression programmers nowadays are afraid of reading the source of stuff they use.
> > And now ask somebody to type those unicode chars on the top of his/her head...
> That's about the only problematic thing with this, though your editor should let you do it relatively easily; if it doesn't, find a one that doesn't suck ;).
On macOS, and on a qwerty keyboard:
• Ω is Option Z
• ± is Option Shift = (or Option + if you look at it another way)
Many moons ago, HyperTalk (and I think AppleScript still does) supported ≥, ≤ and ≠ as shorthands for >=, <= and !=; and they were typed as Option >, Option <, and Option = respectively (fewer keystrokes).
I don't ignore it. I refuse to accept that a developer can be allowed to forever use only the little knowledge they got on their bootcamp / university, and never learn anything on the job.
It's like when we were transitioning to Java 8 on a project at work, and someone asked me if I don't think using lambda expressions will be confusing to some people in the company. My answer was: this is standard Java now, they're Java developers - they should sit down and learn their fucking language. We should not sacrifice the quality of the codebase just because few people can't be bothered to spend few hours learning.
(Oh, and over time, it turned out nobody was confused for long. Some people saw me use lambdas, others saw their favourite IDEs suggesting them using lambdas instead of anonymous class boilerplate - all of that motivated them to learn. Now they all know how to use new Java 8 features and happily apply this knowledge.)
Programming is a profession. One should be expected to learn on the job. And refusing to learn is, frankly, self-handicapping.
Out of curiosity (because this is what I was hoping from from the article), has anyone found a DSL in the wild that actually achieved its objectives?
That is, it either simplified the code necessary to write a program that there was a net savings for the project compared with developers just writing it in a language they were familiar with, or it simplified the business logic aspect to such a degree that even non-devs were able to be productive with it?
I admit this is partly determined by what we decide a DSL is, since it is, ultimately, at some level just an abstraction. But almost every instance I'd squarely say "that was an attempt at a DSL", it became just as complex as the underlying language, with additional gotchas, if it even 'worked' at all (i.e., allowed you to solve the domain specific problems it was intended to solve).
I can give Anvil as an example (https://github.com/zserge/anvil). It's a DSL for writing Android view layouts in Java and Kotlin. The DSL looks nicer than linear code of creating views and binding their properties, its look follows the hierarchy of the views, and the way it works results in a very fast rendering mechanism.
Character escapes and string formatting (printf, String.Format in .NET) fall into this category as well. They’re weird little domain-specific sublanguages that are entirely unlike their host languages, which people don’t even think of as DSLs because they’re so pervasive and (somewhat) consistent across languages.
Likewise, Common Lisp has a powerful matching DSL library too - Optima[0]. I also found a pretty new alternative, Trivia[1], which claims to be a drop-in replacement for Optima, but more extensible.
--
It's rapidly displacing proprietary DSL's like AMPL and GAMS in operations research. Neither of those happen to be embedded in a host language, they're special-purpose. Though there are comparable tools out there embedded in Python or Matlab, they don't perform as well as JuMP or the special-purpose tools.
One in particular that my team and I have used extensively is Spock, a unit testing Groovy DSL (see http://spockframework.org/spock/docs/1.0/index.html). In particular, we make heavy use of data tables and data pipes to generate sophisticated test data that would be a huge pain otherwise. And the accessible mocking syntax makes mocking so easy that devs end up writing more tests and testing more edge cases. The systems we have that use Spock the most have incredible test coverage and are very robust, because Spock makes it very easy and quick to test all your edge cases.
I second this. Spock is the best testing framework I've ever used. Same results on my team: much better test coverage just because writing tests is so much easier and fun.
You are two users each talking about your team to promote a framework. I'll take some actual time to look at some code from the sole link here, which is the website:
class DataDriven extends Specification {
def "maximum of two numbers"() {
expect:
Math.max(a, b) == c
where:
a | b || c
3 | 5 || 5
7 | 0 || 7
0 | 0 || 0
}
}
I see here a DSL that
* defines blocks without curlies by overloading the C-style label syntax normally used for break targets
* overloads the | and || operators to build tables of data without quotes around it
* uses strings as function names instead of camelCase
The Apache Groovy DSL enables that by having a complex grammar and intercepting the AST during the compile. Clojure, also for the JVM, has a simple grammar and provides macros, which are convenient for eliminating syntax in repetitive tests. I switched from Groovy 1.x to Clojure years ago for testing on the JVM.
I wrote one once, a bastardized xpath that was used in a web crawler. I think it worked because it was extremely limited in what it could do and because normal code was a fallback in several places, it was never intended for non-developers. With it we could condense several hundred lines into < 10, with much lower error rates. Come to think of it, I'm not sure if the term "DSL" had been coined at the time, or I was unaware of it in any case, "creating a language" was not something I was attempting to do.
Since then I've encountered several in the wild that have been maintenance nightmares. Some were too simple that they were useless. Others were too complicated, so we ended up having to maintain a language and the code in that language.
There's one lying around here that was intended for support people to program in via an expression tree editor with a clunky UI. That was a disaster. We've got another one which is a terribly bastardized regex that was built because "support people don't understand regex". The don't understand this one any better, they would have been better off learning a life skill like regex.
This may partly be Stockholm syndrome, but I'm actually somewhat fond of Gradle (a Groovy DSL for configuring builds/managing dependencies for Java projects). It's certainly not perfect and I think it would be overkill for simple builds, but I find the DSL well suited to writing custom build tasks.
As an example, I wrote a Gradle plugin for a project I work on that lets you write:
apply plugin: 'mycompany.service'
mainClassName = 'mycompany.Main'
service {
id = 'serviceid'
port = 1234
}
This results in a `buildImage` task being added which builds a Docker image that runs `mycompany.Main` and exposes the specified port (and does some Docker tagging with the service id).
Implementing this plugin was straightforward, less than 50 lines of Groovy, most of which is invoking another DSL[1] for creating a Dockerfile.
Gradle is not a Groovy DSL. Gradle enables you to write its build files in either Kotlin or Apache Groovy, and perhaps there'll be more choices later on.
> Implementing this plugin was straightforward, less than 50 lines of Groovy
Did you know Gradleware now recommend using Kotlin for writing plugins for Gradle versions 3.0 and later?
Rebol / Red is an excellent language for creating DSL's (Dialects) and there are plenty of examples in the wild (I've seen Rebol dialects for creating PDF's, Excel spreadsheets, Scheduling tasks, etc)
> it simplified the business logic aspect to such a degree that even non-devs were able to be productive with it
* bash (or any shell) -- Might seem odd to call it a DSL, but it certainly embodies the abbreviation.
* Matlab/R/etc.
* Excel (and even VB depending on usage) -- This even plays into "business logic"
You're probably more talking about something along the lines of Gherkin (Cucumber) though. While I've certainly seen it make devs unproductive, I don't know that I've even heard of it achieving the converse unless you loosen the criteria to meaninglessness.
Yeah, I actually was thinking Excel, and then was like "but that kind of depends on your definition of DSL", hence in part the question. A shell is an interesting thought of one. I don't know that I'd consider Matlab/R/etc given they're full on programming languages, just ones with a math-y slant.
Spray is the best approach to HTTP routing I've ever used, though I suppose you could argue it's just a normal library and you just write normal Scala. I'm not sure how you'd really draw a line between that and a "DSL" - to my mind any good library lets you write what you're doing in a simple, declarative, domain-appropriate way.
Plenty of such project-specific microDSLs can be seen in Lisp world. They do a great job at reducing the boilerplate one would otherwise have to write, and they're relatively easy to inspect, if you need to know what's being done underneath.
Right, but that hinges on the definition of a DSL. It's not a complete language for solving a domain specific need; it's just a set of abstractions intended to reduce boilerplate. To me that's not that different than, say, a library in any programming language; it's a set of nouns and verbs to make certain problems easier, but it still requires the full knowledge of the underlying language to make useful.
DSLs can be great abstractions and interfaces, in that they can provide a way to state program knowledge in a form that non-programmers may be able to work with more easily.
One example that stuck with me early was the Quake (and Doom) shader files, as well as the other game resource constructs from those old WAD-based games. The shader syntax wasn't much more than a rough wrapper on the program's `struct`s, but it allowed the graphic artists to twiddle with the resources manually (and later made a great program interface for the level editors).
XML (specific to document interchange), HTML (specific to hypertext interchange), and CSS (specific to DOM settings) were all DSLs that came about over the course of browser evolution. They offered ways to define content and configuration in a way that was both easy to understand and somewhat isolated from the underlying source code.
I've developed several DSLs in various systems I've worked in. Mostly you're pushing stuff out of the source code that doesn't belong there: configuration, repeated definitions, and things that may change from installation to installation. One of the abstractions we built was a series of custom state machine scripting languages that mapped to serial and TCP/IP protocols in a way that reduced boilerplate, and made the guts of implementing specific communications scenarios easier. The amount of time spent in developing a DSL was generally many times smaller than the gain in flexibility, transparency, and exposure to who could interact with customizing the system.
In CS, there are no good and bad things, there are only trade offs. DSLs are only good when the effort to learn to be proficient in the DSL repays the time it took you to learn it. Otherwise, the cognitive load of learning the semantics and the challenges often with debugging are not worth it.
An example of a great DSL is linq in C#. List comprehensions in Clojure. JQuery in JavaScript. Etc.
Mutability has its place, and simply hiding it behind abstractions tacked on to the language (vars, refs, agents, and atoms in Clojure's case) isn't a productive way to deal with it. It has a lot of benefits, but the downsides are significant. Sometimes I just want to create a map and mutate its structure, and the language saying "no" is constraining me in an unpleasant way.
It's the old debate of whether the programmer should be constrained by the language, or the language should serve the programmer. Maybe it's a matter of opinion.
Also, typed function parameters are painful. Declaring a parameter as a `f: () -> int` instead of `f` requires that you think about what the signature needs to be. What if you don't know? What if you want to change it? The cost goes up significantly, and I'm not sure the purported ability to sidestep a few bugs is worth it. If safety is an issue, good test coverage can be a sufficient substitute for types.
> Anyone else dislike the immutability fad? [...] Sometimes I just want to create a map and mutate its structure, and the language saying "no" is constraining me in an unpleasant way.
Immutability does not prevent you from incrementally creating new versions of a map! At this point, you are conflating no less than three different and orthogonal concepts:
A) A convenient way to write code that incrementally creates new versions of a value. This is trivial in most languages using immutable values, but it often requires unfamiliar idioms, giving people the false impression it cannot be done.
B) A performance optimisation that performs such local incremental updates in place. This is a real concern. Any immutable-by-default language should support a safe (!!) way for the programmer to ensure updates happen in place.
C) A mechanism whereby new versions of values can be distributed to other parts of a program. This one is tricky as all hell to do safely, and most good solutions work just as well with immutable values.
"But kqr, why did you take something so simple and made it so complicated?"
It was complicated from the start. Traditional languages have just ignored this complexity and solved it in an "undefined behaviour" sort of way: let's just do wha the metal happens to do, however unsafe!" You'll note tha such languages have no way to do safe in place updates, and value propagation is never sychronised or otherwise guaranteed not to cause partial views. It's a low-level hack that's far outstayed its welcome in the age of huge systems with all sorts of interaction effects.
No it wasn't. It's only "complicated" from a certain very specific and constrained POV, which you seem to have adopted wholesale and now assume as the only valid reality.
Mutability is not just how "the metal" works, it's also how the rest of the world works. If I eat a banana, the contents of my stomach (and by extension I) change. You don't get a new me that's no longer hungry + the old me that hasn't changed. If I put gas in my car's tank, the car changes. I don't get a new car. Etc.
You are also conflating immutability with atomic updates, as far as I can tell.
> Mutability is not just how "the metal" works, it's also how the rest of the world works.
This is actually false. A bit hand wavy here, but the world does not truly exist of discreet objects changing state. That's how we choose to model it, and it's useful, but it's not the "truth" of the world / universe.
A few simple examples - viewed at the quantum level, protons, neutrons, electrons aren't physical objects that "exist" in a unique place at a point in time. Now i'm not saying an imperative model is wrong because of quantum mechanics, but am merely illustrating that we chose to model the world as objects with changing states because it's useful, not because it's accurate or the truth.
The ship of Theseus is another examples. There really is no paradox here as there was never a "ship" to begin with. We just chose to model that particularly assembly of particles as an entity/object known as a ship, and gave it "state" that persists over time. But the only true answer is that it isn't a paradox the universe/physics never defined a "ship" to be there. We did. And the only "paradox" is that someone found a flaw in our modeling algorithm. But again, there was never a single object there changing state.
So the long story is - don't judge a method of modeling on how effective it is. Because no model we use in everyday is actually the underlying "truth".
https://en.wikipedia.org/wiki/Ship_of_Theseus
> but the world does not truly exist of discreet objects changing state.
Really? So elementary particles don't change position, energy, velocity? Everything is static and immutable? And gets destroyed/recreated for everything that looks like a state-change to us? While I could imagine modeling the world that way, it is a description of the world that gets shredded to tiny little pieces by Occam's razor.
On a more basic level, just the fact that you can imagine a different interpretation doesn't mean that an interpretation I give is false. There is a difference between "false" and "there are other possible interpretations".
And the fact that you don't have a simple materialistic/reductive definition of what a "river" is, or any other macroscopic object, does not mean that rivers don't exist. If you can't see it, more power to those of us who can, like me, because we are going to have a much better time navigating the "real" world than you are, including taking bridges to cross those rivers rather than drowning, because after all it's just a few drops of water, right?
Or you are just pretending that rivers don't exist, and actually implicitly use this knowledge all the time to safely navigate the real world.
Which brings us back to immutability in programming: it's largely a pretense. It can be a useful pretense. I certainly love to take advantage of it where appropriate, for example wrote event-posters long before that became fashionable[1]. But in the same system I also used/came-up-with in-process REST [2], which is very much based on mutability.
So immutability is useful where applicable, but it is also very limited, as it fits neither most of the world above nor the hardware below, and trying to force this modeling onto either quickly becomes extremely expensive. Of course, many technical people love that sort of challenge, mapping two things onto each other that don't fit naturally at all [3]
With an object-oriented model of reality, then yes, you can say there is a single banana object instance whose coordinates mutate to remain inside your body, and gradually your body decrements the banana nutrient counters while incrementing its own.
With a value-based model of reality, it doesn't happen that way. In that model, a both you and the banana are modeled as the (time, place, nutrients) triple of immutable values. Those triples never "change" or "become old" or "get duplicated". They just are.
At 4 o'clock, my nutrient counters were low and the banana had coordinates similar to mine but not quite. At 5 o'clock, my nutrient counters were higher and the banana had coordinates equal to me. Both of these statements are valid at the same time with no conflict.
Similarly, I don't get where this "if I open the boot of my car do I now have two cars" is coming from. <Chevrolet, 8:41 AM, Boot closed> and <Chevrolet, 8:42 AM, Boot opened> coexist in peace. They're not two different car objects, they're two different immutable, factual descriptions of states. As long as they were correct in the first place, the will never become incorrect or change in any way.
You are perceiving yourself as having mutated after eating the banana, merely because you are in temporal lockstep with your stomach.
Most of what you've just stated as "fact" and "how the world works" is a matter of perception. Time, after all, is (we believe) simply another dimension of spacetime.
"Reality is that which doesn't go away when you stop believing in it"
I gave very specific examples. When I open the trunk of my car, are there now two cars, one with the trunk not opened, and one with the trunk opened?
Now there may be two universes (many worlds interpretation), but so far there is no evidence that this interpretation has validity, and we have no way of accessing these two universes. As such, it is not a very useful description of the way the world works, as it doesn't actually visibly work that way, and very specifically visibly works differently.
> Time, after all, is (we believe) simply another dimension of spacetime
And objects move through space-time, thus changing (mutating) in their attributes. Objects aren't immutable in space-time and have to be destroyed/recreated continuously.
Not sure what reality you're inhabiting, but in mine, state is coterminous with identity, and mutation of state therefore creates a new entity.
That is, we are all value objects, all matter is information, all information is functional, and perception is therefore the lazy evaluation of the universe.
Note: this is especially important when you are the banana.
> Also, typed function parameters are painful. Declaring a parameter as a `f: () -> int` instead of `f` requires that you think about what the signature needs to be. What if you don't know? What if you want to change it?
This is actually my favorite thing about languages with strong type systems where type inference is less emphasized -- if you're looking at a function definition, even if you're not terribly familiar with the language, you know what the thing is supposed to do. Sure, it's nice to let the compiler infer your types at compile or runtime, but it's really nice to just be able to read the code and be able to see what a function is at least meant to do, based on what it returns.
This sounds really fishy to me. Do you have more details?
Source: have programmed professionally in major static and non-static languages and as the years pass I appreciate static languages more and more for each time I waste time debugging what I have learned to think of as stupid, completely avoidable errors.
> Also, typed function parameters are painful. Declaring a parameter as a `f: () -> int` instead of `f` requires that you think about what the signature needs to be. What if you don't know?
Why are you writing the code if you don't know yet what your input will be? A function transforms input into output, how can it do that correctly when you don't know what its input is?
If you don't know the input type yet, you should first be writing the things that will deliver the input, or leave the function as an unimplemented placeholder and define only the output type.
> What if you want to change it
This is one of the best reasons to use types. Directly after the change you'll be told by the compiler which other code needs changes. Compare this to un-typed languages where you'll need a lot of tests to ensure the same or risk runtime errors.
Immutability is useful as a default, but languages should absolutely provide mutable alternatives for situations the programmer deems appropriate.
The reason immutability has become popular so recently, and the reason why I say it's the best default assumption is the rise in multi-threaded programming.
There are a growing number of strategies for dealing with mutability in a multi-threaded environment in a way that is both safe, and intuitive to reason about, the two that springs to mind are the rediscovery of actors (e.g. Akka), and go's goroutines.
Mutability certainly has its place, but so does immutability. Previously, immutability has been largely ignored in favour of the convenience of mutablity. I wouldn't say immutability is being heavily favoured now; more that it's back where it belongs, as part of a binary choice.
Even in single threaded programs immutability can make things easier to reason about. I've had to look at some old Java projects and the easiest time I've had reasoning about them was a project that used mostly immutable objects. Ever since then I've used immutable as the default for my Java objects and it's been better for me. Your mileage may vary obviously
But even actors benefit from immutability. Imagine if the message you received and matched on was not the message you processed (because it was mutated in between). You either want immutability at least for message handling, or messages to be full copies, and there are tradeoffs to either choice.
Actors don't necessarily encapsulate mutable state; they may in fact encapsulate immutable state. An actor can be treated (as in Erlang) as essentially just a(n optionally) recursive function with a mailbox, where mutation can only occur across the function boundary (i.e., when you recurse, you pass in the new value).
At this point the only benefit to allowing mutability is to allow you convenience things like for loops, and for performance reasons. From the perspective outside of the actor, the state behaves the same way whether it's mutable or immutable; you do not gain new behaviors by making it mutable (i.e., for sharing data or similar), you just make a more complex coding model (in that you're mixing mutable and immutable and have to know which is which).
Of course it is; my point was that if your communication mechanism between actors is immutable, there's hardly any way to differentiate mutable vs immutable within the actor...and, honestly, it doesn't really affect much, so why mix the two (since that then creates weirdnesses in how your data can interoperate and be handled; some pieces can be used as new messages, others can't, etc).
Immutability is absolutely fantastic as a default (especially for typical business logic). I'm not sure I want to go back to mutability except in constrained environments or time-sensitive software. Mutability should be harder to reach for because it forces you to think hard about the implications to other parts of your program. I can think of a number of times we've had to parallelize a routine or share resources between workers, and it has been almost trivial _because_ of immutability. In contrast to that, I inherited a hairball of C code using pthreads, mutexes, global variables, etc which had to be completely gutted and rewritten as it was impossible to understand the dependencies in the software (we were witnessing segfaults, deadlock, etc).
Declaring a parameter as a `f: () -> int` instead of `f` requires that you think about what the signature needs to be. What if you don't know?
(Java programmer here.) Then return a sufficiently "broad" type up to, if necessary, Object.
That said I almost never fall that far back, I'd see that as a sign that I should stop and think.
Oh, and the advantage of a typed languages like Java: you can have extremely good tool support. Netbeans (or IntelliJ or Eclipse) will happily help you to change types if it later turns out you were wrong.
The abstraction that hides mutable data structures is transient[1], not vars, refs, agents and atoms.
Those abstractions are all to do with mutable references, not mutable data. They're there to make sure that you have safe ways to coordinate state in concurrent environments.
I agree that immutability can be annoying in languages that weren't designed with it in mind, but the languages that were almost always make it easier, safer and performant to create copies rather than mutate data.
If you're writing Clojure or Haskell and you think you're going to save yourself time by "just creating a map and mutating its structure" then you're misunderstanding the purpose of the language. These constraints are what enable the guarantees and contracts those languages make.
But it does save time. JS is proof of that. There are no immutable data structures and the world rolls on. People will hate on JS, but it's effective.
You can learn to get into the "immutability mindset" if you train yourself to, but are you certain it's worth the time investment? It seems like there's at least a chance that it's not.
Sure, no arguments there. It does save time in JavaScript and a large part of that is because the language has been designed around mutability.
Part of that trade-off is that JavaScript can't make the same guarantees about what happens when you pass an object into a function. It's harder to be confident that a given program is correct.
Immutability is just a part of the "simple made easy"[1] ethos of Clojure and I think most Clojure programmers will argue that taking the time to understand that philosophy _is_ worth the investment.
Any bad implementation of something is bad. Sounds like immutability in JS is just badly designed and implemented, in a way that makes it difficult and slow to code with.
Don't generalise you're experience of a thing if you've only tried its bad implementations. Like don't judge Monads until you try them in Haskell. Don't judge immutability or DSLs until you try it in Clojure, etc.
Agreed that immutability seems overrated right now. The pendulum will swing back in full force, eventually.
But of course, the art is in putting mutability and immutability in the right places. There are no fast rules, but in general mutability of local variables is harmless, and you probably need some mutability at the top level as well (as per: "functional core, imperative shell"). Mutability in other places can still be useful though.
I don't agree or really follow your comment on function parameters. It seems an argument against typing in general.
When you pass an empty list, `f` is called with no arguments. That way, `(reduce + [])` can return 0, whereas `(reduce * [])` can return 1.
It's very easy to write such a function in Clojure, thanks to a lack of types. What would that look like in Kotlin? A function that can either take zero arguments or takes two arguments of types specified by the input list?
The usual implementation of fold/reduce takes a separate seed argument (as does clojure, optionally!), which IMO is far more sensible than having the same function do two completely separate things depending on the number of arguments.
How would you write a function parameter that takes two arguments, whose type is determined by the seed argument?
Also, with reduce, the return value of the function can determine what the type of the input arguments should be. For example if you pass in a `(fn (x y) (list x y))`, you end up with:
> (reduce (fn (x y)
(list x y))
(list "a" "b" "c" "d"))
("a" ("b" ("c" "d")))
Let's throw in a print statement to print out the `y` parameter:
So `y` starts out as a string, then a list of strings, then a list whose first element is a string and the second element is a list of strings, and so on.
I'm not sure what the type of that function would even look like. And if there's no way to express something as simple as `reduce` without resorting to `f: (x: Any, y: Any) -> Any`, are we certain it's good design?
And this is the other thing I like about the haskell/ML collection of languages. The description you gave above is extremely concise and direct. If you're familiar with the language being used it's a very efficient form of communication.
I've noticed that working in languages of that family gives you a vocabulary to talk about things that previously would have been very wordy to discuss.
Languages with these types of type features provide easy abstractions to illuminate structure and patterns begin discussing new ideas that previously would've been considered on off code.
FWIW, this can be inferred fully automatically. Type
let pair = fun x -> fun y -> { _0 = x; _1 = y }
let rec reduce = fun init -> fun combine -> fun xs ->
match xs with
[] -> init
| x :: rest -> reduce (combine x init) combine rest
let test = reduce 0 pair [1; 2; 3; 4]
`O` would be whatever the type of the reducer is, so for `list` in the lisp you're using (is it Clojure? The function params look wrong) has a type signature `(A, B?, ...Z?) -> A | List(A, B, ...Z)`.
So the transducer in this case would have the type `List(A, B?, ...Z?) -> A | List(A, B | List(B, ...etc))`
It's not three different types, but it is necessarily recursive, which seems tricky.
> It's the old debate of whether the programmer should be constrained by the language, or the language should serve the programmer. Maybe it's a matter of opinion.
I like to think of it as the language saving the programmer from herself
If you have sufficiently good test coverage to replace a type system you have to change tests whenever you'd have to change the type. No costs saved. Thinking more before coding is beneficial in my experience.
You pay for not specifying those parameters and so on in other ways later on.
By leaving those things out you are broadening the problem model and denying the compiler information that it needs from you as the programmer so that it can do a good job of solving the problem.
You have to remember that the computer can't synthesise knowledge about the problem domain that you deny it in the first place.
So sure tightening things down by specifying them may be tedious but it is the reality of the problem that you are solving.
Convenience of leaving it out will translate to instability such as run-time exceptions not to mention a massively reduced ability in tooling support (like IDE auto-complete).
Kotlin lives on top of a statically typed architecture, so I think you do have to make that function typed. Some languages get around this requirement by inferring types, but type inference itself can then become Turing complete...
So it's not like there's a "healthy option". It's more that you're choosing the poison you prefer.
> Mutability has its place, and simply hiding it behind abstractions tacked on to the language (vars, refs, agents, and atoms in Clojure's case) isn't a productive way to deal with it.
In Clojure if you want mutability, you don't use an PersistentHashMap i.e. `{}` inside an atom `(atom {})`, you import and instantiate a mutable java class `java.util.concurrent.ConcurrentHashMap` or `java.util.HashMap` `(let [m (doto (java.util.HashMap.) (.put "foo" "bar") (.put "spam "eggs"))] .... )` and bang on it just like you would in Java. Clojure doesn't attempt to solve mutability because the host platform has already done a good job at that. Clojure provides semantics around state managment of persistent data if you need that sort of thing, but that's not necessarily a good replacement for mutability if you actually need a mutable thing.
You can have both mutability and immutability in the same language. Example from Ruby:
$ irb
2.4.1 :001 > a = [1, 2, 3]
=> [1, 2, 3]
2.4.1 :002 > a.map {|x| x + 1}
=> [2, 3, 4]
2.4.1 :003 > a
=> [1, 2, 3]
2.4.1 :004 > a.map! {|x| x + 1}
=> [2, 3, 4]
2.4.1 :005 > a
=> [2, 3, 4]
2.4.1 :006 > a.freeze # make a immutable
=> [2, 3, 4]
2.4.1 :007 > a.map! {|x| x + 1}
RuntimeError: can't modify frozen Array
Ruby is fundamentally a language based on mutability, but it shows that it should be possible to have both ways. Questions:
1) Is really there a language which is fully agnostic about mutability and immutability?
2) If not, it's maybe because designers have strong opinions about this matter? Actually IMHO designing a fully agnostic language would demonstrate strong opinions too.
External DSLs (that is a DSL independent of the host language)... I love them. An external DSL is like a protocol or the ultimate formal API (an example would be HTTP).
Why do I dislike embedded DSLs: I guess the crux could be that you are essentially mixing two languages when the reality it is just one so you are doing non canonical things in the host to make the guest language look pretty (eg going to town with implicits, extension functions, and operator overloading (scala)). Basically a whole bunch of confusing stuff to normal developers just to make the DSL look pretty.
With Scheme/Lisp/Clojure it is OK because embedded DSLs using sexp look like lisp. In Haskell it is sort of OK because Monads and various other things Haskell provides (laziness).
With Scala its the absolute worse with operator overloading, implicits, and various other gotchas.
In my experience it also gets really confusing trying to figure what is going on when a embedded DSL breakk as well as extremely difficult to secure an embedded DSL. `ie System.exit(1)` is just one call away.
There are exceptions of course. DSLs that mimic external DSLs one-to-one. One example being jOOQ as it aims to be one to one with an external DSL ... SQL.