Hacker News new | past | comments | ask | show | jobs | submit login
What is Symbolic Computation? (stevelosh.com)
113 points by stevelosh on June 29, 2016 | hide | past | favorite | 41 comments



The difference between adding symbols to a language with product types (i.e. consing) and simply having algebraic data types in your language is that symbols are untyped, and so you're allowed to compare apples and oranges (they're inequal).

This isn't terribly useful, as it's almost always a mistake to do it. If you do have a context in which you want to compare apples and oranges, algebraic datatypes force your hand to declare which types of things you'd like to compare across (e.g., specify that you're comparing any fruits, and that apples and oranges are fruits). This clarity helps form a clear mental model of how your software works and avoid bugs.

Lisp is the "hold my beer, I know what I'm doing" of functional programming languages. It, like other languages that take this stance, should be laid to rest. Time has demonstrated that programmers do not know what we're doing, and need all the machine-provided help we can get.


This is... unproductive. Time has demonstrated that language is just one part of the tooling story when it comes to developing software. It is easily one of the most talked about, but ultimately it is just another tool.

That LISP structured languages have not died despite many instances of people insisting they should have died already is almost evidence on its own that this is a non-starter.

For more relevant disputes, consider https://docs.racket-lang.org/ts-guide/. Specifically, you can use tooling to get types into a LISP.


I have no dogma about where you implement your typechecker; all I assert is that it's better to have one. The LISP philosophy is (in my impression from the literature) that with powerful enough abstractions, one writes so little code that there is hardly any room for mistakes to sneak in, so typecheckers could provide little benefit. In one sense this thesis is plausible: at some level you either stop encoding the specification of your intended functionality into your types, or you end up programming in a theorem-proving language like Coq or Agda, which everyone seems to agree is bad for productivity. But I think we should be trying to move toward more specifications, not fewer, and solve the ergonomics problems as we encounter them.

Unrelatedly, I have worked in Typed Racket and the performance is so poor that I can't advocate its use. Of course you can profile and optimize anything, and avoid writing code which typechecks slowly, but such static analyses seem the sort of thing where incremental computation (caching, and amortization across executions of the typechecker) should make even very powerful analyses finish quickly. And Racket doesn't take advantage of the results of its typing analyses to optimize runtime representations of datatypes, so even after typechecking your code runs very slowly and uses a lot more memory than it needs.


> I have no dogma about where you implement your typechecker; all I assert is that it's better to have one.

You seem to be suggesting that types are absent from lisp, that's not the case.

Common Lisp, Scheme, and Emacs Lisp all make use of types. They are, to a first approximation [0], all strongly typed and dynamically typed languages.

Since you're not dogmatic about the where of it, try computing:

  (+ 2 'atom)
In your favorite common lisp. It will result in an error, specifically a `SIMPLE-TYPE-ERROR`.

You can also specify types in function declarations in Common Lisp:

  (defun double-integer (x)
    (declare (type integer x))
    (* 2 x))
This will only accept integers now. Other types passed in as a parameter to `double-integer` will result in an error.

You can apply this at any level once you start building up your own types as well (structs, classes, etc.). The benefit here is that you can leave your code wide-open to allow for many cases at some stages, while optimizing or constraining it in others. This malleability is fantastic in lisps.

While I'm certainly a great fan and proponent of static type checking, the benefits of lisp's concepts (the malleability of its semantics and syntax), make it a great force-multiplier for component users. A great number of concepts become practically trivial when given the level of abstraction that lisps permit.

A good type system can give you something akin to a calculus of types.

A good functional language (here I'll include lisps even though many these days seem to discard them as such) give you a sort of calculus over computation.

But the structure of lisp (particularly its homoiconic nature) gives you a calculus over your programs syntax and semantics.

[0] The places where they may be considered weakly typed are places where it's somewhat "natural". Such as in Common Lisp where there's the numeric tower. Technically your number does stay a number, but it moves from an integer to, say, a rational when you perform a division by a non-divisor integer. It maintains the highest level of precision/accuracy possible. So `(/ 1 3)` doesn't become some 32-bit float version like `0.3333334`, but rather `1/3`, nor does it do the C thing and turn it into 0.


I don't see how "typed/untyped" is a meaningful distinction if you consider "dynamic types" to be types. If the only observable effect of the typechecker in a dynamically-typed language is a runtime error, how do we distinguish an arbitrary source of runtime errors (e.g. "throw" or "error" statements) from a dynamic typechecker? Obviously it isn't the presence of the word "type" in error messages, but what is it? In the presence of the metaprogramming you describe, we can't say it's down to whether checks are inserted automatically either, since non-type dynamic checks may also be inserted pervasively and automatically with metaprogramming.

If Lisp is typed, what's an "untyped" language? One in which the user hasn't done metaprogramming to insert runtime checks? Or is the distinction just about whether it's the end programmer or the language implementor who does the metaprogramming? None of these seem like very useful distinctions.

If it's lispy to actually do a separate typechecking pass, please forgive my ignorance.


Imagine a language where you work with blobs of data and have no restrictions on the operations between them. Like assembly languages. You can add a char* and float and the compiler won't complain. I would call that untyped. A typed language restricts the operations you can perform on blobs of data by classifying them.

A user could build a more restrictive type system on top of an already typed language. The distinction between a type system and an arbitrary source of runtime errors is that a type system has the properties i described earlier, and also it has the cohesion you would expect from it being a "system". A smattering of if statements wouldn't qualify.

That's one way to look at it I guess. Everyone has their own working definitions so I still it's still reasonable if you consider dynamically typed languages to be untyped.


i'd say an untyped (well, weakly typed; every language has some notion of types) language was one like C where you can construct a value of some type, and then treat the block of memory in which the physical representation of that value is stored as some other type.

that said, eric lippert has a good blog post on [defining our terms more precisely before even asking the question](https://ericlippert.com/2012/10/15/is-c-a-strongly-typed-or-...)


There are in fact typeless languages that are neither statically nor dynamically typed, although they aren't particularly relevant nowadays -- BCPL, Forth, and assembly, among others. These languages have only words that hold integer values, with different operators treating the words as an integer, pointer, etc. Those who claim that only static types are worthy of the name "type" would probably consider both dynamically typed and typeless languages to be equally untyped -- dynamically typed languages just support richer semantics on their values, such as storing a "tag" (i.e., dynamic type) in each value.


> Common Lisp, Scheme, and Emacs Lisp all make use of types. They are, to a first approximation [0], all strongly typed and dynamically typed languages.

"Dynamically typed" is a contradiction in terms, at least if one is using the standard technical definition of the word "type". These languages offer runtime checks that have some things in common with a type system (in particular they have some of the same use cases), but they are not types and it is confusing to call them types (even if everyone in computer science agreed to talk about "dynamic types" and "static types" that would still cause confusion, because "static types" are types in the mathematical sense and "dynamic types" are not).

> But the structure of lisp (particularly its homoiconic nature) gives you a calculus over your programs syntax and semantics.

It's more a lack of structure, and a lack of syntax. In many lisps you have less control over syntax than in other languages - any embedded DSL has to be written in Polish notation. (You have reader macros but that's basically just having a program that operates on text strings, which you can do in any language - you can write a C program that preprocesses C source and it's effectively the same thing). I mean sure lisp lets you manipulate a tree of symbols arbitrarily, and then execute that tree with certain semantics. But there are a lot of languages you can do that in. The unique thing about lisp is that your program has to be expressed as a tree of symbols - which yes means that you have a calculus over it, but only by forcing its structure to be trivial. I'd rather have my program expressed as a richer structure, and a language that's powerful enough to manipulate rich structures in a structure-aware way.


> "Dynamically typed" is a contradiction in terms, at least if one is using the standard technical definition of the word "type". These languages offer runtime checks that have some things in common with a type system (in particular they have some of the same use cases), but they are not types and it is confusing to call them types (even if everyone in computer science agreed to talk about "dynamic types" and "static types" that would still cause confusion, because "static types" are types in the mathematical sense and "dynamic types" are not).

That is complete nonsense. Dynamically typed languages have types (many even treat them as first-class objects) and primitive expressions with types assigned to those expressions. What dynamically typed languages let you do is construct and evaluate terms that cannot be typed.

> You have reader macros but that's basically just having a program that operates on text strings, which you can do in any language - you can write a C program that preprocesses C source and it's effectively the same thing

This argument boils down to "you can write a huge compiler in any language you want! it's easy!" There is actually a Lisp meme about it: https://en.wikipedia.org/wiki/Greenspun%27s_tenth_rule

> I'd rather have my program expressed as a richer structure, and a language that's powerful enough to manipulate rich structures in a structure-aware way.

You are confusing program syntax with parse trees. All parse trees look the same ("here is a graph of your program"). You can keep things simple and have a syntax that is context-free, has an obvious mapping to the parse tree, has an obvious interpretation, and has tons of tooling support in hundreds of different tools and editors. Or you can decide that you really really like your whitespace to be very emo and sensitive - just write your own context sensitive parser, roll your own tooling, force everyone to learn it! It's easy!


Re embedded DSL as always Polish notation:

That's not accurate and it doesn't require a reader macro to change it. You do need to state the context in the first position, but any string of symbols after that can be manipulated by a macro. The order at that point is up to the macro designer. See the loop macro as an example.


Well sure, but at that point you're effectively implementing parsing it by hand - which I guess is better than implementing lexing by hand as you would with a reader macro, but not by much. You can take a string of symbols and then does something arbitrary with them, sure, but at that point your language isn't really doing anything for you (and again, you could do that in any language (if the point needs proving: apart from anything else, it's trivial to write a lisp interpreter in most languages) - the difference is that lisp doesn't allow you to do the more structured thing).


In what cases is a `richer' structure better? `Richer' to my ears sounds like `more complicated'. The basic function form of (fun arg1 arg2 ...) or fun(arg1, arg2) is adequate for most things and in lisp when you want to express something that doesn't work well in that form you write a macro to grok some structure (like cond). That all seems very structured to me, and I see no issue with the base structure being trivial. In fact I'd argue that's necessary to achieve a calculus over the code.


Some things just make more sense with infix or postfix notation. Not least of which being mathematical expressions.

I think Haskell handles it really well, where everything is just function application, but identifiers made of symbols apply infix (and there's syntax to switch between them). It's just enough syntax that you can usually structure things how you want, but the rules are still simple enough that it's easy see the tree structure.

Sometimes people go a bit crazy with adding their own operators, but for the most part I think the community is getting over that.


> Some things just make more sense with infix or postfix notation. Not least of which being mathematical expressions.

There are pretty much only four binary operators in modern mathematical notation that make sense infix (addition, subtraction, multiplication, and variable assignment) and only in very short, unambiguous expressions. Pretty much everything else is either prefix (function application, derivatives, summation, etc) or uses special layout formatting rules (division, exponentiation) or both (matrix multiplication).

Even with all of this modern mathematical notation is full of ambiguities. A great short discussion about this is in the preface to Sussman and Wisdom's _Structure and Interpretation of Classical Mechanics_: https://mitpress.mit.edu/sites/default/files/titles/content/... which despite being all about "mathematical expressions" somehow manages to have all of its code in Lisp...

I also recommend reading Munroe's _The Language of Mathematics_ for a better understanding of modern mathematical notation and its weaknesses.

Note also that this is only modern mathematical notation. There is nothing "natural" about infix operators and for most of mathematical history they did not exist (see for example Bashmakova and Smirnova's _The Beginnings and Evolution of Algebra_ or any other book on the history of algebraic notation).


Common Lisp declarations are hints to the compiler that is doesn't have to do as much type checking, and so it can speed things up by omitting type checks, because you the user told the compiler it was safe to do so.

http://stackoverflow.com/a/12677731

...SBCL with certain level of safety declarations does mostly do what you really want though.


In your favorite common lisp. It will result in an error, specifically a `SIMPLE-TYPE-ERROR`.

Just because it calls itself a type error doesn't mean it is. If you don't catch the error until your function actually runs, it's a runtime error, not a type error. Type errors are all caught before the program runs, typically in a compilation step.


You're talking about static types versus dynamic types, not typed vs untyped. A type system that is verified at runtime is still a type system.


Dynamic 'types' are not types, they're tags [1]. Their function, to determine what to do with a piece of data passed to a function at runtime, is common to both so-called statically and dynamically typed languages. The actual function of types, on the other hand, is to prove simple propositions about a program without running it. This feature is unique to statically typed languages.

This is the problem with this entire debate. People act like there are two sides arguing from positions of equal merit. That is not the case. One (static) is a strict superset of the other (dynamic).

[1] https://en.m.wikipedia.org/wiki/Tag_(programming)


Now you're confusing the implementation with the abstraction. Types can be modeled by tags. They can be modeled by trees. Or any other data structure that you can analyze with a program.

>The actual function of types, on the other hand, is to prove simple propositions about a program without running it.

The value of a tag proves something about the program. You're just proving it at run-time rather than compile time.

Which is an artificial distinction anyway, since your static types theorems are being proved during the run-time of your static type checker. All theorems are proven during some run-time.


I am afraid this is true.

The `declare` thing is common lisp is a bit poorly understood in general:

`declare` is not a promise the compiler makes to you to only accept the declared type.

`declare` is actually a promise YOU make to the compiler to only pass the declared type as parameter. Knowing this, the compiler can activate some optimization (it's an integer? i guess i can skip type detection then).

But let's get real:

    * (defun double-integer (x)
        (declare (type integer x))
        (* 2 x))
    
    DOUBLE-INTEGER
    * (double-integer 2)
    
    4
    * (double-integer "qualcosa")
    ;; error here, stack trace omitted

    
    * (disassemble 'double-integer)
    
    ; disassembly for DOUBLE-INTEGER
    ; Size: 37 bytes. Origin: #x10044E4242
    ; 42:       498B4C2460       MOV RCX, [R12+96]                ; thread.binding-stack-pointer
                                                                  ; no-arg-parsing entry point
    ; 47:       48894DF8         MOV [RBP-8], RCX
    ; 4B:       488BD6           MOV RDX, RSI
    ; 4E:       BF02000000       MOV EDI, 2
    ; 53:       488B0576FFFFFF   MOV RAX, [RIP-138]               ; #<FDEFINITION for ASH>
    ; 5A:       B904000000       MOV ECX, 4
    ; 5F:       FF7508           PUSH QWORD PTR [RBP+8]
    ; 62:       FF6009           JMP QWORD PTR [RAX+9]
    ; 65:       CC10             BREAK 16                         ; Invalid argument count trap
    NIL
    * (defun double-integer (x)
        (* 2 x))
    WARNING: redefining COMMON-LISP-USER::DOUBLE-INTEGER in DEFUN
    
    DOUBLE-INTEGER
    * (disassemble 'double-integer)
    
    ; disassembly for DOUBLE-INTEGER
    ; Size: 38 bytes. Origin: #x1004B6CBCC
    ; CC:       498B4C2460       MOV RCX, [R12+96]                ; thread.binding-stack-pointer
                                                                  ; no-arg-parsing entry point
    ; D1:       48894DF8         MOV [RBP-8], RCX
    ; D5:       BF04000000       MOV EDI, 4
    ; DA:       488BD3           MOV RDX, RBX
    ; DD:       41BBA0020020     MOV R11D, 536871584              ; GENERIC-*
    ; E3:       41FFD3           CALL R11
    ; E6:       488B5DF0         MOV RBX, [RBP-16]
    ; EA:       488BE5           MOV RSP, RBP
    ; ED:       F8               CLC
    ; EE:       5D               POP RBP
    ; EF:       C3               RET
    ; F0:       CC10             BREAK 16                         ; Invalid argument count trap
    NIL
    * (double-integer 2)
    
    4
    * (double-integer "qualcosa")
    ;; error here, stack trace omitted


The difference is clear: instead of just performing the multiplication, the generated code must CALL a generic function. With all of the cost of function calling (memory access time and stuff).


For others reading, it's making a call in both cases because "integer" is not a native machine type. Integers are unbounded and include all big integers as well.

The various float types and fixnum (machine register width integer) would compile down to machine code multiplication instructions without function calls, if declared or inferred that the inputs (and output in the fixnum case) are of those types, and the safety option is reduced. (In SBCL)


I am not aware of anyone that is against having a typechecking pass on the code that a LISP author writes. I fully acknowledge that my sphere of knowledge regarding who would be against it is pretty low, though. Do you have specific criticisms in mind?

To me, most of the criticisms I see on LISP almost always come down to gripes about the parentheses. Usually it is said jokingly, but it is also the only criticism.

Worse are when there are claims that there are no IDEs that can help with LISP. I don't even know what to say to that one. The birth of the modern IDE can easily be traced to predecessors of emacs. (Indeed, for most purposes today, the only difference between emacs and other IDEs is the language of extension. And... emacs being able to edit itself largely as it runs.)

None of this is to say that a LISP is the panacea of development. I just don't think there is a panacea anymore.


> The difference between adding symbols to a language with product types (i.e. consing) and simply having algebraic data types in your language is that symbols are untyped, and so you're allowed to compare apples and oranges (they're inequal). This isn't terribly useful, as it's almost always a mistake to do it.

You are taking one trivial use case of Lisp-style symbols (enums) and completely ignoring their value as first-class identifiers that come as part of the language.

> Lisp is the "hold my beer, I know what I'm doing" of functional programming languages.

Lisp is not a functional programming language.


Lisp is not a functional programming language.

?


It's a multi-paradigm language. You can use it for functional programming, but it's not strictly a functional programming language.


Yes. In fact, the point with symbols and code as data is that you can implement different paradigms in the language itself. A program implementing prolog, for instance, is an example program in books.


I generally agree with you, at least as far as i want all the machine-provided help i can get.

Programmers do not know what we're doing is a little harsh. People can get a lot done in python for example. As things grow, it gets tougher and tougher to do the right thing all the time. But there's certainly room for that quick, fluid interaction of a less constraining language.


While sometimes I don't, the times I do, I prefer a language that can hold my beer and allow me to do what I want.

I want the machine to help me as much as I need, but no more. I also don't want to do its job.


Has anyone here read Programming in the 1990s or A Logical Approach to Discrete Mathematics?

Symbolic computing is a powerful concept. How does one model something simple that we take for granted such as assignment? Here's a hint: look to Liebniz and Tony Hoare. Symbols and syntactic substitution give you powerful abstractions over expressions. Everything in Lisp is an expression. Symbols give you the ability to develop concepts built up from complex expressions. They let you reason about substitution, equivalence, etc. That's the real power: they're a tool that lets you think above the code.

And as others have pointed out, Common Lisp is typed. Type-declarations, though optional, are amenable to static analysis. And the interesting thing about Lisp that I haven't explored much yet is the ability to generate a specification in a higher-level language like PVS perhaps and generate the type annotations from that. The implementation would then have to satisfy the annotations in order to validate the program.

Nice article. A Gentle Introduction is a neat book.


Symbolic Computing means the computation of formulas which not only contain numbers and its operators, but also names which stand for something (a variable in some calculus, a function in some calculus, a plan operator, a note, ...). Thus McCarthy developed a language for computing with symbolic expressions: Lisp. One of the things he/they found out was that Lisp itself can be described as symbolic computation, shown by the eval procedure and by representing Lisp programs as Lisp data. But generally he had all kinds of applications in mind. One of the first applications were logic formalisms and algebra. In this case one can not only evaluate algebraic expressions, but also do computation with them...

This is from an early Computer Algebra program (Macsyma), which was written in Lisp and whose development has roots back into the 60s. Though Macsyma is written in Lisp and provides a Read-Eval-Print-Loop, it does not use s-expression-based syntax. Instead it uses a syntax, which should be familiar to most people doing algebraic calculations.

expr1 is a variable and we set it to a formula.

    (%i5) expr1:x^28 + 1;
                                         28
    (%o5)                               x   + 1
we can sum two expressions:

    (%i6) expr1 + expr1;
                                          28
    (%o6)                              2 x   + 2
exponent:

    (%i7) expr1^expr1;
                                             28
                                     28     x   + 1
    (%o7)                          (x   + 1)
Let's factor an expression:

    (%i8) factor(expr1*3);
                       4        24    20    16    12    8    4
    (%o8)          3 (x  + 1) (x   - x   + x   - x   + x  - x  + 1)
Subtract 1 and y from the expression:

    (%i9) % - 1 - y;
                        4        24    20    16    12    8    4
    (%o9)     - y + 3 (x  + 1) (x   - x   + x   - x   + x  - x  + 1) - 1
Factor the expression:

    (%i10) factor(expr1);
                      4        24    20    16    12    8    4
    (%o10)          (x  + 1) (x   - x   + x   - x   + x  - x  + 1)
Subtract 1:

    (%i11) % - 1;
                    4        24    20    16    12    8    4
    (%o11)        (x  + 1) (x   - x   + x   - x   + x  - x  + 1) - 1
Simplify it:

    (%i12) ratsimp(%);
                                           28
    (%o12)                                x
Same idea: interactive computing with symbolic expressions, this time expressions from the domain of Algebra. The language also allows us to write programs:

    (%i14) myfact(n) := if n = 0 then 1 else n * myfact(n - 1)$

    (%i15) myfact(10);
    (%o15)                              3628800
It allows us to write functions for symbolic expressions, for example to implement integration, etc.


If you like that, it lives on in the open-source "maxima" (just an apt-get install away).


I think it's a bit of a disservice to always focus symbolic computation on just mathematical equations. The strength of Lisp is that you can easily apply all of that to program statements, as source code is just a data structure.


What an awesome resource to introduce someone to lisp. I would add "On Lisp" by Paul Graham to the final recommendations. If you don't mind quite a bit of opinionated writing, "let over lambda" can be a fun read too.


In the simple sense, I understand symbolic computation as transformation of text. In more general sense, all information is based on symbols ("letters"), so transforming one information into another is a symbolic computation.


Amateur perspective here...

My understanding is that it is the transformation of lists of symbols. This is one step above parsing characters/runes into more meaningful primitive data. Lisp has facilities for manipulating trees of data. The book Land of Lisp makes this distinction clear in an interesting way: they implement a text adventure, and they separate the notion of words or concepts (model) from their final string representation (view). For example, you can express a phrase with something like

  '(you see exits leading (exit east) , (exit west) , and (exit north))
or

  '(you see exits leading (series (exit east) (exit west) (exit north)))
Run this through your parser for that list of lists of symbols, and you may end up with

  You see exits leading [east], [west], and [north].
where east, west, and north are highlighted if expressed on the console or converted into appropriate URLs if expressed in HTML. You can also imagine a potential here for language conversions.

Along similar lines -- with due credit to the PAIP book -- you could separate the notion of some mathematical formula from its inputs. Do manipulations, simplifications, whatever, and THEN provide inputs. This suggests that with symbolic computation and some macros, you can assist the Lisp compiler in coming up with the speediest version of some formula. Or expression. Or template output.

Not sure how to get an exact link to this comment, but look at sklogic's back-and-forth with me here for more uses of this: https://news.ycombinator.com/item?id=11589614


Perhaps a better name would be "meta computation" or "abstract computation".


Peter Kogge's 'The Architecture of Symbolic Computers' is a great book on Lisp and Lisp machines and much more. It explains a lot of the questions this post has elicited.


I think there's an extra "the" in the sentence:

If you already know the some Lisp…


Symbolic computation is when you evaluate a binary executable as a math expression. You might try to solve for inputs that cause a crash. It's difficult due to well-known issues with the famous halting problem, but in practice you can get useful things done.


Are you thinking of symbolic execution?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: