Numpy Clone in Common Lisp

gumby · on May 13, 2019

Nice work: not a skin over the python one but an actual, optimizable implementation.

varjag · on May 14, 2019

I concur, first rate work and good use of the language.

pmiller2 · on May 14, 2019

I think this is great and interesting, but do CL compilers optimize numerical code anywhere near as well as Fortran compilers?

gumby · on May 14, 2019

I would be surprised if so for vector arithmetic given the opportunities for modern FORTRAN to tell the compiler what to do. OTOH the whole point of metasyntactic macrology and the function object abstractions is to permit such things. But modern FORTRAN compilers are very good at what FORTRAN is good for.

The MACLISP compiler was as good and in some cases better than its contemporary FORTRAN competition. But FORTRAN tech has moved on since then.

jxy · on May 14, 2019

IBM doing Common Lisp. I wonder what this is actually used for inside IBM.

guicho271828 · on May 14, 2019

Author here. IBM is currently focusing on the combination of machine learning and reasoning, also called neural-symbolic AI [1], to combine the best of both worlds (noise robustness and explainability, etc.). I am from Tokyo and will join MIT-IBM lab soon to perform this line of research.

I have been working on Latplan [AAAI-2018, 2,3,4,5,6], a neural-symbolic classical planning system that learns symbolic representations of noisy images and perform efficient symbolic planning in the latent space. Later, several other groups have started working in this field, like [7].

The problem with python is the lack of sequential execution speed, which is necessary for solving symbolic tasks, e.g. SAT solver, classical planning, theorem prover. At the same time, traditional languages like C++, typically used for writing solvers, are too inflexible to accommodate the workflow in machine learning. For a neural-symbolic task, personally, Lisp is best in this space (I don't try to repeat why, because there are plenty of articles on it.), with Julia being a contender.

I am also interested in adding the compile time type/shape checking on the deep learning code, since python debugging is painful (just a personal opinion -- not intending to start a war here).

By the way, I made the first prototype of this lib in 3 days, and am working on improving it for 3 month (for other ML projects).

[1] https://medium.com/@MITIBMLab/the-shared-history-and-vision-... [2] https://github.com/guicho271828/latplan [3] AAAI18 presentation at : https://asai-research-presentation.github.io/2018-2-5-aaai/ [4] AAAI18 paper : Classical Planning in Deep Latent Space: Bridging the Subsymbolic-Symbolic Boundary, https://arxiv.org/abs/1705.00154 [5] Follow-up papers (ICAPS2019) : https://arxiv.org/abs/1903.11277 [6] Follow-up papers (ICAPS2019), extension to first order logic : https://arxiv.org/abs/1902.08093 [7] Learning Plannable Representations with Causal InfoGAN, https://arxiv.org/abs/1807.09341v1

kazinator · on May 14, 2019

Hi. Are there some openings over there? My entire career has been low-level development in C and C++ (embedded Linux in the last 15 years), with a passion for Lisp on the side (20 years of experience now). Since 2009 I've been developing the TXR language (http://www.nongnu.org/txr), which contains a Lisp dialect called TXR Lisp. My Japanese is okay-ish. I'm not a numerical researcher and have zero background in machine learning. I live in Vancouver, Canada.

mark_l_watson · on May 14, 2019

I hope you enjoy your new job! I am also interested in hybrid AI systems. I have been doing symbolic AI in Lisp (and sometimes Prolog) and also neural networks since the 1980s but I have never professionally combined disciplines (but some side of desk hacking at home). Glad to see the MIT-IBM lab peruse this with resources.

_ph_ · on May 14, 2019

Is there a specific reason you choose LGPL for the library? Are you aware of the issues to use that license with Common Lisp libraries? Common Lisp doesn't use shared library linking as C does, so some terms of the plain LGPL are not directly applicable to Common Lisp, Franz tries to compensate for this with the LLGPL, but it still more difficult to use correctly compared to a more permissive license.

kazinator · on May 14, 2019

Loading a Lisp .fasl is very much like loading a shared library. (Shared library loading can be regarded as a Greenspunned version of what goes on in Lisp.) There is code in there, referenced by symbols, which resolve. dlsym is like a hacky, low-level version of symbol-function or symbol-value.

C header files also have a ready analogy in Lisp: they are like Lisp files that provide macros and constants.

Where the LGPL says "The object code form of an Application may incorporate material from a header file that is part of the Library." we can understand this "header file" to refer to a Lisp file that provides macros that generate code within the Application's file.

The LGPL doesn't define what is a "header file"; it doesn't say that it's a C thing with typedefs and #defines. In that regard it has a flaw whether we apply it to a C shared library to a collection of Lisp object files.

_ph_ · on May 14, 2019

I know how Lisp loads files (disclaimer, I am professionally using Common Lisp) and consequently are aware of the LGPL issues. First of all, does LGPL only allow using fasl files or also building applications? Can you create a fasl file of a modified library and load it instead? Especially, if the user does not have the source code to the rest of the application?

The answers depend on the Lisp implementation used. There is a good reason that Franz decided to augment the LGPL (at their time version 2) with a license preamble to clear up the ambiguities in the context of Common Lisp.

kazinator · on May 14, 2019

Building an application executable may or may not violate the LGPL. It likely depends on whether the application allows new versions of the LGPL-covered modules to be loaded into it. The simple question is: can users exercise their LGPL-granted rights to that library: to modify the library code, and still use it with the proprietary application?

(FWIW, I've written a Lisp implementation that has compile-file and load, similar to Common Lisp.)

A Lisp-ified LGPL might seem perfect, but on the other hand there are some good reasons not to introduce derivative versions of de facto standard licenses.

_ph_ · on May 14, 2019

That is, why Franz wrote the LLGPL as a preamble to the LGPL. So the LGPL text is copied, only clarified in the Lisp context by the preamble.

armitron · on May 14, 2019

There are no ambiguities in LGPL. "Franz did LLGPL" is not by itself a good enough reason to suspect there are or to use LLGPL.

_ph_ · on May 14, 2019

There are no ambiguities in LGPL.

You are a lawyer? You are a professional Lisp programmer? How does the LGPL deal with Common Lisp macros?

"Franz did LLGPL" is not by itself a good enough reason to suspect there are or to use LLGPL.

As Franz is the most prominent Lisp vendor, employing some of the greatest Lisp programmers and undoubtfully having a good legal department, I see no reason why I shouldn't take their legal opinion seriously. Especially as it is not part of any commercial offerings of theirs, but just provided as a clearer license for Lisp libraries. It also coincides with everything I heard from my lawyers at the company I work for.

kazinator · on May 14, 2019

> How does the LGPL deal with Common Lisp macros?

It says the Library provides "header files" which deposit executable code into Application.

"Header file" is C jargon, but it's not given any definition in the license.

Lisp files that provide macros can be construed as "header files". They can be compiled (but C headers can be "compiled" also).

Nothing in the license seems to be sensitive to the technological/qualitative difference between C and Lisp macros.

jackdaniel · on May 14, 2019

You don't have to be a lawyer to read legal documents and use licenses. There is no reasonable interpretation of LGPL in Lisp context which would lead to a need of additional clarifications.

Here is my stance on this: https://common-lisp.net/project/ecl/posts/ECL-license.html.

TL;DR if someone wanted to have Lisp code to be licensed under GPL terms they would license it that way. I'm a professional Lisp programmer and I've spend a fair share of hours on reading licenses and analyzing them.

_ph_ · on May 14, 2019

This might work for ECL which is build as a shared library. But what if e.g. an SBCL user wants to use a LGPL Lisp library? How can the developer provide the ability to "link" against a modified version of the LGPL library? What about macros exported by the library? The LGPL talks about the szenario of linking shared libraries. This is a clearly defined process in the C world, which has no exact counterpart in the Lisp world. You claim that if the author of a library had other intentions, the author would have chosen the GPL. That might be true, but certainly isn't sufficient for legal certainty. Perhaps the author wanted allow only shared libraries where it is possible. The LLGPL is an easy amendment to make those intentions clear. Even easier, and better suited for Lisp environments (even following the LLGPL to the letter can be technically difficult) are licenses like BSD, which in this case would be the obvious one in my eyes, as Numpy itself is under the BSD license.

jackdaniel · on May 14, 2019

> This might work for ECL which is build as a shared library. But what if e.g. an SBCL user wants to use a LGPL Lisp library?

If you had read the post you'd have my answer to that question.

There is no need to spread fear and uncertainty about LGPL and other copyleft licenses, I've seen too much of that over last few years (and I have a strong intuition that it is not an accident).

P.S. "are you a lawyer?" argument is very cheap (and often happening on threads about copyleft licenses) given it is HackerNews not LawyerNews -- I know of lawyers working for corporations who say that they wouldn't touch anything with GPL in the license name with a five-meter pole (not knowing a difference between LGPL, GPL and AGPL) -- there goes your "in-depth legal analysis". Same lawyers doesn't bat an eye when they give a green light for eula-crippled code from their "technical partners". Cargo cult is present everywhere, not only in software development.

_ph_ · on May 14, 2019

If you had read the post you'd have my answer to that question.

I have read your post, and just reread it, and I don't see where you answer that question. How does an SBCL user deploy a program with an LGPL library so that the reciever of the program can relink it as required by the license.

P.S. "are you a lawyer?" argument is very cheap

I used this question on a post where the poster just makes a claim, not supporting it by any reasoning or linking to supporting documents. The question must be allowed how he justifies his statement. So for legal questions, this would be: are you a lawyer? as I would ask a random poster making definitely medical statements: are you a doctor?

And yes, I know, beeing a lawyer per se doesn't guarantee a correct evaluation, I have worked with far too many corporate lawyers, which are usually not experts on open source licenses. But I also had no doubts on the competence of those, who looked into the details of the licenses in question. I have been able to get their OK on LLGPL software for example, though GPL is of course limited to very isolated usages.

kazinator · on May 14, 2019

> This is a clearly defined process in the C world

Rather, shared-library mechanisms are platform-specific. ELF shared libraries are different from Windows DLL's, for instance. The LGPL covers both, because it's mum on implementation details.

The abstraction is basically the same. The application has some symbolic references satisfied dynamically by the library, and possibly vice versa.

> The LLGPL is an easy amendment to make those intentions clear.

I wouldn't use it because it's a rare license, and I'm not a lawyer. It's best to use one of the top five or so popular licenses. Licenses aren't programming languages; it's okay to use the popular thing.

Furthermore, bosses and customers don't want to hear that Lisp is so weird that it needs a modified version of a license that everyone else uses in stock form.

The LLGPL contains dubious, superfluous clauses like: "It is permitted to add proprietary source code to the Library, but it must be done in a way such that the Library will still run without that proprietary code present."

w...hy? The LGPL doesn't allow that, and there is no reason to do that in a Lisp library.

Code is not added to libraries in Lisp; it is added to the image. We can add the proprietary application into the image and we can load the library into the image. That's all the mixture of proprietary and LGPL that is needed; the proprietary sources don't have to appear to be mixed into the library in the project tree. Keep it in a separate directory, which contains a copy of the LGPL which applies to every source file in it. No brainer.

_ph_ · on May 14, 2019

> This is a clearly defined process in the C world
Rather, shared-library mechanisms are platform-specific. ELF shared libraries are different from Windows DLL's, for instance. The LGPL covers both, because it's mum on implementation details.

The abstraction is basically the same. The application has some symbolic references satisfied dynamically by the library, and possibly vice versa.

Shared libraries are not limited to C, but in the context it should have been clear what I was talking about. Most importantly, most Common Lisp systems are not using shared libraries in the way C and languages using C style bindings do.

> The LLGPL is an easy amendment to make those intentions clear.

I wouldn't use it because it's a rare license, and I'm not a lawyer. It's best to use one of the top five or so popular licenses. Licenses aren't programming languages; it's okay to use the popular thing.

I also am strongly advocating not using rare licenses. But they are still preferrable to licenses which do not work as intended for technical reasons. LLGPL is one thing, I would rather recommend a BSD style license, as it is very well known, has no technical problems involved and is actually the license of the original Numpy code. Why choose a different license for a reimplementation of Numpy?

armitron · on May 15, 2019

Because it's not a derived work (not based on numpy code) thus the library author is free to choose a license that he feels best protects his code and aligns with his interests.

LGPL offers significant protections that can not be found in BSD.

_ph_ · on May 15, 2019

Because it's not a derived work (not based on numpy code) thus the library author is free to choose a license that he feels best protects his code and aligns with his interests.

In this case, the author could choose LGPL even if it were derived on the numpy code. So yes, technically the author is free to choose about the license. However, unless specific and significant reasons require this, I would find it just appropriate to choose the same or a similar license.

LGPL offers significant protections that can not be found in BSD.

Well, I asked the author - without ever getting an answer - whether there would be specific reasons, why the LGPL would be needed. For example, because of requirements by his employer or other specific reasons. Other than that, yes, people always claim that the GPL style license offers especial "protections". But one has to look at the environment. We are talking about a port (derived or not) of the BSD licensed Numpy, intended to run e.g. on the BSD licensed SBCL. So if the library would require special "protections", one could name them.

lostmyoldone · on May 14, 2019

All legal text, and the interpretation of it is ambiguous to some extent. Sometimes by design, but always through the very nature of not being an unambigous formal language, and hence not trivially machine interpretable.

guicho271828 · on May 14, 2019

I was convinced by this paper by lawyers analysing LLGPL that LGPL is just enough.

https://ja.reddit.com/r/lisp/comments/9gjd0m/lisping_copylef...

_ph_ · on May 14, 2019

The link at the top of the page does no longer seem to lead to an article unfortunately. Do you have a direct link to the article?

guicho271828 · on May 14, 2019

the article title is the name of the paper. "Lisping Copyleft: A Close Reading of the Lisp LGPL"

_ph_ · on May 14, 2019

Thank you, Google found the pdf for me :). I am going to read the pdf in detail, but skipping to the conclusion the author only states:

Even so,this article has shown that the clarifications made by the LLGPL to the original GNU license are largely unnecessary, and that the LGPL would probably be interpreted in a similar fashion withoutthe clarifications proposed by the LLGPL.

The useage of the words "largely" and "probably" would make me less confident of the verdict, that the LGPL is enough.

On a different angle: I checked Numpy and it is using a BSD-license. What was your motivation to license your library differently? Is your library derived from Numpy or just implementing the same api? A BSD-style license would make any technical issues - both legal and actual distribution of the product - just go away.

guicho271828 · on May 14, 2019

The "largely" and "probably" are the mentions to the fact that there aren't yet legal cases concerning LLGPL. I am not an expert in law, but given that the journal is peer-reviewed, I will give some credit. https://www.ifosslr.org/index.php/ifosslr/about/editorialPol...

_ph_ · on May 14, 2019

If you allow me to repeat my question, is your library derived from Numpy? If so, what would be the reason to switch it to a different license type?

guicho271828 · on May 14, 2019

armitron · on May 14, 2019

A BSD-style license does not protect contributions/changes to the library so it follows that it is not a drop-in replacement for LGPL.

Multiple people have told you that there are no technical issues with LGPL applied to Lisp programs but of course you may persist in thinking so. Do not expect others to share your anxieties though, particularly when it comes to deciding which license to use for their code.

_ph_ · on May 14, 2019

A BSD-style license does not protect contributions/changes to the library so it follows that it is not a drop-in replacement for LGPL.

I suggested a BSD-style license, as it would resolve any possible legal problems and is actually the license of the library that was cloned. I asked the author, whether he could name a concrete reason why he switched his work to a different license than the original library, which I would find an appropriate default.

Multiple people have told you that there are no technical issues with LGPL applied to Lisp programs but of course you may persist in thinking so. Do not expect others to share your anxieties though, particularly when it comes to deciding which license to use for their code.

How many of those people are lawyers? Law isn't decided by a majority vote. I am a professional programmer working with far too many lawyers to do my work. Calling my concerns about the legal technicalities of a licence "anxieties" is quite inappropriate.

jxy · on May 14, 2019

It is a good question. You would normally think a corporation would go with MIT or BSD instead.

jdc · on May 14, 2019

What are your thoughts on Clasp so far?

Nrsolis · on May 14, 2019

This might be a clue:

https://developer.ibm.com/recipes/tutorials/watson-iot-with-...

agambrahma · on May 14, 2019

Good to see someone fill in this missing link! Could become a viable competitor to Julia (uniform language stack, as opposed to the Python+C combination currently being widely used).

ddragon · on May 14, 2019

I think it's more than a uniform language stack, after all C++ could be considered one as well but most people still use python to interface it as opposed to just using it (at least for exploratory data science). It's a combination of interactivity to experiment, conciseness to stay close to the pseudo-code/math, easy hackability since you need to stay beyond the cutting edge and performance so you're not restricted in what you can do.

Common Lisp (as does Clojure and Julia to a lesser extent) has unparalleled interactivity (besides maybe smalltalk), you can make it as concise as you want, ultimate hackability and really good performance. Research (not deployment) is the one area CL could probably surpass every other, but unfortunately it lacked a concentrated effort (by a company or community) in this current AI resurgence (maybe because of the Lisp Curse). Though there will always be other opportunities.

longemen3000 · on May 14, 2019

A funny situation, the Julia parser is femtolisp, so in a funny way, Julia is already a kind of numpy in lisp

pmiller2 · on May 14, 2019

From the point of view of a numpy user, the stack is 100% Python. You just import it and don’t worry about it.

_0ffh · on May 14, 2019

It's a bit sad that Lush seems to be kind of dead since YLC has moved on...

eggy · on May 14, 2019

I agree. Lush 2 was very promising. On the bright side, now Lisp is back in what seems a good fit of research and practicality.

fartcannon · on May 14, 2019

This is an excellent project. Is there a developer blog? I'd like to try reimplementing numpy I'm various other languages (or part of it, since it seems like it's a huge task).

enriquto · on May 14, 2019

You can re-implement a much better interface that numpy if your purpose is doing math comfortably. Numpy has a wonky interface to ease the transition from matlab/octave (while amazingly producing a much less comfortable system).

nurettin · on May 14, 2019

I've used numpy without much trouble, but matplotlib really shows how creative and non-intuitive interfaces can get when developers are left to experiment with function and method parameters in a dynamic language.

nemoniac · on May 14, 2019

Could you point to a "much better interface" to numerics?

enriquto · on May 14, 2019

Well, for one, octave or scilab are perfectly accetable solutions. Or even "julia" if you pressure me.

Or fortran, where there is no stupid overhead in "for" loops. Or C. Hell, or even luajit if you do not need fancy stuff.

Honestly, python/numpy is possibly the slowest and less convenient way to do math (assuming that you only want to do math, and no need any strings/dictionaries and other useless data structures).

mkl · on May 14, 2019

With the possible exception of Julia (which I don't know), all of those languages have much clunkier and more primitive numerics interfaces. Even just considering array indexing Numpy is more capable, convenient, and comfortable, and Numpy's array-wise maths operations are fast, as they're implemented in C and Fortran.

newen · on May 14, 2019

Coming from Matlab, it took me a couple of hours with the documentation to learn about the indexing mechanisms in numpy. It's highly unintuitive. Numpy is very awkward to use all around.

mkl · on May 15, 2019

That's surprising to me. I came from Matlab too, and I found Numpy's indexing immediately simple and (to me) clearly superior. It seems largely a superset of Matlab's, but nicer (e.g. a[-3:] instead of a(end-3:end)). The biggest differences are the 0 base, and that slices are half open. What specifically did you find difficult?

Edit: I'm asking as I teach this stuff to undergrads and need to understand potential obstacles.

newen · on May 15, 2019

This was a few years ago but here's what I somewhat remember. First was obviously the 0-based indexing, non-inclusive ranges, and the row-major array ordering.

Then, there are a few issues when you get past non-trivial indexing of 1D arrays like a[3:]. For matrices, the semantics of numpy says that a 2D array is a 1D array of 1D arrays. So if A is 2D array, then A[3] is the 4th row of the array. Not great.

Another is that given 2D array A, and indexing vectors r and s, A[r,s] returns a 1D array while Matlab returns a 2D array with completely different semantics.

Another is negative indices.

Then you have things like A[(1,2)] being different from A[[1,2]]] and the whole concept of slices objects like Ellipsis, np.newaxis, and their combinations like (Ellipsis, 1, np.newaxis).

Another is that indexing returns a view and not a copy.

This is just indexing. There are tons of other issues with numpy like the multiply operator, the seemingly random way that they split methods (as in A.sum()) and applicable functions (as in np.sum(A)), and other nonsense.

Numpy is basically a library stuck on top of Python with Python not being away of numpy or multi-dimensional arrays at all. It's ridiculous that numerical and scientific programming has devolved into working with numpy.

enriquto · on May 16, 2019

> Numpy is basically a library stuck on top of Python with Python not being aware of numpy or multi-dimensional arrays at all. It's ridiculous that numerical and scientific programming has devolved into working with numpy.

This! So much this!

Many people have lived through different phases of numpy appreciation. First, when it appeared, it was amazing to be able to access huge arrays from within a script. Then when it evolved, there was a slight suspicion that things were going a bit out of hand. Later it became a caricature of itself when it began to replace matlab. Today, we are in the tragic state where many young people think that "the only possible way to multiply two matrices on a computer is by using numpy.dot".

ddragon · on May 14, 2019

Numpy is mostly not much different from the matlab syntax (and therefore Julia and Scilab which it inspired, and octave which copies it) [1] outside of not being imported in the current namespace, a little more awkward multidimensional array creation and a few edge cases. It's also very fast.

Of course, if you want to do stuff that is not implemented, all those others will do better since you're using the native types and everything works with it (which is the same with this CL Numpy Clone). And I can't see how C is a good interface for numerics.

[1] https://cheatsheets.quantecon.org/

taffer · on May 14, 2019

Maybe something closer to APL: https://curtisautery.appspot.com/5776042744610816

chunsj · on May 14, 2019

Maybe this will need C based highly optimized backend as well, however, the code looks very clean and beautiful to me. I hope someday differentiable programming becomes possible in common lisp.

jlarocco · on May 14, 2019

SBCL is actually really good at optimizing this type of code, and I think it can even do some vectorization.

If it's not not good enough, there are a few CL bindings and wrappers around BLAS and LAPACK.

A good one is Matlisp, which is available in QuickLisp.

http://quickdocs.org/matlisp/

fiddlerwoaroof · on May 14, 2019

Sbcl has all the pieces in place for vectorization, but I don’t think anyone has actually hooked them up. However the great thing about lisps is that the compiler and optimizer are written in the language they compile, so an end user can do this sort of thing themselves: https://www.pvk.ca/Blog/2013/06/05/fresh-in-sbcl-1-dot-1-8-s...

pjmlp · on May 14, 2019

Optimizing Lisp compilers are quite good.

Current generations tends to be unaware that once upon a time C compilers only generated turtle code, it was the investment into optimizing compilers and abuse of UB that made current compilers able to generate fast code.

jstimpfle · on May 14, 2019

Another C hate post? It seems way overblown. When you compare trivial compilation of C code with optimized compilation, you would see maybe a factor of 5 in difference in code size.

On today's machines, I think you can expect to get more than 50% of performance from trivial compilation, compared to optimized code. Depending on workload you it can be 80-90%. (I have a small amount of experience, based on my own very shitty compiler, to back up that claim).

Even on machines from "once upon a time", where performance was more directly correlated with code size, it couldn't have been that bad, assuming the factor of 5. And also seeing the fact that C was used for systems development from the start.

And that's still ignoring (or at least contradicting) the common lore that "once upon a time", LISP was comparatively more inefficient than C than it is today!

And I don't believe that abuse of UB is important to generate fast code. After all, fast code doesn't necessarily contain UB.

pjmlp · on May 14, 2019

I have written plenty of them since C vs C++ on comp.lang.c were a thing.

Once upon a time Common Lisp compilers were already good enough versus Fortran.

https://people.eecs.berkeley.edu/~fateman/papers/lispfloat.p...

As for C's performance on the old days,

"Oh, it was quite a while ago. I kind of stopped when C came out. That was a big blow. We were making so much good progress on optimizations and transformations. We were getting rid of just one nice problem after another. When C came out, at one of the SIGPLAN compiler conferences, there was a debate between Steve Johnson from Bell Labs, who was supporting C, and one of our people, Bill Harrison, who was working on a project that I had at that time supporting automatic optimization...The nubbin of the debate was Steve's defense of not having to build optimizers anymore because the programmer would take care of it. That it was really a programmer's issue.... Seibel: Do you think C is a reasonable language if they had restricted its use to operating-system kernels? Allen: Oh, yeah. That would have been fine. And, in fact, you need to have something like that, something where experts can really fine-tune without big bottlenecks because those are key problems to solve. By 1960, we had a long list of amazing languages: Lisp, APL, Fortran, COBOL, Algol 60. These are higher-level than C. We have seriously regressed, since C developed. C has destroyed our ability to advance the state of the art in automatic optimization, automatic parallelization, automatic mapping of a high-level language to the machine. This is one of the reasons compilers are ... basically not taught much anymore in the colleges and universities."

-- Fran Allen interview, Excerpted from: Peter Seibel. Coders at Work: Reflections on the Craft of Programming

jstimpfle · on May 14, 2019

Cherry picking and pulling the sufficiently smart compiler card. "floating point processing in lisp"... "automatic parallelization"... yeah

Nobody doubts that you can efficiently compile high-level LISP programs for certain specific problem domains. In some cases, you can even get an edge over the equivalent, ugly, C program.

But empirically, for general systems programming, C is second only to assembly in performance, the only problem being pointer aliasing (which you can easily avoid in many cases by using global data instead of doing overabstracted OOP).

That's not because C is perfect, but because it gives you the right mindset, and gives you control in a very accessible way. (And yes, of course you can write C-like systems code in LISP, too... If you're masochistic enough).

The benchmarks game should be enough of a proof here (and it's not even for complicated problems). Why don't you try improving SBCL times there and then come back?

https://benchmarksgame-team.pages.debian.net/benchmarksgame/... https://benchmarksgame-team.pages.debian.net/benchmarksgame/...

(Not that I care for performance factors of 4 that much by default. These are simple programs. And at least they refute the ludicrous claim that C is slow, right?).

jsjolen · on May 14, 2019

>But empirically, for general systems programming, C is second only to assembly in performance,[...] and gives you control in a very accessible way.

Alright, but to what degree is the latter the cause of the former? The OP has already noted that C used to not be very fast, and that it became fast through the rise of sufficiently smart compilers. Do you dispute this claim?

>Why don't you try improving SBCL times there and then come back?

What good would that do? You have already seceded that you can write a C-like program in Common Lisp. This is especially true in SBCL, where you could write a Lisp program to generate the correct assembly code for the program.

jstimpfle · on May 14, 2019

> Do you dispute this claim?

I wasn't alive at that time... But as described I'd wager to dispute it. You need much more sufficient-smartness to make LISP fast. Isn't it obvious? Or is that somehow counter to any history you've read, except when cherry-picked and reinterpreted by some HN commenter?

> The OP has already noted that C used to not be very fast

Not very fast, compared to what? It's just not extremely fast when compared to numerical Fortran code (not: systems programming) or when compared to Assembly. Right? In my book this is just the worst kind of cherry picking.

If it was a reasonable claim, then would someone name the production OS from the early 70s written in LISP, and explain how it's better?

jsjolen · on May 14, 2019

>If it was a reasonable claim, then would someone name the production OS from the early 70s written in LISP, and explain how it's better?

Better or more performant for some measure of performance? I mean, you can always read the UNIX hater's handbook if you're interested in qualities that UNIX did not possess decades ago.

>You need much more sufficient-smartness to make LISP fast. Isn't it obvious? Or is that somehow counter to any history you've read, except when cherry-picked and reinterpreted by some HN commenter?

If you want to make a pedantic argument, then no, it's not difficult to make Common Lisp potentially as fast as C. This is because I can very easily drop down to a level of safety which is on the same level as C. This is, of course, unsafe and I would never recommend it.

I don't understand why numerics performance is cherry-picking, but "systems programming" (what does that mean? OS dev. only?) is not.

This would be a much simpler discussion if you concisely describe in a semi-formal manner why C has a larger potential to be fast than CL. If you did so, it probably would be a lot more difficult for me or anyone else to refute your claims.

jstimpfle · on May 14, 2019

> If you want to make a pedantic argument, then no, it's not difficult to make Common Lisp potentially as fast as C. This is because I can very easily drop down to a level of safety which is on the same level as C. This is, of course, unsafe and I would never recommend it.

That's stating the obvious, which I had already stated in advance if you care to read it. (Don't miss the extra "If you're masochistic enough"). What you're presenting is a Turing tarpit kind of argument - totally irrelevant to anybody with a sense for practicality.

> Better or more performant for some measure of performance? I mean, you can always read the UNIX hater's handbook if you're interested in qualities that UNIX did not possess decades ago.

From what I remember, the Unix hater's handbook is half satire (based on good understanding), and half wrong. Yes, UNIX is the worst useable OS, except all the alternatives.

> I don't understand why numerics performance is cherry-picking, but "systems programming" (what does that mean? OS dev. only?) is not.

Because most programs aren't numerics programs, but every software system needs "systems programming". And coincidentally, C was made for systems programming. Unlike Fortran, it was not made specifically for numerics / scientific programming.

> This would be a much simpler discussion if you concisely describe in a semi-formal manner why C has a larger potential to be fast than CL. If you did so, it probably would be a lot more difficult for me or anyone else to refute your claims.

I suggest actually reading my comments, because I've written all that I have to say. In one sentence, the other side's arguments are all either of the sufficiently-smart-compiler kind, or the performance equivalent of the Turing Tarpit fallacy.

jsjolen · on May 14, 2019

Hi!

Just a last comment, we're kind of off the rails here. I'll read your reply but I won't answer, I hope you're OK with that!

> What you're presenting is a Turing tarpit kind of argument - totally irrelevant to anybody with a sense for practicality.

Not really, it requires very little to do so. Here's how to make fixnum arithmetic go wroom-wroom: https://plaster.tymoon.eu/view/1380#1380

Does that look like a Turing tar-pit to you? Scroll down to the assembly.

I mean, I do think that C-level safety is unacceptable though.

> [...] and half wrong. Yes, UNIX is the worst useable OS, except all the alternatives.

You happen to be misinformed regarding this. It's true today regarding Linux perhaps, but not back then.

>I suggest actually reading my comments, because I've written all that I have to say. In one sentence,

Didn't you just say that empirical evidence points towards C being fast? That's not enough.

pjmlp · on May 14, 2019

C wasn't fast compared to many other high level compilers on mainframes, some of them still being sold by IBM and Unisys, which kept the original system languages instead of throwing it away and rewrite everything in C.

And on 8 and 16 bit systems, any junior Assembly programmer could easily outperform code generated by C compilers.

igouy · on May 20, 2019

> What good would that do?

It would surprise and nudge expectations in a positive way.

> You have already seceded that…

?

conceded ?

darkpuma · on May 14, 2019

BLAS/LAPACK is fortran, not C.

stephencanon · on May 14, 2019

BLAS is a set of subroutine interfaces that originated in Fortran, but can be implemented in any language; most mainstream implementations use one or more of Fortran, C, C++, assembly, and DSLs.

gnufx · on May 14, 2019

Right, though a LAPACK implementation is probably going to be mostly Fortran from the reference version, typically with some optimized routines, e.g. what OpenBLAS imports. Something to note is that unfortunately, the interface is the obsolete Fortran77-style, not modern Fortran, though that does make it more likely you can interchange implementations at run time as appropriate.

darkpuma · on May 14, 2019

It's my impression that common implementations of BLAS have fortran at the core because fortran (moreso than C) lends itself to automatic vectorization.

gh02t · on May 14, 2019

This is partially true (or at least once was, the performance gap is pretty slim nowadays), but most of the high-performance BLAS implementations like OpenBLAS or MKL have their hot path code written in hand-tuned assembler anyway.

stephencanon · on May 14, 2019

BLAS inner loops are usually explicitly hand-vectorized, so Fortran’s autovectorization advantages don’t help. I’ve written many BLAS kernels in assembly over the years.

gnufx · on May 14, 2019

Not arguing with that, but I think the jury is out on whether they need to be hand vectorized. With recent GCC, generic C for BLIS' DGEMM gives about 2/3 the performance of the hand-coded version on Haswell, and it may be somewhat pessimized by hand-unrolling rather than letting the compiler do it. The remaining difference is thought to be mainly from prefetching, but I haven't verified that. (Details are scattered in the BLIS issue tracker earlier this year.)

For information of anyone who doesn't know about performance of level 3 BLAS: It doesn't come just from vectorization, but also cache use with levels of blocking (and prefetch). See the material under https://github.com/flame/blis. Other levels -- not matrix-matrix -- are less amenable to fancy optimization, with lower arithmetic density to pit against memory bandwidth, and BLIS mostly hasn't bothered with them, though OpenBLAS has.

stephencanon · on May 14, 2019

Haswell is now a 6 year-old core, so compiler cost models and vectorization support for AVX2 + FMA have mostly caught up. The state of autovectorization targeting AVX-512 or even NEON on new cores is quite a bit less satisfactory for now.

It's long been the case that compilers generate adequate vector code for five-year old cores. A large part of the wins for hand-vectorization (and assembly) come from targeting cores that haven't even been released yet, or were just made available.

Also, 2/3 is ... significantly worse than I would expect, actually. My experience writing GEMM (I did it professionally for a decade) is that getting to 80% of peak is mechanical, and the remaining 10-15% is where all the real work is.

It's also a mistake to ignore L1 and L2 BLAS; for data that fits in the cache hierarchy, these become important, and you can absolutely squeeze out significant gains from hand optimization. For the HPC community, such small problems are uninteresting, but for everyday consumer computing, you can reap huge benefits here.

gnufx · on May 15, 2019

I wasn't in a position to make realistic comparisons on SKX at the time, but it seems that Zen is similar to Haswell (and Haswell/Broadwell is still pretty relevant in HPC). I'll eventually get back to it and try on SKX. I expect you can do better with the generic code, as suggested. The point is that received wisdom is that you can't just rely on compiler vectorization but need careful hand-(un?!)rolled code. The story around BLIS was that GCC wouldn't vectorize loops that at least GCC6 actually does (better than the SSE3 code in the how-to-optimize-dgemm tutorial). Then the suggestion was to force it with the OpenMP simd pragma, which is worse on Haswell because it doesn't use fma, but I don't know if that's relevant for, say, POWER using the generic kernels. I'm not actually ignoring level 1 and 2, although my interest is HPC. BLIS' reduction-type loops in generic C at least get vectorized now via -funsafe-math-optimizations. Small dgemm is important in some HPC applications too, for which BLIS doesn't do too well, but there's libxsmm.

gnufx · on May 14, 2019

You'll typically need "restrict"ed arguments for vectorization in C, but then GCC usually does vectorize the loops OK in BLAS-like code. There may be advantages from Fortran arrays; I forget the details.

stephencanon · on May 14, 2019

Fortran gets you both restrict and re-association allowed by default, plus the nice syntax for vector ops. That's about it, though.

yarrel · on May 14, 2019

It's unlikely to need an optimized C backend :-) -

http://www.iaeng.org/IJCS/issues_v32/issue_4/IJCS_32_4_19.pd...

zmmmmm · on May 14, 2019

Wonder how it compares to using something like ND4J with Clojure? That would (as I understand it) have the advantage of not sacrificing as much speed.

noobermin · on May 14, 2019

Sympathetic, but as long as lisp requires one to write math as function composition it won't catch on with computational types. Of course, I imagine that sort of thing could be a set of macros away...then that would be awesome, familiar infix mathematical expressions with common lisp in the rest of the code.

dreamcompiler · on May 14, 2019

I've been doing serious math with common lisp for 30 years. For me, the convenience of having automatic bignums and complex numbers, n-ary functions, and true rationals means I can think about math rather than about the limits of computers. The prefix notation became a non-issue in about a day.

dwheeler · on May 14, 2019

"Prefix notation is a non-issue" is true for a tiny minority of software developers and mathematicians. The world has already spoken: infix is a minimum requirement for the vast majority of people doing math.

Bignums, true rationals, and many other constructs are already available in Python. Numpy is too. Most importantly, infix is available in Python.

The funny thing is that it's already possible to do infix in Common Lisp. My "readable Lisp" approach has been supporting infix in standard Common Lisp: https://readable.sourceforge.io/ It's already available in the QuickLisp libary as "readable". You can then use {1 + 2} with macros, quasiquoting, and so on.

dtornabene · on May 14, 2019

the idea that prefix notation is some kind of serious hold up for anyone thats mastered, or is even semi-competent at the kind of mathematics you would use this for is so strange to me. Programming and math are (almost) nothing if not the ability to learn new rules of syntactical manipulation in all kinds of different domains. I say this as some one who is both a lisper and has severe dyslexia. It just seems like a totally made up hangup for a programmer to have. Its also weird to say "the world has spoken" when 60 years ago there were all of three or four non-assembly programming languages, and only 30 years ago prefix was still highly represented in serious industrial and research computing. Things change. Even five years ago doing "functional" programming in javascript was the fringe of the fringe of front end web development.

fiddlerwoaroof · on May 14, 2019

Yeah, function notation and much other math notation is already prefix: f(x), etc. the complications arise because for historical reasons we have the notion of infix operators, that complicate everything by forcing people to think about associativity and precedence.

vnorilo · on May 14, 2019

I think the reason infix feels "nice" is that you can sometimes write dataflows from leafs to roots, whereas with prefix you work from the root up. But then you use a lisp threading macro which lets you do that consistently, with any set of functions/operators.

antepodius · on May 14, 2019

I find I can still go leaf-to root, at least eith my editor (vim with the slimv plugin)- You can press ,W to 'wrap' a lisp expression in another set of parentheses, which means for something like (a + b) * (c + d) You can write the '(+ a b)' first, then wrap it in a (* [...])

vnorilo · on May 14, 2019

Thanks for bringing up another benefit of s-expressions - paredit and friends!

lgessler · on May 14, 2019

Let me tell you my perspective as a Lisp enthusiast and ex-software engineer who now works in academia. I think the problem is that most people who write software in the world (even many professional software engineers) just want to solve their problem and will resist if at all possible having to learn anything difficult (such as S-expression syntax), even if it is promised that their hard work will pay off.

The academics I work with, for instance, prefer Python, tolerate Java, do OK with JavaScript, but run for the hills when they see Lisp. When I tell them that ClojureScript not only has semantics with consistency that will elicit sobs of joy, but is also historically the most stable and concise way of targeting JS in the browser, maybe a few of them will be interested, but as soon as I show them a code snippet it's all over. They turn their heads sideways, they frown so that their lips are like the parentheses that splay before them, they mutter something about parentheses, and they go back to tripping over themselves in JavaScript trying to remember which Array functions are mutating or pure.

Learning a new syntax makes you feel dumb, even if you know how to program. And if you're feeling dumb while you're working on something that isn't even strictly related to the problem you're supposed to be solving before your deadline, it starts to get really hard to justify it. Most people only know something with a Python-like syntax, so most people will only ever work with languages which look like Python or Java since it's what they learned in their programming curriculum. It's ALGOL's world, and Lisp's just living in it.

Who does learn Lisp, then? I'd say, from what I've learned about the people in the Lisp world: mathematicians and computer scientists with intrinsic interest in PL's or the mathematical underpinnings of Lisp, old school symbolic AI researchers, and grizzled software engineering veterans who believe Lisp-family languages are engineering weapons of choice. All of these people have motivations (interest in mathematics and PL's per se, an academic interest with an historical connection to Lisp, and an insatiable itch for an ever more comfortable hammock even if it means getting out of the one you're in right now) that do not apply to the larger class of programmers that I just described.

I wish it weren't so. I wish more people would take the time to learn Lisp or an editor like Vim or EMACS: we, ahem, enlightened ones, know what a pittance the upfront investment will come to seem when the benefits of a superior tool are reaped over a lifetime, but people do not listen or do not believe when we tell them so. (Maybe they're right.) Perhaps we should ask ourselves what could make the promise of our salvation more credible.

dtornabene · on May 14, 2019

I ..just... learning s-expressions isn't hard. At all. Its a made up thing and I think people cling to it because it gives them an out, they don't want to feel dumb or uninteresting in their work (understandable). As far as who learns lisp these days, while Clojure may not be doing Python numbers in terms of adherents, it is absolutely used in increasing numbers in business software, has multiple decent sized conferences and forms the back bone for a number of large open source projects. Its also used as a foundation for a lot of devops and build-tools type work. Racket scheme is used by at least a dozen universities for everything from intro to programming classes, to advanced semantics and compilers classes. On top of that there are a suite of new tools written recently (last few years) in Guile which are slowly gaining ground (Guix being the main one), Chez was recently opensourced (and is being integrated into Racket), Common lisp is still used all over and, wait for it, emacs isn't going anywhere. Lisp isn't the specialist tool or old ("grizzled") standby its purported to be. I think this is some message board logic, honestly. Not trying to be overly hostile or antagonistic, I just don't see the scene/market/environment in those terms. For example, I just discovered an extremely interesting but kind of weird embedded lisp in the wild several months ago! Its a scripting extension to a binary analysis framework that doesn't have first class functions, but does utilize meta-programming to drive a term-rewriting engine that executes single instructions on an emulator. Awesome, and strange, and, as far as I've been able to tell, not noted anywhere people draw up lists of lisp implementations (its been out, open sourced for almost 7 years that I know of). Lisp is kind of everywhere, it just doesn't have an organized messaging clique pushing awareness constantly everywhere (thankfully).

vnorilo · on May 14, 2019

Right. Who are these people who do (numpy stuff) linear algebra, svd decomposition, pseudoinverses, linear regression, machine learning, but see a S-expression and immediately give up?

smallnamespace · on May 14, 2019

Mathematicians aren't averse to learning syntax, but they're lazy.

They learned that stuff because they are either basic methods in the field that have demonstrated value, or what they're actually doing research on.

The syntax buys them power that they wouldn't otherwise have, and they can smell it.

S-expressions? Not so much - the notation is more or less and arbitrary choice unless you're going to leverage the metalanguage capabilities to the hilt, and there are many other programming approaches available.

A mathematician who really wanted to jump into the deep end with programming syntax/structure would much more likely pick up Haskell (categories, yay!) or AVL rather than Lisp.

mkl · on May 14, 2019

Understanding s-expressions isn't hard, but reading s-expression code quickly when you're not used to it is very slow and frustrating. Fluency is essential for efficient use of a language, and breaking people's already-existing infix fluency is quite a hurdle (or certainly appears to be from the outside).

lgessler · on May 14, 2019

The claim that learning how to read and write s-expressions is hard might well be a "made up thing". (I agree with you, I think anyone, if they sat down for a day and really applied themselves to it, would get it.) On the other hand, I think you'd agree, the fact that people have strong, negative, emotional reactions to s-expressions when they first encounter them is absolutely not made up.

We could sit here and grumble about how silly it is that people cannot overcome such an irrational response (indeed I often do), but that won't change the fact that people have these responses. If we did that, we'd be doing the same thing that GUI designers did in the 90's. "Why, the widget for $x is right here. Any reasonable user will RTFM, know that $x can be checked there, and behave perfectly rationally." (And it happened just so, as decreed from developer's rolling chair, right?) Of course, we know now how misguided this is as a method for UX design: all users are in a sense irrational, and systems need redundant strategies for saving the user from themselves.

The similarity here is that we cannot will away intellectual blemishes in users, we can only accommodate them. That is Lisp's challenge if it is ever to get big. For GUI's, the solution was to take seriously users' need for visually attractive interfaces, redundant display of information, and guardrails/handholding for actions commensurate with their risk, to name a few strategies. For Lisp, it's hard to know what might be the right approach. As klibertp writes in a sibling comment, maybe the solution needs to come from outside the language itself.

klibertp · on May 14, 2019

> Perhaps we should ask ourselves what could make the promise of our salvation more credible.

Consistent, long-term, highly visible propaganda put everywhere. Paid ads, people proselytizing in the streets, billboards with posters repeating the same message over and over.

In general, to get popular, you either need to luck out and be at the right place at the right time, or you can force it with propaganda. I don't see any other way. Rational arguments are useless if no one is aware they exist. Plus, rational arguments tend to require thinking on the receiving end, which is hard for people who just glance at something in passing. Simple message hammered into people's heads over and over is the only thing that seems to work.

If I ever become a billionaire, I will fund such a campaign. It's so disheartening to see a great technology being left in the dust due to historical accidents and resistance to any change that's intrinsic in most people. We can't do anything about the AI Winter or Lisp-machines demise, but - as many successful ad campaigns show - we can convince people to change their ways. It's just incredibly expensive, time-consuming, and hard to do right - as many failed ad campaigns show.

klipt · on May 14, 2019

The flip side is: network effects. What is popular may not be perfect, but for any project that requires more than 1 coder, you can often get a lot more done in a popular language because you can find other coders to work with.

And even for solo programmers, it's easier to find an answer on stackoverflow for a popular language than an unpopular one.

rich-tea · on May 14, 2019

You're actually getting at something deeper which I've been noticing more and more lately. Many people seem really reluctant to learn how to use things. It only seems to happen with software. They see a program or a new language and seem to think "well, I'm already a coder, I should be able to use this". When it's even slightly different to what they expect they complain that it's "hard".

It's really strange to me. Imagine if people did that outside of software. I can already walk so I must be able to ride a bike. I can speak English, so I must be able to speak French. The same goes for software. I spent years trying to understand infix notation and learning the precedence rules. Why would it not take time and effort to learn prefix notation?

oblio · on May 14, 2019

> It only seems to happen with software.

Nope, it happens with EVERYTHING. If it doesn't work like something they've used before, they have to really, really need it to use (as in, be forced to use it), otherwise, into the trash it goes.

rich-tea · on May 14, 2019

You may well be right. I probably only notice it with software because that's what I do every day.

dreamcompiler · on May 14, 2019

> Bignums, true rationals, and many other constructs are already available in Python. Numpy is too.

The difference is that these things all run dramatically faster in CL than in plain Python, because CL is compiled AOT. In CL, a C library for math is optional. In Python, Numpy is mandatory.

> Most importantly, infix is available in Python.

Infix is certainly available in CL too, as other posters in this thread attest. I just don't feel any need for it.

yarrel · on May 14, 2019

I'm an art student. Prefix notation is a non-issue. If people want an infix Lisp, Dylan is awesome and doesn't add yet another flow-breaking set of different parantheses to the syntax.

linguae · on May 14, 2019

I'm not 100% sure about this. A counterexample is the popularity of RPN. HP's RPN calculators with postfix notation are well-loved among some engineers, scientists, and accountants, precisely because of the benefits of RPN over infix notation. I have a HP-48SX calculator, and I regularly use the `dc` command on Unix machines and in WSL whenever I need access to a calculator while I'm on a computer. Of course, I am proficient in prefix, infix, and postfix notation, but I like postfix notation the most.

Of course, though, the marketshare of TI's infix calculators is much higher than HP's RPN calculators, and I haven't heard of much recent developments regarding HP's calculator line (the last I heard was HP had a limited-edition re-release of the famous HP 15c model sometime back in 2011, and also earlier this decade HP released a graphing calculator with a color display). But nevertheless there are people who prefer RPN to infix.

mcabbott · on May 14, 2019

The thing about RPN is that it worked well as a sort-of "spoken not written" language. It's a clear and usable way to give instructions one-at-a-time to a calculator with a 1-line display.

But written out on paper, it's very hard to see what's going on at a glance, which is why nobody does this on paper.

Almost anyone programming in mathematical expressions of reasonable complexity today will spend much more time reading (and triple-checking!) these expressions than typing them. That's a strong reason to like infix notation, and in general notation close to what you would use on paper.

darkpuma · on May 14, 2019

Those TI calculators are real cool, I cut my teeth as a programmer on a TI-83+ learning z80 assembly, but the truth is they're considered by many to be devices for teenagers. HP calculators seem far more popular in industry.

ken · on May 14, 2019

Have you seen numeric code (like SIMD intrinsics, BLAS, or LAPACK) in C or Fortran? It's all prefix function calls, anyway, even in those languages.

kyllo · on May 14, 2019

Right on, now we're only arguing about whether the open paren goes before or after the function name :)

lgessler · on May 14, 2019

What exactly do you mean by function composition? If you mean you can't have long chunks of purely imperative code in CL, I'm pretty sure that's not true: CL is multi-paradigm and supports imperative style. I think you might be referring to its S-expression syntax, which is prefix?

vnorilo · on May 14, 2019

I'm not saying you are wrong, but the oft-cited challenge of prefix is a meme. It's not really a big deal. It's trivially easy to add infix syntax to lisps via macros, but almost no one bothers. By the time you know enough to do it, you're also already used to prefix.

reikonomusha · on May 14, 2019

You can use CMU-INFIX: https://github.com/rigetti/cmu-infix