50 Years of APL Datatypes (2016) [pdf]

dang on March 3, 2017 | [–]

Speaking of APL, the HN user behind the following posts has agreed to do an AMA to explain his fascinating work on an APL compiler+runtime for GPUs, all in 750 lines of code:

https://news.ycombinator.com/item?id=13565743

https://news.ycombinator.com/item?id=13638086

We'll probably do that this weekend. (The intention was for 13638086 to be that discussion, but we dropped the ball, so we're going to do a retake.)

Edit: planned for Sunday at 11am PST!

rebootthesystem on March 3, 2017 | | [–]

As someone who used APL professional for over ten years it is interesting to see it come up with some frequency on HN.

Not sure what to make of it other than thinking that perhaps it is different enough that it still generates interest from time to time but few are using it professionally.

There are still places where APL is actively used. This is mostly in the financial services arena. I think it's sort of like companies that use massive Excel files for internal "secret sauce" and they just can't pull away from it.

Not saying APL has run it's course, but, after using it for so long (and still do just to play around) it was easy to see the language needed yet another revision to remain useful and viable for widespread real-world projects.

If you can believe it, I still have a 1980's era original IBM PC in the garage with a real IBM APL keyboard (they used to sell keyboards with the APL character set on the front of the keycap) and both IBM and STSC interpreters installed and running. One of these days I need to see about creating a virtual machine out of those drives (20 MB hard disks!).

gregfjohnson on March 3, 2017 | | [–]

Here is an interesting bit of APL history: Brian Kernigan wrote an entire APL interpreter in the early days of Unix. It was further developed by a number of others, and remained in use for decades. I have used it on Linux and Unix boxes on many projects large and small since grad school days. Recently I decided to see about trying to contribute to this cherished bit of community legacy. Working with the code is like pair programming across time with giants and geniuses! I am engaging in "TDP" programming, a facetious phrase we use with management at work: Temporally Decoupled Pair Programming...

co_dh on March 3, 2017 | | [–]

Can you provide a link to Brian Kernigan's APL's source code? Can't find it by google. Thanks.

gregfjohnson on March 4, 2017 | | | [–]

Yikes. I got a mental wire-crossing, and misattributed the origin of APL\11. It was Ken Thompson who wrote the version I was thinking of. Here is a link to the early history of the project:

http://sigapl.org/Archives/waterloo_archive/apl/apl-11/QL_Si...

A recent version that has gone through many iterations can be found at

https://github.com/PlanetAPL/openAPL

If you google "Ken Thompson APL\11" you will find several links, including ancient archived versions of the code.

lokedhs on March 3, 2017 | | [–]

If anyone is interested in playing with APL, there is a free implementation in GNU APL[1] that works pretty well.

I'm doing a bit of shameless advertising here, since I'm the author of gnu-apl-mode[2] which adds support for GNU APL to Emacs.

If anyone have questions about this, I'll be happy to answer them here.

[1] https://www.gnu.org/software/apl/

[2] Available on MELPA, but you can also find the source here: https://github.com/lokedhs/gnu-apl-mode

summarity on March 3, 2017 | | [–]

You can play with APL (Dyalog or ngn) online here: https://tio.run/nexus

throwaway7645 on March 3, 2017 | | | [–]

Awesome and thanks! Do you actively use GNU APL? Is that implementation still maintained?

lokedhs on March 3, 2017 | | | [–]

It's being actively developed. I think the most recent official release was a few months ago. There isn't much special about official releases though, so I'd recommend you always use the most recent one from source.

In fact, I do believe GNU APL is the youngest APL implementation out there, having had its initial release just a few years ago.

I personally use it for working on data sets such as working on performance data from, say, vmstat. I'm also using it for casual calculations. It's very good for this since the syntax is so terse, I can do a lot of computation with very little code.

And of course, I use it because it gives me some degree of satisfaction. For example I did one of the tasks in last year's Google Code Jam in APL.

throwaway7645 on March 3, 2017 | | | [–]

Awesome and thanks! I'm trying to determine if I have a use case. Not sure if the fairly large CSV files I typically work with are too big as input vectors for APL.

lokedhs on March 3, 2017 | | | [–]

How big are they. GNU APL isn't particularly tuned for this use case but arrays of a few hundred thousand rows is not a problem.

I have had performance problems at several million rows, but it all depends on what you do.

throwaway7645 on March 4, 2017 | | | [–]

Usually not in the millions, but occasionally they are. I've seen a few videos for Dyalog-APL & J, but nothing on GNU-APL. Even a blog post would be welcome on setup with emacs/vim...etc.

lokedhs on March 4, 2017 | | | [–]

Some time ago I created a video showing some of the features of the Emacs mode. I really should make a new one, since there have been some more development since then, and some features could have been shown better.

Hopefully it can still be helpful: https://www.youtube.com/watch?v=yP4A5CKITnM

throwaway7645 on March 4, 2017 | | | [–]

Thank you! I'll watch it in full tonight. I'm not sure if you mention it or not, but are you using a special APL keyboard? Curious how you enter the characters.

lokedhs on March 5, 2017 | | | [–]

The Emacs mode provides two different methods to input APL symbols. It allows you to set a prefix key (like Super for example) that will map to the symbols, or you can use the APL imput method which uses "." (period) as a prefix when typing them.

When using it, you can type C-c C-k which will open a quick help that shows they keymap.

gtani on March 3, 2017 | | [–]

If anyone really wants to learn, I think one of the best resources is still the Polivka/Pakin "APL Language and Its Usage" (Prentice-Hall, but in the 2nd edition, paperback with red cover, which I haven't ever seen for sale on Amazon or Half Price books. They have the grey 1st ed for pretty cheap now.

2nd ed covered the differences between STSC, APL2 and Sharp, which were the main implementations that people used to encounter. Later they issued an edition with a 3rd author, Brown, that used APL2 exclusively.

muraiki on March 4, 2017 | | [–]

The creator of APL also made an open source language called J that uses ascii characters (although there are some other important differences between APL and J). J has a pretty nice IDE which can do things like break down code into its constituent parts, which is also used for stepping through code as a debugger. That has really helped me to understand J code better.

For learning the language, I've found the freely available "J Tutorial and Statistical Package"[0] to be a great approach. You learn J by building up a package of basic stats functions. The author also has a variety of other interesting J papers.[1]

[0] https://webdocs.cs.ualberta.ca/~smillie/Jpage/jtsp.pdf [1] https://webdocs.cs.ualberta.ca/~smillie/Jpage/Jpage.html

dang on March 3, 2017 | [–]

Can anybody explain what this means?

Nested Arrays weren’t without controversy as there were two competing designs called Floating and Grounded which differed in many ways, but fundamentally on whether the enclose of a simple scalar produced a new array or the same array.

What's 'enclosing' (is it something like boxing?) and what does it mean to produce a new or the same array?

ktRolster on March 3, 2017 | | [–]

In APL, everything is an array, and everything is designed to operate on an array. So

    ⎕ ← 1 + 2

will yield 3, and

    ⎕ ← 1 9 5 + 2

will yield 3 11 7. Of course, you can have complex arrays as well: (2 3) (5 4 2) in APL will be an array with two elements: the first element having two sub-elements, and the second having three sub-elements. The question then, is what about (2)? Should that be an array inside an array? Or just an array? That is my understanding of that paragraph.

I don't think the question of "which is best" here can be answered without a lot of practical experience trying both methods. If you like APL, I think this is a good tutorial: http://www.zerobugsandprogramfaster.net/essays/5b.html

lokedhs on March 3, 2017 | | | [–]

In original APL, there was only arrays, and inside each cell in an array you could put scalars. A scalar was simply a number or a character. A string is a single-dimensional array of characters, which is quite logical.

Now, with this design, how do you manage a collection of strings? Turns out you can't, so what old software used to do was so have a two-dimensional array of characters. This works reasonably well, except that this means that all strings have to be the same length. You can work with this, but can be very annoying.

What APL2 introduced was the idea that you can not only put scalars inside an array, but also other arrays. This is of course how pretty much every other programming language works, but at the time (as far as I know, this is way before my time) this was seen as very controversial since it made the language less clean, or "elegant".

To go into some details, you can consider a scalar as an array of dimension zero:

        ⍴⍴5
    0

(⍴⍴ returns the dimension of the argument to the right[1])

Once you have nested arrays, you can also wrap an array inside a scalar. This allows you to more easily treat arrays as single objects:

        s ← 'hello'    ⍝ Assign the string "hello" to s
        ⍴⍴s            ⍝ Confirm that s is a one-dimensional array
    1
        s2 ← ⊂s        ⍝ The ⊂ wraps its argument as a scalar, assign the result to s2
        ⍴⍴s2           ⍝ The result now has 0 dimensions
    0

[1] To go really deep into the technical details, it returns the dimensions of the dimensions of the argument. A scalar has no dimensions, do ⍴5 returns an empty array. The second ⍴ returns zero because an empty array has zero elements.

beagle3 on March 3, 2017 | | | [–]

The K programming language (as implemented in kparc, kdb+, kona and oK), an cousin of APL, does away with explicit matrices, and has 3 basic types: atom=scalar, list=vector and dict=mapping. A 2d matrix is implemented as a vector of (vectors of all the same dimensions); a 3d matrix is implemented as a vector of (matrices all of the same dimensions).

This greatly simplified the entire language - both semantics and implementation.

gtani on March 3, 2017 | | | | [–]

Thanks. My APL2 used to be littered with a lot of "eaches" which generated a lot of complaints in code review (raised, "sideways colon"), but seemed a natural way to express things.

Is this a divergence between APL2 and Sharp implementation? I used to work (a long time ago) with a lot of ex-Sharp people from Toronto who were always harping about the superiority of Sharp.

lokedhs on March 3, 2017 | | | [–]

I'm not sure. I don't actually know that much amount APL history. I've really only used it for a few years. If you want a lot more information about that, I recommend you join the GNU APL mailing list, as some of the members there know a lot about the history: https://lists.gnu.org/mailman/listinfo/bug-apl

As far as I know, Sharp APL was "old style" in the sense that it did not support nested arrays. Some people may like that style, but I have a hard time seeing that it is "better", since the nesed arrays are backward compatible with the non-nested style.

Note that you can also use the ⊂ and ⊃ functions to convert between nested and recursive arrays. For example, assuming you have an array of strings:

          a ← 'here' 'are' 'some' 'strings'

          ⍝ a is now a single-dimensional array of 4 elements
          ⍴a
    4

          ⍝ The ⊃ function converts the nested array into a two-dimensional array
          ⊃a
    here   
    are    
    some   
    strings

          ⍝ The array is two-dimensional with 4 rows and 7 columns
          ⍴ ⊃a
    4 7

throwaway7645 on March 3, 2017 | | | | [–]

There is also an interactive try APL website by the folks at Dyalog I believe. At least they link to it. Dyalog also has a free Mastering Dyalog APL book that although implementation specific is general enough. J is a similar free & open source APL-like language with several tutorials.

ktRolster on March 3, 2017 | | | [–]

J is cool but it's really fun to use all the unusual characters. GNU APL is free and open source: https://www.gnu.org/software/apl/

pinewurst on March 3, 2017 | | | | [–]

http://tryapl.org

Way cool to play with...

Pamar on March 3, 2017 | | | [–]

Far from an expert on APL, I googled for APL Grounded array and found this: http://webcache.googleusercontent.com/search?q=cache:h7FiYcA... - hope it helps.

(From what I could understand with floating you are actually adding a scalar element to the array, with grounded you are creating a 1-element array containing the scalar, and adding this to the original array, which is an array of arrays - but I might have completely misunderstood)

dang on March 3, 2017 | | | [–]

That link does clear things up—thanks!

In the early 1980’s, a number of APL vendors almost simultaneously introduced a new concept into APL - the nested array. Each element of a nested array can itself be any other (nested) array. A new monadic primitive function, called enclose, was introduced that took any array as its argument and returned a scalar enclosed array. This scalar (rank zero) array could then be inserted in place of any scalar element of any APL array.

Unfortunately there were two inequivalent approaches; grounded arrays as proposed by Ken Iverson [...] and floating arrays as proposed by Jim Brown [...] In the grounded system, enclosing a simple scalar produces an enclosed scalar, whereas in the floating system, enclosing a simple scalar leaves it unchanged.

So enclose(x) is always a scalar, but the two schemes disagree about whether enclose(s) == s for a scalar s.

leephillips on March 3, 2017 | | | [–]

This is amusing to read, because I learned APL in 1976 when I entered college. It was my first programming language and my only one for a long time. I remember that the most frustrating thing about it was the lack of nested arrays, especially when dealing with arrays of strings. You had to put them into a rectangular array, so they all had to be the same length, which meant you had to pad them.

I was horrified when I learned FORTRAN: I have to write a do-loop to add two arrays together? Why? (No array fortran back then.)

pklausler on March 3, 2017 | | | [–]

Boxing is not a bad analogy. Both "enclose" schemes would do the same thing with arguments of rank > 0, but differed for arguments of rank == 0; one would box nevertheless, the other wouldn't, and I can't remember which was which.

dang on March 3, 2017 | | | [–]

Thanks, that's clear. From the link Pamar found, the floating scheme is the one that doesn't box the scalar. Sort of makes sense because then the scalars just 'float' around.

mrkgnao on March 3, 2017 | | [–]

"Does [[1]] have type [Int] or type [[Int]]?", I'm guessing.