Strangest Programming Language Feature?

Someone · on Jan 31, 2014

No mention of "ALTER" (http://en.wikipedia.org/wiki/COBOL#Self-modifying_code) on the first page?

Self-modifying code was fairly common at the time COBOL was developed, but incorporating it in a high-level business language was, IMO, a very weird decision.

I also miss call by name (http://en.wikipedia.org/wiki/Man_or_boy_test) on the first page.

In general, one should read the Intercal (http://en.wikipedia.org/wiki/INTERCAL) paper, and figure out from which language each of its features came.

EvanAnderson · on Feb 1, 2014

Your mentioning "ALTER" made me think about a dialect of BASIC that I once used (STARDOS BASIC on a "Micro V" computer, back in the late 1980s). This BASIC dialect had an "XEQ" function that would pass a string to the tokenizer and execute it as an immediate-mode command. If you started the string with a line number you could create self-modifying BASIC code. It was dizzying to my young mind.

Zenst · on Feb 1, 2014

You mention COBOL, I was aught COBOL the JSP (Jackson) method and did lovely code. My first job they used GOTO and I was like argh, COBOL has a GOTO verb. Performance and the like, much less cleaner code but effecient :\ and with that the first steps from the academic world into the real world open your eyes up to many weird programing ways and features.

But I did always liked the COBOL, OCCURS DEPENDING ON ( http://publib.boulder.ibm.com/infocenter/comphelp/v7v91/inde... ) and how you could use it to write variable length records to files, saving space on tape or DISC and with that faster and cheaper.

it is after all effeciency and getting that extra drop of performance that drives the weird and wonderful quirks and usage and practices that this topic aspires to highlight.

antirez · on Jan 31, 2014

Smalltalk's "become" is a pretty interesting one... http://gbracha.blogspot.it/2009/07/miracle-of-become.html

saryant · on Feb 1, 2014

Akka actors have a very similar feature, same name. It doesn't swap identities though.

It's actually pretty cool if you think of it as a way to track state without having a mutable variable.

One example is this implementation of the dining philosophers problem:

https://github.com/akka/akka/blob/master/akka-samples/akka-s...

You can also use "unbecome" and effectively treat an actor's behavior as a stack, popping and pushing receive functions. One cool trick is to bind a value to your receive function:

  def receive(foo: Foo): Receive = { ... }

Then you can bind immutable data to a given receive function but think of it as mutable by replacing the current receive function with become().

BrandonY · on Feb 1, 2014

Smalltalk also is rather famous for giving developers enough freedom to more or less destroy the environment with the terrifying looking command "Smalltalk := nil"

rbanffy · on Feb 1, 2014

I once crashed Squeak by making true := false. It was an interesting experience.

al2o3cr · on Feb 1, 2014

/me drops the MUMPS reference manual on the table

That whole language is one long-running WTF.

roywiggins · on Feb 1, 2014

This line in MUMPS is not only valid, it also actually does something potentially useful:

    s:foo'="" foo(foo)=foo

It's actually a bit tricky to explain what it does. Everything in MUMPS is effectively a tree. Each tree has a value in the "root node" and you can set it like this:

    set foo="hello"

You can also put data deeper into the tree:

    set foo("fizz")="buzz"

So if something is passed to you and you want to know if you can treat it as a string, you test to see if it's a null string:

    if foo'="" do [something]

('= means !=)

Two more features: 1) Almost every directive can be reduced to one letter. 2) If I want to quit based on a condition, I can do either if condition quit or quit:condition

So the line above is more legibly written as

    if foo'=""  set foo(foo)=foo

So if foo="bar", it's equivalent to:

    set foo("bar")="bar"

MetaCosm · on Feb 1, 2014

It is amazing that MUMPS will be 50 years old in 2 years -- and it still alive and kicking.

An emergency MUMPS project is still my highest billing project (per hour) ever.

dman · on Feb 1, 2014

What was the rate?

perturbation · on Feb 1, 2014

f p=2,3:2 s q=1 x "f f=3:2 q:f(asterisk)f>p!'q s q=p#f" w:q p,?$x\8+18
(Not sure if this is going to mess up the formatting, so apologies in advance). This (shamelessly stolen from a usenet sig) will print out a table of primes with formatting. Adding a quit based on max prime size is left as exercise to the reader.
Explanation: f = for, but basically used as a while loop since we don't have a terminate condition ('p' is used as the counter variable and we would normally have an upper limit, but since we don't it will keep iterating until something breaks)
We're setting the variable 'p' to 2 on first iteration of the loop, 3 on the next, and then going up by 2 on subsequent.
Then, we're setting the variable 'q' to 1.
'x' is short for XECUTE. It looks at the string following it and executes it as code.
"f f=3:2 q:f(asterisk)f>p!'q s q=p#f"
This is another loop, with the variable 'f' used as the counter, starting at 3 and incrementing by 2.
Each loop iteration, it checks if f^2 is greater than p or if q == 0. If not, it proceeds to set q to p modulo f.
To sum things up: We quit the loop if either we've checked all candidates (excluding evens) up to ~sqrt(p) or if one of these evenly divides p (that is, modulo 0).
The tricky bit:

w:q p,?$x\8+18

w:q == WRITE, using the value of 'q' as a post-conditional, only to be executed for non-zero values of q.

If we write, we will write 'p', setting the position of the cursor with '?' (syntax particular to write) using the '$x' variable (built-in for the current cursor position).

Note: $x\8+18 evaluates to (($x INTEGER DIV 8)+1)8

First line:

GTM>f p=2,3:2 s q=1 x "f f=3:2 q:ff>p!'q s q=p#f" w:q p,?$x\8+18

2 3 5 7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 67 71

Edit: Formatting is messed up. (asterisk == '* '). I could have escaped, but didn't want to add whitespace.

userbinator · on Feb 1, 2014

That's actually more readable than APL, and once you said "f = for" I automatically made the mental association "s = set" and "x = execute". It wasn't actually that hard to figure out the rest from that...

and something about that code reminds me of vi commands.

jejones3141 · on Feb 1, 2014

Someone's already mentioned COBOL's ALTER X TO PROCEED TO Y, so I'll mention a programming language called COMAL. A friend with a C64 showed it to me, and it was very respectable for its time and the available resources; reminded me of BASIC09.

Apparently it was written by someone from Denmark, and there were commands to switch back and forth between English and Danish. I don't remember whether it was just for error messages or for those and for the language keywords as well; I want to say the latter.

mck- · on Feb 1, 2014

In CoffeeScript I discovered the other day that:

  0 <= "" <= 0   # True
  "" is 0        # False

shoo · on Feb 1, 2014

I guess that is because coffeescript's <= operator is just javascript's <= operator, which does implicit type conversion.

but coffeescript's "is", aka ==, aka === in javascript, doesn't.

https://developer.mozilla.org/en-US/docs/Web/JavaScript/Refe...

having "==" be semantically quite different to "<=" doesn't seem a particularly nice choice of notation.

arguably coffeescript has made things worse in this case compared to plain javascript, where you're probably not going to expect === to behave similarly to <=.

not obvious how you'd improve this to make it consistent, without say redefining how "==" and "<=" work to make them raise type errors or evaluate to something undefined if the types of the arguments don't match.

runlevel1 · on Feb 1, 2014

It's because equality comparisons in CoffeeScript (==, !=, is, isnt) are always compiled into strict (non-type-converting) comparisons (===, !==). [1]

JavaScript doesn't have strict relational operators (>, <, >=, <=), so all relational comparisons undergo type-conversion. [2]

  CoffeeScript: 0 <= "" <= 0           #  True
  JavaScript:   (0 <= "" && "" <= 0);  // True

  CoffeeScript: "" is 0    #  False
  JavaScript:   "" === 0;  // False

[1]: http://coffeescript.org/#operators

[2]: http://www.ecma-international.org/ecma-262/5.1/#sec-11.8

EDIT: Forgot to include 'is' and 'isnt' operators.

BlackDeath3 · on Feb 1, 2014

>GOD is REAL, unless declared INTEGER.

I'm stealing the hell out of this.

kabdib · on Feb 1, 2014

: 2 3 ;

... redefines the constant 2 as 3. FORTH. Go figure :-)

bodyfour · on Feb 1, 2014

Certain ancient FORTRAN compilers would let you do that as well. If the target CPU didn't have a "move immediate" instruction, the compiler would just stash the number into memory. The quickest way to implement it is to just add the number into the symbol table... that way an often-used constant will only need one spot in core.

Of course when your whole compiler has to run in 8 Kwords or whatever, syntax checking isn't a high priority. So if you do "1234 = 0" it would allow it (since '1234' is just a variable in the symbol table) and it actually would change all of the meaning of 1234 everywhere else in the program!

kps · on Feb 1, 2014

I haven't heard of that, and it would be great if you could remember where you found it. It sounds to me like it might be an apocryphal corruption of the following:

Fortran passes arguments by reference. If the compiler uses a constant pool, and (lazily or efficiently, depending on your point of view) passes the constant pool location as an argument, a subroutine could inadvertently modify a constant.

FORTRAN 66 (the first ANSI standard) and FORTRAN 77 forbid passing a constant or expression as an argument that will be modified†‡, thereby blessing this implementation. (Fortran 90 and later are tl;dr.)

† http://www.fh-jena.de/~kleine/history/languages/ansi-x3dot9-... §8.4 p26

‡ http://www.fortran.com/fortran/F77_std/rjcnf-15.html#sh-15.9...

bodyfour · on Feb 3, 2014

It's a war story from my father, so I have to admit that the details could be corrupted. He worked with a variety of machines in the 60s, so it was believable that one had a permissive enough FORTRAN to allow that. It's also possible that your explanation of it being reference-argument corruption.

I'm almost curious enough to fire up a PDP-8 emulator and see what 4K FORTRAN does. Almost. http://techtinkering.com/2009/07/14/running-4k-fortran-on-a-...

etfb · on Feb 1, 2014

In Forth, the following are fairly common definitions in the core:

    0 CONSTANT 0
    1 CONSTANT 1
    -1 CONSTANT -1

In an Algolian language like C++ or Java, if the syntax were legal, these would be:

    const 0 = 0;
    const 1 = 1;
    const -1 = -1;

The reason for this is to do with the interpreter. How the Forth interpreter works is very simple:

1. Read one word, where "word" is literally defined as "a sequence of non-space characters delimited by spaces". Some standard words include DUP 2DROP 1+ - . " and :

2. Attempt to find the word in the dictionary. If found, execute it.

3. If it's not found, attempt to interpret it as a number. If that works, push the number on the stack.

4. If neither 2 nor 3 succeed, emit an error message.

So the reason for defining a constant named "0" is to save time: it's quicker to interpret the word "0" if it's in the dictionary; otherwise you have to search the whole dictionary and then do a text-to-number conversion, which takes too long.

So really it does make sense.

kabdib · on Feb 2, 2014

This also depends on the flavor of FORTH you're using. The one I wrote in high school (oh so many years ago) compiled directly to executable code, with constants compiled to direct pushes, so "redefining" a constant this way would run more slowly.

While I absolutely hate FORTH as a production language (I've seen millions of dollars flushed down the tubes by FORTH afficianados who were unwilling to admit that their code was unmaintainable, slow, and didn't work), it's fun to play around with. Everybody should write at least one FORTH in their career.

Someone · on Feb 1, 2014

It is about space, not time. What you describe happens interactively. At that time, performance does not matter much, after compiling, it does.

At compilation time (the word : is one of the ways to switch the system to compilation mode), Forth doesn't execute words and numbers if finds, but it adds their addresses (in some form, depending on the implementation) to the generated code. When running the code later, the runtime simply fetches each of these function addresses and 'calls' them (that's what makes Forth systems fast without the need for any convoluted compiler technology. Depending on the implementation, that 'call' may be an actual call in the CPU, but it typically is just another 'grab the addresses found there in sequence and execute them' and yes, it can't be turtles all the way down, but Forth gets awfully close)

Now, the question is: how do you compile "push this constant on the stack"? Forth does it by compiling the word called LITERAL, followed by the constant.

So, compiling a function called 0 or 1 adds a function address to the compiler output, but compiling a literal constant adds the address of the function called LITERAL and that constant. For constants used more than a few (where the exact limit depends on he particular Forth implementation) times, the extra space needed for that function gets more than compensated by the gain. In typical Forth systems, -1, 0, 1, and 2 already are space savers before the user types his first character. That's why they are predefined. If you use another constant often, you can easily define it. Many Forth systems even have a function called CONSTANT for it, but you can define it yourself, if it is absent.

For those wondering how the system knows that that constant it compiled isn't the address of a function to call: it doesn't. Instead, the function called LITERAL, when called, hooks into the runtime, uses the 'current instruction pointer' to read the value to push on he stack, and then increases that pointer to point past the constant. When LITERAL returns, the runtime just reads what it thinks is the next pointer and calls it.

marcosdumay · on Feb 1, 2014

Taking about constants, the scad format (for 3D design) allows for defining and redefining constants, but they always have the last defined values...

That is, unless you import a module within redeclariations, in what case the module will inherit the values of the last declarations above the line it's imported on (not the last one of the file). And yep, that'll replace any declarations within that module.

The biggest surprise is that this is useful.

rbanffy · on Feb 1, 2014

Actually, it defines a verb that puts 3 on the stack and names it "2".

spc476 · on Feb 1, 2014

INRAC, the langauge RACTER (http://en.wikipedia.org/wiki/Racter) has the most bizarre flow control I've ever seen in any computer language, ever. It can best be described as "a random, pattern matched GOTO/GOSUB" which is the most succinct description I can come up with.

I have a blog entry about it (http://boston.conman.org/2008/06/18.2) but in short, each line of an INRAC program (unless it ends with a '#' mark in which case execution continues to the next line) is a subroutine, each with a label. The label does not need be unique, but when you "call" a subroutine, INRAC will just pick one line with that label at random to execute. The pattern matching comes in because you can select the label with wildcard characters (which just picks a line that matches the pattern at random).

There isn't much about the language on the Internet. In fact, the only other page aside from my blog entry (which I wrote as I went through the existing source code I found for Racter) is the Racter FAQ (https://groups.google.com/forum/#!topic/rec.arts.int-fiction...) which has a few inaccuracies (or perhaps was looking at a version of the code before processing).

zokier · on Feb 1, 2014

C++ has all sorts of fun stuff. One example is std::vector<bool>, which is not implemented as a simple array of booleans like every other std::vector, but instead it is a bitmap. Afaik this behavior is required by standard. One fun side-effect is that you can't take the address of individual elements of the vector.

Compare these two:

http://ideone.com/qC9yOp

http://ideone.com/jF4krp

protomyth · on Feb 1, 2014

The control structures in Icon are not quite what you would expect and qualify as strange: http://en.wikipedia.org/wiki/Icon_(programming_language)

raldi · on Feb 1, 2014

I don't see what you mean; could you spell it out?

protomyth · on Feb 1, 2014

I thought the wikipedia article could explain it better than I, but here are some more references:

http://blogs.kde.org/2006/10/21/ralph-griswold-icon-language...

http://dl.acm.org/citation.cfm?id=104659

http://research.microsoft.com/pubs/69724/tr-99-64.ps

look at Goal-directed evaluation of icon

eli_gottlieb · on Feb 1, 2014

I'm the only person thinking of the way MATLAB passes everything by value, including compound objects and structs?

julian_t · on Feb 1, 2014

Snobol... where patterns are first class constructs, and every line can end with a GOTO. That definitely lets you write some very, er, compact code. But it was superb to program in.

And how about the assigned GOTO in older versions of Fortran? "GOTO N" where N is an integer variable whose value will be known at runtime.

Happy days...

knome · on Feb 1, 2014

I'm sure you'll be pleased to find that gcc has kept the assigned goto alive as a c extension.

    #include <stdio.h>

    int main( int argc, char ** argv ){

      void * p = && lol ;

     bounce:
      goto *p ;

     lol:
      printf("lol %p\n", p);
      p = && wtf;
      goto bounce;

     wtf:
      printf("wtf %p\n", p);
      p = && lol;
      goto bounce;

    }

adamnemecek · on Jan 31, 2014

Fun fact, this is also the reason why 'i' is commonly used as a name for the loop counter variable.

kps · on Feb 1, 2014

No, it's the other way around; i, j, ... have a long history in mathematics as indices for matrices, summations, etc., with m, n likewise being traditional for the dimensions of a matrix.

In early FORTRAN, integers were present primarily to be used as array subscripts. The INteger mnemonic doesn't appear in any of the early papers or manuals.

As a side note, on the topic of features that would currently seem ‘strange’, some early languages that were intended to be programmed using teletypewriters rather than FORTRAN's Hollerith cards used half-line motions to write array subscripts as actual subscripts.

tsm · on Feb 1, 2014

What about as an abbreviation for "index"?

BlackDeath3 · on Feb 1, 2014

Source?

adamnemecek · on Feb 1, 2014

I mean google around. It's not "it's definitely from Fortran" more like "it has been repeatedly suggested that...".

cratermoon · on Feb 1, 2014

The COBOL MOVE-CORRESPONDING verb. I'm not going to try to explain exactly how it works here, but imagine, if you will, copying the contents of one table to another table with a different column structure, but sort of the same names.

rst · on Feb 1, 2014

This is one of the COBOL features that makes a lot more sense if you think of COBOL as SQL for tape drives than as something that is trying to be, say, Pascal or Ada and failing. (Along with, say, pretty much the entire DATA DIVISION...)

Pxtl · on Feb 1, 2014

C switch fallthrough. I know it's old and everybody's used to it and I get why it's there, but it's freaking weird. I love c#'s response that you must use goto <case> instead to make fallthrough explicit.

xpe · on Feb 1, 2014

I heart what should be called a "truth injection attack" in Python: http://stackoverflow.com/a/2021553/109618

userbinator · on Feb 1, 2014

You can do a similar thing with Java, although it's a bit harder... http://stackoverflow.com/questions/3301635/change-private-st...

I think you can do something like this in C# too (not surprisingly).

eps · on Feb 1, 2014

Duff's device is certainly a very strange construct.

http://www.lysator.liu.se/c/duffs-device.html#duffs-device

kyberias · on Feb 1, 2014

This question is a beautiful invitation for everyone to defend their favorite languages! :)

Pxtl · on Feb 1, 2014

in vb.net: off-by-one array sizes so people who do one-based arrays don't get hurt. Also, two sets of boolean operators, the simpler ones don't short-circuit (or vs orelse, and vs andalso).

analog31 · on Feb 1, 2014

Line numbers in BASIC

ahoge · on Feb 1, 2014

It kinda made sense, because there wasn't anything fancy like Notepad. If you wanted to read some section, you had to specify the range. If you wanted to change a line, you overwrote it by using the same line number. If you wanted to insert a line, you had to pick a line number which lies between those other two line numbers. That's why you used an increment of 10 (or whatever) instead of 1.

rbanffy · on Feb 1, 2014

You obviously never dropped a deck of punchcards, did you?

Line numbers are a wonderful thing in that situation.

Thiz · on Feb 1, 2014

For…else in python.

Weird. or dutch.

valtron · on Feb 1, 2014

Cool!

runn1ng · on Feb 1, 2014

"not really a question. closed."

kps · on Feb 1, 2014

Here is a route to interesting reading (or HN karma, if you're into that):

https://www.google.com/search?q="This+question+exists+becaus...

Crito · on Jan 31, 2014

It logically follows from how arrays in C works, so I don't really know if it qualifies for weird/suprising, but the old array[i] / i[array] thing is fun to show to people who haven't seen it before.

The most amusingly weird thing I can think of is INTERCAL's COMEFROM: http://en.wikipedia.org/wiki/COMEFROM

ThatOtherPerson · on Feb 1, 2014

Can you explain your array[i] / i[array] reference? I can't find it on Google.

kps · on Feb 1, 2014

  By definition, the subscript operator [] is interpreted in such a
  way that ‘‘E1[E2]’’ is identical to ‘‘*((E1) + (E2))’’.  Because
  of the conversion rules which apply to +, if E1 is an array and
  E2 an integer, then E1[E2] refers to the E2th member of E1.
  Therefore, despite its asymmetric appearance, subscripting is a
  commutative operation.¹

¹ Dennis M Ritchie, C Reference Manual, 1975 http://cm.bell-labs.com/cm/cs/who/dmr/cman.pdf (A revised version of this became Appendix A to K&R.)

sootzoo · on Feb 1, 2014

There's an answer [1] on the linked SO thread (flip to the first page) but basically, the following two lines are equivalent in C:

a[10] 10[a]

This is because a[10] is equivalent to (a + 10), and 10[a] is equivalent to (10 + a), as the first comment on this answer explains.

[1] http://stackoverflow.com/questions/1995113/strangest-languag...

rodrodrod · on Feb 1, 2014

From the c standard[0]:

> The definition of the subscript operator [] is that E1[E2] is identical to (* ((E1)+(E2))).

Since * (E1 + E2) is commutative, E1[E2] == E2[E1].

[0] http://c0x.coding-guidelines.com/6.5.2.1.html

Alphasite_ · on Feb 1, 2014

Now obviously, c arrays are just strips of memory allocated to that array, so you can access them using the address of the start of the address and the offset (number of array items to skip). To access them, you sum the address and the offset, then you have the address of the item.

     [~~~~~~~~~~~~~~~~[a~~~~~~~b~~]~~~] Memory
     ^–––––––––––––––––^                Address
                       ^–––––––^        Offset

And now, a[b] is just short hand for the pointer arithmetic going on and you could just use any two ints and access any arbitrary memory address (i assume the compiler enforces that this doesn't happen).

     address[offset] => *(address + offset) => getValueAtAddress(valueIn(address) + valueIn(offset))

    a[b]  => *(a + b) => getValueAtAddress(valueIn(a) + valueIn(b))
    b[a]  => *(b + a) => getValueAtAddress(valueIn(a) + valueIn(b))

Those two are obviously the same so *(a + b) == b[a] == a[b]

kidb · on Feb 1, 2014

I think the trick is that array[i] and i[array] will always be the same value in C.

It's confusing because array[i] makes sense but i[array] doesn't (how the hell can you use an array as an index on an integer?)

It works because array is a pointer to a memory address and i (or index) is an offset. array[i] will add the index to the memory address and return the value there, where as i[array] will add the memory address to the index. Since array+index == index+array, they point to the same memory and return the same value.

Fasebook · on Feb 1, 2014

Javascript's DOM