Self-modifying code was fairly common at the time COBOL was developed, but incorporating it in a high-level business language was, IMO, a very weird decision.
Your mentioning "ALTER" made me think about a dialect of BASIC that I once used (STARDOS BASIC on a "Micro V" computer, back in the late 1980s). This BASIC dialect had an "XEQ" function that would pass a string to the tokenizer and execute it as an immediate-mode command. If you started the string with a line number you could create self-modifying BASIC code. It was dizzying to my young mind.
You mention COBOL, I was aught COBOL the JSP (Jackson) method and did lovely code. My first job they used GOTO and I was like argh, COBOL has a GOTO verb. Performance and the like, much less cleaner code but effecient :\ and with that the first steps from the academic world into the real world open your eyes up to many weird programing ways and features.
But I did always liked the COBOL, OCCURS DEPENDING ON ( http://publib.boulder.ibm.com/infocenter/comphelp/v7v91/inde... ) and how you could use it to write variable length records to files, saving space on tape or DISC and with that faster and cheaper.
it is after all effeciency and getting that extra drop of performance that drives the weird and wonderful quirks and usage and practices that this topic aspires to highlight.
You can also use "unbecome" and effectively treat an actor's behavior as a stack, popping and pushing receive functions. One cool trick is to bind a value to your receive function:
def receive(foo: Foo): Receive = { ... }
Then you can bind immutable data to a given receive function but think of it as mutable by replacing the current receive function with become().
Smalltalk also is rather famous for giving developers enough freedom to more or less destroy the environment with the terrifying looking command "Smalltalk := nil"
This line in MUMPS is not only valid, it also actually does something potentially useful:
s:foo'="" foo(foo)=foo
It's actually a bit tricky to explain what it does. Everything in MUMPS is effectively a tree. Each tree has a value in the "root node" and you can set it like this:
set foo="hello"
You can also put data deeper into the tree:
set foo("fizz")="buzz"
So if something is passed to you and you want to know if you can treat it as a string, you test to see if it's a null string:
if foo'="" do [something]
('= means !=)
Two more features: 1) Almost every directive can be reduced to one letter. 2) If I want to quit based on a condition, I can do either
if condition quit
or
quit:condition
f p=2,3:2 s q=1 x "f f=3:2 q:f(asterisk)f>p!'q s q=p#f" w:q p,?$x\8+18
(Not sure if this is going to mess up the formatting, so apologies in advance). This (shamelessly stolen from a usenet sig) will print out a table of primes with formatting. Adding a quit based on max prime size is left as exercise to the reader.
Explanation:
f = for, but basically used as a while loop since we don't have a terminate condition ('p' is used as the counter variable and we would normally have an upper limit, but since we don't it will keep iterating until something breaks)
We're setting the variable 'p' to 2 on first iteration of the loop, 3 on the next, and then going up by 2 on subsequent.
Then, we're setting the variable 'q' to 1.
'x' is short for XECUTE. It looks at the string following it and executes it as code.
"f f=3:2 q:f(asterisk)f>p!'q s q=p#f"
This is another loop, with the variable 'f' used as the counter, starting at 3 and incrementing by 2.
Each loop iteration, it checks if f^2 is greater than p or if q == 0. If not, it proceeds to set q to p modulo f.
To sum things up:
We quit the loop if either we've checked all candidates (excluding evens) up to ~sqrt(p) or if one of these evenly divides p (that is, modulo 0).
The tricky bit:
w:q p,?$x\8+18
w:q == WRITE, using the value of 'q' as a post-conditional, only to be executed for non-zero values of q.
If we write, we will write 'p', setting the position of the cursor with '?' (syntax particular to write) using the '$x' variable (built-in for the current cursor position).
Note: $x\8+18 evaluates to (($x INTEGER DIV 8)+1)8
First line:
GTM>f p=2,3:2 s q=1 x "f f=3:2 q:ff>p!'q s q=p#f" w:q p,?$x\8+18
That's actually more readable than APL, and once you said "f = for" I automatically made the mental association "s = set" and "x = execute". It wasn't actually that hard to figure out the rest from that...
and something about that code reminds me of vi commands.
Someone's already mentioned COBOL's ALTER X TO PROCEED TO Y, so I'll mention a programming language called COMAL. A friend with a C64 showed it to me, and it was very respectable for its time and the available resources; reminded me of BASIC09.
Apparently it was written by someone from Denmark, and there were commands to switch back and forth between English and Danish. I don't remember whether it was just for error messages or for those and for the language keywords as well; I want to say the latter.
having "==" be semantically quite different to "<=" doesn't seem a particularly nice choice of notation.
arguably coffeescript has made things worse in this case compared to plain javascript, where you're probably not going to expect === to behave similarly to <=.
not obvious how you'd improve this to make it consistent, without say redefining how "==" and "<=" work to make them raise type errors or evaluate to something undefined if the types of the arguments don't match.
Certain ancient FORTRAN compilers would let you do that as well. If the target CPU didn't have a "move immediate" instruction, the compiler would just stash the number into memory. The quickest way to implement it is to just add the number into the symbol table... that way an often-used constant will only need one spot in core.
Of course when your whole compiler has to run in 8 Kwords or whatever, syntax checking isn't a high priority. So if you do "1234 = 0" it would allow it (since '1234' is just a variable in the symbol table) and it actually would change all of the meaning of 1234 everywhere else in the program!
I haven't heard of that, and it would be great if you could remember where you found it. It sounds to me like it might be an apocryphal corruption of the following:
Fortran passes arguments by reference. If the compiler uses a constant pool, and (lazily or efficiently, depending on your point of view) passes the constant pool location as an argument, a subroutine could inadvertently modify a constant.
FORTRAN 66 (the first ANSI standard) and FORTRAN 77 forbid passing a constant or expression as an argument that will be modified†‡, thereby blessing this implementation. (Fortran 90 and later are tl;dr.)
It's a war story from my father, so I have to admit that the details could be corrupted. He worked with a variety of machines in the 60s, so it was believable that one had a permissive enough FORTRAN to allow that. It's also possible that your explanation of it being reference-argument corruption.
In Forth, the following are fairly common definitions in the core:
0 CONSTANT 0
1 CONSTANT 1
-1 CONSTANT -1
In an Algolian language like C++ or Java, if the syntax were legal, these would be:
const 0 = 0;
const 1 = 1;
const -1 = -1;
The reason for this is to do with the interpreter. How the Forth interpreter works is very simple:
1. Read one word, where "word" is literally defined as "a sequence of non-space characters delimited by spaces". Some standard words include DUP 2DROP 1+ - . " and :
2. Attempt to find the word in the dictionary. If found, execute it.
3. If it's not found, attempt to interpret it as a number. If that works, push the number on the stack.
4. If neither 2 nor 3 succeed, emit an error message.
So the reason for defining a constant named "0" is to save time: it's quicker to interpret the word "0" if it's in the dictionary; otherwise you have to search the whole dictionary and then do a text-to-number conversion, which takes too long.
This also depends on the flavor of FORTH you're using. The one I wrote in high school (oh so many years ago) compiled directly to executable code, with constants compiled to direct pushes, so "redefining" a constant this way would run more slowly.
While I absolutely hate FORTH as a production language (I've seen millions of dollars flushed down the tubes by FORTH afficianados who were unwilling to admit that their code was unmaintainable, slow, and didn't work), it's fun to play around with. Everybody should write at least one FORTH in their career.
It is about space, not time. What you describe happens interactively. At that time, performance does not matter much, after compiling, it does.
At compilation time (the word : is one of the ways to switch the system to compilation mode), Forth doesn't execute words and numbers if finds, but it adds their addresses (in some form, depending on the implementation) to the generated code. When running the code later, the runtime simply fetches each of these function addresses and 'calls' them (that's what makes Forth systems fast without the need for any convoluted compiler technology. Depending on the implementation, that 'call' may be an actual call in the CPU, but it typically is just another 'grab the addresses found there in sequence and execute them' and yes, it can't be turtles all the way down, but Forth gets awfully close)
Now, the question is: how do you compile "push this constant on the stack"? Forth does it by compiling the word called LITERAL, followed by the constant.
So, compiling a function called 0 or 1 adds a function address to the compiler output, but compiling a literal constant adds the address of the function called LITERAL and that constant. For constants used more than a few (where the exact limit depends on he particular Forth implementation) times, the extra space needed for that function gets more than compensated by the gain. In typical Forth systems, -1, 0, 1, and 2 already are space savers before the user types his first character. That's why they are predefined. If you use another constant often, you can easily define it. Many Forth systems even have a function called CONSTANT for it, but you can define it yourself, if it is absent.
For those wondering how the system knows that that constant it compiled isn't the address of a function to call: it doesn't. Instead, the function called LITERAL, when called, hooks into the runtime, uses the 'current instruction pointer' to read the value to push on he stack, and then increases that pointer to point past the constant. When LITERAL returns, the runtime just reads what it thinks is the next pointer and calls it.
Taking about constants, the scad format (for 3D design) allows for defining and redefining constants, but they always have the last defined values...
That is, unless you import a module within redeclariations, in what case the module will inherit the values of the last declarations above the line it's imported on (not the last one of the file). And yep, that'll replace any declarations within that module.
INRAC, the langauge RACTER (http://en.wikipedia.org/wiki/Racter) has the most bizarre flow control I've ever seen in any computer language, ever. It can best be described as "a random, pattern matched GOTO/GOSUB" which is the most succinct description I can come up with.
I have a blog entry about it (http://boston.conman.org/2008/06/18.2) but in short, each line of an INRAC program (unless it ends with a '#' mark in which case execution continues to the next line) is a subroutine, each with a label. The label does not need be unique, but when you "call" a subroutine, INRAC will just pick one line with that label at random to execute. The pattern matching comes in because you can select the label with wildcard characters (which just picks a line that matches the pattern at random).
There isn't much about the language on the Internet. In fact, the only other page aside from my blog entry (which I wrote as I went through the existing source code I found for Racter) is the Racter FAQ (https://groups.google.com/forum/#!topic/rec.arts.int-fiction...) which has a few inaccuracies (or perhaps was looking at a version of the code before processing).
C++ has all sorts of fun stuff. One example is std::vector<bool>, which is not implemented as a simple array of booleans like every other std::vector, but instead it is a bitmap. Afaik this behavior is required by standard. One fun side-effect is that you can't take the address of individual elements of the vector.
Snobol... where patterns are first class constructs, and every line can end with a GOTO. That definitely lets you write some very, er, compact code. But it was superb to program in.
And how about the assigned GOTO in older versions of Fortran? "GOTO N" where N is an integer variable whose value will be known at runtime.
No, it's the other way around; i, j, ... have a long history in mathematics as indices for matrices, summations, etc., with m, n likewise being traditional for the dimensions of a matrix.
In early FORTRAN, integers were present primarily to be used as array subscripts. The INteger mnemonic doesn't appear in any of the early papers or manuals.
As a side note, on the topic of features that would currently seem ‘strange’, some early languages that were intended to be programmed using teletypewriters rather than FORTRAN's Hollerith cards used half-line motions to write array subscripts as actual subscripts.
The COBOL MOVE-CORRESPONDING verb. I'm not going to try to explain exactly how it works here, but imagine, if you will, copying the contents of one table to another table with a different column structure, but sort of the same names.
This is one of the COBOL features that makes a lot more sense if you think of COBOL as SQL for tape drives than as something that is trying to be, say, Pascal or Ada and failing. (Along with, say, pretty much the entire DATA DIVISION...)
C switch fallthrough. I know it's old and everybody's used to it and I get why it's there, but it's freaking weird. I love c#'s response that you must use goto <case> instead to make fallthrough explicit.
in vb.net: off-by-one array sizes so people who do one-based arrays don't get hurt. Also, two sets of boolean operators, the simpler ones don't short-circuit (or vs orelse, and vs andalso).
It kinda made sense, because there wasn't anything fancy like Notepad. If you wanted to read some section, you had to specify the range. If you wanted to change a line, you overwrote it by using the same line number. If you wanted to insert a line, you had to pick a line number which lies between those other two line numbers. That's why you used an increment of 10 (or whatever) instead of 1.
It logically follows from how arrays in C works, so I don't really know if it qualifies for weird/suprising, but the old array[i] / i[array] thing is fun to show to people who haven't seen it before.
By definition, the subscript operator [] is interpreted in such a
way that ‘‘E1[E2]’’ is identical to ‘‘*((E1) + (E2))’’. Because
of the conversion rules which apply to +, if E1 is an array and
E2 an integer, then E1[E2] refers to the E2th member of E1.
Therefore, despite its asymmetric appearance, subscripting is a
commutative operation.¹
Now obviously, c arrays are just strips of memory allocated to that array, so you can access them using the address of the start of the address and the offset (number of array items to skip). To access them, you sum the address and the offset, then you have the address of the item.
And now, a[b] is just short hand for the pointer arithmetic going on and you could just use any two ints and access any arbitrary memory address (i assume the compiler enforces that this doesn't happen).
I think the trick is that array[i] and i[array] will always be the same value in C.
It's confusing because array[i] makes sense but i[array] doesn't (how the hell can you use an array as an index on an integer?)
It works because array is a pointer to a memory address and i (or index) is an offset. array[i] will add the index to the memory address and return the value there, where as i[array] will add the memory address to the index. Since array+index == index+array, they point to the same memory and return the same value.
Self-modifying code was fairly common at the time COBOL was developed, but incorporating it in a high-level business language was, IMO, a very weird decision.
I also miss call by name (http://en.wikipedia.org/wiki/Man_or_boy_test) on the first page.
In general, one should read the Intercal (http://en.wikipedia.org/wiki/INTERCAL) paper, and figure out from which language each of its features came.