The best part of this article is in the comments. An original programmer on the game explains the reason that the bug was occurring and also why the code was unreadable (hint: they did it in assembly)
Ah, the quantum tunneling pinball!
We ran into this while writing the original code at Cinematronics in 1994. Since the ball motion, physics, and coordinates were all in floating point, and the ball is constantly being pushed "down" the sloped table by the gravity vector in every frame, we found that floating point error would gradually accumulate until the ball's position was suddenly on the other side of the barrier!
(To simplify collision detection, the ball was reduced to a single point/vector and all barriers were inflated by the ball radius. So, if the mathematical point got too "close" to the mathematical line barrier, a tiny amount of floating point rounding or truncation error could push the point to the other side of the line)
To mitigate that, we added a tiny amount of extra bounce to push the ball away from the barrier when it was nearly at rest, to keep the floating point error accumulation at bay. This became known as the Brownian Motion solution.
Since much of the original code was written in x86 asm to hand tailor Pentium U/V pipelines and interleave FPU instructions, but wiki says Microsoft ported the code to C for non-Intel platforms, I'm sure the code had passed through many hands by the time it got to you. The Brownian Motion solution may have been refactored into oblivion.
I've had similiar problem in one of my first games (scrolling shooter with asteroid physic, hobby project in TP7 and mode 13h), and I solved it in similiar way (when objects bounced off the obstacle I added constant to response vector length, so that ship always bounced with velocity so big, that even with gravity added, and when player rotated towards the obstacle and engaged engine, it would still move out of the obstacle in next frame).
It looked a little funny, cause when you left the ship to itself, it bounced in a place to half its height and back forever.
To mitigate this I've added 2 kinds of obstacles - most of the terrain was filled with obstacles with spikes, that did damage on each collision, so players won't test my collision response too much :) There were some obstacles without spikes, and you could forever bounce on them, but they were rare, and I marked them as "landing places". I think that added to gameplay.
Then I've added buttons that trigger doors to open/close, and in a few levels I've just placed some ship with no ai attached over the button. Effect - ship bounced forever, so the doors connected to the button opened and closed repeatedly ;) Player could also influence that by pushing the triggering ship away :)
It's funny how constraints of game engine can guide you into gameplay elements that you wouldn't consider without them.
Maybe the reason the Brownian motion stopped working was a shift from 80-bit x87 intermediaries to 64-bit SSE in x64, and the removal of long double support from MSVC.
As a matter of fact, there is still a quantum tunnel bug. If you push and release the flippers too fast, the ball goes through them. Not a big deal unless you want to control the ball the same fine-tuned way as with real machines.
BTW, it works OK with Windows 7 (64), just copying it from an old installation. I like the game... 170M :-)
This is an excellent example of why comments in code are so very useful, helpful and in my opinion, necessary.
... nobody at Microsoft ever understood how the code worked (much less still understood it), and that most of the code was completely uncommented, we simply couldn't figure out why the collision detector was not working. Heck, we couldn't even find the collision detector!
To all those folks out there who keep finding reasons to try and be clever with their language, or omit comments from their code: now you can remember Pinball. It's very possible that with some minimal commenting about what functions did, or what the intentions of block of codes were, Pinball could have lived on. Heck, we could have even seen a Modern UI version of Pinball.
It's an argument for clear code that's readily understandable by other people. It's not an argument for any particular technique for achieving that goal. I've never seen any programmer seriously suggest that code should be deliberately made difficult to understand.
I can't agree with you emphatically enough about this. Cargo cult commenting is one of the easiest ways to damage the readability of a codebase.
Comments should be an edge-case solution for explaining unusual quirks, not the go-to approach for explaining the entire codebase. Using comments to explain what the code is up to is like using exceptions for flow of control. A large proportion of the comments I see should actually be function or method names for a more well factored version of the block of code that follows.
This point gets brought up a lot, but I'm not going to complain at all. In fact, just 5-6 months ago I used to be one of those programmers who would do
# gets the xml
def get_xml():
It took comments like this to help me realize just how silly the whole thing was. Now I write as much self-documenting code as I can, but comment as needed.
Comment should be "why I did this", and code should be "how I did this", some people just write in "what did I do" and walk away thought they have comment
I generally become aware that I should leave a comment whenever I write something that is less than trivial (but exactly less than trivial) to write/understand. Longish list comprehension? Summarize what it does. Nested function calls? Summarize why. Etc.
"A large proportion of the comments I see should actually be function or method names for a more well factored version of the block of code that follows."
This may be a valid argument for modern languages like Ruby or Python. For 90s' coding standards however comments are even MORE important than the code itself. For Assembly and C programming and for low level graphics and physics algorithms proper naming of variables and functions are never enough for readable code.
I agree that comments should not be the "go-to approach for explaining the entire codebase." I hope you didn't think I was insinuating that in my comment. I was only trying to illustrate that in the absence of clear code, which many developers are guilty of sometimes, comments can be very helpful.
Sadly, I fear that comments won't help in that situation. They could, except that the solution depends critically on the developer realizing they're writing garbled code during the time they're writing it. I don't know about anyone else, but my code's never garbled when I write it. It only becomes garbled by magic, and only when I don't look at it for a few weeks. Or, strangely enough, the moment someone else looks at it.
Still haven't figured out a reliable enough garble detection mechanism that my employer's willing to pay for.
Nah I didn't think that at all, it was clear from the way you said "minimal commenting". I just have an axe to grind when it comes to comments of the form
// assign the integer value 10 to the variable x
x = 10
In our embedded systems class we were required to define all of our constants, just to add another level of description to operations. It definitely helped to navigate spaghetti code.
"I've never seen any programmer seriously suggest that code should be deliberately made difficult to understand."
Then you've not worked with some of the people I've worked with. The clever "it was hard to write, it should be hard to read" mantra has been thrown out by a few ex-colleagues trying to be clever, and a couple of them really meant it. Whether it was a hatred for others, or a misguided attempt at "job security", I'll never know, but some of these people do exist.
Often, however, the same effect is had under the guise of "clever hacks" - look how clever and awesome I am! What? Of course everyone else can read and understand this (they just have to understand that I compiled this with my own custom-built compiler I rolled myself on gentoo so I could squeeze out an extra .0001% improvement!)
"Often, however, the same effect is had under the guise of "clever hacks" - look how clever and awesome I am!"
Sometimes I write code like that. Then I delete it and write it the proper way.
Clever code is code you will not be able to understand in the morning.
There is Kernighan's adage
"Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it."
Indeed I fear I wrote code like that yesterday (a particular problem that, after talking to 3 other engineers, everyone agreed did not have a simple solution) that I am going to have to debug today. I am not looking forward to it!
useful thoughts. it reinforces my view that it's really the intent that bugs me. I've dealt with crap code, littered with comments like "I'm sorry, didn't know what I was doing, etc". The person/people knew they were in a tough spot. In other cases I've dealt with crap code with random comments like "Java sucks. Sun sucks. Everyone using Java sucks and is stupid - Ruby is the only true language, and it sucks that this is in Java" and "if you're too stupid to understand this, quit right now and tell so-and-so to go hire a real programmer".
Just... garbage attitude oozing through the code at every line - difficult to work with.
Maybe some awful programmers do advocate for it, but that doesn't really change my overall point. There is no serious argument in the community, at least, over whether code should be readable. The argument is merely about how to make it readable, and this story does not argue for any particular approach there.
One guy I worked with would do it to protect his bailiwick. He was intensely territorial, and knew that writing code so ugly that nobody wanted to even look at it, much less take the time to understand it, was an effective way to make sure that nobody else on the team would ever touch anything he wrote.
Sounds like "mortgage code"... code that is intentionally so complex that noone in the company other than yourself can maintain it, hence you end up with a job for life that pays your mortgage.
Ha. Unfortunately for him, it became clear that most of his code was all sound and fury, calculating nothing. Well, that and he took forever to fix bugs because he couldn't really understand it either.
He was soon out on the street, and the rest of the team's policy of just rewriting bits instead of debugging them (it never took very long, and frequently resulted in 1/10 as many lines of code doing twice as much) had soon swept away most of his footprints.
yeah, +1. In this religious war, I side with "if anyone who has a reasonable level of competence with my code can't grasp it at first reach, I have failed in writing it."
Or, famously, "Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it." --Brian Kernighan
What other technique do you have at your disposal in a code originally written in assembly and then translated to C, and then refactored several times without rewrite?
I personally didn't hear of one that would be usable in this case.
Also, I'm starting to think I should write a script to cut out every comment here that has "code" and "comments" in it - among with several other phrases that signify neverending debates.
I haven't heard the argument about making code deliberately hard to understand either.
I agree that clearer code and/or commenting would have been suitable.
I only latched on to the example because they specifically mentioned that the code wasn't commented. In this case, a few lines of comments might have helped them find what they were looking for. Even if they original programmer was one of those people who just can't stop writing code like:
$a = ($b)?($ab):$c;
d($a);
Maybe some amount of convincing could lead them to leave a comment every once in a while so we can at least get:
// Detect collision
$a = ($b)?($ab):$c;
d($a);
Since it seems like both clear code and comments were absent, either one would have been helpful.
I don't think comments are any kind of substitute for clear, well-structured code. In your example, "Detect collision" still doesn't give me any kind of epiphany about the code. At best, it gives me a slight hint about where to go to get more information. Chances are I'm trying to fix a bug or change the behavior, so I still need to figure what $a, $b, $ab, $c, and d are and what their purpose is.
In other words, I have nothing against good comments, especially on obscure code, but I think they're only marginally useful even in those cases.
Definitely. I wasn't trying to say that they were a substitute. I specifically said both could have worked. In the example they cited, they couldn't even find where to begin. A small comment could have done something about it.
When you can't have both clear code and commenting, you could at least try to do one every once in a while. How can that be a bad thing?
Respectfully, commenting code and making it easily readable don't usually add to the bottom line. So while we all miss playing Pinball, it would be a waste of money to the shareholders.
This game was another casualty of maximizing value towards shareholders and away from all other stakeholders.
I disagree. Sure, in the short term, taking on that technical debt may have made sense, but in the long term, this is a perfect example of how technical debt added significant maintenance costs to a project.
What if dropping Pinball was not an option? The cost to the shareholder would have been much higher than it needed to be.
There's some commonly repeated (and likely even true to some extent) that ~60-70% of the cost of a software system is in maintenance. Which to me, sounds like if you're the contractor providing the system, that means 60-70% of the value is in the maintenance. It does mean taking something of a hit upfront so make things maintainable, but with the right billing structure you can be adding to the profits by Doing It Right as you go. It's certainly still a gamble though; you might not get the maintenance contract. You might be billing hourly and thus don't see any extra cash for your timesavings (and indeed, might see less).
There are some readability problems that stem from a lack of comments, and there are other readability problems that stem from exceedingly poorly-written code.
Heck, we couldn't even find the collision detector!
This sentence tells me that the big problem here comes from the latter, much worse, source of of readability problems. The fact that they couldn't find the collision detector implies that the code had no class named CollisionDetector, no procedure named DetectCollisions, nothing of the sort. Which implies (if not conclusively) that there was a whole lot spaghetti code floating around in there. Or at least a whole lot of very poorly-factored code.
I do think that comments are an important part of making code readable and maintainable. But comments are there to handle less drastic readability concerns. Documenting what units a parameter should be in, for example, or saying, "The following code implements X algorithm with Y and Z optimizations applied." Comments should never just be a substitute for using language-level code organizational and structuring constructs to organize and structure one's code.
A programmer from the original game commented on the article and said much of it was done in asm then later ported to C by msft, so it probably was really that messy.
I'm pretty sure I saw a library somewhere that could be purchased in both "raw code" and "commented, documented code" forms.
Which I can only assume meant they had the comments/documents and were intentionally stripping them out for the cheaper version. The difference in price was also actually quite significant as well, as I recall.
That, along with the fact that the person doing the purchasing is not necessarily the person who would be using the product, makes this IMO a particularly nasty sales tactic.
I've also heard war stories from Big Company folks about contracting out for a large software project, and per contract receiving the full source for the app, but none of the (particularly convoluted) build scripts or Makefiles.
Lessons Learned are why BigCo has a 900 page specification document including glossary of what the word "means" means.
This is also a great example of what happens during acquisitions.
Space Cadet was developed by Cinematronics (apparently no connection to the classic arcade game manufacturer) and published by Maxis. Maxis eventually bought Cinematronics. Then (according to Wikipedia) "The studio was closed and the employees were laid off when Maxis was acquired by Electronic Arts in 1997."
So someone out there knows how to fix it. I bet Microsoft called EA looking for a quick/free fix, EA called the employee (or lost track of them, more likely), and the employee told them to stick it.
Comments under the article seem to imply that most of the relevant code was somehow hand converted from (probably well commented) assembly to C which I only assume was done by someone who had no interest in how or why the code actually works, which seems to perfectly explain the lack of comments (or meaningful structure for that matter).
You don't understand. The kind of person who writes an undetectable collision detector will never write a comment like this. He might write some other comment, but it will be as obscure as his code.
I usually argue that version control history is often as useful as code comments. In this case, though, I suppose comments would have been better than some arcane 15 year old VCS :)
Open source wasn't exactly a huge priority for Microsoft in 1995 when the application came into existence. In fact I think it's a testament to their backwards compatibility that pinball survived for as long as it did with little or no maintenance.
Indeed, and it was still on their agenda in 1998. Microsoft saw Linux as their worst enemy and developed business strategies to eradicate it (by means of 'de-commoditizing'). Then, the 'Halloween Documents' leaked. ESR writes:
"In the last week of October 1998, a confidential Microsoft memorandum on Redmond's strategy against Linux and Open Source software was leaked to me by a source who shall remain nameless." This document can be read at http://www.catb.org/esr/halloween/halloween1.html
Elaborate on what? It seems rather obvious that a popular game with a minor bug rendering it unplayable would have that bug fixed by someone for free if it was open source.
Or do you want me to elaborate on why it's too bad? Well, pinball is the best game that was ever included with Windows, so it's unfortunate that it's not still there.
First, it was not a minor bug, it is a major bug that made the whole product unusable.
Second, the whole code base was one big, nasty, uncommented mud-ball. From the article: "... we simply couldn't figure out why the collision detector was not working. Heck, we couldn't even find the collision detector!"
I was genuinely curious to hear how you think sprinkling fairy dust open source on top of the mess is going to magically fix the issue. I am not talking about reimplementing the whole thing from scratch (which nobody is prohibiting you by the way). I am asking about the value added to the community to fix this particular piece of crap.
But, based on your response, I am going to risk even more karma and chalk it up to trolling. You did not address my question, and don't even look like having read the article or understood the issue. So... have fun!
People not in the middle of working on the program don't understand the magnitude of the problem and it's way easier to be an internet warrior shooting off their mouth. GP's comments made it appear to be trolling.
In fact the problem was way worse, the program used floating point for movement computation and could accumulate floating point errors along the way and moved too much. The floating point size was changed in 64-bit and exacerbate the problem. That's probably why the collision detection didn't work.
It's better to rewrite from ground up than fix it.
Windows has millions of users. If Microsoft went out and said: "Hey, this game you all like has a bug in 64-bit versions, here's the source, any ideas?", I bet tons of people would have invested a couple of days. As you see in the blog post, it was not even possible to invest that amount of time into the problem within Microsoft.
Open source is not magic in the least. It just enables other people to try and help out if your project means anything to them.
If the game had been open source from the beggining, it probably would not be an undocumented mess right now. Sure, open source is no guarentee for quality, but open source and popular means that many eyes will look at the code, and it will naturally develop documentation.
Why are you so angry? Is it so offensive to you that I think that millions of programmers throughout the world with nearly unlimited time could do something that a couple Microsoft programmers couldn't do in a short period of time?
I can't try this right now, since I don't currently have access to a Windows machine, but can anyone confirm/deny this? Do you get the floating-point gravity bug?
Heh, 's nice to read about that and as a pinball-enthusiast i'm a bit saddened for it to be gone, even if it is more for retro-reasons as opposed to it being special in some way...
Maybe MS should team up with someone who understands making (digital) pinballs, like the guys that created Zen Pinball?
Look for Danny Thorpe's comments, he mentions that it was created by a company called "Cinematronics", which was later acquired by Maxis, which was then acquired by EA.
To me, the lesson is: Do not duplicate code! and keep it simple stupid
When implementing Unicode, MS said "hey let's copy all of our API functions to new names". Now they have two problems.
When implementing 64-bit, MS said "hey let's have a completely different OS for 64-bit." Now they have four problems.
Compare this to Unix/Linux/MacOS where Unicode is just implemented as a standard way (UTF-8) of encoding characters into the existing API. And there need only be one shipping OS which supports both 32-bit and 64-bit process types equally since there are only a handful of kernel APIs compared to thousands for Windows.
There is a great deal of both misunderstanding and ignorant Microsoft bashing in this comment.
First of all, you are mixing up two completely different concepts.
For character encoding on Windows:
For many functions in the Windows API there two versions of a function, one with an A (for ANSI) at the end and one with a W (for wide). This was added to make it easier to support Win32 on both Windows 95, which used 8-bit characters and codepages and Windows NT which was natively utf-16 unicode. At the time utf-16 was considered the best and most standard choice for supporting unicode. In most cases it is implemented as W function with an A function that is little more than a wrapper.
This has nothing to do with what Raymond is describing.
For the 64-32 bit stuff they ensured that all code would compile and work correctly with both 32/64 bit stuff and built two versions, one for ia32 and one for amd64. The kernel would have to be modified to support the amd64 architecture. This is exactly what Linux, OSX and other operating systems that support multiple architectures do. On top of this, because amd64 supports backwards compatibility, they also included an ia32 environment with it as well, but this is optional, so anything that ships with the OS cannot depend on it. I assume this is what OSX does too, the only difference is that with Windows the two versions ship as different SKAs, and MacOSX ships with both versions and installs the one that the computer originally shipped with.
Second, the number of system calls has nothing do with any of this at all.
Windows unicode support predates the existence of UTF-8 -- so that's great API design for Windows if you possess a time machine.
The ANSI functions merely map to the unicode functions.
In addition, the Windows Kernel is probably similar in API size to the Linux kernel. Of course, that's not nearly enough API for a complete Windowing operating system in either case.
The point is both Unix and Windows faced the exact same problem: needing to support larger characters. MS thought little and unleashed their army of coders to do something foolish. The Unix guys thought hard and did something elegant and sensible.
NT was started around 88 and released in mid 93. UTF-8 wasn't ready until early 93. It's too bad they couldn't go back in time to retrofit everything.
Programming is a series of tradeoffs. Unless you are in the middle of doing it, you don't understand the pressure the programmers were facing and the tradeoffs needed to be made. The Windows and Unix guys didn't face the same problem since they were different problems with different tools available in different time periods. Hindsight is 20/20 and is easy to be dickish to laugh at their mistakes afterward.
A huge amount of Unix software uses UTF-16 just like Windows (including Java). You're just being deliberately ignorant of the history. UTF-8 didn't exist and UCS-2/UTF-16 was the standard. One could argue that the Unicode Consortium screwed up assuming that Basic Multilingual Plane would be enough characters for everyone.
I am not ignorant of the history. Please try to follow this reasoning:
1. Faced with a new character set which was larger than eight-bits (Unicode 16-bit) Microsoft said "hey let's make an all-new API" and set to work rewriting everything
2. Faced with a new character set which was larger than eight-bits (Unicode 32-bit), the Unix guys said "hey let's create a standard way to encode these characters and rewrite nothing.
You seem to be fixated on the difference between the new character sizes. Ignore the precise number of bits! The point is when making a change to adapt to a new system, do you rewrite everything and risk causing bugs everywhere, or do you do something clever which has far less risk and uses the same API?
#2 massively ignores the fact they didn't bother to solve the problem until much much later. In fact, everyone else solved it the same way as Microsoft before then even on Unix. Actually, a bunch of Unix guys were involved in the design of UCS-2 and UTF-16 so I'm not sure why it's Microsoft's fault.
But yes, some Unix guys eventually faced with a bigger problem, significantly more time, and a design already started by the Unicode Consortium eventually solved it better. But that's not really much of an argument.
Also arguing that there is no risk going to UTF-8 is ridiculous. Anything that treats UTF-8 as ASCII, as you suggest, is going to do it wrong in some way. At least making a new API forced developers to think about the issue.
They didn't exactly face this problem. Linux kernel actually mostly has no idea of any kind of unicode or encoding except two places: character console code and windows-originated unicode based filesystems. It's interesting to note that NTFS in windows kernel implements it's own case folding mechanism for unicode and that this is also probably only significant place where windows kernel has to care about unicode.
What are the consequences of your preferred approach for legacy code? While there are certainly disadvantages of duplicating APIs in terms of cruft, I think that many customers appreciate that old code continues to run (and compile) against newer versions of the OS.
I believe you'll find it to be Windows <insert version here> 64-bit Edition.
A quick googling suggests that Windows 7 retail copies have both 32-bit and 64-bit editions, but I'm guessing you still have to pick the right edition when you go to install it. And if you don't have a retail disc, then you probably only got one of the two architectures.
Compare this with Mac OS X, where there is no "32-bit edition" or "64-bit edition", there is just "Mac OS X". Everybody installs the same OS, and it transparently works for both 32-bit and 64-bit machines. The only time when you really have to care is if you're installing a 3rd-party kext (kernel extension), as 32-bit kexts won't work on a 64-bit kernel, but that's pretty rare and any kext that's still being maintained today will support both architectures.
That's a nice description of what's going on, but it really boils down to:
* Running 32-bit kernel on 32-bit hardware
* Running 32-bit kernel (mostly; enough that kexts are 32-bit) on 64-bit hardware
* Running 64-bit kernel on 64-bit hardware
Interestingly, for a while only the Xserve defaulted the kernel to 64-bit mode, whereas all consumer machines used the 32-bit kernel mode, though this could be toggled using a certain keyboard chord at startup (I think it was holding down 6 and 4). Eventually this shifted so the 64-bit kernel became the default, but only after giving kext authors enough warning so they could update their kexts.
In any case, as diverting as this is, it doesn't really matter to the consumer. There was only one version of OS X to install, and it ran in whatever mode was appropriate for the machine in question. The only reason consumers ever had to care what mode their kernel was running in was if they wanted to use a kext that did not have both 32-bit and 64-bit support, and by the time the kernel switched to 64-bit mode by default, this was pretty rare.
> but I'm guessing you still have to pick the right edition when you go to install it. And if you don't have a retail disc, then you probably only got one of the two architectures.
Good guess. They are separate versions, and only retail includes both.
And, whats more, you can install Leopard on a 64 bit x86 machine, then dd the HD over to a ppc machine, and it will boot, as long as you use the right partition format.
I do believe that Mountain Lion dropped 32-bit kernel support. But Apple hasn't produced a 32-bit machine in years, and every new major version of the OS does tend to drop support for old hardware. The fact that the motivating factor here was dropping the 32-bit kernel is fairly irrelevant.
Even when you look at syscalls themselves there is difference: Linux kernel deals with mostly opague strings of bytes (that are today by user-space side convention mostly utf-8) while NT kernel deals mostly with UCS-2 unicode codepoint strings (that are sometimes UTF-16 and this way madness ensues).
Again, you are confused with user mode API (Win32/64) and kernel mode service API. Windows does have kernel mode service API. Just it's not well known since most people don't need to deal with them.
Apparently, someone has actually FIXED it and put it online. Although hard to know if the linked EXE isn't just a trojan. Anyone have a clean room they can test it in?
[That would have been even more work, because there was at the time no infrastructure in Setup for having 32-bit-only components. (And then automatically uninstalling it when WOW64 was disabled.) And besides, all the people who criticized Windows 96 as "not really a 32-bit operating system because it has some parts in 16-bit" would use the same logic to say that 64-bit Windows is "not really a 64-bit operating system." -Raymond]
Man, if that's the case, does using a BIOS mean I'm still running a 16-bit operating system?
If so, then I guess I've got to marvel at how popular "16-bit" Linux is - I can run 64-bit only applications, watch Flash, Netflix... whoever knew a 16-bit OS could be so powerful!
Yup. All you've got to do is copy it over from a Windows XP machine, if you have one laying around.
Aside from the explanation others have quoted, I'm guessing some of it was also an aesthetic decision. They could probably still go and bring it back using WOW64, but it still looks like a Win95-era program, whereas all the other games have been replaced with better-looking rewrites. That and I think it only supports a resolution of 640x480.
> nobody at Microsoft ever understood how the code worked (much less still understood it), and that most of the code was completely uncommented, we simply couldn't figure out why the collision detector was not working. Heck, we couldn't even find the collision detector!
This continues to be one of my pet peeves, particularly with code samples and a lot of what is posted on Github, even major libraries. Almost no comments, background or guidance on the intent and structure or the code.
No, I am not suggesting that something like this is necessary:
// Iterate through all elements
for(int i=0; i < count; i++
{
// Check that velocity isn't above threshold
if(velocity[i] > THRESHOLD)
{
// Limit velocity
velocity[i] = THRESHOLD;
...
That's ridiculous. However, something like this is useful:
I've heard some say "I just write self-documenting code". That's a myth but for the simplest of structures. Any non-trivial piece of work is far from being self-documenting. Code is self-documenting for the guy who wrote it. I guarantee you that anyone else reading it has to reconstruct a stack in their head to understand what the hell is going on. That's not self-documentation. I shouldn't have to think to understand what a chunk-o-code is doing. The same for functions and/or methods.
The myth of self-documenting code is easy to demonstrate if I show you a piece of code in a language you don't know. I am going to assume that most programmers these days don't know assembler, Forth or Lisp.
I wrote this twenty years ago. Even if you understand Lisp you'd have to think it through. However, this is not how I wrote it. This is what I actually wrote:
;=====================================================================================================
; GetPolylineEndEntities
;
; Argument: Polyline entity name
; Return: Two element list containing the first (if any) entity found at the end of the polyline.
; The polyline itself is excluded.
; If nothing is found at a particular end, that element in the list is set to nil.
;
(defun GetPolylineEndEntities ( plename / plends cvcenter cvsize oldcmdecho pt1 pt2 endss outlist)
(setq plends (GetPolylineEnds plename)) ;Get the endpoints
(setq cvcenter (getvar "viewctr")
cvsize (getvar "viewsize")
oldcmdecho (getvar "CMDECHO")
)
(setvar "CMDECHO" 0)
(foreach point plends
(progn
;Examine what connects at each end
(setq pt1 (add2d point '(-0.0125 -0.0125)))
(setq pt2 (add2d pt1 '(0.025 0.025)))
;Zoom to the end being analyzed to have better selection accuracy
; **** Have to figure out a way to do this without zooming ****
(command "zoom" "c" point 2)
;Eliminate the original cable from the resulting selection set
(setq endss (ssdel plename (ssget "C" pt2 pt1)))
;Add the first entity found to the output list
(setq outlist (append outlist (list (ssname endss 0))))
)
)
(command "zoom" "c" cvcenter cvsize)
(setvar "CMDECHO" oldcmdecho)
outlist
)
Even if you don't know Lisp you now have an idea of what this code is doing. My style has changed over the years. This isn't my best example, but it is here to drive a point home.
The use of an unfamiliar language serves to illustrate the point that the idea of self-documenting code is, again, a myth. I wrote that code myself and without the comments I'd have to mentally reconstruct every step to even begin to understand what's going on and what the intent was. I haven't touched Lisp in quite some time.
I've looked through so much code in Github without a single comment that it makes me wonder if this is what is being taught in schools these days. Accurate in-code documentation is, as far as I am concerned, part and parcel of becoming a professional programmer. It recognizes that the work represents a huge investment in time, money and intellectual effort and it ensures that this effort and expense doesn't have to be duplicated in order to maintain, evolve or migrate the product as the Microsoft example clearly demonstrates.
MSG_LOOP_PostMessage:
mov a, MSG_Head
cjne a, #MSG_BUFFER_END - 2, mlpm0
mov a, #MSG_BUFFER
sjmp mlpm1
mlpm0:
inc a
inc a
mlpm1:
cjne a, MSG_Tail, mlpm2
clr a
ret
mlpm2:
mov r0, MSG_Head
mov @r0, MSG_Code
inc r0
mov @r0, MSG_Parameter
mov MSG_Head, a
mov a, #1
ret
I wrote that about fifteen years ago. Of course, the routine's label tells you something: "MSG_LOOP_PostMessage". It must post a message to a message buffer. The rest is giberish. What's the intent behind each and every block of code? Well, of course, that's not how I wrote it. This is what I wrote:
; POST MESSAGE --------------------------------------------------------------------------
; This routine posts a new message to the message loop buffer.
;
; The message code and parameter are written to the buffer only if buffer space is
; available. This is determined by first looking at the difference between the
; head and tail pointers. By design, we sacrifice one set of locations (two
; bytes) in the buffer in order to create a gap between an advancing head pointer and
; a lagging tail pointer. If a lot of messages are issued and not processed immediately
; the head pointer will quickly wrap around and threaten to collide with the tail
; pointer. By sacrificing a set of locations we avoid having to keep a counter of
; unprocessed messages. This, because there would be ambiguity when both head and
; tail pointers point to the same location: it could mean that there are no messages
; to process or that there's a buffer full of messages to process. The two byte
; gap we are imposing between head and tail removes this ambiguity and makes it easy
; to determine the buffer empty and full conditions.
;
; If there's space for a new message it is stored and the head pointer advanced to
; the next available location. However, if no space remains, the message is discarded
; and the head pointer will remain one message (two bytes) away from the tail pointer.
;
; Arguments
; ----------------------------
; MSG_Head Message buffer head pointer
; MSG_Tail Message buffer tail pointer
; MSG_Code Message code
; MSG_Parameter Message parameter
;
; Return
; ----------------------------
; ACC = 0 -> Message did not post
; ACC <> 0 -> Message posted
;
; Modified
; ----------------------------
; MSG_Head
; ACC, R0
;
; Preserved
; ----------------------------
; MSG_Tail
; MSG_Code
; MSG_Parameter
;
MSG_LOOP_PostMessage:
mov a, MSG_Head ;Increment the head pointer by one message
cjne a, #MSG_BUFFER_END - 2, mlpm0 ;and parameter.
mov a, #MSG_BUFFER ;Need to wrap around.
sjmp mlpm1 ;
mlpm0: ;
inc a ;No need to wrap around, just increment.
inc a ;
mlpm1:
cjne a, MSG_Tail, mlpm2 ;Check for a buffer full condition.
clr a ;Flag that we did not post the message.
ret ;Exit if it is.
mlpm2:
;The buffer isn't full, we can store the new message
mov r0, MSG_Head ;Store message
mov @r0, MSG_Code
inc r0
mov @r0, MSG_Parameter ;Store parameter
;Now set the head pointer to the next available message location.
mov MSG_Head, a ;The accumulator already has the next "head" location
;there's no need to execute the same logic again.
mov a, #1 ;Flag that the message was posted
ret ;and exit
Now you don't have to know assembler to understand this code. A couple of years later I had to re-write this in C. It was incredibly easy to do because of the exhaustive in-code documentation. This is what came out:
// POST MESSAGE --------------------------------------------------------------------------
// This routine posts a new message to the message loop buffer.
//
// The message code and parameter are written to the buffer only if buffer space is
// available. This is determined by first looking at the difference between the
// head and tail pointers. By design, we sacrifice one set of locations (two
// bytes) in the buffer in order to create a gap between an advancing head pointer and
// a lagging tail pointer. If a lot of messages are issued and not processed immediately
// the head pointer will quickly wrap around and threaten to collide with the tail
// pointer. By sacrificing a set of locations we avoid having to keep a counter of
// unprocessed messages. This, because there would be ambiguity when both head and
// tail pointers point to the same location: it could mean that there are no messages
// to process or that there's a buffer full of messages to process. The two byte
// gap we are imposing between head and tail removes this ambiguity and makes it easy
// to determine the buffer empty and full conditions.
//
// If there's space for a new message it is stored and the head pointer advanced to
// the next available location. However, if no space remains, the message is discarded
// and the head pointer will remain one message (two bytes) away from the tail pointer.
//
//
// Return
// ----------------------------
// 0 = Message did not post
// 1 = Message posted
//
//
U8 msg_loop_post_message(U8 message, U16 parameter)
{
if(msg_loop.count == MESSAGE_LOOP_SIZE)
{
return(0); // Can't post because buffer is full
}
else
{
msg_loop.item[msg_loop.head].message = message; // Post message
msg_loop.item[msg_loop.head].parameter = parameter; //
msg_loop.count++; // Update message count
if( msg_loop.head == (MESSAGE_LOOP_SIZE - 1)) // Wrap around?
{
msg_loop.head = 0; // Yes.
}
else
{
msg_loop.head++; // No
}
return(1);
}
}
This code is definitely readable, but I feel like the port from assember to C might have been too direct. Originally head was a pointer and it made sense to have conditional jumps, but in the C code head is an offset starting at 0. Why keep the entire if structure when you can simply have
msg_loop.head = (msg_loop.head + 1) % MESSAGE_LOOP_SIZE // Advance head and wrap
I remember I had to convert this to C in a hurry because new features had to be added. Doing it in assembler was just about unthinkable. The entire project consists of nearly 100 assembler files and resulted in about 90 files once translated into C.
This was a marathon run and once the entire project was translated and working the focus was narrowly shifted in the direction of adding functionality rather than optimizing the code.
This code was running on an 8051-derivative. Performance was important. You are always counting clock cycles on these systems. I think that if you look at the resulting machine code with the relevant variables mapped to the data memory space the modulo statement doesn't really do any better than a spelled-out conditional statement. In fact, it could be a couple of clock cycles slower (just guessing).
Again, it's been a long time. I can totally see looking at the generated assembly and saying something like "That's as fast as it it going to get. Move on!".
We all make less-than-elegant code at one point or another. Hopefully not too many times. :-)
> the modulo statement doesn't really do any better than a spelled-out conditional statement
Actually, on modern processors the modulo would probably be more efficient than the conditional due to cool stuff they do like pipelining. Pipelining is basically where they execute instructions ahead of time (sort of in pseudo-parallel); and so when there's a branch, there's a fair chance the results from future processing might have to be thrown out.
One of the techniques that's used to go around this is branch prediction[1], where they initially guess whether a branch will be taken or not and then depending on whether the branch was actually taken, they change their guess. This works particularly well for for loops and the kind.
Another development on the ARM side (and MIPS too, I think) are instructions that are embedded with conditionals in them.[2]
If you throw methods and functions at everything all you are doing is adding the processing overhead of the entry and exit from the method or function. These are not magical entities. There's a time, a place and cost to using them.
Having come up from assembly and, in general, low level coding, one becomes very aware of what is being created behind the scenes. Unless something like this contrived bounds-check test will be used multiple times across a module or modules there's no reason whatsoever to add the overhead of entering and exiting a function for a simple couple of if/else-if/else statements.
I see this all the time. Everything has to be an object and everything has to be a class with a pile of properties and methods. No it doesn't. Massive projects --critical projects-- have been done over the years without any of that. Be careful not to engage in creating a monument to a coding paradigm rather than a tight, fast, practical and sensible solution to a problem.
Then tell your compiler to inline it - but don't bother doing it until you have measured it to actually have a measureable impact. Readability and rewriteability dominate that kind of unguided microoptimizations you seem to like all the time.
That doesn't mean to abstract for the sake of abstraction but to choose the right abstraction. Writing code so that it is easy to reach the right abstraction helps a lot.
I used to play Pinball at school whenever I was in the Computer Lab. I also spent the bulk of my time in the computer lab at recess. Brings back old memories.
Ah, the quantum tunneling pinball!
We ran into this while writing the original code at Cinematronics in 1994. Since the ball motion, physics, and coordinates were all in floating point, and the ball is constantly being pushed "down" the sloped table by the gravity vector in every frame, we found that floating point error would gradually accumulate until the ball's position was suddenly on the other side of the barrier!
(To simplify collision detection, the ball was reduced to a single point/vector and all barriers were inflated by the ball radius. So, if the mathematical point got too "close" to the mathematical line barrier, a tiny amount of floating point rounding or truncation error could push the point to the other side of the line)
To mitigate that, we added a tiny amount of extra bounce to push the ball away from the barrier when it was nearly at rest, to keep the floating point error accumulation at bay. This became known as the Brownian Motion solution.
Since much of the original code was written in x86 asm to hand tailor Pentium U/V pipelines and interleave FPU instructions, but wiki says Microsoft ported the code to C for non-Intel platforms, I'm sure the code had passed through many hands by the time it got to you. The Brownian Motion solution may have been refactored into oblivion.
http://blogs.msdn.com/b/oldnewthing/archive/2012/12/18/10378...