Hacker News new | past | comments | ask | show | jobs | submit login

A few years ago I needed an emulator for version N of a popular microcontroller. I found an emulator (written in Java) for version N-1 of the same microcontroller. It was written in the style you describe: one derived class per machine instruction type; program to run in emulation is a collection of these via their abstract base class. It was quite pleasant to extend, actually: I simply made classes for the new instructions available in the new version of the microcontroller, added instances of them to the giant table mapping codes to instruction objects, and It Just Worked.

I do note that that is an atypical experience to extending programs in general, and I don't have the alternate, functional approach to compare it to (the emulator was written in a very old version of Java + Swing).




In functional style, you'd probably have a table mapping machine instruction type to a function. These days, even Java (since version 8) supports that, and it roughly compiles down to the same thing (due to how Java implements lambdas).

This case sounds simple, but I still expect the functional approach to save you from a lot of line noise coming from Java class bureaucracy. But in more complex cases, it may be a difference between a couple dozen lines and a couple dozen files.


On the other hand, the OOP approach allows you to add printing the function name (for disassembly views), metadata attributes about what flags are affected, clock cycles, etc. You could argue that each of those could be separate tables. Funny how it comes back to the array-of-structs vs struct-of-arrays design question, or maybe I'm just seeing that everywhere because of thinking about Mike Acton's Data Oriented Programming talk.


You get that with Java 8 and later because you can just define a FunctionalInterface and then use a mapping table to function-to-use and it gets really compact. If you want to decorate arbitrary functions that should be fairly trivial too.

Wrote something small like that for a random data generator for an in-house properties tester in Java a short while back. It makes it easier, in my opinion, when you have all the code available in a single screen than if it’s spread out.


Clojure. Clojure is extremely data oriented.


I think the most of the data-oriented-programming stuff referenced nowadays are not about conceptual data (how humans understand it), but data in the hardware (how machines understand it). In that regard, in Lisp you can’t really control memory allocation/alignment as flexible as C/C++, (due to its conceptual assumptions with data), so I don’t know how you would do DOP well in Lisp:

(although it might sometimes serve well as a configuration language alongside high-performance C++ code like in gamedev...)


Those are two different kinds of programming, which unfortunately get similar labels.

The C++/gamedev "data-oriented programming" is about structuring your in a way that's friendly to CPU cache. It turns the programmer-friendly abstractions inside-out in order to make execution as efficient as possible.

The Clojure/Lisp/functional "data-oriented programming" is about building your (programmer-level) abstractions around basic data structures like lists and maps. It's meant to simplify your code and make it more robust.

The two "data-oriented programming" paradigms are pretty much opposites in every way except the name. They both focus on data, but the resulting data structures are vastly different.


Actual Lisp implementations need to interface to the C/C++ side, thus they have non-standard / quasi-standard ways to deal with raw/foreign memory and with C types.


I'm... not sure how we went from replacing a function-like object with a function to adding disassembly and clock cycles. Am I missing some context here?


Can't edit or delete anymore so replying to myself. I confused the subthreads and thought this was still the "interview quesiton" part. Obviously the assembly/clock cycles are relevant to the phaedrus's example.


IMHO the simplest and most straightforward solution for a CPU emulator is a single (or perhaps nested) switch statement. Finding the right case and adding code there is much less annoying than the overhead of creating and updating classes and likely also separate files.

On a related note, this is also why I don't like at all the current C#/Java "modern" style of dozens of files with perhaps one or two actual useful lines of code (the bulk of those files being boilerplate fluff and stupid comments like "// @return the return value".) Debugging is especially irritating since stepping through the code becomes substantially nonlinear with deep call stacks.


Over the years I’ve heard some good advice on this sport of thing attributed to Bertrand Meyer. I keep meaning to track down citations.

There are conversational styles you can apply to code that make it simple to test (or in his case, make coherent assertions about) that still keep the methods small but cleave it in a way that you can still track in your working memory.

It’s a style I aspire to, and put a great deal of time and energy into. But reading my old code I can say it’s harder than it sounds and it doesn't sound that easy to begin with.


IMHO a CPU emulator is so simple and well specified, that most any programming style will render a nice solution.

(A CPU must almost by definition must be simple to implement.)



Interestingly, you managed to illustrate the O(n) debate above pretty well.

O(n) is short hand for here is "what happens when n gets large, but it I'm telling you nothing about when n is small". For example some sorts are O(n log(n)) and some are O(n^2). Naturally for big sorts O(n log(n)) is preferable - but when n is 5 say the O(n^2) will often be faster. People use O(n) precisely because it carries all this unsaid nuance.

In your example you probably have upwards of 30 instructions but possibly hundreds, which sits comfortably in the big side of things. If there was a way to reduce that to an O(1) solution (which implies that if you can add many new instructions without adding lines of code) then you've coded it badly - but we all know that's not possible for a micro controller emulator.

In his example he used O(n), but implied a O(1) solution is possible. The same logic applies. His use of Big-O means for large n, it's possible to add many new cases without adding code. If that is true, his solution is far better than the one wanted by the interviewer. However, it's likely n was small because this code had to be written for a job interview. It's entirely possible he managed to find a enough commonality in those small number of cases to reduce it to being 6 lines, while in the real world problem is indeed O(n). In that case the correct solution depends on the instructions he was given - whether he was asked to solve this particular problem in an optimal fashion, or demonstrate how he would solve the problem in general using those n cases as an illustration.

He doesn't say, so we don't know. His use of O(n) does hint - but the grey beards among us would like to see proof the problem really can be solved with O(1) and the interviewer was demanding an O(n) solution despite that. It does seem like a really weird thing to demand.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: