Don't laugh: this is the next big step in cloud computing infrastructure. Imagine Linux compiled to the BrainFuck ISA running a Docker container running a BrainFuck web application. Of course, we will clearly need a solid BrainFuck-to-JS transpiler to make things web scale.
The joke is gonna be on all of you when I deploy a 256-core machine in production. :P
Then you'll modify it to be, "Have you heard of anyone running BrainFuck in production other than Nick P's BrainFuck-as-a-Service platform of questionable longevity?
Nice haha. I was just going to call it the BrainF interpreter with references to prototypes A-E. Ill say I put 256 "brains" in each FPGA or ASIC. I'll reference the performance of various machine learning algorithms. I'll have comparidons showing speedup over a Core 2 Duo I with less watts.
Fast-forward time some to see it becomes a post-mortum, legacy system, or acquired by Novell as their entry into AI. They promise it will be the success of Netware all over again. Audience at conference pauses at the ambiguity of that statement unsure if they should cheer or charge out the door.
Oh the funny part is I just slapped BrainFuck and a little creativity on top of the standard M.O. of the hardware accelerator industry. All the ones getting tons of VC money or revenue. Including DWave that showed the DWave-specific algorithm got a million times speed-up over implementing the DWave-specific algorithm on a general-purpose, barely-parallel CPU. Was that an innovation going a million times faster or a shoddy component going a million times slower? We don't know but they have reputable customers shelling out big cash. Lmao...
I wasn't aware of how flimsy the tech is. Guess I need to do more reading on the topic.
But it looks like there is a legitimate and interesting shift towards running stuff on FPGAs. Google, FB, and MS are all doing something with ML hardware acceleration.
I actually attended a talk a few weeks back by Microsoft's Doug Burger[1]. He has been leading a team that has created a low-latency FPGA network to accelerate stuff within MS. The eventual goal is to allow customers to take advantage of this distributed FPGA fabric to run custom firmware.
He said that FPGAs now run several of Bing's core search algorithms and Azure has some stuff running on FPGAs too. I forgot the exact performance gains, but it was somewhere around 2x for Bing with extremely stable response time even at insane server loads.
One interesting factoid is that they were able to translate Wikipedia in its entirety from English to Russian using 90% of the currently deployed FPGAs in around 100 ms. Insane stuff.
In second, this jumps out at me: "The first issue is that the problem instances where the comparison is being done are basically for the problem of simulating the D-Wave machine itself. There were $150 million dollars that went into designing this special-purpose hardware for this D-Wave machine and making it as fast as possible. So in some sense, it’s no surprise that this special-purpose hardware could get a constant-factor speedup over a classical computer for the problem of simulating itself."
Actually gives me an idea. Instead of comparison to BF competitors, I could actually just compare a massively-parallel, BF CPU to 256 interpreters communicating with each other through IPC running on a general-purpose computer. I'd show the CPU performed many times better. It's the closest thing I can think of to how D-Wave is doing benchmarking. The difference is $150 million is not in either my bank account or addition of transaction history.
"One interesting factoid is that they were able to translate Wikipedia in its entirety from English to Russian using 90% of the currently deployed FPGAs in around 100 ms. Insane stuff."
Didn't know about that project. Pretty cool. Yeah, the FPGA projects have been doing all kinds of stuff like that going back to at least the 90's from my reading. The speedups could be over fifty fold. Some claimed three digits. Other programs harder to parallelize & reduce... which is basically what they do on FPGA... might have under 100% speed up, tiny speed up, or even a loss if it was sequential algorithm vs ultra-optimized, sequential CPU like Intel's. The latest work, which started in 90's projects I believe, was to create software that automatically synthesizes FPGA logic from the fast path of applications in a high-level language then glues them into the regular application on a regular CPU. You can't get speed-up of actual hardware design but makes boosts easier if problem supports good synthesis. Tensilica is another example of a company whose Xtensa CPU is one that's customized... from CPU to compilation toolchain... to fit your application. Container people are compiling and delivering containers. Tensilica compiles and delivers apps with a custom CPU.
...Well, we can compile TIS-100 programs to BF, and then run it on your 256-core machine. Then the question will become, "Have you heard of anyone running BrainFuck in production, other than Nick P's BrainFuck-as-a-Service platform of questionable longevity, or Qwerty's TIS-100-as-a-service running atop the former?"
Interesting expansion that requires you to buy the new model that's 128-cores that each have an I/O MMU for added security of running the "corrupted" code without permanent damage. Includes actual ROM on-chip for trusted boot and accuracy of simulation. To keep customers happy, replacements for anyone with a support contract are sold at cost [with R&D included].
Yes, we would require a new model, as new IO primitves would be required. Might I suggest ^v/.! as the instructions for writing up, down, left, right and last IO ports, respectively? For reading, we could use &V\,*$ for up, down, left, right, any and last respectively.
We'll take whatever you suggest. So long as we have a DSL for writing our demo programs that buyers can understand and approve of before programmers see the actual primitives. ;)
Well, yes. With that, we can inplement most of the TIS-100 instruction set fairly easily. We may have to exclude JRO, the NIL psudoport, and restrict the use of labels, but it would be fairly complete. Better yet, these restrictions are totally undocumented, so programmers will have to either work with the primitives or cross their fingers that their code will compile each day.
"these restrictions are totally undocumented, so programmers will have to either work with the primitives or cross their fingers that their code will compile each day."
That's a great expansion it being undocumented. Worked wonders for Microsoft's strategy of lock-in. We're going to have to make sure the CPU and platform's API's are considered copywritten on top of that so they might not be... legally allowed... to clone or port it without paying us. We could collect royalties on a BrainFuck CPU for our lifetime plus an arbitrary number of years decided by Congress reps collecting bribes.
My pitch will target the enterprise DB, embedded, and military markets. I'll package it as something for highly-concurrent, real-time programming. Give them a language like ParaSail or Chapel that compiles to BrainFuck. Keep getting subcontracts to use it in criticsl, long-term projects. Eventually, the sham will come out but BrainFuck will be To Big To Fail (TM). Also leverage patent on key innovation forcing uptake of BrainFuck for at least 20 years.
I'm not sure it's worthy of a Bond villain but it's a start. ;)
Whatever, man. When all that falls on its face, CTOs will be googling for peeps with low level brainfuck debugging skills. But they can't hack BF themselves, so I'll ace the interview with no prep, and scoop a six fig salary. Remote, from Amsterdam. Part time, because, priorities.
Nah, they'll ask when they last had this problem. They'll learn that there were these migration tools developed to turn COBOL into C or Java with bug-for-bug compatibility. They'll pay some company like Semantic Designs to automate a conversion from BrainFuck to Rust since it will be popular on embedded systems by the time they're finished. The remoters will then spend all their time enhancing and debugging RustyBrainFuck while writing as much code in Rust as possible to hide the BrainFuck behind a neat interface. Building that was another enterprise project.
Unfortunately, you can't outrun software's hidden assumptions behind correct functioning forever. There will come a time, much like Y2K for COBOL, that their critical database will just loose everything if they don't make an internal change that requires understanding all the Rust, BrainFuck, Verilog, and analog components I used "because I was learning analog at the time."
I can't predict what they will do facing such a situation. I can tell you to start a brewery of fine Scotch that sends flyers to them around the time of the feasibility study. You'll make a killing. :)
It could be worse. No matter how bad the BF->Rust code that's generated is (it'd literally just be BF in a less compact syntax, most likely: it's pretty hard to convert idiom-wise), it's probably better than the Troff sources (as written by Joseph Ossanna).
Honestly, it'd probably be worth the effort to just write the Rust up front. The kind of technical debt you'd land in otherwise... euurgh.
That's been a tough question to answer recently. The greedy and the idiots are always on the top of my list. There's significant overlap in those in market segment that might buy this CPU. They strangely also have a large supply of cash that's been flowing for a long time. I wouldn't have guessed that based on what they taught me back in high-school about economics.
". A large amount of research is focusing on what kind of wimpy machines are best fit for important workloads. In this paper, we present a rather extreme example of a wimpy processor, using the BrainFuck [22] esoteric programming language as its ISA."
Seems legitimate to me. Seriously, why not explore the extremes? Thats where we often learn the most.
I suspect many here think this is a joke, but although does have it's funny side this definitely warrants the research IMHO
Why is the author consistently mixing up "
"instructions per cycle" with "instructions per second"?
Surely, phrases such as "We observed the following Instructions per Second (IPC)" and "assuming a perfect 1 instruction per second on the general purpose processor" should have activated some neurons in some brain, even assuming it got distorted by doing this research?
That's my guess. It's one of the few projects that I thought actually would damage my brain. I avoided it. Unlambda language is another one like that. Although I have considered trying to implement that one, esp in hardware.
This greatly annoyed me too. It's IPC not IPS after all. I'm surprised this didn't get caught by literally anyone who reviewed this paper before hitting submit.
Ah. MIT, never change. Mind, this kind of stuff could come out of other institutions as well. They shouldn't change either.
Anyways, can we have a version of TIS-100 where you write brainfuck with added message sending primitives instead of the TIS-100 asm? Of course, each BF interpreter would be working with only, say, 2 or 3 cells of RAM.
I want to see either that, or a TIS-100 asm compiler for GreenArray chips. Or both.
This is some bullshit. They didnt even port the standard benchmarks to Brainfuck to compare performance against reference implementations. I want to see ZIP, raytracing, web servers, map reduce... actual apps with associated BrainFuck performance. Scratch the whole Future Work section in favor of that given you might accidentally find evidence for using it in production somewhere.
Moore and your GreenArray chips: be afraid! Something even more incomprehensible is coming after your market share!
I wasn't. I'm simply too sane to follow BrainFuck developments unless it's something wild and accidentally useful like a 256-core CPU on HN front page. ;) If they've been implemented, authors should've run some of them on the processor to include benchmarks in the paper. It's standard thing to do in CompSci papers on CPU's, compilers, optimizations, interpreters, anything. They usually include something like that.
Right, okay. It stood out to me too, because there are compilers and larger applications targeting Brainfuck, so it seems strange to only choose “hello world”-style programs.
The competition rules have added constraints to make the interpreter a usual command line tool. However, with memory mapped input and output ports it becomes a lot easier, which would be akin to 8bit micro controller programming. Parsing would take more than one cpu instruction per BF instruction, so the program would need to be assembled. I don't know x86 well enough to write that, though. I imagined, there would be a single instruction equivalent of 'x=*(p++)', but couldn't find anything like that.
In some makeshift syntax and with one instruction too much, I got:
>, < = inc, dec al
+, - = inc, dec [al]
. = out output_port, [al]
, = in [al], input_port
[ = Label: cmp [al], #0
jnz (Labelend + sizeof (jmp instruction))
] = Labelend: jmp Label
The labels are generated by the assembler. The ports would have to be directly soldered to some I/O peripherie, but alternatively the input and output could be memory mapped and fed by interrupts, as well.
LOL. I work in the same building as Urban. I'll be sure to point it out to him on Monday, I'm sure he'll be amused (or annoyed at my position in the queue of people that have already informed him of this). :D
The instruction encoding seems inefficient -- they don't use the full 4 bits of information (the first bit is always zero except for `.`). Why not use a 3-bit encoding, since there are 8 instructions?
They have 9 states including a stop instruction, b`1111 which would preclude a 3-bit encoding. In any case, I suspect it's much easier to handle 4-bit words.
But it's still an entirely pointless 336K. I could use that to store:
-197 copies of nethack
-10% of DarkPlaces
-36% of GCC
-44% of ZSH
-30 copies of CPython (just the executable, of course: that's all you counted)
-10% of ZSNES
-1.36 copies of XScreenSaver
-42% of Teeworlds
All of these are more valuable uses of my precious disk space, because all of them actually do something: They actually give me capabilities I previously didn't have.
GCC and CPython are great illustrations of what Im talking about. Others are good. Also, Oberon System base with kernel, files, editor, viewer, etc is around 116KB per Wirth's paper on FPGA implementation.
So, you get a whole OS with some utilities in between 1/3 and 1/2 of SystemD binary. Also worth noting Oberon includes safety checks, too. ;)
Honestly, I found the tradeoffs of it to be the most interesting. It hits an interesting balance of readable, safe, fast compile, and fast runtime. Includes some OOP and concurrency foundations while barely changing the language. Whole platform can fit in one book. You could theoretically understand whole thing and reimplement in language of your choosing. Was esp good when hardware was weaker and more diverse. Also better starting language than C in education as it can be easy at first with difficulty gradually increased with intro of unsafe constructs. Finally, it's seeing some uptake in embedded with Astrobe IDE.
So, that's a quick review. The main one you'll find is older system where A2 Bluebottle is latest. Runs fast, too.
"Yeah, it does look really cool, just verbose. And you say it's easy to implement...."
It is a compiler and OS. Easy means a different thing in this context versus average usage in software. I'd say vastly easier than trying to understand GCC or Linux. How about that? Also, the original version was done by 2 people and some change. Each port of compiler was done by 1-2 people in relatively short time. Mostly students with basic knowledge in CompSci. Helps it's well-documented.
So, it's not easy as throwing together a web app but can't be ridiculously hard if you take it a piece at a time. The use I had for it, other than learning or pleasure, would be for subversion-resistent, verified-to-assembly builds. It's super easy to learn Oberon with the OS itself straight-forward. People could code it up in a local language, the compiler too, compile those (or hand-done in ASM), and bootstrap into a trusted environment. That can be used to produce the rest with compilers built on top in a memory-safe language that handles foreign code more safely. Better, no patent suits or anything on Wirth-based tech like .NET or Java might get you.
Other than Oberon system, Modula-3 (not Wirth) and Component Pascal (Wirth et al) are most worthwhile to check out in terms of practical languages. BlackBox Component Builder is still in active use with Component Pascal, esp in Russia and Europe. They love it over there since it's got OOP & GUI with Oberon simplicity & safety.
Oberon is mind boggling, in all respects. You can download the emulator (<5mb total). It is SDL2 based so runs like a stand alone app, no VirtualBox or QEMU needed. Then you can get and compile Lola-2, the Verilog pre-processor compiler from Prof. Wirth. Now you can literally compile the RISC5 processor "chip" itself which runs on the FPGA!! Just use the output .v files from Lola-2 and compile with the Xilinx toolchain. Next you can compile the "Oberon OS" quite simply by hand in a few minutes. With a compile script the complete OS compiles in 10 seconds.
Then why not finish off with PICL, the language/compiler for the PIC16. Includes the uploader. All in <700 lines of Oberon code. The best part is the amazing tutorials/documentation. Some amazing finds at Prof Wirths personal site. https://www.inf.ethz.ch/personal/wirth/
Of course. I'm foolish, but not foolish enough to think "easy" compiler/OS equals "easy" webapp. However, I am foolish enough to start any project, although usually not mad enough to finish it once I get an idea of the work involved.
Once again, Oberon looks neat. I am usually not a fan of the Wirthian languages, so I might not enjoy it in the same way I enjoy, say, Python, or Scheme, but it looks interesting.
"Of course. I'm foolish, but not foolish enough to think "easy" compiler/OS equals "easy" webapp. However, I am foolish enough to start any project, although usually not mad enough to finish it once I get an idea of the work involved."
You and I are appearing to be more similar than I thought on these kinds of things haha.
"Oberon looks neat. I am usually not a fan of the Wirthian languages, so I might not enjoy it in the same way I enjoy, say, Python, or Scheme, but it looks interesting."
In your case, rebooting PreScheme to do a small OS like Oberon or clone of it might be a better take. There's already books on quickly putting together a Scheme compiler. The PreScheme and VLISP papers are pretty detailed. Include some safety features from Clay (C-like) and Carp (Lisp) with Scheme's macros and simplicity. Mock-up assembly language in it so you can code & test it, too, in Scheme with extraction process to real thing.
That combo seems like it would work better for you plus result in at least one deliverable: a PreScheme implementation for system programming whose code was compiled with GCC or LLVM. You might find that useful over time esp if you made syntax compatible with one of your typical, Scheme implementations to port those libraries over easily. Split it between GC'd code and non-GC'd code like Modula-3 did.
Seems neat. The original PreScheme is part of S48, which seems to be dead.
Speaking of which, I should probably consider doing that project I was thinking of doing for a while: port SCSH to some other scheme implementations which are actually alive.
So yeah, the trick with doing a prescheme project is that I'd first have to build a PreScheme compiler (with added safety, etc.) and then I'd have to build one that can run on bare metal. The former ought to be possible, especially if I'm targeting C or LLVM (some bits might get a bit rough, but most of Scheme is pretty easy to implement). The latter would perhaps be possible, but making use of it would likely require a more in-depth knowledge of the hardware than I currently possess.
You know, I was joking about porting Oberon to the gameboy, but the GBA is a really well-defined piece of hardware with readily available tooling (which I actually have, because you need it for LSDJ), and a good deal less complexity than a modern x86 machine...
Man, why can't I just write videogames, like the normal people?
Well, my current side project is writing a version of the Tafl-inspired board game Thud that you can play over a network, so I guess I'm already doing that...
"The former ought to be possible, especially if I'm targeting C or LLVM (some bits might get a bit rough, but most of Scheme is pretty easy to implement)."
That's what they did originally. Should work.
" but the GBA is a really well-defined piece of hardware"
Then do a PreScheme wrapper on it supporting inline or calls to assembly for performance-critical routines. See if you can make the primitives on bottom close to how hardware works for efficiency. The AI research of the 80's indicate it might be able to handle a board game.
To my knowledge, CPython is the only piece of software on that list that has its own shared libraries.
And yes, disk usage is a concern. Because those kilobytes add up fast, and I've only got so much space.
And finally, the point isn't merely that it takes up disk: the point is that it's worthless. The old solution to that problem, having init set locale on boot from a config file, worked fine. I don't mind disk use that much. But I do mind pointless software.
All code has bugs: to minimize bugs, write less code.
In comparison, the QNX demo disc had a whole graphical OS in 1.44MB that was more robust than that one file. Seems to be some efficiency or architectural issues in there. ;)
Given, QNX never really caught on in the wider field of computing AFAIK. This may lend credence to the Godot team's theory that public visibility is based upon how big you are (https://godotengine.org/article/godot-aims-mainstream).
I was talking complexity vs size. Gabriel's Worse is Better already tells you that growth has more to do with network effects and marketing than technical correctness. QNX lacked them for desktop or server use but did well in embedded. Almost had a mobile shot but Blackberry blew it totally with marketing plan.
EDIT: Read Godot. That was hilarious but I kept seeing too much reality in it.
Ahh gotcha. Far as other thing, have you seen the Web and enterprise BS getting lots of adoption since big names tried them out? And the drift to incomprehensible frameworks with huge dependencies that one can barely maintain with similar security problems and a tiny fraction of efficiency of C++? And now more of them as SaaS? Just thinking there's lots of that bullshit in the article going on in real life. It's that or legacy SAP, Oracle, COBOL, MFC BS. Competing BS exists but sanity gets rarer and rarer.
Outsourcing was done with great care by a few big name corporations. And after their apparent success every two bit public traded company went on a "me too!" outsourcing rush to try to goose their share value.
Oooh. That. I get a miss on a lot of it, because I'm not in industry, and I avoid Java (the natural focal point of incomprehensible frameworks), and frameworks in general.
On a daily basis, the worst I have to deal with is SystemD, which is creatively incompotent and best, and downright insane at worst. But it's got a pretty face, and the draw of easy-to-understand unit files (over, say, SVinit), so people don't realize the depths of the madness that lies beneath.
Had less attack surface (microkernel), isolated failures, easier to upgrade, and could be self-healing within one node. Important to some people. They kept paying for it until numbers justified RIM buying it for its potential. ;)
Indeed. It's one of the Microkernel architecture's (semi-rare) success stories. It goes to show that yes, Microkernels are complicated, and hard, but if you pull it off there are real benefits.
The [RSSB] Single Instruction Architecture is interesting for a simple processor implementation as well. [Calfee] mentions a relation to "biological mechanical computer", i.e. neural like.
Not sure whether Iota, Jot, and lambda calculus can be executed instruction-by-instruction, but generally speaking, wouldn’t a Turing-complete language with just two instructions allow for even greater throughput than BrainFuck does?
As someone who lists Befunge on their resume, either they get it or they don't
(I followed up an interview with a link to a brainfuck interpreter I did in Haskell https://github.com/serprex/bfhs I got a 2nd interview, but no offer)
Love it.