Hacker News new | past | comments | ask | show | jobs | submit login
RECC, The Robert Elder Compiler Collection (robertelder.org)
85 points by ScottWRobinson on July 2, 2015 | hide | past | favorite | 38 comments



Hi, I'm the author and I would be interested in hearing feedback on what you'd like to see next?

Directions I can go in:

- Fully client side C compiler and IDE

- Full access to parse tree, static analysis with rich html annotations for errors

- For example, if a function call has arguments that don't match the prototype, show a detailed graphic demonstrating that the arguments don't fit into the function

- Write a proper shell for the kernel (right now it responds to single letter commands 'q' 's' 'p'...)

- CPU Emulators for more languages

- Make the compiler produce faster code (it is currently anti-optimized for code simplicity)

- Implement currently unsupported C features (float, double, union, goto, K&R C style functions, ...)

- Implement more of the standard library

- Work toward having better static analysis than GCC or CLANG (but only for C89)

- Write blog posts (about what?)

- Work toward compiling the Linux Kernel

- Create learning tools to help visualize the OS and program stack as the program executes (In addition to showing stack values, provide type annotations via background colours/graphics eg. (0x10000 <= unsigned int))

- Static analysis for MISRA C (at least the rules that aren't equivalent to the halting problem)

- Different CPU word sizes

- Advanced in browser profiling

- A visual tool/animation that shows you every single step of how the C preprocessor works (for complicated cases like recursive variadic macro functions)

- Thousands of other ideas...


Awesome project! The simple CPU target for a complex language excites me. I am looking for a more interesting target for this: https://github.com/tomlarkworthy/firefrack

So my vote would be on getting as much 3rd party stuff compiling as possible with the architecture you have. Linux would be amazing.

This guy did some cool work on compiling linux for 8 bit micros. Might be inspiring if you have not seen it: http://dmitry.gr/index.php?proj=07.+Linux+on+8bit&r=05.Proje...


Yeah, I actually read that guys blog post to see how practical it would be to run something like linux on a smaller word sized CPU. Since the CPU emulation code is only 325 lines of C, it wouldn't be a lot of work to port it to emulate the 32 bit instructions on a very small word size.


I haven't been able to play with this much yet, but it feels an accessible way to play with low level computer ideas.

I feel like this could hit a sweet spot between "toy computer that doesn't do anything fun", and "fully modern computer that is too complicated to understand the entire stack on" for code camps and teenagers.

I'm definitely following with interest, and will be seeing if I can make some fun activities using this for our upcoming code camp.

I would love to know what contributions you would like from the community.


I'm glad to hear that because, although I would like to take this project somewhere where it would be used for real engineering applications, I think it would only currently have practical applications in teaching. I would love to work on this project full time, but unless I could derive income from a project like this, is just isn't practical.

In terms of contributions from the community, I think the two most valuable things any of you could do at this point would be:

1) Tell other people and share + like etc on Facebook Twitter etc. 2) Reach out to me and provide feedback on why this project interests you, and where you think it could go. You can reach me at 'robert at robertelder.org'


It all looks great. Could you perhaps license your test suite and maybe your libc stuff MIT? I understand if you want to keep your kernel and compiler under some license that still allows for commercial control (like GPL or some dual license scheme) but there's not much commercial value in the test suite and I think it could be of value to others :)


What do you think about Apache 2.0?


I myself like to keep the world neatly divided in GPL and MIT, but Apache 2.0 looks permissive enough :) What makes you prefer Apache 2.0 over MIT?


The Apache license explicitly mentions patents. I think that for a project to be successful it should be painless for big companies to use it. Lawyers and IP people at large companies generally have a much more serious tone than programmers when it comes to considering which licenses they can include in their stack.


512 registers! Did you do this so you could punt on register allocation?


In the previous version of the spec, I had a large number of bits that were unused, but once I thought about it I realized that any bits that go unused tautologically guarantee less than optimal performance. I played with the instructions a bit, and it worked out well to allow addressing of registers with 9 bits. Right now, the compiler mainly works as a stack machine and doesn't use more than 10 registers at a time (usually just one or two). In the future, I think it would make sense to make the number of registers big enough that it would fit inside the lowest level CPU cache, since this CPU is mainly designed to be emulated. Storing temporary data in registers instead of the stack has the advantage of not needing to push and pop all the time.


I like the idea of visualizing the parse tree, and another fun place to go would be to do a real IR, bring things into SSA form, show phi function placement, and then do register allocation and optimization.


I have in the past used xdot files to good effect for visualizing phi function placement and interference graphs. What would be great is a way to make the result interactive (e.g. drill down from basic blocks to instructions, selectively get more or less detail on voluminous data like live ranges). Graphviz unfortunately isn't suitable for that.


It's really easy to do that in force-directed layout in Javascript, but annoying to do it in a tree (or, better, orthogonal) layout.

(Hoping someone corrects me on this.)


Would you pay for such a tool?


I'd pay $500 for a 2015 version of: https://www.gnu.org/software/ddd. Would pay a lot more if I made my living coding.


oh man i remember loving ddd


You have a register dedicated to zero?


Yes, I also have a register dedicated to the word size (4). Since these values are used so commonly, I felt that it didn't make sense to constantly have to load them manually.

You'll also notice that there is no move instruction. With the existence of the zero register, move becomes unnecessary because you can just do

add r1 ZR r2; Moves the contents of register r2 into r1.

There are a number of other similar situations where it makes sense to have zero handy all the time.


"Changing the value in the zero register is not recommended." Hee hee.


:p


Not uncommon in real architectures, either by convention (ARM r0) or design (the MSP430 CG, which, like this architecture, also generates other useful constants).


Related to your project: At my university, we learned basics of assembler programming in an IDE made by one of the professors. It was written in Xaw/Athena widget se. Now I see somebody has ported it to javascript: http://ivanzuzak.info/FRISCjs/webapp/


Great project Rob! I'm writing a C compiler in Haskell at the moment, and I'm totally going to borrow your test suite and libc, it looks very neat :)

edit: Oh jeez, I just saw your license file. I'll have to wait :)


I figured that the license doesn't really matter for now, since I don't expect anyone to actually use this for production code (there are still quite a few bugs, and it is not very user friendly). A couple people have brought up the licensing, so I may give that some attention. I was considering Apache 2.0. Any opinions?

If you want, you can subscribe your email on

http://recc.robertelder.org/

and I'll update you when the license has changed.


What about the ISC license? It is slightly more liberal than Apache 2.0, and really short.

https://en.wikipedia.org/wiki/ISC_license


I'll give it consideration, although I believe more descriptive licenses are actually better since they say what happens in every given situation and don't leave room for 'undefined behaviour'. As you've probably read before, when a compiler encounters 'undefined behaviour' it gives the compiler the green light to do anything it wants. Similarly, undefined behaviour in a contract can give lawyers the green light to claim whatever ridiculous thing they want to.


Right now the license forbids anyone from looking at the code, which is publicly available on github, so everyone who has ever looked at your github page --- I'm guessing, every commenter here --- is violating it. That's not good.

I'd strongly recommend that you change it ASAP. The main two options are:

- The 'I just don't care' license: BSD 2-clause. This allows anyone to do anything with it, other than claim they wrote it. It's the most open of the licenses. http://opensource.org/licenses/BSD-2-Clause

- The 'One true way' license: GPLv3. This is the classic copyleft license. It requires any modifications to the program to also be GPLv3'd. This requires quite a lot more thought, because if your compiler runtime is also GPLv3'd it means your compiler can't be used to produce distributable non GPLv3'd binaries, which is bad; so you'll have to license your runtime differently. GPLv3 also makes it really hard to use your program commercially (you may consider this a positive or negative quality). http://opensource.org/licenses/gpl-3.0.html

Most other licenses are basically just variations on the them. Also, writing your own license, particularly a more restrictive one ('this software can only by used for education', say) will basically kill any use of your program. And it'll probably be invalid, too. Writing licenses is hard.

...

A cautionary tale follows: there is a C compiler called vbcc (http://www.compilers.de/vbcc.html). It's incredibly small, fast, and produces great code. It's easy to port. But it's released under a look-but-don't-touch license which forbids distribution of modified archives. This means that if I, say, modify it to include a compiler backend for Infocom's Z-machine, which I did (http://cowlark.com/vbcc-z-compiler), then I cannot distribute a version of vbcc which contains that backend! Instead I have to distribute the unmodified archive and a patch. This makes it way too hard to develop for, and additionally means that no software distributor will touch it.

When vbcc's author finally gets hit by a bus, vbcc will die. Which is a shame; it's really nice.

Please don't do that!


An "All rights reserved" license statement doesn't make it unlawful for readers to view code. Why would this license statement be different?

I'm going to go out on a limb here and say this it's just false that his current license statement prohibits reading the code on Github. I don't think that's a "right" controlled by copyright.


You can submit path to vasm/vlink/vbcc authors and they would applt to his toolchain. They are very friendly to any improviment. On this way, We got a full macro assembler and linker for TR3200 cpu. Also, we have a SmallerC back end for TR3200. We only need to add some stuff for relocatable code on vasm tr3200 back end, and and write or adapt a libc to have a full C toolchain working.

Also, I think that is time to put our virtual computer here...


Oh sure --- I'm not saying there's anything wrong with it technically. But I can't use any kind of open development process. So, no github or Bitbucket. And the distributions still won't touch it. And there's still a single point of failure revolving around Volker Barthelmann; I can't rely on vbcc continuing to exist.

A similar thing happened to lcc, which was a perfectly good if quite simple compiler that simply stopped being relevant because distributing was too hard.


Why not we ask him about changing their license model? VASM and VLINK are modular. I think that if there is a problem about changing the license model, would be on these modules, not on the core of VASM and VLINK.

I would be more happy if vasm/vlink/vbcc resides on GitHub or at least on a public subversion repository.


I have, on several occasions. It's probably worth trying again, but TBH it would probably sound better coming from someone else.


I don't think the license as stated prohibits people from looking at the code, and if it does I would clarify to anyone who is interested that I'm totally ok with you taking a look at what I've been working on.

I appreciate your feedback, although my experience has taught me that those of us in the HN crowd have a completely different and disconnected understanding of software licensing compared to what lawyers think. I worked at a couple IP heavy companies and my impression is that any license that leans toward the Richard Stallman way is a huge red flag for businesses. I recall that GPL v3 is much better than v2. I'm also of the understanding that both BSD and MIT are discouraged now because the license is so thin, and they don't say anything about patents, while Apache 2.0 does.

I invite anyone to contribute more to this discussion, because it happens to be one of those political things that can be more important than the code itself.


The 'looking at' part is explicitly covered by the Github ToS you've agreed to so you're all set for that. Although if you're concerned about un- or poorly defined behaviour, it also gives others the right to fork your repo on Github, whatever that implies.

This is their ToS blurb:

However, by setting your pages to be viewed publicly, you agree to allow others to view your Content. By setting your repositories to be viewed publicly, you agree to allow others to view and fork your repositories.


Well, you _do_ say: "This software is not currently available under any license." I know that's not what you _mean_, but it's what you _say_. My personal and uninformed option is that you should probably change this to something along the lines of: "This software may be studied but not modified, and copies not be redistributed", which is more explicit.

I am not, personally, a fan of the GPL (it makes things too complicated) and tend to release all my software as MIT or BSD-2 because I like the simplicity. But Apache-2 is a perfectly decent license. Nobody will complain about that.

Ah, here's a good bullet-point comparison: http://choosealicense.com/licenses/


This is wonderful.


mmm... I'm thinking about of implementing it (or write a wrapper) on the Trillek virtual computer.

The virtual computer architecture is like a S100 bus computer of 32 bits. Any CPU of 32 bits could be plugged on it, and One Page CPU it's far more simple that our TR3200 cpu.

To put it, I only need to implement a class that follow this interface : https://github.com/trillek-team/trillek-vcomputer-module/blo... and access ROM/RAM and memory mapped devices using vcomp class read/write funtions.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: