I'm not so worried about duplication by academics -- that does not happen often -- but rather about academic research that's just wrong: makes bad assumptions, uses a flawed methodology, fails to address the general case.
If industry could provide better platforms (up to date, not hand me downs or neutered versions of their real product) then I think you might see better forms of this research.
For example, I'd love to improve search relevance, but w/o having access to Google's search engine to build on, it's pretty hard. That's my suggestion. :-)
While I agree in general that opening up opportunities for academic-industry collaboration is good, I don't think it's practical for academics to work on problems at true industry scale. Academics don't have access to the resources, personnel, or funding required to do that kind of work. An academic lab can do many things of relevance to industry -- but not everything.
Google recently open sourced its TensorFlow plstform specifically to enable researchers (and others) to build upon and improve it -- trying to avoid the problem with MapReduce (where a bunch of clones came out that were, at least initially, inferior to the original).
It would be really nice if they would go ahead and release the Google version of MapReduce now that they've learned their lesson. It's not too late for everyone to learn from the original, and it's no longer a competitive advantage now that anyone can run a Hadoop job on AWS on demand.
You and samth seem to be missing the point: academic research is often built using methods that will ensure no real-world success while aiming to achieve real-world success. Factoring in real-world constraints, best practices, and existing work-arounds will let academics achieve better results on average. Baseline of practicality goes up.
And again, these are all academic projects that aim at being practical. The point is supported by a number of academics that incorporate real-world information and constraints into their work to produce deliverables that advance state-of-the-art and are useful. Examples that come to mind are: Haskell/Ocaml/Racket languages, CompCert C compiler, Microsoft's SLAM for drivers, SWIFT auto-partitioning for web apps, Sector/Sphere filesystem, the old SASD storage project (principles became EMC), old Beowulf clusters, ABC HW synthesis, RISC-V work, and so on. So many examples of academics keeping their head in the real-world instead of the clouds to making a name for themselves with awesome stuff with immediate and long-lasting benefits. I'm not Matt but I'm guessing he'd rather see more examples like this than, say, a TCP/IP improvement that breaks compatibility with all existing Tier 1-3 stacks and whose goal is to improve overall Web experience. Yes, there are people working on those. ;)
I don't think the RISC-V work is a good example. It suffers from some of the problems that mdwelsh is worried about.
It's aimed at a real world problem but their solution is not good.
A couple of days ago, someone asked where the verification infrastructure was on https://news.ycombinator.com/item?id=10831601 .
So I took another look around and found it was pretty much unchanged from when I looked last time. There is almost nothing there. It is not up to industry standards, to put it lightly.
It's not just the verification aspect that is weak either. On the design side, they only have docs on the ISA. For SOC work, you are essentially given no docs. Then in another slap in the face, the alternative is to look for code to read but the code is in Scala. Basically only helping those who went to Berkley or something.
It is something that seems relevant but if you were to try using it most engineers would have a pretty hard time.
That I recall, the RISC-V instruction set was created by looking at existing RISC instructions, industry demands, and so on. The result was a pretty good baseline that was unencumbered by patents or I.P. restrictions. From there, simulators and reference hardware emerged. Unlike many toys, the Rocket CPU was designed and prototyped with a reasonable flow on 45nm and 28nm. Many others followed through with variants for embedded and server applications with prior MIPS and SPARC work showing security mods will be next.
Them not having every industrial tool available doesn't change the fact that the research, from ISA design to tools developed, was quite practical and with high potential for adoption in industry. An industry that rejects almost everything out of academia if we're talking replacing x86 or ARM. Some support for my hypothesis comes from the fact that all kinds of academics are building on it and major industry players just committed support.
Is it ideal? No. I usually recommend Gaisler's SPARC work, Oracle/Fujitsu/IBM for high-end, Cavium's Octeons for RISC + accelerators, and some others as more ideal. Yet, it was a smart start that could easily become those and with some components made already. Also progressing faster on that than anything else.
It possibly can be done via a torture tester apparently, https://github.com/ucb-bar/riscv-torture , but taking a quick look I don't think it handles loops, interrupts, floating point instructions etc.
There didn't seem to be a lot in there but I don't know Scala. I wish it was scripted in Lua or something with the Scala doing execution and analysis. Make it easier for others to follow.
Doesn't seem nearly as thorough as what I've read in ASIC papers on verification. They did (co-simulation?), equivalence, gate-level testing, all kinds of stuff. Plus, you did it for a living so I take your word there. I do hope they have some other stuff somewhere if they're doing tapeouts at 28nm. Hard to imagine unless they just really trust the synthesis and formal verification tools.
Are those tools and techniques good enough to get first pass if the Chisel output was good enough to start with? Would it work in normal cases until it hits corner cases or has physical failures?
Interesting paper. It sounds good until you look for the actual work. With a possibly limited amount of testing, you can't be sure of anything. In verification, you can never just trust the tools. With no code coverage numbers, how do I know how thorough the existing tests are? The tests themselves have no docs.
The torture test page said it still needed support for floating point instructions. That kinda says, they did no torture tests of floating point instructions. I wouldn't be happy with that. Same goes for loops. Etc.
You have to think about physical failures as well: the paper mentions various RAMs in the 45 nm processor. You should have BIST for those and Design For Test module/s. Otherwise you have no way to test for defects.
Yeah, that all sounds familiar from my research. Especially floating point given some famous recalls. Disturbing if it's missing. I'll try to remember to get in contact with them. Overdue on doing that anyway.
Nobody said you did. He suggested several possibilities. One was working in industry to understand real-world development, deployment, or support needs. Another suggestion is considering real-world issues. That's main one as the other merely supports it.
An example would be putting Racket to use on industrial scale projects with groups of programmers from different backgrounds. These would discover any pain points of language/tooling plus opportunities for improvement. Doesn't have to be Google: just practical, diverse, and outside Racket's normal sphere.
The reason I used Racket as an example is that they already do some of that at least in their forums. Maybe some commercial sector but I lack data on that. They've evolved to be constantly more useful for both academic and practical stuff through such feedback.
If you doubt that or that they're purely academic in a bubble, then feel free to share why. You may have data I dont have but Ive seen them adapt based on external feedback and usefulness in specific cases. Not a Racket user or insider, though.
I'm one of the core Racket developers, so I also think we're doing pretty well. :)
But what you're suggesting requires persuading a large group of developers to adopt a new language -- if you have a recipe for doing that, lots of people including me would love to learn it.
"But what you're suggesting requires persuading a large group of developers to adopt a new language -- if you have a recipe for doing that, lots of people including me would love to learn it."
What I'm suggesting is a group of people interested in trying something and reporting the results try something and report the results. You don't have to convince anyone of anything as you're responsible for you, not them. :)
All you'd have to do is make sure the tutorials/guides, tooling, core libraries, and distribution are in order. Optionally a little bit of evangelism for awareness. Random person on forum or HN: "Hey, I'd like to try to write some business code or some service in a new language. What should I use?" Drop a Racket link, guide, and something on macros & live updates (if racket has it). I remember loving those when I played with LISP back in the day. I wouldn't expect any more from the Racket team.
Now, I'll drop Racket references in these tangents if you're saying you all sat around in Plato's Cave with no exposure to real programming past its shadows in academic papers and just came up with everything in Racket on your own. Just seems like there's some feedback loops in there from useful projects that caused improvements that make it more useful in practice. If you say no, I'll concede I'm wrong given you're part of the core team. Then be mystified at its evolution.
We certainly put effort into documentation, distribution, libraries, tooling, etc, and there are many Racket users who will bring up Racket unprompted. It turns out language adoption is hard, though.
And far be it from me to encourage you to stop mentioning Racket! But I think fewer of the academic projects you mentioned than you think were developed by people based on industry needs. Instead, we Racketeers are all software developers, and we make Racket the language we want to program in. The most significant Racket application at the beginning (and maybe still) is DrRacket, the IDE. Developing that has led to everything from FFI improvements to contract systems, just as an example. I expect the same to be true for many other real working systems developed by academics.
" But I think fewer of the academic projects you mentioned than you think were developed by people based on industry needs."
So, I issue a retraction that's the opposite of my prior claims: Rackets features, libraries, and tooling were all developed by academics or Racket community with feedback or justification from real-world projects (eg web servers) or use in industry. Purely Racket community working within the Racket community on day-to-day, academic or leisurely needs.
I'll revisit the others on the list to see which of them might be the same.
"And far be it from me to encourage you to stop mentioning Racket!"
I wouldn't anyway. You all have earned mention with the right mix of attributes in the project. :)
"Instead, we Racketeers are all software developers, and we make Racket the language we want to program in. "
That makes sense. Scratching an itch as the old FOSS motto goes. I did the same thing with a 4GL a long time ago. I understand the motivation. Staying with such a project for many years is an angle I apparently haven't caught up to. Regrettably. ;)
"The most significant Racket application at the beginning (and maybe still) is DrRacket, the IDE. Developing that has led to everything from FFI improvements to contract systems,"
That makes sense. It looks like a very complex program. It would stretch the language into programming-in-the-large and robustness territory by itself.
"I expect the same to be true for many other real working systems developed by academics."
I'll keep it in mind. Thanks for your time and information on the Racket project.
Third time I've seen his name recently. First, a definitive guide on FP compilers. Another I think touring them. Then this. Is this guy supposed to be one of the grandmasters of FP or something? ;)
See my reply above - I'm all for industry opening up where possible, and a lot of stuff gets open sourced these days (hell, didn't Facebook even open source its data center designs?). Opening up technology doesn't necessarily mean academics will focus on the right problems, though.
Google patents a lot of cool stuff, actually. But the ideas tend to not get noticed in academia; when was the last time you saw a patent in a bibliography?
The point is that patents don't appear to contribute to the sum total knowledge that academics build on top of. They're by and large intellectual dead ends, if profitable ones.
Oh I mostly agree. Some organizations will only fund you if they get IP out of it. Many significant tech in that area. But it's mostly an after the fact make money thing. Or just bullshit altogether.
Of course, there's tons of academic research that does all of those things. And I of course agree that too much academic software is "works on my grad students machine"--an enormous amount of my time on Racket is spent on real-world works-in-practice issues. But this doesn't seem like it's a particular issue of industry-relevant systems work, just Sturgeon's Law.
Also, failure to address the general case is not so bad--it just means that the next part of the general case has to be addressed by the next researcher.
Finally, I think the real issue is academics who have an idea, and cloak it in pseudo-relevance to industry to sell it. A program analysis framework isn't suddenly industry-relevant now that it applied to JavaScript, and we should just be ok with not chasing the latest industry fad.