Hacker News new | past | comments | ask | show | jobs | submit login
Best Papers from 27 Top-Tier Computer Science Conferences (jeffhuang.com)
192 points by jholdenm on July 18, 2012 | hide | past | favorite | 57 comments



This is an awesome compilation. Thanks!

Few comments:

- Take the institution ranks (except the top 10) with a grain of salt. In theory conferences (eg. FOCS, STOC, SODA), the order of authors is alphabetical.

- Best student paper has a different meaning for theory vs systems conferences. In theory conferences, it is given to a paper who's all authors are full-time students. In systems conferences, it is given to a paper who's first author is a full-time student. So best student papers in theory conferences are particularly good reflectors of an institution's merit and should be included.

- A thought/idea: Most top conferences in CS tend to have inner cliques. So if you rank institutions merely by their program committee membership/chairship in these conferences, the resulting ranking might look very similar to this list. The differences between the two lists might be one way to bring out "rising star" institutions.


In theory conferences, it is given to a paper who's all authors are full-time students.

But the theory community does not place a student's advisor's name on the paper as a matter of course, correct? While in the systems community, the advisors name will usually be on the paper, listed last. With that in mind, the two classifications of "student paper" represent close to the same thing.


Unless the advisor's name goes first, due to alphabetic order... I guess having a surname starting on 'M' can harm my academic career :/


As dude_abides says, alphabetical order is common in theory. But in systems, author ordering typically indicates contribution and/or status.


Well, my team publishes in both, and the team-wide rule is to always put alphabetical.


I was watching tptacek talk on crypto/pen testing and he mentioned that a flaw published 15 years ago, is still being put into the wild by general web developers, in big companies who should know better.

And I was also reminded of the checklist manifesto - where the Boeing (?) flight safety team analyse each crash report, and produce checklists for pilots that are pushed out around the world and actually read

so here is a silly idea

A kick-starter that funds a group to guide writing of real actual best practise in software engineering - broken down into silos like realtime or web or os. It can be updated and informed wiki style but is aimed at spreading actionable, immediate choices, with the background reading to educate later on,

Maybe I need more (or less) coffee


> A kick-starter that funds a group to guide writing of real actual best practise in software engineering - broken down into silos like realtime or web or os.

There are enough reasonable guides to writing better code (security and otherwise). How would this effort be different? And more importantly, why would Random J. Programmer go there if he didn't go to any of the previous ones?

A good answer for that will cause me to chip in (even though I'm not a fan of the "kickstart everything" craze)


I think your comment, why would J Random go there if he did not go to the others is the killer comment - ultimately programming either has a professional body to enforce this, or everyone learns to program a few years after they learn to read.


What's easier and more plausible to happen - Random J. reading a long processing-cycles-demanding guide or just following a checklist?

Random J. seeing a link to/start of a guide: "that seems long and intricate... I don't have time to grok this and it seems like that kind of thing that to apply you need to grok and understand."

Random J. seeing a checklist: "ah, just following the steps. I can do this."

Maybe it's an inconvenient truth, but such checklists would be an overall boost to software security. Having straight-forward checklist to mechanically follow > Having no checklist.

Me and you would want to go behind that. Random J. Programmer wouldn't.


> or just following a checklist?

Well, OP didn't suggest a checklist (he did refer to the Boeing one as inspiration, so he might have thought about that).

And while they might be a hundred times more likely to follow a checklist, a hundred times (almost zero) is still (almost zero).

Let me re-iterate the question/problem:

> Random J. seeing a checklist: "ah, just following the steps. I can do this."

The question is: why would Random J. see the checklist? This is the crux that needs to be addressed. The content and existence of the list, while important, is a much easier problem to solve.


It's true that there are people not being aware of the information now and they'll be people not aware of it in the future. But I think[1] there's a large group of people who are aware that the information exists, but are just ignoring it as they deem that info too complex/time consuming to learn. Enabling better security practices for this group would be an overall net gain.

[1] Based on my past experiences, YMMV :-)


These "best paper" lists are almost comical and a very poor way to find important papers. Have a look at citation statistics for the older "winners", they are only marginally better than the average for the venue - essentially nobody still reads them today and they've had hardly any impact all.

To find _important_ papers you want at least 5-10 years of hindsight - look for those that are still being cited a lot correcting for citation rings, dubious journals/conferences etc. As a side benefit, these can almost always be found online on some course website without requiring IEEE / ACM subscriptions.


What's happening to hackernews? While there's still a handful of insightful comments and submissions, the culture feels like it's starting to shift with everyone having their noses in the air. I'm starting to use this site less and less, and it's a shame. Oh well.


Your rant is relevant to the insightful parent --which also provides arguments for what it says--, how exactly?


How does one get a list of the most cited papers of the last 5 years? Or, see trends on how a paper has been cited? Is there a site that does this?


Find a semi recent advanced textbook (preferably one that is used in at least one of the better schools) and use google scholar from there (both papers cited in the text and those citing it).

The formal bibliometric tools such as scopus, reuter-thompson etc are hugely misleading to say the least with the ever growing avalanche of publications over the last decade (increasing number of people are being paid bonuses for each publication in an "international" venue). See this character who according to reuter-thompson is a "rising star" of computer science [1] and also happens to be a collaborator of El-Naschie [2].

[1] http://sciencewatch.com/inter/aut/2008/08-apr/08aprHe/

[2] http://www.timeshighereducation.co.uk/story.asp?storycode=42...


Bell Labs, Xerox PARC, Microsoft Research. Is being backed by companies with strong monopolies the only way to do world class research outside of academia?

Supposing it is true that in the future, higher educations collapses to a currently unrecognizable form and large companies become untenable, what will drive innovation? Who will fund say building a quantum computer?


Jon Gertner's recent book about Bell Labs talks about this. Having a monopoly was essential for the Labs' success. A few choice passages:

Still, the contrasts between these organizations and Bell Labs are crucial. “This was a company that literally dumped technology on our country,” the physics historian Michael Riordan has said of Bell Labs. “I don’t think we’ll see an organization with that kind of record ever again.” The expectation that, say, Google or Apple could behave like Bell Labs—that such companies could invest heavily in basic or applied research and then sprinkle the results freely around California—seems misplaced, if not naive. Such companies don’t exist as part of a highly regulated national public trust. They exist as part of our international capital markets. They are superb at producing a specific and limited range of technology products.

...

“I’ve often said to my old friends that we were very lucky we got to work there, in an environment that I don’t think will ever exist again,” remarks Dick Frenkiel, who worked on the first generation of cellular technology. “It’s hard to say something will never happen again. But with the monopoly gone, with the whole concept of monopoly essentially discredited, how could there ever be a place like that again?”

...

That perceived natural monopoly wasn’t only justified by the phone system’s technological complexity and interdependence; it was also—in an argument that telephone executives made over and over again—a matter of economics. With one company in effect serving the country’s phone customers, some parts of the phone business that were highly profitable, such as long-distance service, could subsidize other aspects that were less profitable, such as local calling. Profits from high-paying corporate customers, moreover, could subsidize service to residential customers. Profits from dense urban areas could subsidize expansion into sparse rural areas. All in all, this kind of “averaging,” as it was sometimes called, helped make telephone service available and affordable for most Americans.

...

In testimony before the U.S. Senate’s Subcommittee on Antitrust and Monopoly, Bill Baker asserted that “the notion that Bell Laboratories could endure and function away from AT&T, Western Electric, and the operating integrated Bell System would be laughable were it not so sinister and so ominous.” It was an argument like the one a gifted child might make in favor of preserving his parents’ marriage.


I think the claim that Bell Labs could not have existed without being part of a monopoly is unscientific at best. Sample size 1.

My understanding of economics would beg to differ with that conclusion, but I don't think we'll get anywhere arguing economics. However, here is something that seems faulty in the text you quoted.

The expectation that, say, Google or Apple could behave like Bell Labs—that such companies could invest heavily in basic or applied research and then sprinkle the results freely around California—seems misplaced, if not naive

I think Google has, indeed, been investing in applied research and sprinkling the results freely around. Off the top of my head, without thinking, there is Go (the language), and the fact that they employ Guido van Rossum. And their developers can (at least sometimes) use 10% of their time on whatever (including open source) - so if 1% of developers' time is actually spent on open source, Google is investing 1% of its payroll money (i.e., hugely) in "sprinking technology around."

I bet I could come up with a bunch more stuff they do "for free," if I thought about it more.

And I bet Google has plenty of money to throw around - a certain amount of research is just "pocket change" to them. I suspect that's pretty much the way things worked at Bell Labs, at least in terms of "free technology" that was being dispersed.

Of course, you can increase the level of innovative research through patents (and also kill it, if you do it wrong).


They might be examples research (cutting edge too) but from Bell came:

Karnaugh Maps, Transistor + MOSFET, Unix, C (this spans at least 1 to 2 years of any EE/CS curriculum). They won (at least) 2 nobel prizes in physics.

From IBM research: LASIK, Networking, DRAM, Leo Esaki + 2 other physics nobel prizes, fractal geometry, ATM, relational databases, vacuum tubes etc.

I don't think a comparison is even possible. (Not to take a cheap shot at anyone - Google has been the single biggest contributor to the delta between my parents' and my generation - they didn't have internet + IR to answer their questions and I have used their services for a large part of my life and they are doing some sweet-sweet things all the time)

Overall, the formula seems to be the same - collect super-smart people, give them freedom and see the magic happen. From what I see, the PR boost each of those contributions brought might have resulted in a net positive (not to mention, being a potential destination for people with immense knowledge of their domains has to be a HR win).


I think Google has, indeed, been investing in applied research and sprinkling the results freely around. Off the top of my head, without thinking, there is Go (the language), and the fact that they employ Guido van Rossum.

Really bad examples. Neither of those is research, basic or applied.


Google's driverless car project has been around for more than 7 years and there's still no immediate plan for commercialization. I'd consider that research (at the very least it's applied AI research).


A much better example!


No, because it has absolutely massive commercialization potential.

Whereas my "really bad examples" are much more in line with the "only monopolies have so much money they just give stuff away" premise of the whole discussion.


By research they mean nothing of the kind Guido can produce for Google or in general.


>Who will fund say building a quantum computer?

404 No Such Agency


Bell Labs had a monopoly, as part of Bell Telephone, which was (AFAIK) granted exclusive right to provide telephone service. Microsoft does not. AFAIK, Xerox never has had a monopoly.

Having a large amount of marketshare != a monopoly.

Lots of people insist on using the term for both things. I don't know why. But even if you are one of those people, surely you will see there there is a fundamental difference between those two things.

For the sake of this discussion, it would be necessary to differentiate between them.


I don't think it's necessary to make a distinction between a de jure monopoly, say, and a de facto monopoly, or a cartel.

The point is a market that allows lots of profit for an extended time period. This doesn't have to produce a Bell Labs, but it sure seems helpful.


Microsoft is a convicted monopolist.


And that conviction was immoral and improper. Microsoft did not have a monopoly in the proper sense of the term, only in the sense used by anti-business people who want to twist the term to suit their own purpose.


It will be interesting to see how those papers compare with the Most Influential awards which, at least for SIGPLAN conferences, are given 10 years later.

http://en.wikipedia.org/wiki/SIGPLAN


None of these were about garbage collection, which surprises me. Is GC considered a solved problem?


Not solved by any means, but there is typically only one GC-related paper per top-tier PL conference such as PLDI or ICFP these days. Many more GC results, particularly for more actively developed systems where the changes tend to be incremental, are published at either ISMM (International Symposium on Memory Management) or MSPC (Memory Systems Performance and Correctness).


thats moreso a reflection that aside from PLDI, none of those conferences really touch GC. its also a tricky domain to analyze wrt theory, and requires huge engineering effort to evaluate in practice. But theres also lots of real opportunities for innovation in the theory & practice of GCs.

:)


www.cs.umass.edu/~emery/pubs/gcvsmalloc.pdf

"In particular, when garbage collection has five times as much memory as required, its runtime performance matches or slightly exceeds that of explicit memory management."

Yep. Like most problems in life, it's 'solved' if you have the money.


That's talking about throughput. Getting good pause times and good throughput is still hard. Even if you have 100x as much memory, you still get the same pause times (albeit less frequently).


I think that for a conventional GC it's mostly dependent on how much time you have and what properties you want, that's mostly an engineering problem. Remember that academia worked on GCs way before mainstream languages. From what I understand the "experimental" languages are playing with region-based allocation for now.


I wish there was some way to express meta data like 'best paper' from trusted sources into the web crawler space. This is an excellent compilation, and I've added a half dozen papers to my tablet for later reading, but its a human compilation. My thought is whether there might be an opportunity to flag something such that a web crawler could automatically compile this sort of list.

Challenges I see to that would be spam injection and author spoofing.


Actually, when you "search" for papers on a paticular topic, you need more information that what was best paper; the "best" paper for a topic is probably not distinguished in a conference that cover multiple topics!

Citation rank is one way to evaluate a bunch of papers on a paticular topic, and its definitely one of the best way to find related papers in the first place. However, citations are fairly easy to game (cite yourself, get your colleagues to cite your paper) and therefore are not a great indicator of quality/influence/impact. Next, the venue of the conference is important; an OSDI paper will probably be pretty good given their low acceptance rate. But you can find lots of noise even at the best first-tier venue; its not that hard to get published (in CS) and many rising academics will flood the system with papers to make their tenure case stronger (where the tenure committee is not composed of peers, sheer numbers + conference rank are very important).

Sometimes you'll also find a gem in someone's dissertation 10 years back who didn't really focus on publishing: if you don't pursue an academic career, you don't have too much motivation to publish broadly. Or a good paper that was never cited at all on some topic that was then just emerging or way before its time. The quality of my own papers (as judged personally and subjectively) are inversely related to their citation counts.


Interesting! But each article costs 30+. How could a poor startup afford?


More than a few are freely available. But, FWIW, the IEEE and ACM are notorious for charging ridiculous fees to access their papers, if you're not a member. However, individual membership in both organizations is relatively inexpensive (it's journal subscriptions that quickly drive up the price), and can be worthwhile if you find yourself accessing a lot of academic papers of this sort.

My take is that ACM, IEEE and AAAI are all worth maintaining memberships with, especially if you want to stay up to date on what's going on at an academic level. Maybe the biggest advantage of being a member of these orgs, other than access to archived papers, is that subscriptions to their journals is much cheaper if you're a member. YMMV, of course.

On a related note, at least one of IEEE, AAAI or ACM have a discount program for journals published by other outfits, like Springer. Some deal like "join ACM, and your subscription to $FOO from Springer is heavily discounted." If you like to subscribe to journals and what-not, check out all these offers, as they can bring the prices down to where an individual can actually afford them, whereas normally you'd find the price to be so high that only an institution could really justify it.


This is a nice service the ACM offers, though authors have to take advantage of it for it to work: http://www.acm.org/publications/acm-author-izer-service


I see it more as an appeal from the ACM to authors to keep the ACM in the loop. Personally, I see it as a hassle, so no thanks.

If the ACM removed the paywall, I would link directly to the proper page for each of my papers - there's excellent meta information on those pages, like who we cite and who cites us. As well as author links which link to other papers written by those authors.

I really want to somehow lobby the ACM and IEEE to remove their paywalls (I am a member of both), but I have neither the time nor the knowledge of how to be most effective.


If you look to the right of most of the google scholar links is another link that says "PDF at <site>" and you can download it from there. Very useful.


Try punching the paper's title/author/etc into Google Scholar and CiteSeer to find other freely available copies. This works very well in CS, math, and physics, but is hit-or-miss in the "soft" sciences.


Hire someone part-time who's still enrolled in uni and use their subscription :P


it's a hassle but if you google the author's name, chances are high that you'll find their webpage that links to the pdf of the paper. They're likely to also have other resources like videos, and related reading.


- published draft versions are often just called draft for legal reasons and content-wise identical to the final versions

- http://citeseerx.ist.psu.edu/ has a public "cached" version of many papers that aren't publicly available on other sites.

Also, if there is some paper you just can't find, you could try posting its name (or a link to the paywalled site) here. There lots of academic people on HN. Someone here may be able to help you out.

I think all research should be made freely available. The current situation is just sad.


If you can't find it on the web, email the authors. It's highly likely that they'll send you a copy.


I clicked on a few of the paper titles and they led me to a direct download link off google scholar. Maybe not every paper is free this way but it was for the papers I clicked on.


check on http://citeseerx.ist.psu.edu/index and xxx.lanl.gov .


Thank you. That works!


You can sometimes find the papers in open access directly by googling, or by looking at the author's webpage. If it doesn't work, try to email the authors. If you know someone who has an account on a university network with the relevant subscriptions, ask them.


Next time he should just add a torrent with all the papers included, that way we won't need to hunt down all the free versions of each paper.


Too bad it doesn't include posters... I would have made the list :(


HCI guy here, surprised UIST is in the lot but not SIGCHI/TEI.


[dead]


Absolutely not true. The pose estimator linked to in CVPR (Shotton et. al) shipped with the Kinect. I don't know more examples from that list in particular but there are two other examples of academia + research labs producing solid world-class products:

SLAM : A model checker for drivers written for Windows. Cilk : Out of MIT acquired by Intel. UT austin + CMU were responsible for HW model checking (see FDiv bug)

I find it very hard to believe that none of these papers will be used by anyone ever in a decade or two from now.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: