Hacker News new | past | comments | ask | show | jobs | submit login
Einstein award going to Paul Ginsparg for creating arXiv.org (idw-online.de)
1128 points by endymi0n on Nov 26, 2021 | hide | past | favorite | 97 comments



Around 6 or 7 years ago, I went to a quantum computing conference and presented about a then newly developed quantum programming language that could run on a real quantum computer, showing a 1/2 decent simulation of dihydrogen energy in terms of bond length.

I authored a paper about this quantum programming language but had no way to post it to the arXiv. I'm not a traditional academic, but I had posted on the arXiv before, long ago, at a previous job. Problem was that my arXiv credentials were associated with that job's long lost email.

I gave the presentation in a giant ballroom, and afterward, sat down at my assigned spot at one of these large circular tables. Next to me, during a brief coffee break, an old man I didn't recognize told me my talk was very interesting, and asked me if my paper was published yet. "Not yet; we haven't chosen a journal. And posting to the arXiv is delayed, because my account is locked, so a colleague is planning to post on my behalf."

The man responds, "Oh, it should be possible to fix that." I said that I figured as much and have just procrastinated contacting the admins. "They're at Cornell right?"

He said, "No, I mean, I can fix that. You said your name was Reikon Musha right?"

He opens his clunky laptop and continues, "I'm not supposed to do this. But I'm certain that if you gave an invited talk here, you're definitely not spoofing your name. Maybe you can show me your ID? No, no, just kidding."

He continues clicking around. "Was xyz@example.com your old address? What's your new one?" I answered yes, and gave him my new address. He typed it in and said, "straight into the database it goes; just go and reset your password now."

I was absolutely puzzled. I said, "Thanks?? Who are you?"

He says, as a matter of fact, "I'm Paul, I invented the arXiv."


He not only fixed, but was careful in his conversation to give you the best possible anecdote.


Haha, exactly. I know a lot of these kinds of anecdotes are hammed up for entertainment, but the conversation was quite literally as written (modulo the inaccuracies of my memory).


You can now add, “And that man’s award’s name? Albert Einstein”


I laughed at your comment. On point.


What an interesting story! I was wondering if you where you also working in the quantum computing field back then, or if you just worked on this as a hobby. 6 years ago the entire quantum computing field was not hyped as much yet (this was just about when the ibm quantum experience started?).


Yes, I was working in the field, not as a hobby. The programming language and compiler research certainly preceded IBM's foray into publicizing "cloud quantum", but I think they had just come out with a primitive version of their Quantum Experience by the time we gave that talk. This was definitely before the public hype set full sail—but it was clearly beginning to simmer. Everything was really research-oriented then, but it's, as you observe, around the time companies began to open up a bit more. (This is completely ignoring DWave, a whole different story. This also ignores physics-academia, which on the other hand had all sorts of hype, all of which eventually bled into the public arena, from Majorana qubits to chemical simulation.)

As a personal aside, I miss those days when things were just so much quieter, heads-down, and collegial. It's difficult to describe how the research and commercial environment has changed for its participants over the last decade.


It was definitely very hype within academia long before that, even if you weren't hearing much about it in mainstream news yet.


Now this thread is first Google result for your name


This is why I love HN


and everybody clapped


How does he (or people in general) pronounce arxiv?


Like "archive". The X represents the Greek letter χ. Similar to TeX or LaTeX, which are pronounced like "tech" and "lay-tech".


With arXiv I mind less for some reason, but I pronounce TeX and LaTeX as teks and lay-teks because:

- That’s how I pronounced them when I first read them.

- That’s how everyone else pronounces them when they first read them until someone who has heard this factoid corrects them.

- They look exactly like commonly-used English words.

- It feels like a dumb prank to take the name of a writing system, whose sole purpose is written word and which will be most discussed in written media, and give it a deliberately odd pronunciation that people don’t usually discover until they talk to someone about it.

I’m grateful to the legendary folks who built these things, but I still wish they would’ve honored the Least Surprise Principle and picked better names.


Not to say I disagree with you at all, but everything you say here applies to GNU.... (Well, maybe it's not so common a word, but it's still a written word for something that you're supposed to pronounce it unlike the English word it resembles.)


I pronounce it gah-noo or G.N.U. arbitrarily, but I assumed it was just like “sequel” and S.Q.L. where everyone just pronounces it both ways.


SEQUEL is actually the precursor of SQL :)


This is a great story.


So what you're saying is you can pwn literally any arXiv account by giving a convincing enough presentation on a highly specialised scientific field at a conference that Paul is attending?

Pah, what sort of security do you call that?!


If you can both get invited under a fake name and show interesting enough research that Paul seeks you out I suppose.

I think the first part would be rather difficult, probably more so than pwning a server.


Parent is joking


Well, that'll teach me not to HN right after waking up.

Or won't. But it should :P


HN pre-coffee: read only. HN post-coffee: writes enabled. :)


Also, you can comment as a caffeinated user by prefixing with cudo.


I've always said that the neurobiology of sarcasm is woefully understudied


I’m glad you were the one to break the news


Reminds me of the xkcd: https://xkcd.com/810/


Nice anecdote, thanks for sharing :)


Many years ago (think: 1994) scientists (mainly CS folks) published papers on the web as postscript files. Actually, they were usually on FTP sites, not the web, as almost nobody used the web at the time. I developed a huge aversion to postscript (it was clunky and the renderers weren't great) but the idea that every bit of published science would be freely available on the web seemed completely and totally obvious to me.

When I later became an academic, I learned that when you write a paper and submit it, conditions for publication typically include signing away the copyright to the journal, possibly with a license to distribute a few preprints offline.

My advisor at the time pointed out that he actually modified the contract (changing the terms to retain copyright), signed it and sent it back (no journal ever complained, and all his papers are available as PDFs online).

arxiv is the closest thing to what I dreamt of some decades ago and I'm thrilled that Paul is receiving recognition. Personally, I think arxiv is a better path forward than scihub, entirely due to its "legitimacy". In the future, I will always work to "publish" on arxiv and not put my work in journals (fortunately, I am not in a publish or perish situation).

My only complaint is that I don't particularly like PDF and wish there was an HTML-zip format that could be sent around and the browser did all the rendering work, while the udnerlying data tables are stored in well-defined formats so they can be programatically extracted.


I've said this here before, but based on having dinner with him probably ten years ago now, the arXiv replaced the system that existed in the early 90's where people at top research universities would mail preprints to each other, as publishing was slow and waiting for preprints to be published would put you behind the curve by quite a long time. Obviously this was inefficient and somewhat exclusive (if you weren't at a university in the preprint exchange club, you'd be hopelessly behind in your field).


> My only complaint is that I don't particularly like PDF and wish there was an HTML-zip format that could be sent around and the browser did all the rendering work, while the udnerlying data tables are stored in well-defined formats so they can be programatically extracted.

Back when I was doing Math research it was common to submit LateX files to ArXiv. A quick look at some of the Physics papers, though, suggests it may not the case for other fields.


It depends. arXiv won't normally accept a LaTeX-generated PDF, but it will accept a PDF written in Word.

Particle physicists, cosmologists and astrophysicists almost exclusively use LaTeX (anybody using Word would get laughed at). In other fields of physics, Word is more common and also not everyone submits everything to the arXiv. It's an interesting cultural divide...


I wrote my thesis in latex and convert to html in 1995. It’s still online, 27 years later, and totally readable! Only a few external image links are broken.

I don’t particularly like latex though.


> It depends. arXiv won't normally accept a LaTeX-generated PDF, but it will accept a PDF written in Word.

What do you mean? I predict the majority of PDFs on arxiv are generated from latex just based on the fields who mainly use it. In fact many papers contain the tex source files.


What I mean is, if you try to upload the PDF from LaTeX, it will complain and tell you to upload the sources (which it then compiles to a PDF).


MHTML is not going anywhere, but presumably you could do a single HTML file with all media as embedded data URIs?


I'm in CS research, I submit all my papers as latex sources. Submitting pdfs that were obviously created in latex is supposedly against the rules.


> changing the terms to retain copyright), signed it and sent it back (no journal ever complained, and all his papers are available as PDFs online).

Assuming the journals signed it first, this is most likely illegal since they didn’t know they were signing under the different terms. If they sign it after the professor signs it, they technically have the duty to confirm the language is the same, but it still might be seen as a deceptive practice on the professor’s part.


I had this discussion with the professor in question. He had already consulted with his IP lawyers. As you say, they technically have the duty to confirm the language is the same, but we know they didn't because the paper was published with their copyright attached. I think he was trying to push the issue, gently, and hoped some published would make a stink.


Contracts are amended all the time in exactly this method. That’s how contracts work (IANAL so maybe I’ve missed something, but I was taught this by lawyer left)


Surely you are supposed to tell the other person if you are changing the contract though? Not a lawyer, but it's hard for me to believe that tricking someone into signing the wrong thing is an effective legal strategy.


If you just signed it, that would be indicating you accept it with no changes.

There is no reason to send it back aside from redlining something. So sending a new contract is itself the indication that it has been redlined.


The Nobel Prize in Physics often goes to "tooling" experimental work. Think: blue LEDs, CCDs, fiber optics, optical tweezers, things like that. Not that anybody asks me, but I would advocate a similar tooling award split between Berners-Lee (www), Knuth (TeX), and Ginsparg (arXiv). I can think of no people who had a more profound change in how physics is actually practiced.


Maybe that's a sign that theory research in physics has become iredeemably suspect?

The Swedish bank (not an OG Nobel prize, but effectively one of them) prize for economics pretty much alternates (A) sharing the prize between econometrics (ie. tooling) and empirical research and (B) awarding deep theory.


I would push back on this. Irredeemably suspect? The Nobel committee doesn't make theory awards until they are supported by experiment. There was basically no new particle physics data for 30 years. Then they found the Higgs and that's nothing else. Theory beyond this is speculative, nobody knows if it's important or not.


I would love if there was a "Nobel Prize in Math, Computer Science, and Information Processing".


There’s the Godel Prize for theoretical computer science, the ACM Turing Award for more general contributions to computing, and of course the Fields Medal (and nowadays the Abel Prize) for math.


Iirc, Nobel specifically didn’t want there to be a category for math itself?

But I could be misremembering or have misinterpreted what I saw, which was in a non-authoritative source.


That's a very good point. Surely people on Hacker News collectively know enough people to get a nomination done? :-)

PS: the above list good, are we missing anyone important who ought to be on that list?


Maybe the HN community could set up an award?


I second this. lets do one.


How would we go about doing so?


just in case it hasnt clicked yet, the X is the greek letter chi, so the url is "archive" phonetically


I've been using the Arxiv for 10+ years. I always thought it was just a "cool" way of spelling it, like "ArXiV" or "eXtReme". Mind blown, thanks.


Now the game "arxiv vs snarxiv"[0] will also make more sense to you!

[0] http://snarxiv.org/vs-arxiv/


What about vixra[0]?

[0]https://vixra.org/


ArXiv is great. I encourage anyone to submit their work there in addition to submitting to a peer-reviewed venue. Thankfully, they're compatible with overleaf nowadays. There was a time when arxiv didn't accept stuff from overleaf but it also didn't accept pdfs made from latex, so I had to pdf-print my pdf to submit it to arxiv.


Yes, arXiv generally doesn't accept PDF's, preferring instead the tex source (which is amusing when people don't realize their comments show up. It uses something called autotex which has a few quirks (e.g. all images have to be in the same dir, etc.).

Here is the makefile I use that also generates a .tar.gz for the arxiv (obviously won't help with Overleaf without cloning first, but)

    FIGURES=$(wildcard figs/*) 
    TEXFILES=main.tex included.tex included2.tex ...

    main.pdf: $(TEXFILES) $(FIGURES) main.bib 
         latexmk -pdf -g $< 

    .PHONY: clean show

    clean:
        latexmk -C 
        rm -rf forarxiv* 

    forarxiv.tar.gz: forarxiv/main.tex forarxiv.pdf 
        rm -f forarxiv.tar.gz 
        cd forarxiv &&  tar --exclude=*.bib -cvzf ../forarxiv.tar.gz *

    forarxiv: 
       mkdir -p $@

    forarxiv/main.tex: main.tex main.bib | forarxiv
       latexpand --empty-comments $< | sed -e 's#figs/##g' > $@

    forarxiv/main.bib: main.bib
       cp $< $@

    forarxiv.pdf: forarxiv/main.tex $(FIGURES) forarxiv/main.bib
       ln -f $(FIGURES) forarxiv/
       ln -f foo.sty forarxiv/
       ln -f foo.bst forarxiv/
       latexmk -cd -pdf forarxiv/main.tex 
       latexmk -cd -c forarxiv/main.tex 
       mv forarxiv/main.pdf $@



    show: main.pdf 
        xdg-open main.pdf


Do you think it would be worthwhile to upload something like a master's thesis that ultimately did not get published through a traditional journal? That happens to be my case. I am alright with my supervisors' decision to not pursue publication, however I still wish to have it out there in some form, as it took a sizeable amount of effort, and it might be useful as a way to advertise myself.

On a tangent, is publishing the LaTeX source of a paper or thesis on GitHub something that people do...? I was also toying with the idea of writing my thesis with org-mode while including the code snippets I used for numerical calculations and graphs.


>worthwhile to upload something like a master's thesis

That'd be great. Seems dumb to let talented, hard-working student papers moulder on some basement shelf. (Or do they file 13 'em?) I've run across several onlined, very readable and informative masters' works over the years. Tend to be less hidebound about fresh theories/tooling as well.


The tex for my thesis is on github but I don't expect anyone to ever look at it there (anyone who might read it will find it on arxiv), I just used github for backup & versioning (I'm not quite dumb enough to have a document representing 4 years work in one place).


"arXiv moderators expect submissions to be of scholarly archival interest to the communities they represent. A submission may be declined if the moderators determine it lacks originality, novelty, or significance.

Submissions that do not contain original or substantive research, including undergraduate research, course projects, and research proposals, news, or information about political causes (even those with potential special interest to the academic community) may be declined." as explained in https://arxiv.org/help/moderation

If your supervisors decided not to pursue publication then it seems likely that it does not match these criteria either. Note, this isn't a qualitative judgement about your thesis - as it says as much about academia as it does your thesis.

Additionally, you might run into the endorsement system if you're new to arXiv. https://arxiv.org/help/endorsement


> Submissions that do not contain original or substantive research

Most, if not all master's thesis contain original research, and that is definitely my case as well.

> undergraduate research, course projects, and research proposals, news, or information about political causes

My thesis can't be classified under any of these categories, either.

It does sound like I would be able to publish it there, but I'm going to have to reach out to my supervisors for an endorsement in arXiv, thanks for the tip.

I believe they weren't inclined towards publishing it because I did not quite manage to fulfill their expectations with respect to the scope of the thesis. I came across quite a few bumps along the way which hindered my progress. It was still graded well enough, it has a good amount of original content, and I am relatively satisfied with its quality, though, so it's not like it was a complete disaster.


You should do it, but contact your advisor first. I know I'd like to upload my old masters thesis online, but the university retains a copyright on it. Its possible yours does to. I'm sure the information on that would be available on your schools website in the thesis format/guidelines sections.


Didn't mean to imply it was a disaster; but without being familiar with your work the main signal in your comment was that your supervisors didn't push for publishing it, which might be a hint it's not a good fit for arXiv either. But if your supervisors will endorse you then by all means go for it!

Another option might be to put it on Zenodo.


I can't attest to ArXiv publication, I didn't consider it actually (I may now, though to be honest mine is not groundbreaking research by any stretch of imagination), but publishing the LaTeX source on GitHub is definitely relatively common. I and a few acquaintances I knew from my Master's program did that.


I imagine it depends on the field. Having perused your GitHub page and taken a look at what your program was about, I see why you might publish your thesis' source there. People working in theoretical physics aren't too inclined towards free software, so maybe that's why I haven't seen many related papers.


Ah yes, I assumed computer science, apologies.


There are some theses on the arXiv but it's relatively rare. See e.g. https://arxiv.org/search/?query=dissertation&searchtype=all&...

Typically these are uploaded in some other archive that may be less open (e.g. often outsourced to proquest or something), but many universities have their own open archives. My PhD thesis is available on MIT dspace. Hilariously, it's a scanned copy of my printed thesis, since I had to print it out for submission, even in 2015.

As for theses on GitHub, those happen but often people keep it private since they don't necessarily want the revision history exposed :).


>On a tangent, is publishing the LaTeX source of a paper or thesis on GitHub something that people do...?

I've definitely seen it done, although I suspect the prime concern is versioning and backup.


Well deserved. I only recently learned about the origin of arxiv and the story surrounding Joanne Cohn. Highly recommended read

https://physicstoday.scitation.org/do/10.1063/PT.6.4.2021110...


If you've ever wondered about the unusual arXiv.org website favicon: The original logo was a skull with bones as reference to the piratey nature of distributing preprints without the publisher's consent. The smiley face was added later in order to make it less offensive. Due to the conversion to a non-transparent format, the backgrond was filled with the green color.

https://www.quora.com/Whats-the-story-behind-the-arXiv-org-f...


IN a world of leeches like Elsevier, his work is much appreciated


Neat, next award should go to Alexandra Elbakyan for creating Sci-Hub :)


The least controversial post of the year.

Well deserved.


Why does so many online articles refrain from adding a simple link to the product in question?

https://arxiv.org/


What is the point of arxiv other than a file host but where people forcibly constrain themselves to the bureacratic process of creating a UN*X braindamaged tex / PDF with whatever academic publishing conventions? Someone in here even has posted makefile. For analogy to the problem I am alluding to, imagine needing a makefile to write an ASCII text document. Skimming through these comments, it seems like arxiv supports some kind of search or other metadata traversal. All this stuff could be done without a centralized server. No I am not advocating for your half working p2p network, cryptocoin, or startup.


Is arXiv openly accessible in China? I tried a quick search on the topic and saw that there are quite a few mirror sites but no definitive answer.


You can check whether a site is blocked by GFW with the following websites:

1. http://www.chinafirewalltest.com/

2. https://www.comparitech.com/privacy-security-tools/blockedin...


That’s good. Sometimes I think stuff like this (supporting plumbing) moves science forward more than an individual discovery could.


Been out of research for a while, didn't know about this. Will get my papers up there.


A worthy choice.


Completely and utterly deserved. Arxiv, like Wikipedia and arguably sci-hub, is an absolute boon to humanity. What I find interesting is that its endorsement system does largely work at making it very easy to submit papers whilst reducing the amount of spam. Sure, there are some papers that are not completely brilliant -- or indeed, actually any good at all -- but they have the same "right" to be read as their huge CERN brethren. In some fields of physics, paid journal publishing is really a niche in comparison to stuff on the ArXiv. In others, and other areas of science, not quite so much yet. I also very much like the fact that it is open, mostly LaTeX based (machine-readable maths!) and free.


> and arguably sci-hub

Inarguably, not arguably. In my mind there is a clear separation between pre and post sci-hub. The stark increase in experienced friction when accessing scientific knowledge since sci-hub's pausing has highlighted its profound value.

There's also libgen, the closest thing to a bastion of all written human knowledge.

It'd be fun to read a short story about an AI trained on libgen data that makes significant contributions to fusion, longevity and cancer but whose creators are ineligible for any prizes and are jailed for life for multiple counts of copyright violation.


On one of huggingface's explanation blurb of a training set, they half-jokingly speculate that GPT was trained on libgen. Apparently Google won't reveal what book data it used...


Google books probably contains as much information as Libgen. It's simply doesn't allow public access.


Yep. the next Einstein award should go to Alexandra Elbakyan.


and Aaron Swartz


Wikipedia is one of my refuges from the where the rest of the web seems to be going. It’s one of the few that don’t cause me to hit reader view (mobile safaris defacto ad blocker) immediately.


Unfortunately Wikipedia has also been heading in a disappointing direction. I now expect everything on there to be filtered through a specific political perspective. Deleting information is much easier than adding, so pesky facts are unlikely to persist.


The Wikipedia process is still pretty good last time I checked?

There was a notability challenge in en.wikipedia on Olavo de Carvalho, a Steve Bannon type who wrote some philosophy books (like -- on Aristotle and Epicurus) as a younger man (and keeps getting his "philosopher" self-nomination challenged because he didn't go and get a college major in philosophy). I was for keeping the page, but on balance I think the process, arguments and result were fair -- even if almost certainly carrying some underlying political motivation.

We have to focus on the process. It's the only way to general axiology.


I don't have faith in the process anymore. Wiki-lawyering to remove content, is too effective. To what end? Saving disk space?

I'm no longer motivated to fight deletionism in my domain. I don't want to spend time that is not appreciated.

I suspect that as soon as a page is tagged for deletion by someone, others that enjoy being the arbiters of what is and is not WP, flock there. If they lack interest in the topic, they will vote to remove with many technical reasons. Statements that others do find it relevant or interesting, are simply ignored.


Somewhat true of the main pages; if you want to know whether something controversial is being left out, check the talk page.


It does seem that many pages get claimed by one editor or another who sees it as their domain, but I haven't seen any study on the politics of these wiki fiefdoms. Most I see are conservative, sometimes extreme.


I second this opinion about Wikipedia.


Wikipedia has some major issues for real research and there are good reasons why universities don't allow students to include wikipedia links as citations in their papers. The information there simply isn't curated properly.

A main gripe is that more often than not, the supporting links for claims made in wikipedia articles are broken or of poor quality. Another is the tendency of ideologues to remove anything they disagree with in their particular domain.

It's useful for finding trivial information (flag of Botswana, say) but otherwise I usually block wikipedia from search results.


> there are good reasons why universities don't allow students to include wikipedia links as citations in their papers

That reason is because Wikipedia is a tertiary source. It's too far removed from a source of information to be appropriate to cite. It has nothing to do with the reliability of the site; it's basically never appropriate to cite any kind of encyclopedia.

If there's a specific piece of information you found on Wikipedia, you chase down Wikipedia's source for that information and cite that. If you're using Wikipedia as a general reference, you don't need to cite that at all.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: