Hacker News new | past | comments | ask | show | jobs | submit login

Wikipedia doesn’t want to host journalism. It wants to host editorial content. This is an important distinction for technical reasons of editorial process: the content on Wikipedia should always be able to be recreated from the article’s sources. If the article is the canonical place where the assertion can be found, then editors have to tip-toe around editing that piece of content much more carefully, being careful to preserve it as an artifact—which makes some of Wikipedia’s regular editorial approaches (e.g. deleting “niche” articles) untenable. The proper way to contribute primary-source knowledge to Wikipedia is to create a secondary source (like an article on a third-party website) containing quotes of your primary-source statements. Then edit Wikipedia and cite that third-party source.

If you think this won’t work: how do you think the Wikipedia articles for companies get updated to reflect news about them? Almost always, the company’s PR team writes an article, puts it on the PR news-wire, and then goes and edits the relevant Wiki article to reflect the new information, citing the PR news-wire article.

The key reason this is allowed is that anyone could have made those same edits (and probably would have—the PR team isn’t changing the outcome, just expediting it.) The edits aren’t accepted by an argument to authority of the editor, but by an argument to consensus-acceptance of the secondary source as reflective of reality.

By analogy: it’s untenable to put code (e.g. a usage example) in a Git commit message. The commit messages are on a layer above the code; they refer to the code, but they can’t refer to themselves. Nobody can edit the code in somebody else’s commit message to fix it if it’s wrong. It’s much better to have the code committed as code (or docs, tests, etc.) in the repo, because then you can fix it, change it, talk about it, whatever.

The code is the primary source; the commits are the secondary sources, citing the primary sources; and the commit messages are an editorial reflection of the secondary sources. (This is especially true in LKML-style commits where submitters squash their commits into patches, dropping their messages; and then maintainers—essentially editors—write the final commit message.)

If you want to make a new commit message in the commit log, it has to serve as a description of a new commit, which in turn has to package a change in the code. If you want to make a new paragraph in a Wikipedia post, it has to serve as a description of a citation, which has to package new primary-source information. But solving both problems is easy: your write the code/third-party article, and then commit/cite it.




the content on Wikipedia should always be able to be recreated from the article’s sources

If Wikipedia only wants content from published sources, then it's going to miss the 99% of history that's not reported on a newspaper web site.


Nobody said anything about "published" sources. Anything that is not Wikipedia (or another purely-editorial tertiary source like a rival Encyclopedia, or a survey-level textbook) qualifies as a secondary source.

Assertions about history before the Internet—before writing, even—can be cited just like anything else. The "99% of history" that we currently know about, we know about because some historian or paleontologist or anthropologist dredged up primary-source data, put it together, made sense of it, and wrote down their findings in a journal paper (i.e. a secondary source.) Wikipedia, like all encyclopedias, cites those secondary sources.

You have primary-source knowledge of your own? Write a blog post about it. Just like a historian writing a journal paper, that blog post is now a secondary source that quotes your primary-source knowledge (ETA: or maybe the blog post is even a primary source itself, depending on how fresh your knowledge was and how unbiased your reporting of it was.) Wikipedia can now cite your blog post. Wikipedia cites plenty of blog posts.

Alternatively, if you're doing journalism—going out and talking to primary sources—then the Wikimedia group of sites has a place that will host any secondary-source artifact you create from that: Wikinews. You can write a Wikinews article, and then cite it on Wikipedia. Wikipedia cites plenty of Wikinews articles.

Let me make another analogy: if you compare the Wikimedia foundation as a whole to, say, a newspaper publisher; then Wikipedia is specifically the editorials section of that newspaper. Journalism is most of a newspaper; but the one place it doesn't belong, is in the editorials section of the newspaper. Journalism belongs in the "news" part of the paper.

Wikipedia editors know that people can wear many hats, and there's nothing wrong with being both an editor and a journalist. The whole distinction they're making, is that Wikipedia doesn't want people editing articles with their primary-source hats on, or with their journalist hats on. Wikipedia wants people editing articles purely with their editor hats on. If you're a primary source, or a journalist, you do that stuff outside of Wikipedia proper. Then you turn around, take off those hats, put on your editor hat, and edit Wikipedia to refer to what primary-source-you or journalist-you just created. (Or, if you're not an editor by nature but just want your content cited, then you should just be a primary source and/or journalist—someone who writes stuff down somewhere it can be cited—and leave editing Wikipedia to people who like editing. Maybe make friends with some Wikipedia editors, and send them links to your new stuff when you produce it. If it's worth including, they'll cite it and write about it! This is literally exactly the same as having a relationship with the editors of a newspaper/magazine/regular encyclopedia. Just because you/anyone can be a Wikipedia editor, doesn't mean that you have to; and it doesn't mean that being a (Wikipedia) editor is not a specialized job that is some people's comparative advantage, while your comparative advantage might lie elsewhere—like in being a personal historian.)

I want to call out a specific example of blog posts being cited as secondary sources, to make this clear. Microsoft's Raymond Chen is a blogger (https://devblogs.microsoft.com/oldnewthing/). He writes a lot about the path-dependencies in Microsoft products that have led to them being the way they are today. Wikipedia cites these posts all the time. Even though he's writing about things he did himself. You are allowed to be a primary-source man-on-the-spot recollecting history; and the secondary-source historian "interviewing" the primary source; and even the tertiary-source editor citing the secondary-source blog-post "interview" artifact. (I don't think Chen bothers to edit Wikipedia to cite his posts, but there's no reason he couldn't.)


> You have primary-source knowledge of your own? Write a blog post about it. Just like a historian writing a journal paper, that blog post is now a secondary source that quotes your primary-source knowledge. Wikipedia can now cite your blog post. Wikipedia cites plenty of blog posts.

Per Wikipedia editorial guidelines, that is still a primary source:

https://en.wikipedia.org/wiki/Wikipedia:No_original_research...

An account of a traffic incident written by a witness is a primary source of information about the event; similarly, a scientific paper documenting a new experiment conducted by the author is a primary source on the outcome of that experiment. Historical documents such as diaries are primary sources.


Yeah, sure, Wikipedia has one (useful, practical) definition of "primary source" and "secondary source", and they're allowed to define those terms however they like.

Usage of the terms outside Wikipedia, in the greater scope of historiography, makes finer distinctions. One example definition (from https://www.lib.uci.edu/what-are-primary-sources):

> Primary sources are documents, images or artifacts that provide firsthand testimony or direct evidence concerning an historical topic under research investigation. Primary sources are original documents created or experienced contemporaneously with the event being researched. Primary sources enable researchers to get as close as possible to what actually happened during an historical event or time period. A secondary source is a work that interprets or analyzes an historical event or period after the event has occurred and, generally speaking, with the use of primary sources. The same document, or other piece of evidence, may be a primary source in one investigation and secondary in another. The search for primary sources does not, therefore, automatically include or exclude any format of research materials or type of records, documents, or publications.

Anthropology and paleontology have very clear "primary sources", because they deal in hard artifacts. Those artifacts are primary sources. The things you write down about those artifacts are secondary sources. This distinction is important because different researchers might interpret the same artifact in different ways. But, if you know that there's a primary-source artifact preserved somewhere, you can always go back to it and study it yourself, rather than taking any particular researcher's word for it on what it is, or what it means.

History, on the other hand, deals in documents, received oral traditions, etc. In these cases, the "primary sources" serving as inputs to a historian's work are often things that would have been considered "secondary sources" or even "tertiary sources" at their time of creation. For example, a centuries-old medicinal textbook. A historian can cite this document as a "primary source" for what kinds of medicine people at the time believed in. But, of course, despite being a "primary source" in the sense of being a real document from the period, it's not a "primary source" in the sense of reliably giving you hard data about what people at the time actually did. Every word written in the document was, at the time, an interpretation that went through an editor. They might have introduced all manner of bias.

Likewise, in modern writing, if you are a sane adult human being, you are usually considered to be creating "primary source" documents if you write down/are interviewed about your experiences of things as they happen to you. But—despite being the person that did these things!—if you are recounting your experiences long after the fact, your recollection would usually be considered a "secondary source." Per the definition above:

> Primary sources are original documents created or experienced contemporaneously with the event being researched.

> A secondary source is a work that interprets or analyzes an historical event or period after the event has occurred...

That second assertion still holds, even if the same person that is doing the "interpretation or analysis" took part in the event.

Wikipedia might consider e.g. someone's written reflection on what their childhood was like—or a veteran's recounting of a battle long after the war has ended—to be a "primary source", and it's Wikipedia's perogative to use the term however it best suits them. But most of academia would disagree with them.

And, practically, if you have your own primary-source hat or journalist hat on, you should use the greater academic definitions—because Wikipedia might not always draw these particular distinctions; because you might be submitting your work to more editorial teams than just Wikipedia's; and because it's best to be pessimistic in how authoritative a given editor will judge a particular work of yours to be. If you obey all the rules required to get your work cited as a secondary source, you won't need to worry about whether it qualifies as a primary source.


Usage of the term outside Wikipedia is irrelevant to editing articles on Wikipedia. The person you responded to was talking specifically about having Wikipedia edits reverted for using primary sources.

Though on rereading the post I wonder if the issue is more of original research than primary source. The line between them is a bit blurry.


Great, and even if you are the secondary source as recommended they will just delete your content (as you noted) for niche articles.

I just dont understand why anyone would want to contribute to wikipedia given how much annoyance you have to deal with to help.


You're still coming at it with the mindset of a contributing editor. Most people who edit Wikipedia are not contributing editors; they're just plain editors. They fix spelling mistakes and re-arrange paragraphs to make the text flow better, for fun. They could be (and many are!) professional editors for publications. Editing Wikipedia is good practice for being a professional editor.

People who come to Wikipedia wanting to make some text be in the article are going to come away upset. People who come to Wikipedia wanting to make the article the best article it can be—and in the process, discover some citable sources and decide that the article would be improved by a new sentence containing a gloss of what one of those sources says—are going to enjoy their time. You don't edit Wikipedia to get new information into the encyclopedia. You edit Wikipedia to improve the quality of the encyclopedia-as-encyclopedia. (Maybe, sometimes, by improving articles into existence. Maybe, other times, by improving articles out of existence. Maybe by making the article longer; maybe by making it shorter. One of these states† is optimal for the people looking for an encyclopedia article on X. The editors want the article to be in that state.)

If you look at it as less like a content-aggregation activity (like submitting and titling Reddit posts)—and more like the activity of a group of cloistered monks excited to work together to make a the best darn illuminated manuscript they can make—I feel like the Wikipedia community and its foibles makes perfect sense.

† And yes, that means that sometimes readers will come away empty-handed for their query. Often that's for the best, especially if an existing Wikipedia article on X would just bump down a much better Google search-result for X (say, that of a niche-content encyclopedia) to #2, such that fewer people are using that excellent resource. There is a reason Wikipedia doesn't have articles for every existing Pokemon—and that reason is that Bulbapedia exists, and Wikipedia doesn't want to try to pretend it can beat Bulbapedia at being Bulbapedia. People compare (an offline copy of) Wikipedia to the emponymous Hitch-hiker's Guide to the Galaxy, but it's not; a true HHGTTG would look like a copy of Wikipedia stapled together with copies of all the niche encyclopedias it intentionally defers to.


>You don't edit Wikipedia to get new information into the encyclopedia.

Uh, how does new information get in the encyclopedia then?


You don't edit Wikipedia with the intent of putting new information in there. The culture of Wikipedia will notice you as a foreign presence and push you out. You edit Wikipedia with the intent of improving Wikipedia. When you do that, some of your edits will add new information to Wikipedia as a means of improving Wikipedia. But that addition will have been a tactic for satisfying your terminal goal of improving the encyclopedia, not a direct attempt at satisfying your terminal goal of having this information be in the encyclopedia. The community can tell the difference.

By analogy: a small farming village has a commons. Someone who grazes their cows on the commons, then sells the milk and meat to the townsfolk, is just participating in the economy, and the townsfolk are fine with that. Someone who herds their cows in from the next town over, grazes the cows on the commons, and then herds the cows back home—and never otherwise interacts with the community? Not okay. That's sociopathic behavior. They're not a member of the community; they're just taking advantage of it for their personal benefit.

Wikipedia's editors are smart enough to recognize what someone looks like when they're just trying to take advantage of Wikipedia for their own (PR) benefit. The edits might be the same, either way (just as both farmers above graze their cows in the commons in the same way, either way); it's the context that determines whether the edits are acceptable.


WP:V specifically cites press releases as self-published sources, which aren't verifiable.


True. However, there's a duality to the citation of press releases.

There's the skeezy usage of them—citing the marketing claims as truth. (I totally forgot about the possibility of this usage, honestly.) Yes, nobody will let you get away with that.

But there's also the more literal way to cite a press release: as a primary-source claim by the company about what the company is doing or planning to do, like the group equivalent of a diary.

Obviously unacceptable: "Apple's new iPhone is the best ever!![citation to PR fluff-piece]"

Acceptable, I think: "Apple issued[https://www.apple.com/ca/newsroom/2017/06/imac-pro-most-powe...] an interrim press release on June 5th, 2017, claiming that they were working toward the release of a new "Pro" iMac model, as a supplement to professional users who are waiting for the next generation of the "Mac Pro" product line, which they mentioned as having been delayed."

It's a use/mention distinction thing. It's not okay to report the press release's claims at face-value, but it's okay to treat the press release as a primary source of information on the corporation's intentions, beliefs, claims, and assertions. It'd be similar to, say, citing the primary-source of a letter(s?) patent, as a source of information of a head-of-state's intentions, beliefs, claims, and assertions.

Or, to put that another way: any claims a PR piece might make are unverifiable, but the mention of the claim is itself verifiable—you can verify that the claim is right there in the PR piece, and you can verify that the PR department of the relevant company really did publish it, and thereby you can verify that the company is in fact making that claim... which can be an important thing to have in an article about them all on its own.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: