In case you're similarly misled by this headline as I was, this is about making the archive specifically of "Editor & Publisher" available, not about some larger archive of the contents of American newspapers.
It's an amazing resource, but it's hidden away and researchers within Google have trouble accessing it even if they knew it existed. I spent
months with lawyers to get access to the original files for research and at one point they told me they were going to delete it! Whaaaaa!?!
It was something like 6 PB, which seriously, to Google isn't much, but the team that "owned" the data wasn't using it and to them it was just an expense. Ugh. People don't care about history.
Google doesn't want to make the web and the world searchable any longer like their original set values set, now they just want to put up just good enough service to get all your private information and then aim ads at you. Gone are those heady days of some streak of altruism in their mission as a corporation.
Which is infuriating because people would happily pay even a fairly premium price I suspect for good access to that archive, but of course Google can't just sell a good product for a reasonable price, it has to be FREE******.
(Each * here representing some unknown third party getting access to your email address, phone number, blood type and sexual preferences.)
New York State funded an effort to scan, but not digitize historical newspapers, and while the microfilm is stored in the state archives, the online versions are hosted by an eccentric guy who digitizes the microfilm as a hobby. The guy puts everything online, but makes it difficult to work with in a variety of ways.
That's fantastic, though this too appears to have quite limited scope, going by the about page. Certainly a great resource though, and since I'm currently doing some genealogy research on two separate branches of my family that emigrated to the US I'll definitively use it.
etrabroline did say "in the public domain." I consider Isaac Asimov's inclusion major, for example. I've even heard of a lot of the periodicals he's listed in.
It's a century's worth of NYTimes front pages. An amazing long term dataset that is fun to flip through to answer infinite questions you might have about both how the last 100 years really went down and also about the evolution of presenting information to readers.
I highly recommend everyone go back and read the headlines (and articles) from a year ago from multiple news sources frequently. It gives great perspective on the narratives at the time with hindsight as to how those narratives changed and if the predictions made were substantiated.
Almost anything in politics is viewed completely differently after a year. Just look at any news story you followed last year. Around the election. Impeachment. EU/Brexit, etc.
Personally I had not seen anything from him stating he believed the security holes were fixed on a broad scale, or on any scale. But his recent video in November 2020 suggests he no longer had any substantial concerns regarding voting machine security: https://www.youtube.com/watch?v=cMz_sTgoydQ&t=521s
Two easy ones: COVID coverage from January to March, and Kamala Harris coverage from Democratic Debates to VP nomination.
For COVID, back then, it was "tech bros are afraid of the flu (and maybe also racist)" and then "go hug a Chinese person", and then "closing flights from Wuhan is racist (even though China itself was doing it domestically)". It was funny to see the escalating tsunami of wrongness coming at the media, but of course, they would never admit wrong-doing.
For Kamala Harris, well she was (is?) hugely unpopular. She has terrible political baggage, has been personally responsible for terrible systemic racism, and the news coverage reflected that quite accurately up to a very specific point. Tulsi torpedoed her early on in the debates by bringing all this up.
However, when Biden said his VP choice would be a woman of color, there was an immediately 180 in coverage and retroactive editing to make it look like she was an exceptional candidate who has always championed racial issues and is a regular everyday person just like all of us.
> closing flights from Wuhan is racist (even though China itself was doing it domestically)
This is a myth. Wuhan Tianhe International Airport was completely shut down. Domestic and international flights stopped at the same time.
Niall Ferguson of the Hoover Institution falsely claimed that China kept allowing international flights to keep taking off after it stopped domestic flights, and his claim has since gone viral. Even Trump has repeated it several times on national television.
The irony is that the airport was shut down so quickly that foreign governments didn't have time to get their citizens out of Wuhan. They had to negotiate with the Chinese government to allow specially chartered evacuation flights to take off from the city. But Niall Ferguson looked at a flight tracker that showed scheduled (not actual) flights, and concluded that international flights were still taking off from Wuhan, and then wrote an Op-Ed about it. The myth has never died, despite repeated debunking.
It's fun to browse but I'm sad that the project seemed to just sputter out. At one time I thought it would grow to make historical newspapers searchable with coverage comparable to the books searchable through Google Books.
There was a recent news story about a 150 year old town hall in New Hampshire burning down. As of the writing of the article they weren't sure if any of the old town records were destroyed but it seemed likely. Towns should probably be digitizing all of these things now, and archive.org seems like a great place for them.
History isn't what it used to be. Recent events suggest that Internet Archive too will cease to be as soon as it's content becomes politically or ideologically unpalatable. Enjoy it while you have it, don't count on it being around forever (in any useful state anyway).
Interesting question. I wonder if history will view the years prior to 2000 as mostly American since we have such a passion for digitizing everything and got on the internet early.
Wikipedia and Archive.org are doing what the internet was made for at least on the history side of things, but also culture, information and education. If you can, donate.
- 43% Direct support to websites.
Keeping the Wikimedia websites online is about more than just servers. It also includes ongoing engineering improvements, product development, design and research, and legal support.
- 32% Direct support to communities.
The Wikimedia projects exist thanks to the communities that create and maintain them. We strengthen these communities through grants, projects, trainings, tools to augment contributor capacity, and support for the legal defense of editors.
-
32% Direct support to communities
The Wikimedia projects exist thanks to the communities that create and maintain them. We strengthen these communities through grants, projects, trainings, tools to augment contributor capacity, and support for the legal defense of editors.
- 13% Administration and governance.
We manage funds and resources responsibly to recruit and support skilled, passionate staff who advance our communities and values.
- 12% Fundraising.
Wikimedia is sustained by donations. Millions of remarkable individuals and institutions ensure that we have the necessary resources to continue our global mission.
As for archive.org they offer basically no transparency as to which sites are excluded from their archiving (and as far as I know they will remove all the content of a site once the owner asks them to).
Those threads don't show anything interesting, some just bash the org for spending any of the money they receive in a way someone doesn't like from years and years ago. This is silly
They show something interesting, that "Hosting wikipedia accounts for roughly 2% of their total expenses", something that someone donating might not know.
I do think it’s relevant to know how funding is spent, but I’m not sure why hosting being only 2% of expenses would have you advocate AGAINST donating to Wikipedia.
The running one of the biggest technical and community concerns on the planet takes a lot more then serving content from a computer somewhere. Salaries, community management, improvements to the wiki software—these things cost money too and are no less important then responding to HTTP requests.
Community is arguably a bigger part of why Wikipedia works the way it does, vs hosting which is a commoditized concern and only thinks of the bare minimum to keep the lights on.
But lighting up an empty building doesn't turn into Wikipedia.
The community rarely benefits from the actions of the organization. In fact there is kind of a hostile relationship between them. Most people contributing to wikipedia have never been to one of the parties for example.
I don't get why people are so upset that Wikipedia spends more money on personnel then on hosting.
Most internet organizations do that (free, charity or commercial). Good people are expensive. You need sysadmins, programmers, some graphics artist, probably more than a few lawyers etc.
And I wouldn't want them to use the cheapest, possible people that they can find for those roles. And asking good people to work for free or cheap, is just as shitty.
And a lot of free software and opensource foundation spend most of the money organizing conferences, so that people can meet in person and have presentations and working groups etc.. But when Wikipedia does the same its somehow wrong ?
You cited one of few small ways the organisation is able to something back to wikipedians as the problem.
Are you now pivoting to saying Wikipedia doesn't spend enough rewarding contributors, from having previously complained about having a celebration with contributors in Accra?
I strongly encourage you to go to one of the events run by the kind of organisation you're complaining about. I'm not exaggerating when I say Mozfest is life changing for many people. Open Knowledge Foundation events have been career-defining for me and many people I've mentored in the UK, Kenya and South Africa. I've never been to a Wikipedia event but I know about them and I think you're imagining something vastly different than what happens. These are events that are about building communities, networks, and opportunities for open collaboration. If you don't like how these organisations do it, sorry, because they are the ones doing it successfully.
They are volunteers. They do as much works as they want or don't want to do. Once you are the size of the Wikipedia there are tasks that need to be done, and you can't really relay on volunteers to do it in their free time.
> I would avoid donating to any such organisation myself.
Its your money.
I am just saying that its normal and expected in free and opensource communities to do that sort of things. Organize conferences, meetups, public awareness, handle the legal stuff and hire the core stuff that makes sure thing are running 24/7.
This is not helpful. Wikipedia is a community project, and managing that community takes a lot of the resources, obviously.
The Internet Archive has proved itself so many times over, and is so underfunded, that it deserves all the money it can possibly raise - which will allow them to solve more of the problems they face.
I believe that it is helpful because if I knew about that a few years back I would never bother donating nor contributing to them.
> and managing that community takes a lot of the resources, obviously
I do not think that parties in Accra using donation funds is an integral part of "managing that community".
The Internet Archive has proved that they are not to be trusted multiple times. Such as when they decided to not be transparent, to retroactively remove content if asked by the site owner or if robots.txt started banning crawlers, when they started "lending" e-books with DRM, or when they decided to do the whole "National Emergency Library" thing and publishers sue it over it (the lawers and possibly said publishers if they win the ruling - which is likely - will be paid via the donation money).
> I do not think that parties in Accra using donation funds is an integral part of "managing that community".
Well, that's why the opinion is unhelpful.
Wikipedia runs the largest decentralised and communal knowledge curation project in history. Making it and keeping it great takes a lot of free labour, all over the world. They have to organise communities at many many different levels, and often that involves getting a bunch of people in a room together - people who aren't getting paid - for a few days to make a ton of decisions and design and implement things. It involves recruiting new people to help and teaching them the organisational processes and skills, and nurturing new people up through the organisation, and resolving comflicts. All of that is absolutely necessary to make something like wikipedia work and barely scratches the surface.
A party for the people who worked their asses off for free is really an incredibly cheap way to reward people.
You think that it is unhelpful and this is fine, you are free to ignore my comment and donate to wikipedia if you want. My post is meant to inform people so that they can make an informed decision and not feel like they have been scammed.
As for whether parties are important for wikipedia, anyone interested in this topic can read the links that I posted earlier. There was a debate whether they are helpful or not if I remember correctly.
Wikipedia is good at running wikipedia. They produce something incredible. If you want to exercise moral control over people who do jobs you don't understand, by all means keep your money. But it's not helpful for you to encourage others to do that.
Even if they aren't, I don't see why the people doing good things in the world can't have something enjoyable.
If everyone who works for a charity has to work for at most minimum wage, then I don't think sites like Wikipedia or the Internet Archive would be as high quality as they are.
I think it's because a lot of people with limited means apparently donate to Wikimedia thinking it's a small non-profit on the verge of bankruptcy. And in no small part because they imply this every year during their fund drives.
I doubt they'd do as well fundraising if your average donator knew that Wikimedia was pulling in $100m a year, is paying it's directors and above $200-400k a year, and the vast majority of their dollars are not spent on "servers and power" like their ads imply.
Personally, I'm fine with their budget, but I do wonder if their fundraisers are as ethical as they could be.
This is my main issue. The minimum wage in Greece, where I a from, is less than 10k/year, some people who I know (including me) donated to them a few years back in hopes of helping because they make themselves seem like they are in the brick of collapse every time they fund-raise, only to learn that they give huge wages to their higher ups and waste the donations on things irrelevant to Wikipedia.
But Wikipedia doesn't, and couldn't run from Greece. It runs from San Francisco, because that's where they can be the most efficient at getting the most billionaires to help. In San Fran employees would probably worrying about money at $120k, and would be homeless way before they got to $10k. Also, WMP aren't hiring people at average wage - you're comparing the average job in Greece to running one of the most important organisations in the world from San Francisco. That's not a reasonable comparison. Of course WMP has to have brilliant, motivated, connected, experienced people at the top. They pay them far below average wage for people in that market.
Honestly you should compare what WMP pays with other similar sized global organisations.
> because that's where they can be the most efficient at getting the most billionaires to help
They should choose between the donors and the billionaires then. (regardless, they are a tech company, they could employ people remotely)
This still does not address the issue of the dishonest fundraising though (pretending that they need your help despite making 1/10th of their wage and not saying up-front on what they spend the money on). I am sure that you would be upset if you gave money to a beggar that you later see getting in his Mercedes, or if you donated money to an organization for the homeless but then realized that 90% of the money goes for researching new pie recipes. It honestly feels like that.
> They should choose between the donors and the billionaires then
Why? They have absolutely no reason to do that. They have a mission, they are raising money for that mission, and doing so very successfully.
> (regardless, they are a tech company, they could employ people remotely)
Could, and they know that, and they make hiring decisions based on their expertise and knowledge, and have deciided the organisation is better off hiring some people in SF.
> This still does not address the issue of the dishonest fundraising though (pretending that they need your help despite making 1/10th of their wage and not saying up-front on what they spend the money on)
I understand that you had a misunderstanding about what wikipedia is, how their community management works, and that you didn't make use of the public, publicised resources that make their spending completely transparent. That doesn't make them dishonest, it makes you lazy.
That is totally fine! But if people don’t want to donate for them, they should know so they can make that choice, and not be swindled by Wikipedia’s donation request banners that make it sound like the public good is perpetually on the verge of insolvency.
Not only that but the execs and many pay themselves a nice fat salary with all the benefits and perks of course that most small to medium enterprises would be appalled at for a "struggling non-profit" that we have of their image.
From the banners one might conclude that they are facing imminent shut down and getting by in an unheated offices.
no way I'm giving them money especially after reading the threads here and people attack h_anna for exposing the truth.
What actually happens is that the organisation needs certain skills and networks and hires the right people to bring them in. The salaries are small compared to what the people getting them could get paid at other organisations. I know good people who turned down much bigger salaries to work at WMF.
Nobody has exposed anything, these are all public facts published and actively publicised by Wikipedia itself. They literally actively try to expand participation in local groups, and small parties for the people who gave their time is a gesture of thanks to those people, not some ostentatious overspend.
Not to me - as you can see from the sentence you quoted - but for the people being hired and in context of the other organisations that are offering them jobs.
For the people running an organisation the size and importance of wikipedia? 6 figures is a laughable minimum. The legal and social responsibility, the vast and varied expertise required, the personal social networks you put on the line. We need the best people in those jobs to keep that organisation existing at all, let alone functioning at the level is has and does. Recruiting those people is competing against the rest of the world to hire them, and they are already taking a huge cut in remuneration because they want to do good.
Depends on the market rate for their skills. In this case, yes, depending on where they will be expected to live a barely six digit salary could be low.
It's interesting that one of your criticisms is that the Internet Archive is too lawsuit-averse - that it takes things down, after requests from apparent rightsholders, too quickly & opaquely. But your other criticism is that the Archive hasn't been lawsuit-averse enough - that during a once-in-a-century emergency with the nations' libraries closed, they took too much of a risk in offering extra digitized book loans, against the wishes of rightsholders.
As a former Internet Archive employee, but not speaking for them here:
That's inconsistent. If the Archive were as meek about deferring to traditional-rightsholder supremacy as you want with regard to the "Emergency Library" of digital books, the web archive might not exist at all, or could only include material with explicit prior permissions – shrinking it to a tiny fraction of its size.
When the Archive started crawling & storing websites, there was no clear legal right to do so. (The 1996 DMCA, which if read a certain way, immunizes some such activities as "caching", wasn't even law when the IA started in 1995 - but its immunity also requires the prompt retroactive removals you find objectionable!)
There was, from the start, a colorable argument, based on 'fair use' & the historical role & respect given libraries by our law & culture, that this should be legal, and could be legal if the facts were interpreted a certain way.
But on the letter-of-the-law, there was an immense risk rightsholders could sue - like the publishers have now with regard to pandemic book-lending.
It was only by demonstrating the immense value of such an archive, & repeatedly making the case for its legitimacy via such demonstration and reasoning in courts & legislatures & culture, that the right of such an archive to exist has now been firmly established. The fact that lawyers & courts themselves have found it so essential has been part of the success. And, now that it is a familiar activity, it continues to gain reinforcement by being assumed-as-legitimate when new laws/policies are drafted, because those now avoid language that could be inadvertently interpreted to prohibit something obviously good & existing.
Still, having an automated exclusion procedure (via 'robots.txt'), and generally respecting credible rightsholder takedown requests, is essential to capping the legal risks of such a large archive.
And the Archive's approach on book-lending, including during the "Emergency Library" program, has been broadly similar. An urgent need & technological opportunity arose before laws & explicit "ask first" processes could accommodate the situation. Culture & precedent suggested extraordinary, but temporary, adaptation could plausibly be legal & would be net-beneficial for society. (Which is better: no access to library-loaned physical books while in 'lockdown', or tracked time-limited access to temporarily-created digital copies, smaller-in-number than the count locked in closed libraries? Does technological format-shifting in response to an emergency, with minimal impact on rightsholders' revenues, fit 'fair use'?)
Still, the legal risks were capped by setting the program to be of limited duration, & having a policy of respecting any explicit book-exclusion requests from rightsholders.
If you're really so afraid of rightsholder damages judgements, sure, the Archive isn't your best donation target. (I think the risks to the Internet Archive from the latest lawsuit are limited to having a bad precedent established, not bankruptcy.) But know that every historical website you can access is there because the Archive was willing to take some risks establishing new rights/precedents, & your preferred policy of fewer-exclusions would mean yet more legal risks. And lots of people donate to good causes specifically so that they can defend themselves, legally, or push new cases, legally, for broader benefits.
Re-reading my older post I realize that I was unclear. I a bothered by the combination of lack of transparency, retroactive removal, and exclusion. I would not be bothered if they had a clear transparency policy. Such as a list of sites that were retroactively removed, sites that are excluded from future archival, and a warning that data has been removed (along with the reason) when you try to visit an excluded site via archive.org. Sadly for some reason this seems to not be a thing.
> Still, having an automated exclusion procedure (via 'robots.txt')
I am talking about retroactive removal, not exclusion. That being said I heard that they recently stopped doing that.
> If the Archive were as meek about deferring to traditional-rightsholder supremacy as you want with regard to the "Emergency Library" of digital books
Another organization could be created for this. By having IA do this it sets at risk the rest of the archive (as well as the donations given to it).
Personally I donate to keep wikipedia independent so they don't get bought up by an entity. I don't care what agendas they have and there are no better alternatives (and never will be).
Thank you h_anna_h, I know you are getting downvoted but HN just isn't the same as it used to be. People get easily offended now and will downvote you if you dont agree with them. It's pathetic and sad.
Thanks for bringing up that thread. I have a lot of reservations about giving money away to such "non-profit" organizations before but I get an idea of whats going on behind the scenes now.
I certainly will no longer be donating to the EFF and Wiki organizations knowing non-profit organizations possibly funding parties and luxurious corporate life styles. I want the money I give them to stretch as much as possible. I'm not against parties either but when you spend millions citing conferences and travel (even some large for-profits dont spend this type of money) citing you are non-profit to mismanage my money then I simply won't stand for it.
My trust in local non-profit organizations were already pretty shaky but I somehow trusted these large organizations because I thought they would be more frugal and we would have more transparency. I was clearly wrong.
You simply have no idea how the money is being spent and its very difficult to dictate how they should better allocate resources either.
Apparently most people think that frugality would be the default in these organizations but once you give them a credit card, they won't think twice about over spending on stuff that has marginal benefits like luxury company cars, chartered planes, expensive dinners etc.
I will still donate to causes I believe in but I am going to now ask for receipts and will enforce strict frugality. Parties should be limited to the office with a dozen Dominos pizzas and everybody should bring their own soft drinks & cups from now on.
It's unpopular here, but that stunt really made me wonder about their management.
They opened themselves up to millions of dollars (if not billions of dollars) of liability for no good reason. (Covid doesn't give you a pass to give away someone else's property without permission.)
Similar to you, I'm not going to donate to an org that is so reckless with donations. If they would admit it was an error in judgement and come to a quick settlement I might become a donor again.
Even still, the damage has been done -- it'll take a long time for publishers to trust the Internet Archive again.