I'd say that before Google, "AltaVista knew". In my experience, it was AltaVista that transformed the web/Internet from a collection of disparate sites to a single place that you went for answers to questions.
Before AltaVista, I used to write the URL for any useful web and ftp servers that I found in a logbook, so I could find them again. Getting an answer was a matter of looking up the logbook, deciding which website might have the required information, then typing the URL into my browser. After AltaVista the logbook fell into disuse, as locating information was a matter of typing a few well chosen keywords into AltaVista and visiting a few of its top hits.
Google took over from AltaVista, but at the time it felt like Google was a "better AltaVista" than something completely new.
Maybe I didn't know about Alta Vista at the time, I can't remember the search engines I was using back then, but what I remember is that every single search was returning thousands of porn results.
Google was a revelation, it actually found things !
And, AltaVista actually showed you a clustermap of results keywords that you could refine.
So, if you typed in "python" you would see a clustermap of results that included a bunch of herpetology in that cluster and a bunch of computer science in that cluster, etc.
One thing that this article fails to point out is how often librarians were wrong before Google. In 1986 there was a study[0] that showed that across the board reference librarians were only correct about 55% of the time.
I think most people today would consider a query answering system that had an accuracy of 55% to be an interesting curiosity, but certainly not ready for real-world application.
It's funny how frequently we measure machine learning performance of an "easy for humans" task and fail to compare it to human accuracy on the same data. We just assume humans would do perfectly on it. I'm sure there are a few MNIST digits that I would get wrong.
[0] P.Hernon, C McClure "Unobtrusive Reference Testing: The 55 Percent Rule," Library Journal, 111 April 15,1986
It's the pre-internet sources and Government employees fault and not the Reference Librarian as a profession. This is out of the US Library system only.
Your "Across the Board" = "Government documents and central reference areas of 26 U.S. libraries" Not your academic or local libraries that most of us think of as Librarians.
I was a System's Librarian and contend that Librarians are needed more now more then ever due to the Internet but that is a different tale. One-Third the Librarians were let go in my state during the economic down turn).
Reference Librarians should be quoting sources and I bet you a million dollars that a Reference Librarian gives you the source and it is the source that is wrong.
But why were people asking reference librarians questions?
When I spoke to librarians in the mid-90s, they'd just point me towards sources that could have an answer. It was up to me, the reader, to ultimately find the answer.
Depends on where you were and who you were asking.
At several large research university libraries, I found the reference librarians could be downright obsessive in tracking down some questions. It would depend on workloads and other factors, I suppose.
If a search for something like "crane hook design principles" returns a page with "buy crane hook design principles for less" at the top, it is wrong. And wronger than the wrong of a librarian because it's not even trying to be correct, just thrown out there for the edge case. We just happen to write it off as "that's just Google".
Is that what it shows you? I just searched for that exact thing. The first two were ISO standards, the third wikipedia, and the fourth a paper on crane hook stress analysis.
Not as good as a directed search by a professional, sure (and I haven't actually looked at any of those papers, so who knows how relevant they actually are), but my experience seems a lot better than what you're implying.
If you followed either of the links to ISO, you would have found that they are offerings to sell the standards. The technical information is behind a paywall. That's ISO's business model. The link to Wikipedia is about cranes in general.
If a librarian said "the answer will cost you $181 (CHF 178,00)" or "here is a book about cranes" we would probably agree these high quality agree even if we might disagree over their being wrong. Nevermind that ISO standards are not necessarily a good set of design principles because minimum requirements are not the same as design tradeoffs.
I don't think that's entirely fair. Surely the most useful measure of "wrong" is whether the user comes away with the correct answer?
Adverts on a webpage might definitely make it less likely that a user comes away with that answer, but I don't think they'd invalidate a user's successful result any more than gossiping with the librarian would invalidate their help if they successfully find an answer.
The advertisements are noise. SEO also produces noise. At some point the signal is lost. Google's business model is built on maximizing the amount of noise a user will tolerate whenever the user in hope that the user stops searching and starts shopping. If you search for a book, Amazon will appear before the Wikipedia page.
Nobody contested your argument, specific knowledge of niches s still fairly scarce on the net, at least if looking for gratis information. OTOh a well sorted library supplies books that cost a fortune, so google couldn't be replace that, at least not yet.
I'm a bit confused by this article because Google can't actually answer the complex questions that the librarians have received. Google's "knowledge graph" can answer simple questions, based on scraped Wikipedia data, and the core part of the site (the search engine) presents users with a semi-random set of links, basically telling them to find the answer on someone else's site. It's like a librarian that knows a little bit of info (but not quite enough) who then tells you which part of the library to go to find a book that may or many not have the answer to your question. A site like Quora is better for answering complex questions.
IMHO, the "knowledge graph" actually shows that Google is very poor at determining the implied questions people are asking, and exposed the failures of alogorithmic search in general: http://newslines.org/blog/googles-black-hole/
The article doesn't even mention the Knowledge Graph. It's using just referring to the fact that Google is still the first place you go to ask those questions. Usually, the answer resides somewhere in the first page of results.
I asked Google "will a poisonous snake die if it bites itself?" Results include Quora, Yahoo Answers, a Youtube video of a snake actually doing it, and HowStuffWorks. So no "Google" doesn't know, but I'm going to ask Google when I want to answer a question.
I am skeptical about the entire premise of software like Google answering questions more complicated than research for school reports. Any sufficiently interesting question can be answered better by humans. Questions with simple answers often require more nuance to properly understand the answer. For example if you ask the population of a city. When was the census, and how exactly was the edge of the city defined? To make information usable you need to understand the source. Complex questions have multiple sources that may be incomaptible in forming an answer to a question. Without strict standards for representing knowledge I doubt Google have sufficient data to do anything groundbreaking outside of a few small areas.
I guess the point of the article is that now you find answers to these questions _using_ Google, and not necessarily get answers from Google itself. Also, while Google Knowledge Graph cannot answer complex questions given as examples in the article, until quite recently it couldn't even answer simple questions it is able to answer now. I wonder how long will it take until it is able to answer the questions from the article.
I think Google can't answer complex queries because it cannot understand the question, because the answers to the questions are generally created by humans, and are not easily scraped from other sites. So it seems likely that Google will buy Quora, or make an equivalent.
I was channel surfing on TV yesterday and flicked onto one of the Harry Potter films, to hear Hermione saying she researched something in the library and couldn't find a single reference - my immediate thought before moving on a channel was that (assuming I don't have her magical skills), I would have no idea how to research something in a library without either the internet, or asking other people who might know where I could find the answers.
(I say no idea - obviously on a basic level it wouldn't be hard to start by finding all books on the rough topic, reading them, etc.... but to do it with any sort of efficiency as opposed to so slowly that I'd end up beating myself to death with a hardcover.)
When I was growing up in the 70s and 80s, we were taught how to use a card catalog and later, microfiche for finding stuff in the library. While I was still in university up until the earliest 90s, I was still using microfiche for research.
I was born in 1990 - I spent my childhood in libraries for fiction, and I guess occasionally for extremely basic "research" for school projects, but at an age when "research" was enough to look up a book on the topic I needed to know about, that probably was at least partially a picture book, then borrow that and go home with my mum...
Have fond memories of both card catalogs and microfiche, but only really for finding novels I wanted.
> .... but to do it with any sort of efficiency as opposed to so slowly that I'd end up beating myself to death with a hardcover.
I'd say you have the process down pat. It really was a matter of spending half a day locating a candidate set of books in the subject catalogue, writing down their details, then walking all over campus to locate the books in the various branch libraries and reading the relevant section of each to see if it was what you wanted.
The only reason people didn't beat themselves to death with a hardcover was because they didn't know any better, because there was no better. (Mind you, from a Body Mass Index perceptive, the old way was better!)
⚫ A subject search in the card catalog. Find both books on the subject and related subjects. Cross-reference the books to find other possible subjects to search. Later, online catalogs (computers accessible typically only at the library) could make this far faster.
⚫ For news items, title searches in news indices. Later, Lexis-Nexis. If it was a major story and you knew the dates, looking through issues of major news publications (e.g., TIME, Newsweek, etc.) could often turn up hits.
⚫ Bibliographies. Let someone else do your research for you. Cross-comparing bibliographies often turned up the key / seminal works in an area, where you could then use ...
⚫ Citations indices. These show where a given work was referenced. So if you find a seminal work, you now find the works (generally journal articles) which reference it.
⚫ You're also probably starting to recognize key authors at this point -- in most fields there are a limited number who really know their stuff. So you can start running author searches on their other writings.
⚫ If the library has open stacks, stack surfing can be quite interesting and serendipitous. Find the area containing books of interest and pull interesting titles. This is limited to the books which are on the shelves, so it's biased away from the more popular material. Depending on just how popular the material is, it can still be quite useful.
Though those are the "old school" methods (1960s - 1990s for the most part), they're still generally applicable today. I like to research a field for a while until I start turning up the same names repeatedly. At that point, you've got a general feel for the size of the area. It's possible to dive in and read in depth.
I also have a personal bias for older works -- I like to dig in deep. Much more recent material doesn't change the understanding all that much, and you can often find that there's been a significant degree of bias or skew in how original works are interpreted. Reading them with your own eyes can often reveal what you wouldn't otherwise know, or can give you an unbiased (or at least your own bias) of the original work's meaning and focus.
Many fields are highly skewed to a present / current bias. While it doesn't hurt to note where recent research has actually advanced the state of the art or changed understanding (and curiously: it's in the study of some of the oldest events, particularly of Earth's geological history and that of the Universe that we've seen stunning progress over the past 40 years), in human, political, and philosophical fields I find that at the very least having a familiarity with works going back 200-2000 years (and occasionally further) can be quite useful. There are some very persistent aspects to human behavior which remain unchanged. And yes, there are ancient beliefs which have proven themselves quite wrong.
But that itself might serve as a caution -- the same human nature that generated those ancient errors persists today.
Can you tell me the thickness of a U.S. Postage stamp with the glue on it? Answer: We couldn't tell you that answer quickly. Why don't you try the Post Office? Response: This is the Post Office. (1963)
This has a strong odor of trolling to it - one wonders how many questions directed at librarians were genuine vs. mischievous. I'm also struck by the failure to provide an answer rather than suggest a method, eg 'use a calipers to make precise measures of physical objects. If it is too difficult to measure a single stamp, measure a block of multiple sheets and divide your result by the number of sheets.'
The other question that jumped out like that one was in a trolling context, though probably was real in that timeframe was "And there was this typewritten note found on a cataloguing card:
Telephone call mid-afternoon New Year's Day, 1967: Somewhat uncertain female voice: "I have two questions. The first is sort of an etiquette one. I went to a New Year's Eve party and unexpectedly stayed over. I don't really know the hosts. Ought I to send a thank-you note? Second. When you meet a fellow and you know he's worth twenty-seven million dollars — because that's what they told me, twenty-seven million, and you know his nationality, how do you find out his name?"
I do not feel even Google today could handle that.
As a librarian-programmer, I continue to find it interesting how interested HN is in libraries/librarians.
I wonder what that's about? Seriously, I'm not entirely sure and wonder. Some kind of nostalgia for the 'information technology' of pre-computing society?
> "Can you tell me the thickness of a U.S. Postage stamp with the glue on it? Answer: We couldn't tell you that answer quickly. Why don't you try the Post Office? Response: This is the Post Office. (1963)"
Haha!
An inexpensive caliper gauge, available in any specialty tool outlet, could yield an instant empirical answer to within about 10 microns.
That's how you can tell someone was trolling the library, decades later. Because the stamps came in boxes, so a box of 10K consisting of 1K sheets was 8 inches thick so 8/1000th of an inch (all made up numbers). My wife is the daughter of a postmaster and by osmosis she knows these things, life in a rural post office isn't always as exciting as a Bukowski novel so this kind of thing has been discussed. There were spools of stamps at various times so that might be a puzzling math problem.
On the table in front of me is 1000 sheets of laserprinter paper in two packages about 4 inches thick so 4/1000ths of an inch for letter size 20 pound paper.
You would think so, but it's surprising how consistent paper is! For instance, I've measured various papers with a caliper (pages of a book, samples of laser printer paper, ...) and it's amazing how consistent it is.
I am one of those programmers who doesn't have a computer science background. Instead I have an MLS (and an undergrad anthropology degree) and thus though myself am not a librarian have some idea of the value librarians provide.
For one thing librarians try to satisfy the information need of a patron. This is different than answering the question asked, which even now search engines can't really do well yet. Google search is powerful but you're searching an unfathomably large index and your query is a key. What librarians do is try to understand the exact thing you want to know, which you may not even know when you asked the question. Often a librarian will respond to a question with a question.
A librarian will also continue to help until the problem has been solved for the patron. They'll encourage a patron who might feel discouraged. Librarians treat people as people, not as queries.
Librarians are also staunch defenders of privacy and information freedom. In America, librarians are professionals with a code of ethics, which is codified by the American Library Association. Librarians are taught the complex issues involving privacy, the chill effect. Librarians are not in the business of collecting your personal information or selling you advertising. Librarians fight censorship, and they fight to protect your privacy.
Librarians are supremely excellent at curating. Go to a Barnes & Nobles and look at the selection of children's books. Read some of them. Then go to your local library's children's section and read some of those books. It's night and day. Libraries are filled with amazingly valuable books that, especially in the case of children's books, address important issues, are beautiful and beautifully written. They're not stories about Barbie rescuing a fashion show, they're about relatable and enjoyable characters that discover the world around them, learning to understand themselves.
Having said all this there are considerable problems in the American librarian profession which turned me off to it.
Librarians were slow to react to digital content, which put them at a disadvantage on digital rights management issues. Walking into a public library can for some be like a time warp to 20 years ago, with volumes of books filling shelves everywhere. There's a reason for that, books can be lent out. By sticking to books librarians haven't been the force they should have been in fighting DRM and copyright laws that would enable access to digital information to more people. Libraries spend considerable amounts of money for highly protected digital content like JSTOR.
Librarianship as a profession is also one dominated by time in service type of seniority. I wanted to be a programmer at a library. I was a programmer at a library as an undergraduate student and there were librarians who programmed to manage the library's website and library. But coming out of MLS the unfortunate reality was that programming jobs are jobs that go to librarians with seniority. I would have had to man a reference desk for a decade. And that kind of advancement, where it isn't skill, but time spent warming a seat that gives you political power is just not for me.
Finally, it's not clear what the future of libraries will be. Fewer and fewer people are reading books. Libraries are very popular places, especially for children, but in many cases libraries are just places that have a computer or a meeting room. Public librarians don't seem to have an answer for what's to come.
I'm also a programmer with an ML(I)S, and while I do work in a library, I tend to share your negative analysis of the 'industry' and it's future.
But I'm curious what year you had trouble finding a job due to seniority things. In my experience from the late 2000's to now, programming jobs in libraries are not very hard to get for interested and qualified people. If you can make a good impression, are a programmer, and actually have an interest in working in libraries (with or without an MLS), and are willing to move (there are only so many jobs and might not be one in your city), I think you could find a job in a library. (University more than public; public libraries still don't really put much resources into in-house development, which is sad, I think).
Whether you'd want to or not is another question. The pay probably won't be competitive with the private sector, and, as you know, the work environment may or may not be entirely dysfucntional.
I love the idea of libraries. I don't think actual present day U.S. libraries are succeeding at achieving their potential in a post internet world. Before the internet, libraries and librarians were pretty much the experts at 'information retrieval', and I love the idea of "civil society" non-profit agencies that are experts at collecting, finding, and preserving information -- something society could use now more than ever. I don't see actual libraries meeting the challenge though, sadly. Perhaps it's unfair to expect to them to.
It's nice to see DuckDuckGo mentioned at the end as on-par with Google, Siri, and OnStar. I'm a little surprised they mentioned DDG in the first place, as it doesn't have nearly the name recognition as Bing or Yahoo search.
So wouldn't the world be a better place for everyone if all the librarians in the world posted everything to a searchable quora/stackoverflow shared knowledge base? Instead of having knowledge fragmented and buried deep in places where people would have trouble finding.
Before AltaVista, I used to write the URL for any useful web and ftp servers that I found in a logbook, so I could find them again. Getting an answer was a matter of looking up the logbook, deciding which website might have the required information, then typing the URL into my browser. After AltaVista the logbook fell into disuse, as locating information was a matter of typing a few well chosen keywords into AltaVista and visiting a few of its top hits.
Google took over from AltaVista, but at the time it felt like Google was a "better AltaVista" than something completely new.