Before Google, Who Knew?

femto · on Dec 22, 2014

I'd say that before Google, "AltaVista knew". In my experience, it was AltaVista that transformed the web/Internet from a collection of disparate sites to a single place that you went for answers to questions.

Before AltaVista, I used to write the URL for any useful web and ftp servers that I found in a logbook, so I could find them again. Getting an answer was a matter of looking up the logbook, deciding which website might have the required information, then typing the URL into my browser. After AltaVista the logbook fell into disuse, as locating information was a matter of typing a few well chosen keywords into AltaVista and visiting a few of its top hits.

Google took over from AltaVista, but at the time it felt like Google was a "better AltaVista" than something completely new.

jmnicolas · on Dec 22, 2014

Maybe I didn't know about Alta Vista at the time, I can't remember the search engines I was using back then, but what I remember is that every single search was returning thousands of porn results.

Google was a revelation, it actually found things !

bsder · on Dec 22, 2014

So did AltaVista.

And, AltaVista actually showed you a clustermap of results keywords that you could refine.

So, if you typed in "python" you would see a clustermap of results that included a bunch of herpetology in that cluster and a bunch of computer science in that cluster, etc.

I still miss that.

antman · on Dec 22, 2014

Here you are: http://search.carrotsearch.com/carrot2-webapp/search?source=...

bsder · on Dec 22, 2014

You might not want to spread that one around yet. 3 links about Burmese pythons hardly qualifies as a cloud around herpetology.

I think they should avoid using this as their tech demo.

Not being as capable as a 15 year old search engine running on significantly less powerful hardware is kind of embarrassing.

musername · on Dec 23, 2014

The technical capability can't be inferred from the used capacity. The comparison is unjust.

saraid216 · on Dec 22, 2014

I used Dogpile, which searched about 5 engines and gave you the results for all of them.

Homunculiheaded · on Dec 21, 2014

One thing that this article fails to point out is how often librarians were wrong before Google. In 1986 there was a study[0] that showed that across the board reference librarians were only correct about 55% of the time.

I think most people today would consider a query answering system that had an accuracy of 55% to be an interesting curiosity, but certainly not ready for real-world application.

It's funny how frequently we measure machine learning performance of an "easy for humans" task and fail to compare it to human accuracy on the same data. We just assume humans would do perfectly on it. I'm sure there are a few MNIST digits that I would get wrong.

[0] P.Hernon, C McClure "Unobtrusive Reference Testing: The 55 Percent Rule," Library Journal, 111 April 15,1986

baldfat · on Dec 22, 2014

It's the pre-internet sources and Government employees fault and not the Reference Librarian as a profession. This is out of the US Library system only.

Your "Across the Board" = "Government documents and central reference areas of 26 U.S. libraries" Not your academic or local libraries that most of us think of as Librarians.

I was a System's Librarian and contend that Librarians are needed more now more then ever due to the Internet but that is a different tale. One-Third the Librarians were let go in my state during the economic down turn).

Reference Librarians should be quoting sources and I bet you a million dollars that a Reference Librarian gives you the source and it is the source that is wrong.

saraid216 · on Dec 22, 2014

Suggesting that reference librarianship is "easy for humans" is sort of... ridiculously wrong. There's a reason it requires a master's degree.

greggarious · on Dec 22, 2014

But why were people asking reference librarians questions?

When I spoke to librarians in the mid-90s, they'd just point me towards sources that could have an answer. It was up to me, the reader, to ultimately find the answer.

dredmorbius · on Dec 23, 2014

Depends on where you were and who you were asking.

At several large research university libraries, I found the reference librarians could be downright obsessive in tracking down some questions. It would depend on workloads and other factors, I suppose.

greggarious · on Dec 25, 2014

This was mostly for pleasure or K-12 assignments. The sort of questions that were very fact based.

"Do snakes poop?"

"How far away is Mars?"

That sort of thing.

dredmorbius · on Dec 26, 2014

I was specifically thinking of uni-level questions.

greggarious · on Jan 3, 2015

At the uni level, aren't they more supposed to help you find the appropriate database rather than find the answer for you?

I might ask a librarian how to access back copies of a journal, but I wouldn't expect him or her to do my lit review for me...

crististm · on Dec 22, 2014

You can get pretty far with being right only 51% of the time. With 55% you have an edge.

brudgers · on Dec 22, 2014

How often are less than 55% of the results of a Google search relevant? Count ads.

swores · on Dec 22, 2014

That' s not comparable at all. 55% of answers were correct, not 50% of the possible material in the library was relevant.

brudgers · on Dec 22, 2014

If a search for something like "crane hook design principles" returns a page with "buy crane hook design principles for less" at the top, it is wrong. And wronger than the wrong of a librarian because it's not even trying to be correct, just thrown out there for the edge case. We just happen to write it off as "that's just Google".

ricree · on Dec 22, 2014

Is that what it shows you? I just searched for that exact thing. The first two were ISO standards, the third wikipedia, and the fourth a paper on crane hook stress analysis.

Not as good as a directed search by a professional, sure (and I haven't actually looked at any of those papers, so who knows how relevant they actually are), but my experience seems a lot better than what you're implying.

brudgers · on Dec 22, 2014

If you followed either of the links to ISO, you would have found that they are offerings to sell the standards. The technical information is behind a paywall. That's ISO's business model. The link to Wikipedia is about cranes in general.

If a librarian said "the answer will cost you $181 (CHF 178,00)" or "here is a book about cranes" we would probably agree these high quality agree even if we might disagree over their being wrong. Nevermind that ISO standards are not necessarily a good set of design principles because minimum requirements are not the same as design tradeoffs.

estel · on Dec 22, 2014

I don't think that's entirely fair. Surely the most useful measure of "wrong" is whether the user comes away with the correct answer?

Adverts on a webpage might definitely make it less likely that a user comes away with that answer, but I don't think they'd invalidate a user's successful result any more than gossiping with the librarian would invalidate their help if they successfully find an answer.

brudgers · on Dec 22, 2014

The advertisements are noise. SEO also produces noise. At some point the signal is lost. Google's business model is built on maximizing the amount of noise a user will tolerate whenever the user in hope that the user stops searching and starts shopping. If you search for a book, Amazon will appear before the Wikipedia page.

musername · on Dec 23, 2014

Nobody contested your argument, specific knowledge of niches s still fairly scarce on the net, at least if looking for gratis information. OTOh a well sorted library supplies books that cost a fortune, so google couldn't be replace that, at least not yet.

sparkzilla · on Dec 21, 2014

I'm a bit confused by this article because Google can't actually answer the complex questions that the librarians have received. Google's "knowledge graph" can answer simple questions, based on scraped Wikipedia data, and the core part of the site (the search engine) presents users with a semi-random set of links, basically telling them to find the answer on someone else's site. It's like a librarian that knows a little bit of info (but not quite enough) who then tells you which part of the library to go to find a book that may or many not have the answer to your question. A site like Quora is better for answering complex questions.

IMHO, the "knowledge graph" actually shows that Google is very poor at determining the implied questions people are asking, and exposed the failures of alogorithmic search in general: http://newslines.org/blog/googles-black-hole/

jyrkesh · on Dec 21, 2014

The article doesn't even mention the Knowledge Graph. It's using just referring to the fact that Google is still the first place you go to ask those questions. Usually, the answer resides somewhere in the first page of results.

I asked Google "will a poisonous snake die if it bites itself?" Results include Quora, Yahoo Answers, a Youtube video of a snake actually doing it, and HowStuffWorks. So no "Google" doesn't know, but I'm going to ask Google when I want to answer a question.

sparkzilla · on Dec 21, 2014

>Usually, the answer resides somewhere in the first page of results.

And that's the business opportunity.

7952 · on Dec 22, 2014

I am skeptical about the entire premise of software like Google answering questions more complicated than research for school reports. Any sufficiently interesting question can be answered better by humans. Questions with simple answers often require more nuance to properly understand the answer. For example if you ask the population of a city. When was the census, and how exactly was the edge of the city defined? To make information usable you need to understand the source. Complex questions have multiple sources that may be incomaptible in forming an answer to a question. Without strict standards for representing knowledge I doubt Google have sufficient data to do anything groundbreaking outside of a few small areas.

xyzzyz · on Dec 21, 2014

I guess the point of the article is that now you find answers to these questions _using_ Google, and not necessarily get answers from Google itself. Also, while Google Knowledge Graph cannot answer complex questions given as examples in the article, until quite recently it couldn't even answer simple questions it is able to answer now. I wonder how long will it take until it is able to answer the questions from the article.

sparkzilla · on Dec 21, 2014

I think Google can't answer complex queries because it cannot understand the question, because the answers to the questions are generally created by humans, and are not easily scraped from other sites. So it seems likely that Google will buy Quora, or make an equivalent.

bhaumik · on Dec 21, 2014

I'd attempt to acquire/"make" a Wolfram clone before looking into Quora.

http://www.wolframalpha.com/examples/

minthd · on Dec 21, 2014

While quora sometimes gives good answers, often you get no answers at all.

And don't forget the time issue - quora is quite slow.

swores · on Dec 21, 2014

I was channel surfing on TV yesterday and flicked onto one of the Harry Potter films, to hear Hermione saying she researched something in the library and couldn't find a single reference - my immediate thought before moving on a channel was that (assuming I don't have her magical skills), I would have no idea how to research something in a library without either the internet, or asking other people who might know where I could find the answers.

(I say no idea - obviously on a basic level it wouldn't be hard to start by finding all books on the rough topic, reading them, etc.... but to do it with any sort of efficiency as opposed to so slowly that I'd end up beating myself to death with a hardcover.)

slantyyz · on Dec 21, 2014

That's a fascinating perspective.

When I was growing up in the 70s and 80s, we were taught how to use a card catalog and later, microfiche for finding stuff in the library. While I was still in university up until the earliest 90s, I was still using microfiche for research.

swores · on Dec 21, 2014

I was born in 1990 - I spent my childhood in libraries for fiction, and I guess occasionally for extremely basic "research" for school projects, but at an age when "research" was enough to look up a book on the topic I needed to know about, that probably was at least partially a picture book, then borrow that and go home with my mum...

Have fond memories of both card catalogs and microfiche, but only really for finding novels I wanted.

femto · on Dec 22, 2014

> .... but to do it with any sort of efficiency as opposed to so slowly that I'd end up beating myself to death with a hardcover.

I'd say you have the process down pat. It really was a matter of spending half a day locating a candidate set of books in the subject catalogue, writing down their details, then walking all over campus to locate the books in the various branch libraries and reading the relevant section of each to see if it was what you wanted.

The only reason people didn't beat themselves to death with a hardcover was because they didn't know any better, because there was no better. (Mind you, from a Body Mass Index perceptive, the old way was better!)

digi_owl · on Dec 21, 2014

Funny, that's pretty much how i do it online as well. Read article after article until i have a reasonable grasp of the topic i was looking into.

dredmorbius · on Dec 23, 2014

Methods depend on the topic, but frequently:

⚫ A subject search in the card catalog. Find both books on the subject and related subjects. Cross-reference the books to find other possible subjects to search. Later, online catalogs (computers accessible typically only at the library) could make this far faster.

⚫ For news items, title searches in news indices. Later, Lexis-Nexis. If it was a major story and you knew the dates, looking through issues of major news publications (e.g., TIME, Newsweek, etc.) could often turn up hits.

⚫ Bibliographies. Let someone else do your research for you. Cross-comparing bibliographies often turned up the key / seminal works in an area, where you could then use ...

⚫ Citations indices. These show where a given work was referenced. So if you find a seminal work, you now find the works (generally journal articles) which reference it.

⚫ You're also probably starting to recognize key authors at this point -- in most fields there are a limited number who really know their stuff. So you can start running author searches on their other writings.

⚫ If the library has open stacks, stack surfing can be quite interesting and serendipitous. Find the area containing books of interest and pull interesting titles. This is limited to the books which are on the shelves, so it's biased away from the more popular material. Depending on just how popular the material is, it can still be quite useful.

Though those are the "old school" methods (1960s - 1990s for the most part), they're still generally applicable today. I like to research a field for a while until I start turning up the same names repeatedly. At that point, you've got a general feel for the size of the area. It's possible to dive in and read in depth.

I also have a personal bias for older works -- I like to dig in deep. Much more recent material doesn't change the understanding all that much, and you can often find that there's been a significant degree of bias or skew in how original works are interpreted. Reading them with your own eyes can often reveal what you wouldn't otherwise know, or can give you an unbiased (or at least your own bias) of the original work's meaning and focus.

Many fields are highly skewed to a present / current bias. While it doesn't hurt to note where recent research has actually advanced the state of the art or changed understanding (and curiously: it's in the study of some of the oldest events, particularly of Earth's geological history and that of the Universe that we've seen stunning progress over the past 40 years), in human, political, and philosophical fields I find that at the very least having a familiarity with works going back 200-2000 years (and occasionally further) can be quite useful. There are some very persistent aspects to human behavior which remain unchanged. And yes, there are ancient beliefs which have proven themselves quite wrong.

But that itself might serve as a caution -- the same human nature that generated those ancient errors persists today.

anigbrowl · on Dec 22, 2014

Can you tell me the thickness of a U.S. Postage stamp with the glue on it? Answer: We couldn't tell you that answer quickly. Why don't you try the Post Office? Response: This is the Post Office. (1963)

This has a strong odor of trolling to it - one wonders how many questions directed at librarians were genuine vs. mischievous. I'm also struck by the failure to provide an answer rather than suggest a method, eg 'use a calipers to make precise measures of physical objects. If it is too difficult to measure a single stamp, measure a block of multiple sheets and divide your result by the number of sheets.'

Zenst · on Dec 22, 2014

The other question that jumped out like that one was in a trolling context, though probably was real in that timeframe was "And there was this typewritten note found on a cataloguing card:

Telephone call mid-afternoon New Year's Day, 1967: Somewhat uncertain female voice: "I have two questions. The first is sort of an etiquette one. I went to a New Year's Eve party and unexpectedly stayed over. I don't really know the hosts. Ought I to send a thank-you note? Second. When you meet a fellow and you know he's worth twenty-seven million dollars — because that's what they told me, twenty-seven million, and you know his nationality, how do you find out his name?"

I do not feel even Google today could handle that.

greggarious · on Dec 22, 2014

Sounds like a question for Dan Savage.

jrochkind1 · on Dec 22, 2014

As a librarian-programmer, I continue to find it interesting how interested HN is in libraries/librarians.

I wonder what that's about? Seriously, I'm not entirely sure and wonder. Some kind of nostalgia for the 'information technology' of pre-computing society?

(Personally, while I still work in libraries, I share btipling's [https://news.ycombinator.com/item?id=8781169] general lack of optimism about the future of libraries, alas.)

kazinator · on Dec 21, 2014

> "Can you tell me the thickness of a U.S. Postage stamp with the glue on it? Answer: We couldn't tell you that answer quickly. Why don't you try the Post Office? Response: This is the Post Office. (1963)"

Haha!

An inexpensive caliper gauge, available in any specialty tool outlet, could yield an instant empirical answer to within about 10 microns.

irremediable · on Dec 21, 2014

I suspect the inter-stamp variability might be a little bigger than that.

VLM · on Dec 21, 2014

That's how you can tell someone was trolling the library, decades later. Because the stamps came in boxes, so a box of 10K consisting of 1K sheets was 8 inches thick so 8/1000th of an inch (all made up numbers). My wife is the daughter of a postmaster and by osmosis she knows these things, life in a rural post office isn't always as exciting as a Bukowski novel so this kind of thing has been discussed. There were spools of stamps at various times so that might be a puzzling math problem.

On the table in front of me is 1000 sheets of laserprinter paper in two packages about 4 inches thick so 4/1000ths of an inch for letter size 20 pound paper.

anigbrowl · on Dec 22, 2014

Ha, I came to the same conclusion about it being a prank before spotting your comment.

kazinator · on Dec 22, 2014

You would think so, but it's surprising how consistent paper is! For instance, I've measured various papers with a caliper (pages of a book, samples of laser printer paper, ...) and it's amazing how consistent it is.

irremediable · on Dec 22, 2014

Fair enough! I stand (partially?) corrected.

cmiller1 · on Dec 22, 2014

So measure a stack of 20 and divide for an average. You could even measure them individually then to estimate the variability.

grogenaut · on Dec 22, 2014

Haha, back in 1963, a caliper was NOT inexpensive like you can get at harbor freight these days.

grogenaut · on Dec 22, 2014

Though a lot more people likely had them just as a lot more people had many tools. (all anecdotal)

btipling · on Dec 21, 2014

I am one of those programmers who doesn't have a computer science background. Instead I have an MLS (and an undergrad anthropology degree) and thus though myself am not a librarian have some idea of the value librarians provide.

For one thing librarians try to satisfy the information need of a patron. This is different than answering the question asked, which even now search engines can't really do well yet. Google search is powerful but you're searching an unfathomably large index and your query is a key. What librarians do is try to understand the exact thing you want to know, which you may not even know when you asked the question. Often a librarian will respond to a question with a question.

A librarian will also continue to help until the problem has been solved for the patron. They'll encourage a patron who might feel discouraged. Librarians treat people as people, not as queries.

Librarians are also staunch defenders of privacy and information freedom. In America, librarians are professionals with a code of ethics, which is codified by the American Library Association. Librarians are taught the complex issues involving privacy, the chill effect. Librarians are not in the business of collecting your personal information or selling you advertising. Librarians fight censorship, and they fight to protect your privacy.

Librarians are supremely excellent at curating. Go to a Barnes & Nobles and look at the selection of children's books. Read some of them. Then go to your local library's children's section and read some of those books. It's night and day. Libraries are filled with amazingly valuable books that, especially in the case of children's books, address important issues, are beautiful and beautifully written. They're not stories about Barbie rescuing a fashion show, they're about relatable and enjoyable characters that discover the world around them, learning to understand themselves.

Having said all this there are considerable problems in the American librarian profession which turned me off to it.

Librarians were slow to react to digital content, which put them at a disadvantage on digital rights management issues. Walking into a public library can for some be like a time warp to 20 years ago, with volumes of books filling shelves everywhere. There's a reason for that, books can be lent out. By sticking to books librarians haven't been the force they should have been in fighting DRM and copyright laws that would enable access to digital information to more people. Libraries spend considerable amounts of money for highly protected digital content like JSTOR.

Librarianship as a profession is also one dominated by time in service type of seniority. I wanted to be a programmer at a library. I was a programmer at a library as an undergraduate student and there were librarians who programmed to manage the library's website and library. But coming out of MLS the unfortunate reality was that programming jobs are jobs that go to librarians with seniority. I would have had to man a reference desk for a decade. And that kind of advancement, where it isn't skill, but time spent warming a seat that gives you political power is just not for me.

Finally, it's not clear what the future of libraries will be. Fewer and fewer people are reading books. Libraries are very popular places, especially for children, but in many cases libraries are just places that have a computer or a meeting room. Public librarians don't seem to have an answer for what's to come.

jrochkind1 · on Dec 22, 2014

I'm also a programmer with an ML(I)S, and while I do work in a library, I tend to share your negative analysis of the 'industry' and it's future.

But I'm curious what year you had trouble finding a job due to seniority things. In my experience from the late 2000's to now, programming jobs in libraries are not very hard to get for interested and qualified people. If you can make a good impression, are a programmer, and actually have an interest in working in libraries (with or without an MLS), and are willing to move (there are only so many jobs and might not be one in your city), I think you could find a job in a library. (University more than public; public libraries still don't really put much resources into in-house development, which is sad, I think).

Whether you'd want to or not is another question. The pay probably won't be competitive with the private sector, and, as you know, the work environment may or may not be entirely dysfucntional.

I love the idea of libraries. I don't think actual present day U.S. libraries are succeeding at achieving their potential in a post internet world. Before the internet, libraries and librarians were pretty much the experts at 'information retrieval', and I love the idea of "civil society" non-profit agencies that are experts at collecting, finding, and preserving information -- something society could use now more than ever. I don't see actual libraries meeting the challenge though, sadly. Perhaps it's unfair to expect to them to.

jstanek · on Dec 21, 2014

It's nice to see DuckDuckGo mentioned at the end as on-par with Google, Siri, and OnStar. I'm a little surprised they mentioned DDG in the first place, as it doesn't have nearly the name recognition as Bing or Yahoo search.

brandonheato · on Dec 22, 2014

So wouldn't the world be a better place for everyone if all the librarians in the world posted everything to a searchable quora/stackoverflow shared knowledge base? Instead of having knowledge fragmented and buried deep in places where people would have trouble finding.

walterbell · on Dec 22, 2014

The better to obsolete you with, my dear.

The challenge of every "knowledge management" system ever invented, including Google Answers, which paid (!) for answers.