All grand plans indeed ... but a higher priority should be making search even work at all.
Google search returns approximate results, related results, results with your words split (or joined) and other nonsense to drive ad views.
Even using the so-called "power tools" like allinsite: and the (barely functional) quotation marks, you still get very shoddy results.
Anyone who ever searches for very specific groups of terms knows exactly what I am talking about: enter search terms, click on result, search in page for term, doesn't exist.
As annoying as the approximation can be, I'd like a citation for "to drive ad views". It seems much more likely that it was added for the obvious reason - to improve search results, on average, for most people, because relevant results don't always contain the exact search term you put in. In my case, as many times as it's burned me, there have also been a few where I remember it presenting exactly the results I wanted, and it's probably done so many more times without my noticing. YMMV, and it would certainly be nice if Google were better able to distinguish queries that should not be rewritten from ones that probably should, but that's no reason to assign nefarious motives.
As has been mentioned, you should probably set your default search to verbatim mode.
Whatever the reason. It has changed recently and is annoying.
Search for "vertx play framework" and the last two search results don't have the term vertx despite pages and pages of results with the term. I don't understand how dropping search terms can make the results more relevant.
Also you can't set the default search to verbatim. You can only hack the query parameters in specific browsers.
> Whatever the reason. It has changed recently and is annoying.
Fair enough. Like I said, it can certainly be improved, and perhaps recent changes have gone too far.
> Also you can't set the default search to verbatim. You can only hack the query parameters in specific browsers.
Don't the kind of people that frequent this site near universally use their browsers to search, rather than wasting keypresses navigating to google.com?
Ridiculously, Safari, which I use, requires an extension to use an arbitrary search engine, but I can't think of another browser that doesn't support it.
>> As annoying as the approximation can be, I'd like a citation for "to drive ad views".
Not 100% proof but ad views and clicks seem to be increasing by 20% to 30% each quarter. Google tests everything to death and they please wall st with those ad click increases.
If you look at the quarterly earnings reports, it's typically 20-30% year over year, not quarter over quarter. Keep in mind that the population of internet users is estimated to be growing about 10% year over year alone (though not always in places where Google is dominant), google's usage rate is still increasing in many places outside of the US, and then there's the fact that we all use the internet more every year, etc.
Besides, without any other data or even a clear mechanism of action, you could just as easily say they're just better achieving their stated goal of only showing ads that the people who see them are interested in seeing. That's supposed to be one of the benefits of the auction-based approach of adwords. 20-30% is a lot for that, but who knows?
Really, I think you're seeing a pattern there that you want to see, and I'll second comex's call for real evidence.
A lot can be attributed to more ads on pages, true. But do people go there to see a page full of ads or "most relevant" and "unbiased results"? How many know ads from content?
But growth in places where Google makes its real money (US, EU, Canada etc) has plateaued for a while, IIRC. Growth in ad clicks mirrors growth in revenue and we know a click from Guatemala is not the same as one from NYC.
>>Besides, without any other data or even a clear mechanism of action, you could just as easily say they're just better achieving their stated goal of only showing ads that the people who see them are interested in seeing.
Maybe, but if you read the comments, people are accusing Google of making results worst to boost the ad click rate. That's wrong on so many levels.
> But growth in places where Google makes its real money (US, EU, Canada etc) has plateaued for a while, IIRC. Growth in ad clicks mirrors growth in revenue and we know a click from Guatemala is not the same as one from NYC.
Actually, average cost per click has been going down for a while now. I don't see any breakdown by geography, and wall street had been all worried about what that means for an ever-more-mobile world, but, again, the data isn't there to back up your supposition. A large part of it could be growth in regions that bring in lower ad revenue.
> Maybe, but if you read the comments, people are accusing Google of making results worst to boost the ad click rate. That's wrong on so many levels.
Er, what is wrong on so many levels? I can't tell what you mean from that sentence construction. You say, "maybe", so you don't disagree with me, but you backed up the only other "accuser" in your post above, so I don't think you disagree with him...
In any case, the considerably more obvious and likely explanation is the one I think most people have tended to assume: google disambiguates terms automatically because it's what works in the 90% case. It's annoying for power users, yes (even back when it was less of an issue, at least we could +terms, so I miss +ing terms like crazy with today's google), but it makes no sense to attempt to optimize for ad revenue that way. Extraordinary claims, extraordinary evidence...or in this case, any evidence.
Google's user base or searches are increasing by what % a year?
Is there a difference and what makes that difference?
Google's recent stock growth and revenue also match Google's highly publicizes Panda and Penguin updates. Or supposedly fighting "spam" but in reality also ruined a lot of small businesses. My educated guess is that Google is sending a lot less "free clicks" to other sites.
Cost per click can be brought down by a supply gut or a poor ROI for the advertiser.
>> In any case, the considerably more obvious and likely explanation is the one I think most people have tended to assume: google disambiguates terms automatically because it's what works in the 90% case.
In any case, the considerably more obvious and likely explanation is the one I think most people have tended to assume: Google makes sure that any algorithm changes (at least) don't hurt their Adwords business. That's their bread, butter and dessert.
I find Google Search to be incredibly frustrating to use because it chooses to ignore search terms even when pages exist with those terms. Google better be really careful because crippling the experience in order to sell more advertising reminds me why I switched away from Altavista.
Google search for me (and I've been using it a very long time, having been an Internet user since the late 1980s and around when Google first appeared) has generally gotten worse over time, not better. I don't think the changes are to sell more advertising, though, I think they are a combination of:
1) The web is just much bigger and noisier than it ever was. There are so many SEO-bait sites out there now, it is a wonder search still works at all. I can't really fault Google for this part.
2) Changes Google made to make search more accessible to the mainstream user. Google search now tries way too hard to be "smart" about what the user meant to ask for instead of what he or she actually asked for and this can be a huge negative if the person is looking up precise information and knows how to use a search engine.
They somewhat recently added a "Verbatim" option to search that can help you avoid some of this too-smartness, but even with that enabled Google is still inferior to what it used to be when I'm looking for very targeted technical information that I am sure exists out there.
Sadly, this sort of thing is a trend impacting not just Google. The success of Apple has created a culture of creating things for the mainstream consumer user which often comes at the expense of the power user. I get why this is done and ultimately it is the right call for any business that could potentially serve the mainstream, but I do wish more companies would leave in the highly technical expert options as settings for those who are comfortable using them. I feel that in recent times most software in general has swung way too far on the pendulum from being too hard for normal people to use to being totally gimped for experts and feeling like a toy more than a tool and I wish attempts to try to support both sets of users became a "thing" instead of constantly hearing the mantra that "options or settings are bad, no options or settings for you"!
The biggest thing for me has been localization. If you happen to use a localized version of Google, it will heavily favor results in that language, and give you mostly crap when searching for stuff in a different language using the localized search machine (so pretty much all the time for programmers).
Thankfully you can still tell it to use Google.com and in english.
Sadly, you are in a minority. Google's results are excellent for most people.
I used to use + often. We know that the + operator was rarely used, and most of those times it wasn't used correctly. (Of all searches, only 1 in 600 were correctly using the + operator.)
Being outside Google (and not able to see their data) is frustrating, but they have a lot of numbers and they do this stuff because they can show it helps most people.
>>Sadly, you are in a minority. Google's results are excellent for most people.
Sadly, you are in a minority. Bing's results are excellent for most people. Slap a Google logo and I'm willing to bet that most people wouldn't tell the difference. It's the Google brand that is golden, at least so far. But lately people are talking and questioning their honesty.
Bing's own comparison site "bing it on" shows that most users prefer Google's results[1], even though the comparison uses results modified to favor Bing[2].
dear Googler, you made me waste a bit of time: trying to do that by using tweets isn't accurate for obvious reasons. Power user vs normal user, motivation to tweet, bias against Microsoft etc. The other post was confusing as hell, I couldn't make out his point. Absolutely no data, other than suggesting that the poll was rigged. It may very well be.
Either way, if Bing narrowed the preference to 10%-20%, it's a great feat. Personally I find Bing lacking on obscure terms but most people wouldn't care.
In my sentence "Google's results are excellent for most people." I could have said "Modern web search engine results are excellent for most people" - I didn't because this is a discussion of how Google (and only Google) is failing for a number of people.
Google needs to create a search market place around that index of theirs. They need to open it up.
Search is too big a problem for just a couple people at Mountain View to be working on. And at this stage with the amount of data being generated, no one really has the infrastructure to compete.
Hope they do it before the regulators make them. If Apple has succeeded at building a market place around their closed platform Google can too.
What would "extensions" to search look like? Google had "Subscribed Links" from roughly 2006 to 2009, which would let users opt in to receiving links from third parties for certain queries (eg. I added a Javadoc extension that would show me the official Javadoc when I searched for a Java class). Nobody used them. Search isn't a market like mobile phones: it serves an immediate, well-defined need, and there doesn't seem to be a need for third parties to jump in.
Your comment highlights the problem. When people think search they think Google. Search is bigger than that and Google is in a way through its success and utility, limiting people's imagination when they think about search.
Just look at their menu bar...images, videos, flights, blogs, shopping, books, patents, apps.
Is that it?
Not to mention random.
So we just sit around waiting for some benevolent god in Mountain View to say, you know what now let the mortals have...recipes.
If they want to expand that list to the infinite domains it should be covering, it is never going to happen with the resources they have. They need to open the index to tap into its full potential.
I doubt any "index" exists in the way the words indicates. It's likely highly custom for how it's accessed. What would an API for that look like? Do you want to just be able to do random regexes (that would be awesome...I miss code search)? Do you just want a disk sitting somewhere with all the internet on it so that you can run custom programs on it?
Identifying what a recipe looks like and then providing a search interface that can figure out which of the millions of variations of some soup recipe is what a person is looking for and is more authoritative than others (and not some blog spam with minor (but random) alterations, or written by an amateur with no business in the kitchen) is a hard problem. Crawling isn't really the hard part. It takes a lot of hardware and time, but then you have all this data...that's when the hard part starts.
I'm interested in what others think a useful "index" API would look like, though.
I want a search that works for Usenet news. Yes, I understand that Usenet news is dead, but still, effective searching would be nice and Usenet search has been broken for ages and ages.
Another example is targeted search. For example, if I'm searching about mental health stuff I do want good quality sources, I don't want tabloid gossip about celebrities going into rehab.
Or sometimes I want to break out of the SEO trash, and have a bit of serendipity. Search for a term like [spectacles cases]. You get a lot of shops selling pretty much identical cases. What you want is a nicer way to preview those shops (because often the websites are god-awful and their own site searches are much worse than anything Google provides.) WAIT: I just tried this to make sure I was right, and Google have changed the way they show results like this. You get the same ads at the top, with heavily SEOd content links below, but now at top right there's a "shopping" section with links to different cases in different shops. So, that's much better now than it used to be.
Still, serendipity is fun. I remember when you used to be able to use Google to noodle around and find cool stuff. Now? Not so much. That's not Google's fault. The modern web is very different to what it used to be, but it'd be great if there was some way to get access to those smaller sites.
This is a great point. Gabe Newell makes the same point about the gaming industry: the short-term view is that gaming is about selling units. The long-term view is that gaming is a cultural activity that can be monetized in an almost infinite number of ways.
It should be mentioned that this move towards approximate search also gives strong indication of how little demand for real "smart search" there is in the search field.
Alta Vista years ago had full logical search but Google beat them with simpler searches having greater relevance. Google's searches are now moving towards even less specificity. How much could greater language understanding help this?
How many people want to go to the trouble of writing out a full sentence to specify exactly what they want? How much is it worth accommodating them? etc
I almost never search for very specific terms, most of my searches are also approximate. In this case it's very helpful that the search results are also approximate.
Douglas Hofstadter once said of Ray Kurzweil and his grand ideas: "It's an intimate mixture of rubbish and good ideas, and it's very hard to disentangle the two, because these are smart people; they're not stupid."
I think Google (and Larry Page) is betting that they will be able to separate out this mixture.
He may have noticed this in himself first, then applied it to others. In fact, he may consider it a limitation of genius and part of the current human condition.
One thing about good ideas - is not so much the novelty, but more-so ideas that are far enough out there that they are semi-novel, but based on an extrapolation of the current state of things such that a path to getting the idea implementable is possible.
Sometimes, this means that you're able to foresee the trajectory of industries and technology that others miss or can't.
However - its also important to have the resources avaialble to be able to implement/execute/foster that which is needed to make those ideas become reality.
In this case, he has immense resources which can make the ratio between rubbish and great ideas shift significantly.
Its easy to simply be a daydreamer - even if a greatly informed day dreamer - when you have no resources.
I am going to insert this piece into my pitch. It touches on the basic and salient rationale that access to resources is a given to derive inertia for for certain ideas
For one, Kurzweill DOES have "grand" ideas. Extravagant visions of a future with technological immortality, singularity, etc. Hofstadter does not. He merely examines some things, like cognition, and proposes some theories about their workings. Like, you know, every scientist.
Kurzweill comes out as grandioze, obsessive and deluded, Hofstadter like a normal writer, no more or less strange than, say, Marvin Minsky.
How could anyone hope to accomplish anything halfway great without being grandiose, obsessive and deluded?
Kurzweil has already done enough to prove he's not simply deluded. It seems like every week there's an article that comes across HN about how all great innovators have a capacity for self-delusion. Kurzweil has consistently shown himself to be a great innovator, and yet people dismiss him as a crank because he has the gall to continue to reach for "impossible" goals.
Genius or hard work will get you a solution to the problem you set out to solve. You need "grandiose, obsessive and deluded" in order to set out against a sufficiently ambitious problem.
And you're listing people who have made discoveries. Kurzweil isn't trying to discover an equation or a law of nature, he's trying to build strong AI. That's a different category of endeavor. It's the work of Henry Ford or Steve Jobs rather than Einstein or Feynman.
Wow. Henry Ford or Steve Jobs. So, the discovery of the nature of cognition, a scientific endeavor that has stumped tens of thousands of the most brilliant people who ever lived for thousands of years, is on the order of setting up assembly lines and setting up shiny packaging for other people's engineering work. I learn new things here everyday.
Heh, not only that, but it implies that discovering the fundamental secrets of the universe isn't "sufficiently ambitious" compared to making things more efficiently or making easier-to-use gadgets.
It depends. If that discovery is a logical next step in the continuation of our current understanding, then yes, it may be less ambitious than something that positing something a few steps removed from what is currently provable, and attempting to fill in those gaps, whether it be a discovery in the scientific sense, or just achieving something heretofore thought impossible.
It's easy to forget that sometimes a discovery will languish for a long time (or even be forgotten and rediscovered) before someone invests time and effort in determining how it can usefully be exploited.
Who deserved credit "discovering" the assembly line? Adam Smith, who wrote about division of labor in 1776, or Eli Whitney, who implemented it manufacture muskets, or Henry Ford for using it to such great effect that he changed how industry operated? Or maybe the first Chinese Emperor, who's creation of a terracotta army is said to have used techniques reminiscent of assembly lines? Or perhaps the Venetian Arsenal? How many times in history was it discovered again, but never put into practice?
Another example, I really don't care who discovered it was possible to go to the moon, I do care about the people that acted on that discovery and actually did it.
Except that Hofstadter, other than writing about his ideas and theories, has accomplished jack shit.
Pragmatically speaking, Kurzweil is miles ahead of Hofstadter in terms of putting his theories and ideas to practice.
It's very easy to criticize visionaries, until they achieve something. Then nobody holds the critics accountable for their negative attitudes towards the visions, and the critics probably would claim that the progress was obvious (in hindsight everything is obvious).
>Except that Hofstadter, other than writing about his ideas and theories, has accomplished jack shit.
Why would he have to accomplish anything else? He is not an inventor, he is a writer.
>Pragmatically speaking, Kurzweil is miles ahead of Hofstadter in terms of putting his theories and ideas to practice.
I don't think so. He merely invented some low hanging fruit in early computer science, like OCR and text recognition. Things on which other people worked and had results too.
And things that, even now, 3 and 4 decades after his inventions, are miles BEHIND of his expectations of them, and somewhat of a disappointment still.
>It's very easy to criticize visionaries, until they achieve something.
And it's equally easy to be a "visionary", if you don't have to also achieve those visions. Visionaries are a dime a dozen, especially in California.
>>I don't think so. He merely invented some low hanging fruit in early computer science, like OCR and text recognition. Things on which other people worked and had results too.
It only looks like low hanging fruit after it is done.
>>And it's equally easy to be a "visionary", if you don't have to also achieve those visions. Visionaries are a dime a dozen, especially in California.
What exactly is your overall point? Most visionaries will fail because that is just how things are. Leonardo da vinci was a great visionary that could not fulfill any of his visions. Visions that were fulfilled hundreds of years latter. Eventually somebody will fulfilled those visions. Again, I don't understand what you are bitching about.
All this irrational hate against the guy for daring to dream what would be one of the greatest achievements of human history is unnerving. Eventually we WILL have strong AI, we are living proof that it is possible just as birds were living proof that things could fly.
You are correct that Hofstadter doesn't have grand assertions at the level of The Singularity but there is still something like a "Hofstadter" mix - say all the ideas in Godel, Esther and Bach with the implication that these are related and important. One might reasonably say this a somewhat grandiose admixture.
All that said, I think Hofstadter's "mixture of good ideas and rubbish" description isn't a good characterization of the thinking behind The Singularity. Rather, Singularity thinking is extrapolation of growth trends which might easily be false but since it is made on a broad level, it is not easy to formulate why it is false. And Kurzweil, in particular is popularizer of the singularity with a particular version of it.
One could argue there are limitations to the expansions that Singularians have been extrapolating from but characterizing these limitations is itself quite a tricky problem.
I think the best refutation to the Singularity idea is the argument of Paul Allen, that software and the understanding of intelligence simply haven't been amenable to increased computing power.
It's a good argument but I feel the tone isn't quite fair, the sense that you could more fairly say the singularity looks plausible until you really focus on the software barrier.
As I recall, Hofstadter said this about "the singularity" in particular. I don't Kurzweil is claiming he will bring this about at Google, at least not before his rough projected arrival date of 2030-40.
I'm working on some open source projects that pertain to recognizing entailment in plain text. Entailment is the relation that holds when one text "follows from" another.
I attended a talk he gave on campus when I was in college, and afterward I lurked as some really smart people (grad students, maybe one or two profs) asked him for details. He came across as a con artist, selling ideas that he can't defend. For instance, one that stuck out was that he had nothing to back up his assertion that within x years there would be the ability to non-destructively copy all the information in a brain. Several questions I thought were fairly reasonable were met with the response: "read my book[s]." (in context, as I recall he was promoting _The Age of Spiritual Machines_.)
His basic argument of the inevitable advance of technology to true AI, or ability to upload consciousness or integrate computers with brains, rests on priors that are not clearly relevant.
With his latest title, "How to Create a Mind: The Secret of Human Thought Revealed", how can anyone take him seriously? Scientists (practicing scientists) have been writing books like that for decades, and the lesson should be that unless you've demonstrated that you can build an AI (real AI, not soft AI like machine learning), you have no business using a title like that. If real AI is simply a combination of the right machine learning tricks, then the trick is in the combination, because nobody has figured it out (publicly) yet.
Kurzweil is awkwardly reminiscent of the connectionists of yore. I'm surprised nobody's taken a similarly hostile attitude towards his (as yet unsubstantiable) predictions.
I've been to a couple Kurzweil talks. He's got a good shtick, but it doesn't seem to change much. I tweeted "I saw this talk 10 years ago. Shouldn't it be 5 minutes long and 100x more interesting now?"
>Haters gonna hate. Larry Page obviously didn't hire him and let him take up Google's resources for his "rigorous hand-waving".
No, he did it to get some PR of Google as a mad-genius storage. He also did it with tons of other semi-retired Comp Sci legends. He might also like the guy, or believe the Singularity mambo-jambo himself.
There's a line going around about Google being where "old computer scientists go to retire".
>How does this one-liner contribute to the conversation at all?
By giving a summary of the whole thing? A very succinct one at that? If it's also accurate (which you can argue about), then it's a perfect contribution to the conversation.
Is it just me, or does Ray Kurzweil seem to be getting younger? I remember seeing a speech of his a few years ago, he had a nervous tick where he was blinking his eyes constantly. He also looked a lot older. Now, he doesn't have that tick, seems to have more hair, etc.
People can look older and younger at different times, depending on weight gain or loss, stress or depression, choice of clothing, plastic surgery (duh!), more or less workout, etc.
Him getting 100 pills every day: I don't think is very relevant. Mostly delusional. If that worked, pharmaceutical companies would have made it into one mega-pill, and sold it for a fortune to rich people years ago.
>As for your argument against 100 pills, that only holds if it's well-known or well-established that it works and works safely.
Or semi-safely. The rich are also "early adopters". Also see tragically failed plastic surgeries and BS like cryogenics.
But still, that it's not "well-established that it works and works safely", is exactly my point about his use of the pills. Mostly wishful-thinking on his part.
Wolfram and Kurzweil are betting on opposite horses here. I work for Wolfram, but I'd be delighted if the ML-approach works. Personally, I think we need a diversity of approaches to really understand how to crack the nut of 'hard AI' (if that is even a sensible thing to pursue).
I wonder what would be the best strategy for google to monetize such an asset as ibm watson? Surely ads based model is not the best way(since they've already got a large part of the internet ad market and given that only google, ibm and maybe microsoft could build something similar.
If it's a truly competitive advantage, wouldn't offering it for startups/other-businesses in return for equity would offer bigger profits?
Thanks. I could find some references using those keywords!
Do you then also know about the cost per query as compared to a Google search? Or alternatively, how much concurrent traffic can they handle with that server?
I don't know, but i don't think that really matters. google mostly deals the "cheap" stuff, while watson deals with really valuable stuff , so compute cost is is not so important.
One application of watson is in medical diagnostics. They are working on other applications in finance, helping to develop drugs, predicting when industrial machines need maintenance and even coming up with novel recipes for tasty foods.
Basically, by reading and understanding large amounts of text, it can help you validate an hypothesis or even lead you to new hypothesis.
I think Google Books project provides an interesting twist on this. Now their AI not only has the whole content of the publicly accessible Internet to study but also quite nice chunk of material ever put on paper.
>Kurzweil eventually wants to help create a “cybernetic friend” that knows what you want before you do
I am not sure I want that. At least not as much as Google does. Kurzweil is going to help me click on more ads. Shreds any sense of him being a visionary.
I'm 100% sure I don't want this. Exploration and discovery as a whole is one of the most rewarding experiences we have as humans. I don't want that outsourced to some machine/ algorithm. I addition to that, we deal with so much cognitive dissonance within our brains on a day to day, minute to minute, millisecond to millisecond basis, I'm not convinced Google (even with their massive cache of data on us) could even predict what we "want" anyhow. Think of youtube and it's "recommended for you" video suggestion. Just because I listen to Lil Boosie every now and then dosen't mean I want to hear all the shitty southern rap songs it's choosing for me.
I'm not in love with the personally tailored search experience as it stands.
I'm totally OK with searching "#{my_city} taxi" instead of "taxi" if it means when I search "abortion" I get a sampling of search results that has the greatest co-citation... instead of something that myopically focuses on Google's perception of my preferences.
At the least, I'd like to be able to switch off the personalization. But anyway... Beating a dead horse here in this community, I'm sure.
If only there were a way to predict that users searching for "taxi" tend to want local results, and users searching for "abortion" don't. You know, like that other search engine does, called, um, Google:
Or if you were willing to type more than 2 words into the search box before blaming Google for failing to read your mind, while you simultaneously tell Google to stop trying to read your mind.
I believe he's trying to say that he doesn't want results tailored to him at all. For example, if I were republican, I'd tend to get anti-abortion material when searching.
He seems to be saying that he'd be okay with typing in his city name to get local results in exchange for not having any personalization. In other words, it's beyond local vs non-local.
Clippy told people how to do their work more efficiently, and because people don't like to remember new ways of doing work if their old habits sufficient, people hate him. Clippy was like computerized supervisor, slave driver, who looked over their shoulders and told them "You selected one menu item 5 times, better use keyboard shortcut to mine more uranium for your Master."
Google wants to tell people what to do. He is like missionary, every layman would respect him for making life easer. People would love Google for this.
Every time I read about machine learning work that is being done at Google, that's available if you're a Real Googler, and then compare that to the closed-allocation nightmare the other 90% face, it makes me want to fucking rage out and fly a plane into Mountain View...
... land, get off that plane, take a cab, and have a polite but blunt conversation with the founders about how to fix their company. (What did you think I meant?)
Artificial intelligence is nice, brahs, but natural stupidity in the form of closed allocation and Enron-style performance reviews are putting that company at 10% speed. Clear out the latter and you'll have plenty more muscle for the former.
I'm sure Ray Kurzweil will do amazing things, but he'd do even more amazing things if the company still had the machinery (e.g. open allocation, a culture of human decency) to bring talent to him properly. The whole reason he is there is to work with great people that the company is supposed to be better equipped to find than he is... but how will that work, given that the company sold off its ability to reward and recognize talent just to appease McKinsey?
I'm a little curious - have you done machine learning work that's made it into a real product?
I tried doing it a bit with my last project. My results were basically terrible. It turns out that getting useful results out of heterogenous, vague, fuzzily-specified real world data is really hard.
I'm tight with a few folks in Search Quality whose job is that sort of data-scientist machine-learning work, and they're really good at it. Y'know what 90% of their daily time is spent on? Compiling golden sets. Labeling training data. Running MapReduces to collect basic statistics about their data set. Running MapReduces to identify representative members of their data set, and outliers that should be excluded. Shoving data into R to visualize it. Futzing with numeric coefficients. Building webpages and tools so they can visualize the data and results, futz with the numbers online, and get feedback in real time. Collecting test sets and running your algorithm against them, and then trying to figure out why your losses are losing.
Machine-learning from a practitioner's POV is not at all like the textbooks and theoretical papers suggest. I'd estimate that less than 10% of one's time goes into the "fun" part of machine learning - brainstorming new signals and writing the code to extract them and feed them into your classifier - and 90% is on the kind of grunt work that hard science grad students do all the time. You get paid well for it, but that's because a lot of the work is really boring and time-consuming. I suspect that I get a far more frequent rush of accomplishment as a mostly-UI guy than the data scientists in my department get.
It's a tool. It works well in some cases, but it can take a lot of effort to get it to work well.
I can only speak for myself here, but I find a lot of what you described as "grunt work" to be fun. Yes, it's true that when you work with real world data, you don't only work on the mathematical modeling, and you spend a lot of time just setting yourself up to be able to do what is typically called the "fun" part. For me, personally, I find all of the work involved to be fun-- even setting up a data pipeline and figuring out how to distribute a computation. Automating away grunt work and setup process is fun, too.
Part of what makes data science fun is the machine learning itself. And an equally interesting part of it is is that it involves so many other parts of computer science that a typical corporate job would call "too hard" and isolate you from.
At least based on what I've seen, data scientists are respected and this means they get dibs on the most interesting projects. However, the interesting projects themselves involve a lot of detailed work (that's the nature of technology) and if you're that interested, you're going to want to do it yourself, at least until you really understand the problem (at which point, you'll automate the dull stuff).
Kurzweil has grand ideas, thinks out of the box enough for people to call him crazy. I would guess some of those are the kind of grand / crazy ideas larry and company want to pursue too. So i think it is a good fit. For more information on the man and what motivates him. See the documentary the Transcendent Man, http://www.amazon.com/Transcendent-Man/dp/B0051Y6NQA
Google search returns approximate results, related results, results with your words split (or joined) and other nonsense to drive ad views.
Even using the so-called "power tools" like allinsite: and the (barely functional) quotation marks, you still get very shoddy results.
Anyone who ever searches for very specific groups of terms knows exactly what I am talking about: enter search terms, click on result, search in page for term, doesn't exist.