Attention models? Attention existed before those papers. What they did was show that it was enough to predict next word sequences in a certain context. I'm certain they didn't realize what they found. We used this frame work in 2018 and it gave us wildly unusual behavior (but really fun) and we tried to solve it (really looking for HF capability more than RL) but we didn't see what another group found: that scale in compute with simple algorithms were just better. To argue one group discovered and changed AI and ignore all the other groups is really annoying. I'm glad for these researchers. They deserve the accolade but they didn't invent modern AI. They advanced it. In an interesting way. But even now we want to return to a more deterministic approach. World models. Memory. Graph. Energy minimization. Generative is fun and it taught us something but I'm not sure we can just keep adding more and more chips to solve AGI/SGI through compute. Or maybe we can. But that paper is not written yet.
This is an uncharitable and oddly dismissive take (i.e. perfect for HN, I suppose).
Today's incredible state-of-the-art does not exist without the transformer architecture. Transformers aren't merely some lucky passengers riding the coattails of compute scale. If they were, then the ChatGPT app which set the world ablaze would've instead been called ChatMLP, or ChatCNN. But it's not. And in 2024 we still have no competing NLP architecture. Because the transformer is a genuinely profound, remarkable idea with remarkable properties (e.g. training parallelism). It's easy to downplay GPTs as a mostly derivative idea with the benefit of hindsight. I'm sure we'll perform the same revisionist history with state-space models, or whatever architecture eventually supplants transformers. Do GPTs build on prior work? Do other approaches and ideas deserve recognition? Yeah, obviously. Like...welcome to science. But the transformer's architects earned their praise -- including via this article -- which isn't some slight against everyone else, as if accolades were a zero-sum game. These 8 people changed our world and genuinely deserve the love!
Question for you, as someone relatively new to the world of AI (well, not exactly new - I took many courses in AI, including neural networks, but in the late 90s... the world is just a tad different now!)
Is there any good summary of the history of AI/deep learning from, say, late 00s/2010 to the present? I think learning some of this history would really help be better understand how we ended up at the current state of the art.
I've been following a data science course called "The Data Science Course 2023: Complete Data Science Bootcamp" at Udemy.
The course starts all the way back from basic statistics and goes through things like linear regression and supposedly will arrive at neural networks and machine learning at some point.
So I don't know if something like this is exactly what you're looking for, but I think that, in general, if one wants to learn about (the history) AI, then it might be a good idea to start from statistics and learn about how we got from statistics to where we are now.
> And in 2024 we still have no competing NLP architecture.
No, we do. State space models are both faster and scale just as well. E.g., RWKV and Mamba.
> Transformers aren't merely some lucky passengers riding the coattails of compute scale.
Err... they are, though. They were just the type of model the right researchers were already using at the time, probably for translating between natural languages.
I kind of wonder if the reason this seems to be true is that emergent systems are just able to go a lot farther into much more complex design spaces than any system a human mind is capable of constructing.
I think this is overly general. A more accurate statement is that, on tasks where we don't actually understand how something works in precise detail, it's more effective to just throw compute at it until a system with "innate" understanding emerges. But if you do actually know how something works (rather than vague models with no clear supporting evidence), it's still more effective to engineer the system specifically based on that knowledge.
I’m studying neuroscience but very interested in how ai works. I’ve read up on the old school but phrases like memory graph and energy minimization are new to me. What modern papers/articles would you recommend for folks who want to learn more?
For phrases, Google's TF glossary [0] is a good resource, but it does not cover certain subsets of AI (and more specifically, is mostly focused on TensorFlow).
If you are in neuroscience I would recommend looking into neural radiance fields rendering as well. I find it fascinating since it's essentially an over-fitted neural network.
This is a classic... This isn't the first time this piece has been posted before either. Here's one from FT last year[0] or Bloomberg[1]. You can find more. Google certainly played a major role, but it is too far to say they invented it or that they are the ones that created modern AI. Like Einstein said, shoulder of giants. And realistically, those giants are just a bunch of people in trench coats. Millions of researchers being unrecognized. I don't want to undermine the work of these researchers, but that doesn't mean we should also undermine the work of the many others (and thank god for the mathematicians who get no recognition and lay all the foundation for us).
And of course, a triggered Yann[2] (who is absolutely right).
But it is odd since it is actually a highly discussed topic, the history of attention. It's been discussed on HN many times before. And of course there's Lilian Weng's very famous blog post[3] that covers this in detail.
The word attention goes back well over a decade and even before Schmidhuber's usage. He has a reasonable claim but these things are always fuzzy and not exactly clear.
At least the article is more correct specifying Transformer rather than Attention, but even this is vague at best. FFormer (FFT-Transformer) was a early iteration and there were many variants. Do we call a transformer a residual attention mechanism with a residual feed forward? Can it be a convolution? There is no definitive definition but generally people mean DPMHA w/ skip layer + a processing network w/ skip layer. But this can be reflective of many architectures since every network can be decomposed into subnetworks. This even includes a 3 layer FFN (1 hidden layer).
Stories are nice, but I think it is bad to forget all the people who are contribution in less obvious ways. If a butterfly can cause a typhoon, then even a poor paper can contribute to a revolution.
A little after that. I'd put the heyday between 2003-2010, starting with the GMail launch and ending with the Chrome & Android launches. That period includes GMail, Maps, Scholar, Orkut, Reader, the acquisitions of Blogger/YouTube/Docs/Sheets/Slides/Analytics/Android, Summer of Code, OAuth, Translate, Voice Search, search suggestions, universal search, Picasa, etc. Basically I can look at my phone or computer and basically everything I routinely use dates from that period.
GMail, the service that tells every user that they will be indexed and profiled personally for ad leads? that is an achievement ?
GMail was one of the red flags for many that "don't be evil" was not going to be what it appeared. History says that this kind of mass profiling never ends well.
Agreed. I remember the town hall meeting where they announced the transition to being Alphabet. My manager was flying home from the US at the time. He left a Google employee and landed an Alphabet employee.
I know it was probably meaningless in any real sense, but when they dropped the Dont Be Evil motto, it was a sign that the fun times were drawing to an end.
The quota system can kick in at whatever time the limits are reached.
And GPUs are scattered across borg cells, limiting the ceiling. That's why XBorg was created so that a global search among all Borg cells for researchers.
And data center Capex is around 5 billion each year.
Google makes hundres of billions of revenue each year.
You are asking what people would do in impossible situation. Like "what you do after you are dead", literally I could do nothing after I am dead.
I cannot even understand what I do stands for in the context of your question. The above is my direct reaction in the line that he assumes he had unlimited budget.
> I cannot even understand what I do stands for in the context of your question
That he had a higher budget than he knew what to do with. When I worked at Google I could bring up thousands of workers doing big tasks for hours without issue whenever I wanted, for me that was the same as being infinite since I never needed more, and that team didn't even have a particularly large budget. I can see a top ML team having enough compute budget to run a task on the entire Google scrape index dataset every day to test things, you don't need that much to do that, I wasn't that far from that.
At that point the issue is no longer budget but time for these projects to run and return a result. Of course that was before LLMs, the models before then weren't that expensive.
I know a Google operations guy who has occasionally complained that the developers act like computing/network resources are infinite, so this made me chuckle.
Those were fun times! (& great to see you again after all these years). It's astonishing to me how far the tech has come given what we were working on at the time.
> "Realistically, we could have had GPT-3 or even 3.5 probably in 2019, maybe 2020. The big question isn’t, did they see it? The question is, why didn’t we do anything with the fact that we had seen it? The answer is tricky.”
The answer is that monopolies stifle technological innovation because one well-established part of their business (advertising-centric search) would be negatively impacted by an upstart branch (chatbots) that would cut into search ad revenue.
This is comparable to a investor-owned consortium of electric utilities, gas-fired power plants, and fracked natural gas producers. Would they want the electric utility component to install thousands of solar panels and cut off the revenue from natural gas sales to the utility? Of course not.
It's a good argument for giving Alphabet the Ma Bell anti-trust treatment, certainly.
A better example of that behaviour would be Kodak which invented the first digital camera as soon as 1975 then killed the project, because it was a threat to their chemical business.
Digital photography works not just because of the camera but because of the surrounding digital ecosystem. What would people do with digital photos in 1975?
It does not matter. In the 80s, they owned the whole photography market, now they only exist as a shell of it's former self.
By not pursuing this tech, they basically committed corporate suicide over the long run and they knew it. They knew very well, especially going into the 90's and early 2000 than their time making bank selling film was counted.
But as long as the money was there, the chemical branch of the company was all-powerful and likely prevented the creation of another competing product that would threaten its core business, and they did so right until the money flow stopped and suddenly they went basically bankrupt figuratively overnight since the cash cow was now obsolete.
Kodak did plenty of great things with digital cameras in the early 2000s. Their CCD sensors from then are still famous and coveted in some older cameras. Go look at the price of a Leica M8 (from 2006) on eBay.
The problem Kodak had is what the person you're replying to is alluding to. They got outcompeted because they were a photography company, not a digital hardware manufacturer. Companies like Sony or Canon did better because they were in the business of consumer electronics / hardware already. Building an amazing digital sensor and some good optics is great, but if you can't write the firmware or make the PCB yourself, you're going to have a hard time competing.
It's not chemical-wing vs digital-wing. It's that Kodak wasn't (and rightly wasn't) a computing hardware manufacturer, which makes it pretty damned hard to compete in making computing devices.
(Granted companies like Nikon or Leica etc did better, but it's all pyrrhic now, because the whole category of consumer digital cameras is disappearing as people are content to take pictures on their phones.)
>They got outcompeted because they were a photography company, not a digital hardware manufacturer. Companies like Sony or Canon did better because they were in the business of consumer electronics / hardware already.
Huh? This makes no sense. Sony was indeed a consumer electronics company at that time, but Canon was not: Canon was a camera manufacturer. They didn't get into electronics until later as cameras became digital: their earlier cameras were the all-mechanical kind. Sony was an electronics company that had to learn how to make cameras, but Canon was a camera company that had to learn how to make electronics. Kodak could have done the same.
Kodak, while they incidentally made some cameras, were a film and film processing company that wasn’t great at cameras and wasn’t anything in electronics.
They were much worse positioned than either a camera company, or a consumer electronics company, for a pivot to the post-film photography world.
Canon was only a camera company when it started, but by the 1960s was moving into other markets like lenses, magnetic heads, photocopiers, fax machines, etc.
Kodak could have diversified like that too, but they didn't. They were positioned badly because they concentrated almost all their efforts on film and nothing else. Of course, part of this is probably due to American business culture compared to Japanese; Japanese businesses tend to be much more diverse and long-term in thinking, but regardless, Kodak did this to themselves.
Agreed but my additional point is that Kodak actually did branch out beyond film, and did so pretty well from a technical POV.
The problem was competing in hardware manufacturing, which is a whole different ballgame from concentrating on just the imaging aspects of it. So they were reduced, near the end, to just being really a (decent) component supplier to other companies. But that's the wrong part of the food chain to be in.
> In the 80s, they [Kodak] owned the whole photography market,
No, they shared it with Fujifilm (1/3rd each in 1990 according to https://pdgciv.files.wordpress.com/2007/05/tfinnerty2.pdf). And essentially via film, film processing but no significant camera market share, which is likely far more relevant to the transition to digital.
How is that not immediate grounds for his termination? The board should be canned too for allowing such an obvious charlatan to continue ruining the company.
1) Google enjoyed significant market status at the time and a leap forward like seemingly semi conscious AI in 2019 would be seen as terrifying. Consumer sentiment would go from positive to “Google is winning to hard and making Frankensteins monster”
2) it didn’t weave well into googles current product offering and in fact disrupted it in ways that would confuse the user. It was not yet productized but would already make the Google assistant (which at the time was rapidly expanding) look stupid.
A fictional character companion was not clearly a good path.
All this being said, I integrated the early tech into my product division and even I couldn’t fully grasp what it would become. Then was canned right at the moment it became clear what we could do.
Eric Schmidt was the only leader at Google who recognized nascent tech at Google and effectively fast tracked it. When my team made breakthroughs in word to vec which created the suggestion chip in chat/email he immediately took me to the board to demo and said this was the future. (This ironically wound a long path to contribute later to transformer tech)
Sundar often ignored such things and would put on a McKensey face projecting everyone just was far dumber than him and his important problems.
The first time I met with Eric Schmidt he asked what I did (harvest idle cycles to do scientific calculations), and then we talked about how Google was slowly getting into cloud. I remember him saying "I've tried to convince the board that the cloud is the future of growth for google, but they aren't interested" (this was before GCS and GCE launched).
The second time I met him I presented my project (Google Cloud Genomics) to the board of the Broad Institute, of which he was a member (IIRC the chair) and he was excited that Google was using cloud to help biology.
The irony, of course, with that, is that then Google/Alphabet like Microsoft eat its lunch in the space by poorly supporting The Broad and allowing Verily to sign a $150m deal with Azure. Quite a few of the folks who were working on HCLS products in Cloud (were you one of them?) subsequently departed for Verily.
I worked on the predecessor of HCLS and was pushed out before any Azure deal (and would never have moved to verily). I had set up a relationship between Google Research, the Broad, and Cloud (to apply large-scale Google ML technology in genomics) and all that went away when other folks took it in different strategic direction.
The Broad is still a google cloud customer and verily seems to support all three clouds now, but I'm not aware of the details. The whole thing seemed really strange and unbusinesslike.
On the other hand, Alphabet's inability to deploy GPT-3 or GPT-3.5 has led to the possibility of its disruption, so anti-trust treatment may not be necessary.
Disrupted by a whom? Microsoft? Facebook? The company formerly known as Twitter? Even if one of these takes over we'd just be trading masters.
And that's ignoring how Alphabet's core business, Search, has little to fear from GPT-3 or GPT-3.5. These models are decent for a chatbot, but for anything where you want reliably correct answers they are lacking.
Honestly, this is part of the reason why I don't think Google will be a dominant business in 10 years. Searching the web for information helped us service a lot of useful 'jobs to be done' ... but most of those are now all done better by ChatGPT, Claude, etc.
Sure we have Gemini but can Google take a loss in revenue in search advertising in their existing product to maybe one day make money from Gemini search? Advertising in the LLM interface hasn't been figured out yet.
Google (kind of) feels like an old school newspaper in the age of the internet. Advertising models for the web took a while to shake out.
The problem is that chatting with an LLM is extremely disruptive to their business model and it's difficult for them to productize without killing the golden goose.
I know everyone cites this as the innovators dilemma, but so far the evidence suggests this isn't true.
ChatGPT has been around for a while now, and it hasn't led to a collapse in Google's search revenue, and in fact now Google is rushing to roll out their version instead of trying to entrench search.
A famous example is the iPhone killing the iPod, and it took around 3 and a half years for the iPod to really collapse, so chat and co-pilots might be early still. On the other hand handheld consumer electronics have much longer buying cycles than software tools.
maybe its too early. I almost never google search anymore, a lot of my friends do the same. Kind of like after I was using google for years, lots and lots of people were still using sites like ask jeaves, but the writing was on the wall
Maybe. I can't be the only one who gets irked by the slow, typing-out response from copilot/gpt when google gets me instant results. Near instant or a lot faster result might convince me to use whatever gpt search
honestly I use the free version and the text is really fast. Lets take a look at a typical question I may send to chat gpt:
>I have some tasks, some of them maintenance, some of them relaxation that I do >before bed. For example, maintenance:
>brush teeth, floss, water floss, put on nose strips, take prylosec.
>For relaxation:
>stretch, massage gun, ASMR, music, tiger balm, salt bath, movie, tea, nasal >clean, moisturize.
>can you suggest some others tasks?
Google can't even handle this. if you try, the entire page is filled with Videos: ... with 4 video thumbnails from youtube on how to use a water pic.
So I scroll off down (by now chat gpt already has a response) half a page on 'people also ask' with 4 questions about water pics, keep scrolling,
finally search results, and all of them are about ... water pics/flossers.
ChatGPT has been around for less that 15 months[1]. In what version of reality is that "a while now"? Your iPhone/iPod timeline is also off by about 9 years[2].
'A while now' being long enough that I might expect some notable impact on Google's business by now. I do agree it could just be too early.
My iPhone/iPod timeline is referring to the time it took for the iPod's sales to be severely impacted [1]. I'm not disputing that the iPod continued to exist for a long time, the iPod Touch wasn't discontinued until 2022, but it was a pretty meaningless part of Apple's business by that point.
Yea, this is a good example that it might just be too soon still to say.
There was a period where it was not obvious that the iPhone or Blackberry would become the dominant player in the field and it could be true with search and AI chat etc too.
No, it's the other way around. Right now web search is "disruptive" to the LLM business model.
Google is a business fundamentally oriented around loss leaders. They make money on commercial queries where someone wants to buy something. They lose money when people search for facts or knowledge. The model works because people don't want to change their search engine every five minutes, so being good at the money losing queries means people will naturally use you for the money making queries too.
Right now LLMs take all the money losing queries and spend vast sums of investor capital on serving them, but they are useless for queries like [pizza near me] or [medical injury legal advice] or [holidays in the canary islands]. So right now I'd expect actually Google to do quite well. Their competitor is burning capital taking away all the stuff that they don't really want, whilst leaving them with the gold.
Now of course, that's today. The obvious direction for OpenAI to go in is finding ways to integrate ads with the free version of ChatGPT. But that's super hard. Building an ad network is hard. It takes a lot of time and effort, and it's really unclear what the product looks like there. Ads on web search is pretty obvious: the ads look like search results. What does an ad look like in a ChatGPT response?
Google have plenty of time to figure this out because OpenAI don't seem interested. They've apparently decided that all of ChatGPT is a loss leader for their API services. Whether that's financially sustainable or not is unclear, but it's also irrelevant. People still want to do commercial queries, they still want things like real images and maps and yes even ads (the way the ad auction works on Google is a very good ranking signal for many businesses). ChatGPT is still useless for them, so for now they will continue to leave money on the table where Google can keep picking it up.
Ironically they killed the golden goose (the web) with display ads which incentivized low quality content to keep users scrolling. To put salt in their own wound, they are now indexing ever more genAI crap than they can handle and search is imploding.
I think it’s evidence that timing is everything. In the 2010s deep learning was still figuring out how to leverage GPUs. The scale of compute required for everything after GPT-2 would have been nearly impossible in 2017/2018–our courses at Udacity used a few hours of time on K80 GPUs. By 2020 it was becoming possible to get unbelievable amounts of compute to throw at these models to test the scale hypothesis. The rise of LLMs is stark proof of the Bitter Lesson because it’s at least as much the story of GPU advancement as it is about algorithms.
Sure, Google is worth more but they essentially reduced themselves to an ad business. They aren't focused on innovation as much as before. Shareholder value is the new god.
Agreed that Friedman cursed the market with "shareholder theory", and it is a totally subversive and corrupting concept because whatever will increase value is accepted because it's the doctrine.
Well at the time before Microsoft got involved it was sort of an unspoken rule among the AI community to be open and not release certain models to the public.
> Not only were the authors all Google employees, they also worked out of the same offices.
Subtle plug for return-to-office. In-person face-to-face collaboration (with periods of solo uninterrupted deep focus) probably is the best technology we have for innovation.
“Office” does not have to mean open office. Academics all have offices with doors for a reason. I can’t stand open office, but private office in a building with other people is great.
> The group is also culturally diverse. Six of the eight authors were born outside the United States; the other two are children of two green-card-carrying Germans who were temporarily in California and a first-generation American whose family had fled persecution, respectively.
As much as I think the America has lot of things it needs to fix, there is no other country on earth this would be possible. That's just a fact.
> there is no other country on earth this would be possible. That's just a fact
I don’t think this the case. If anything the US makes life very hard for even high-skilled work-based immigrants. Many countries have a higher % of foreign born residents than the US (Singapore, Australia, Germany, Canada)
I myself used to work at Google UK and my own team was 100% foreign born engineers from every continent.
“ Six of the eight authors were born outside the United States; the other two are children of two green-card-carrying Germans who were temporarily in California and a first-generation American whose family had fled persecution, respectively.”
The more interesting thing to me is that only one of them went to an elite American undergrad (Duke). Rest went to undergrads in India, Ukraine, Germany, Canada (UToronto has a 43% acceptance rate).
Why would that be of note, especially in America? I think it would be an interesting observation in China or Japan, or some other country which is generally less welcoming to immigrants than the US
First generation immigrants are still a tiny minority of the population. The fact that the entire team consists effectively of first generation immigrants says something, probably both about higher education and American culture.
One thing is that getting a PhD is a good way to get into the U.S. As a foreigner, many visa and job opportunities open up to you with the PhD.
For an American, it's less of a good deal. Once you have the PhD, you make somewhat more money, but you're trading that for 6 years of hard work and very low pay. The numbers aren't favorable -- you have to love the topic for it to make any sense.
As a result, U.S. PhD programs are heavily foreign born.
I think you have completely the wrong takeaway here...
The US population is around 330 million. The world population is 8.1 billion people. What is that 4%? If you took a random sampling of people around the world, none of them would be Americans. You're going to need a lot more samples to find a trend.
Yet when you turn around and look at success stories, a huge portion of this is going to occur in the US for multiple reasons, especially for the ability to attract intelligence from other parts of the globe and make it wealthy here.
I understand, but reality has to factor in — to get representative you would have to narrow your sample to English-speaking, narrow it to legal for long-term employment in the US, narrow it to having received at least an American-level higher education…
Isn't some of it have to do with it being a self selecting sample? If you come to America to study, you were a good student in your resident country leading to more chances of success than the average local citizen. Their children might be smarter on average.
Alternatively if you are coming fleeing persecutions you are enterprising enough to set up something for your children. That hard work inculcates a sort of work ethic in such children which in turn sets them up for success.
Speaking from experience as an immigrant myself.
I think all those are true, but if so the percentages of first-generation immigrants should increase as you ascend the educational pyramid. I believe it does from
undergrad to Phd, but not from general population to higher education, so clearly there are at least two very different worlds.
There is a motivation that comes with both trying to make it and being cognizant of the relative opportunity that is absent in the second-generation and beyond.
There are also many advantages given to students outside the majority. When those advantages land not on the disadvantaged but on the advantaged-but-foreign, are they accomplishing their objectives? How bad would higher education have been in Europe? What is the objective, actually?
It looks like you are roughly right, but still, a sampling of 8 students from this population is not likely to come up that way (by my calculation 1.4x10^-7)
Its really not; hiring within a single firm, especially for related functions, will tend fo have correlated biases, rather than being a random sample from even the pool of qualified applicants, much less the broader population.
Because there are plenty of people in the US who are neither immigrants nor the children of immigrants. In fact, they're probably a significant majority. So to have 8 out of 8 be members of the smaller set is rather unlikely.
Not when you consider that those people were pulled from a worldwide talent pool for a relatively niche topic. If you can recruit anyone you want in the world and end up with 8/8 Americans, that would be weird.
Not sure if we can claim this any more, what with texas shipping busloads of immigrants to NY and the mayor declaring it a citywide emergency, and both major parties rushing to get a border wall built.
The phrasing was "welcoming to immigrants", not "welcoming to the ever shrinking definition of good immigrants established by a bunch of octogenarian plutocrats".
"Illegal" is a concept - it's not conflating to assume that it's not the bedrock of the way people think.
Illegal immigrant is pretty well defined, an immigrant that didn't come through the legal means. The people hired by Google are probably not illegal immigrants.
Illegal is a status more than it is a concept. Immigrating illegally is not the central definition of immigration, any more than shoplifting is the central definition of customer.
America is much more welcoming of immigration, by which I mean legal immigration, than Japan or China. This is not in dispute.
It is also, in practice, quite a bit more slack about illegal immigration than either of those countries. Although I hope that changes.
>America is much more welcoming of immigration, by which I mean legal immigration, than Japan or China. This is not in dispute.
It's not? It sounds like you know little about the world outside of America. Japan is stupidly easy to immigrate to: just get a job offer here at a place that sponsors your visa and it's pretty trivial to immigrate. Even better, if you have enough points, you can apply for permanent residence after 1 or 3 years, and the cost is trivial. In America, getting a Green Card is very difficult and costly, and depending on your national origin can be almost impossible. In Japan, there's no limits at all, per year or per country of origin, for work visas or PR. Of course, Japan is somewhat selective about who it wants to immigrate, but America is no different there, which is why there's such a huge debate about illegal immigration (in America it's not that hard; in an island country it's not so easy).
Yes, this is one of the actually admirable qualities of the US and California in particular. CA has one of the world’s largest economies because it attracts and embraces people from just about every part of the world.
Google is an ad company at the end of the day. Google is still making obscene amount of money with ads. Currently Gen AI is a massive expense in training and running and is only looking like it may harm future ad revenue (yet to be seen). Meanwhile OpenAI has not surpassed that critical threshold where they dominate the market with a moat and become a trillion dollar company. It is typically very hard for a company to change what they actually do, so much so that in the vast majority of cases its much more effective (money wise) to switch from developing products to becoming a rent seeking utility by means of lobbying and regulation.
Simply put, Google itself could not have succeeded with GenAI without an outside competitor as it would have become its own competition and the CEO would have been removed for such a blunder.
> It is typically very hard for a company to change what they actually do
Microsoft started out selling BASIC runtimes. Then they moved into operating systems. Then they moved into Cloud. Satya seems to be putting his money where his mouth is and working hard to transition the company to AI now.
Apple has likewise undergone multiplet transformations over the decades.
However Google has the unique problem that their main cash cow is absurdly lucrative and has been an ongoing source of extreme income for multiple decades now, where as other companies typically can't get away with a single product line keeping them on top of the world for 20+ years. (Windows being an exception, maybe Microsoft is an exception, but say what you will about MS, they pushed technology forward throughout the 90s and accomplished their goal of a PC in every house)
Also, though I dislike it greatly, Microsoft purchasing GitHub was a brilliant way to buy into a growing segment which were natively hostile to what they stand for. LinkedIn, maybe less clever, they only make pocket change on it, but it is profitable.
Google doesn't really do acquisitions on that level. They do buy companies, but with the purpose of incorporating their biological distinctiveness into the collective. This tends to highlight their many failures in capturing new markets. The last high-profile successful purchase by Google, that I recall at least, were YouTube and Android, nearly twenty years ago.
Waze was bought 10 years ago, and I think that one went well. They didn't try to integrate it to the whole Google Account thing, but they definitely share data between it and Maps (so Waze users get better maps, and Google Maps gets better live traffic information).
Remember that Microsoft spent 10 years fighting the open internet, and only launched the cloud much later. The Windows business was allergic to the web-model, and they fought it tooth and nail.
It's not about just ad revenue though, is it? Many thousands, if not millions of sites rely entirely on Google Search for their revenue. You cannot simply replace all of them with a chatbot - there are massive consequences to such decisions.
>It's not about just ad revenue though, is it? Many thousands, if not millions of sites rely entirely on Google Search for their revenue.
I mean these are kind of tangled together. Chatbots actively dilute real sites from what we are seeing, by feeding back into their fake into googles index in order to capture some of the ad revenue. This leads to the "Gee Google is really sucking" meme we see more and more often. The point is, Google had an absolute cash cow and the march towards GenAI AGI threatens to interrupt that model just as much as it promises to to make the AGI winner insanely rich.
Their wildest success at Google would have meant the company's stock price doubling or tripling and their stock grants being worth a couple million dollars at most. Meanwhile all of them had blank checks from top VCs in the valley and the freedom to do whatever they wanted with zero oversight. What exactly could Sundar have done to retain them?
It would have gone much better for Google if the Brain team had been permitted to apply their work, but they were repeatedly blocked on stated grounds of AI safety and worries about impacting existing business lines through negative PR. I think this is probably the biggest missed business opportunity of the past decade, and much of the blame for losing key talent and Google's head start in LLMs ultimately resides with Sundar, although there are managers in between with their own share of the blame.
They should have been spun out with a few billion dollars budget, no oversight, and Google having 90% (or some other very high %) ownership and rights of first purchase to buy them back.
It is a successful model that has worked again and again to escape the problem of corporate bureaucracy.
Interestingly Google itself has followed this model with Waymo. Employees got Waymo stock not Google, and there was even speculation that Google took on outside investors. It's weird to me that they didn't consider generative AI would be as game changing as self driving cars, especially considering the tech was right in front of them.
Funnily enough, the same AI safety teams that held Google back from using large transformers in products are also largely responsible for the Gemini image generation debacle.
It is tough to find the right balance though, because AI safety is not something you want to brush off.
> AI safety is not something you want to brush off
It really depends on what exactly is meant by "safety", because this word is used in several different (and largely unrelated) meanings in this context.
The actual value of the kind of "safety" that led to the Gemini debacle is very unclear to me.
I think the issue is that there is no future for a trustworthy AI that doesn't completely cannibalize their ad revenue cash cow.
Like, who wants to use an AI that says things like, "... and that's why you should wear sunscreen outside. Speaking of skin protection, you should try Banana Boat's new Ultra 95 SPF sunscreen."
Yeah, but isn't the idea to cannibalize your own products before someone else does?!
In any case, consumer chatbots isn't the only way to sell the tech. Lot's of commercial use too.
I don't see why ads couldn't be integrated with chatbots too for that matter. There's no point serving them outside of a context where the user appears to be interested in a product/service, and in that case there are various ways ads could be displayed/inserted.
"Based on the research I have available wearing sunscreen with a SPF at or greater than 95 is the most likely to prevent the harmful effects of sunburn".
Go to google -> Search "sunscreen" -> (Sponsored) Banana Boat new Ultra 95 SPF sunscreen
> Like, who wants to use an AI that says things like, "... and that's why you should wear sunscreen outside. Speaking of skin protection, you should try Banana Boat's new Ultra 95 SPF sunscreen."
On the other hand, their history suggests most people would be fine with an AI which did this as long as it was accurate:
> ... and that's why you should wear sunscreen outside.
> Sponsored by: Banana Boat's new Ultra 95 SPF sunscreen…"
He was too distracted with all the other things going on under his watch. He spent too much time trying to convince rank and file that it's OK to do business with customers they don't like, trying to ensure arbitrary hiring goals were made, and appeasing the internal activists that led protests and pressured to exit people that didn't fall in line.
This is why the CEO of Coinbase sent the memo a few years back stating that these types of people (not mission focused, self absorbed, and distracting) should leave the company. The CEO of Google should be fired.
How has he been appeasing internal activists? He's been pretty clear about firing them
Edit: I'm also unclear on what makes you think he's spending much time on "arbitrary hiring goals". I remember an initiative for hiring in a wider variety, also conveniently much cheaper, locations but there wasn't indication he was personally spending time on it
It is intellectually dishonest to call the military-industrial complex merely "customers that some Google employees didn't like".
Humans are more than their companies' mission, and your freedom to exercise your political views on how companies should be run inherently relies on this principle, too. So your argument is fundamentally a hypocrites projection.
Besides, many of the Google employees who are against defense projects are against them because those projects actively target their home countries. Much harder problem to fix!
I'm not sure if you're being sarcastic or not. Obviously you don't get to ignore your morals and values just because you're currently employed by someone?
I mean if you had the pedigree these people have, the right move is to leave a big company and go start your own thing, there is simply a lot more upside there. You’d be foolish to stay with a big company, not sure what Google could have done here. Not everyone wants to be Jeff Dean.
Many inventions of the modern age were largely R&D bloat as you'd describe it - in this age R&D is done in university, spun out into a startup, and dies shortly after the IPO - but there's a flicker of it occasionally lasting on if it can hide under the guise of a tax deduction.
Only money spent that turns into an asset is investment. I’m not sure how it is in US, but in Germany, for example, not even registering a patent is an investment and does not imply the creation of an asset — only buying the patent creates the asset.
What they meant was that the R&D was fully deductible, not amortized over a number of years. It's like saying that a business's electricity bill is "tax-free" because they can deduct it from their revenues immediately.
And as it is, Google certainly paid property taxes immediately on the office building as well as FICA on all of the employees who, of course, paid their own taxes.
But haters of the R&D system love to call it "tax-free".
There's the benefit of expensing the costs over amortizating them - but there's also straight up cash in the form of R&D tax credits - it's one method tech companies use to minimize their tax bills:
I mean this is how the US operates. Other countries continually try to refine their existing industries and societal structures. In the US, you let everything rot and hope that a hail Mary innovation comes out of somewhere and creates a breath of fresh air. It creates quite a wild society, but it seems to have worked for now!
It's amazing how much industry changed CS in the last decade. Significantly more than Academia. Resources is part of it of course, but I also think that the majority of best people in CS choose research labs in industry over academia. No surprise here of course - if openai offers you more than $500K a year, why would you go to academia, make peanuts in comparison and worry about politics for your tenure? Then when they get bored at big tech they raise money and start a company.
Why do you think transformers won't be key in making a level 4+ self-driving AI. It seems to me that Vision-capable multi-modal transformers could be the missing part: they can understand what is happening in the world in a deductive way.
The Vision transformer is capable of predicting that a running child is likely to follow that rolling soccer ball. It is capable of deducting that a particular situation looks dangerous or unusual and it should slow down, or stay away from danger in ways that previous crop of AI could not.
Imo, the only thing currently preventing transformers to change everything is the large amount of compute power required to run them. It's not currently possible to imagine GPT4-V running on a embedded computer inside a car. Maybe AI asic type of chip will solve that issue, maybe Edge computing and 5G will find it's use-case... Let's wait and see, but I would bet that transformer will find it's way in many places and change the world in many more ways than bringing us chatbots.
I think we've found repeatedly in self-driving that it's not enough to solve the problem in the normal case. You need an AI model that has good behaviors in the edge cases. For the most part it won't matter how good the vision models get, you're going to need similar models that can make the same predictions from LIDAR signals because the technology needs to work when the vision model goes crazy because of the reflectivity of a surface or other such edge cases where it completely misunderstands where the object is.
I don't quite agree on this one. While I think that Musk choices to go full vision when he did was foulish because he made his product worse, his main point is not wrong: human do drive well while using mostly vision. Assuming you can replicate the thought process of human driver using AI, I don't see why you could not create a self-driving car using only vision.
That's also where I would see transformers or another AI architecture with reasoning capabilities shine: the fact that it can reason about what is about to happen would allow it to handle edge cases much better than relying on dumb sensors.
As a human, it would be very difficult to drive a car just looking at sensor data. The only vehicule I can think of where we do that is submarines. Sensors data is good for classical AI but I don't think it will handle edge case well.
To be a reasonable self-driving system, it should be able to decide to slow down and maintain a reasonable safety space because it is judging the car in front to be driving erratically (ex: due to driver impairement). Only an AI that can reason about what is going on can do that.
Sure but humans do a lot more with vision than just convolutions. So maybe we need to wait for AI to invent new techniques equally revolutionary and equally impactful to convolutions to the point where it's believable that AI models can handle the range of exceptions humans handle. Humans are very good at learning from small data where AI tends to be pretty terrible at one-shot learning by comparison. That's going to continue being hugely relevant for edge cases. We've seen many examples now where a self-driving car crashes due to too much sunlight distorting its perception of where objects are. We can either bury our heads in the sand and pretend AI models work like humans and need the exact same inputs humans do or we can admit there are limitations to the technology and act accordingly.
I also think dumb sensors is unfair, there are Neural Network solutions for processing LIDAR data so we are talking about a similar level of intelligence applied over both sensors.
> As a human, it would be very difficult to drive a car just looking at sensor data.
What is vision if not sensor data?? Our brains have evolved to efficiently process and interpret image data. I don't see why from-scratch neural network architectures should ever be limited to the same highly specific input type.
Can’t argue with this logic, more data points certainly helps. I was arguing about vision vs lidar, vision + lidar is certainly better than vision alone.
Bandwidth alone isn’t what prevents 5G from this sort of application, at least in the USA. Coverage maps tell the story: coverage is generally spotty away from major roads. Cell coverage isn’t a fixable problem in the near term, because every solution for doing that intersects with negative political externalities (antivax, NIMBYism, etc); if you can get people vaccinated for measles consistently again, then we can talk.
It all needs to be onboard. That’s where money should be going.
If you plan on letting llava-v1.5-7b drive your car, please stay away from me.
More seriously, for safety critical applications, LLM have some serious limitations (most obviously hallucinations). Still, I beleive they could work in automotive application assuming: high quality of the output (better than current SoA) and very high token count (hundreds or even thousand of token/s and more), allowing to bruteforce the problem and run many inferences per seconds.
Could combine the existing real-time driving model with influence from the LLM, as an improvement to allow understanding of unusual situations, or as a cross-check?
I wasn't intending to say it would be useful today, but pushing back against what I understood to be an argument that, once we do have a model we'd trust, it won't be possible to run it in-car. I think it absolutely would be. The massive GPU compute requirements apply to training, not inference -- especially as we discover that quantization is surprisingly effective.
Google is an Ads company ad the end of the day. They are making 10% revenue from GCloud but still bulk of their profits are from Ads.
If they were threatened, Google could have easily owned the cloud. They were the first to a million servers and exabyte storage scale. Their internal infra is top notch.
Under no circumstances can they damage their cash cow. They have already shown to be UX hostile by making Ads look like results, and Youtube forcing more ads onto users.
The biggest breakthrough an AI researcher can have at Google is to get more ad clicks across their empire.
Google management knows this. Without ad revenue, they are fucked.
That is my biggest fear. That with AI we are able to exploit the human mind for dopamine addiction through screens.
Meta has shown that their algorithm can make the feeds more addictive (higher engagement). This results in more ads.
With Insta Reels, Youtube, TikTok e.t.c "Digital Heroin" keeps on getting better year after year.
Same thing is happening with the Figure One robot. There isn't much new in the demo Figure did recently, all the work has been published in the palm saycan paper and other deepmind papers. Google is good at laying golden eggs and not realizing that they've laid them. Then employees get fed up, leave and start their own companies so they can productize their work without management getting in the way.
It's 2024; do we really have to point out the fact that publications take liberties with headlines? It's been that way forever.
Besides that, are you denying that transformers are the fundamental piece behind all the AI hype today? That's the point of the article. I think the fact that a mainstream publication is writing about the Transformers paper is awesome.
AI has been people's dream since written history. i.e., everyone in their own sense would want to invent something that can do things and think for themselves. That's literally the meaning of AI.
Transformer is an evolution of large scale training of seq2seq. Diffusion models is an evolution of even older approach - training denoising autoencoders.
It had been a relatively gradual and accelerating progress, with number of people working in the field increasing exponentially, since 2010 or so, when Deep Learning on GPUs was popularized at NIPS by Theano.
Tens of thousands people working together on Deep Learning. Many more on GPUs.
No, but it is weird to use "modern" here. Modern suggest a longer timeframe. I would say machine learning deep NNs is modern AI. It's just not true that everything not transformer is outdated, but it is "kinda" true that everything not deep NN is outdated.
But that's been the case for the last 60 years. Whatever came out in the last 10 years is the first thing deserving to be called AI, and everything else is just basic computer science algorithms that every practitioner should know. Eliza was AI in 1967; now it's just string substitution. Prolog was AI in 1972; now it's logic programming. Beam search and A* were AI in the 1970s; now they're just search algorithms. Expert systems were AI in the 1980s; now they're just rules engines. Handwriting recognition, voice recognition, and speech synthesis were AI in the 90s; now they're just multilayer perceptrons aka artificial neural nets. SVMs were AI in the 00s; now they're just support vector machines. CNNs and LSTMs were AI in the 10s; now they're just CNNs and LSTMs.
Yeah, but for a while it seemed we'd gotten over that, and in the "modern era" people were just talking about ML. Nobody in 2012, as best I can recall, was referring to AlexNet as "AI", but then (when did it start?) at some point the media started calling everything AI, and eventually the ML community capitulated and started calling it that too - maybe because the VC's wanted to invest in sexy AI, not ML.
Consider "modern" to mean NN/connectionist vs GOFAI AI attempts like CYC or SOAR.
I dunno. The earliest research into what we now call "neural networks" dates back to at least the 1950's (Frank Rosenblatt and the Perceptron) and arguably into the 1940's (Warren McCulloch and Walter Pitts and the TLU "neuron"). And depending on how generous one is with their interpretation of certain things, arguments have been made that the history of neural network research dates back to before the invention of the digital computer altogether, or even before electrical power was ubiquitous (eg, late 1800's). Regarding the latter bit, I believe it was Jurgen Schmidhuber who advanced that argument in an interview I saw a while back and as best as I can recall, he was referring to a certain line of mathematical research from that era.
In the end, defining "modern" is probably not something we're ever going to reach consensus on, but I really think your proposal misses the mark by a small touch.
Sure, the history of NNs goes back a while, but nobody was attempting to build AI out of perceptrons (single layer), which were famously criticized as not being able to even implement an XOR function.
The modern era of NNs started with being able to train multilayer neural nets using backprop, but the ability to train NNs large enough to actually be useful for complex things AI research, can arguably be dated to the 2012 Imagenet competition when Geoff Hinton's team repurposed GPUs to train AlexNet.
But, AlexNet was just a CNN, a classifier, which IMO is better just considered as ML, not AI, so if we're looking for the first AI in this post-GOFAI world of NN-based experimentation, then it seems we have to give the nod to transformer-based LLMs.
Well transformers still have some plausible competition in NLP but besides that, there are other fields of AI where convnets or RNNs still make a lot of sense.
Modern AI != Machine Learning. Artificial intelligence has always been about models being able to accomplish a broad swath of tasks that they weren't explicitly trained on. It seems hard to argue we had accomplished this before massive transformers came along.
The article starts off by saying they all left Google. There's also an entire section narrating how google execs passed on transformers and didn't pick up until OpenAI released GPT.
>The picture internally is more complicated. “It was pretty evident to us that transformers could do really magical things,” says Uszkoreit. “Now, you may ask the question, why wasn’t there ChatGPT by Google back in 2018? Realistically, we could have had GPT-3 or even 3.5 probably in 2019, maybe 2020. The big question isn’t, did they see it? The question is, why didn’t we do anything with the fact that we had seen it? The answer is tricky.”
>Many tech critics point to Google’s transition from an innovation-centered playground to a bottom-line-focused bureaucracy. As Gomez told the Financial Times, “They weren’t modernizing. They weren’t adopting this tech.”
> There's also an entire section narrating how google execs passed on transformers and didn't pick up until OpenAI released GPT.
Google certainly seem to have fumbled on rapidly exploiting the tech, but I'm not sure that's totally fair.
Google built BERT in 2018, around the same time as OpenAI built GPT-1, but neither were released.
OpenAI's GPT-2 was built in february 2019 but OpenAI were hesitant to release it due to potential danger, only doing so later in november 2019. GPT-3 followed in july 2020, but ChatGPT which the first accessible to public view of the tech didn't appear until november 2022 (how time flies!).
Google's first transformer-based chatbot, LaMDA, was released in may 2021, well over a year before ChatGPT, but was then rapidly withdrawn due to negative feedback and behavior.
The difference between ChatGPT's acceptable behavior, and success, compared to LaMBDA, was OpenAI's invention of RLHF to better control it. One could argue that Google should have invented something similar themselves earlier, but surely this might have been expected to come from these same people who had been working on language models at Google ...
Regardless of the publication intentions, I like to see contemporary folks who have made an impact on the world through technology being recognized and elevated.
In economics, where I come from, people are not afraid to give accolades to relevant people, whether treasury secretaries, banking economists, or professors.
I feel that in tech, due to the lack of publications that have technical capabilities to understand that impact plus the lack of goodwill from the media in general, there are a lot of folks doing relevant work out there that is ostracised due to this "cult of tech CEO/entrepreneur" or intentionally under the radar to not bring any negative light in their lives.
What's in it is positive association between AI and Google. Google is sort of a seen as a joke in the latest wave of consumer facing AI tools, and they desperately want to change that.
I'm not sure why you're being downvoted, but you're right — a story suggesting that 'modern' AI originated at Google is precisely the kind of narrative that Google's PR firm would shop around.
As someone who periodically listens to papers via text-to-speech, I was intrigued when I saw this link. I'm a little uncertain about the "AI simplification" - If I'm reading a paper I don't want to have the second guess everything I hear as to whether it completely misinterpreted, elided, or hallucinated a point.
Do you have any personal experience with the results?
They’re good at summarizing text convincingly if you didn’t read the original document, but GPT-4 still makes a ton of basic factual mistakes: https://arxiv.org/abs/2402.13249
I think it will be several decades - maybe longer - before you can throw an arbitrary 20-page document into a computer program and get an accurate one-page summary. People need to take seriously that transformer LLMs don’t understand human language.