>AlphaQubit, a recurrent-transformer-based neural-network architecture that learns to predict errors in the logical observable based on the syndrome inputs (Methods and Fig. 2a). This network, after two-stage training—pretraining with simulated samples and finetuning with a limited quantity of experimental samples (Fig. 2b)—decodes the Sycamore surface code experiments more accurately than any previous decoder (machine learning or otherwise)
>One error-correction round in the surface code. The X and Z stabilizer information updates the decoder’s internal state, encoded by a vector for each stabilizer. The internal state is then modified by multiple layers of a syndrome transformer neural network containing attention and convolutions.
I can't seem to find a detailed description of the architecture beyond this bit in the paper and the figure it references. Gone are the days when Google handed out ML methodologies like candy...
(note: not criticizing them for being protective of their IP, just pointing out how much things have changed since 2017)
wait. Are you saying you were a paper author who described a method in their paper that wasn't actually implemented? IE, your methods section contained a false description?
No I’m saying the original doc2vec paper described an approach which the ML community never seemed to actually implement. There were things that were called doc2vec, but they were not what the paper described. Folks mostly seemed to just notice.
I like opensource and reproducible methods too. but here, the code was written by claude and then exported. Is that considered a dependency? They can find a different LLM or pay someone to improve/revise/extend the code later if necessary
Agree that the use of "AI engineers" is confusing. Think this blog should use the term "engineering software with AI-integration" which is different from "AI engineering" (creating/designing AI models) and different from "engineering with AI" (using AI to assist in engineering)
The term AI engineer is now pretty well recognised in the field (https://www.latent.space/p/ai-engineer), and is very much not the same as an AI researcher (which would be involved in training and building new models). I'd expect an AI engineer to be primarily a software developer, but with an excellent understanding of how to implement, use and evaluate LLMs in a production environment, including skills like evaluation and fine-tuning. This is not some dataset you can just bundle in software developer.
I absolutely thought that is was dubbing after watching the video and not paying close attention. This seems like 2010s tech and my expectations in 2024 are so much higher.
At 2 trips per day for 300M Americans over 7 days, that would put the rideshare takeover at ~4.2Bn. If we extrapolated based on the referenced graph and exponential growth, that would put the takeover at 2029 :)
Its safe to assume that the limiting factors will soon become sourcing of components of the perception and control stacks.
Yeah, I wonder how much money they're pouring into lidar production. Particularly considering that they've partnered with Hyundai, Stellantis, Mercedes-Benz Group AG, Jaguar Land Rover, and Volvo.
That's not "[understanding] a complex report." It exemplifies the authors point that they're portraying it as "tools for those unwilling to put in any effort." During this meeting, how is that guy going to critique any detail from the prospectus or make any useful contribution from the 20 second summary?
I think these LLMs are tremendously useful and the ads undersell them. But I can also appreciate that the ads have to appeal to the lowest common denominator, and showing real workflows in 30sec is difficult.
Correct me if I'm wrong, but we have no recent and explicit US gov't guidance on whether these model weights are copyrightable. Copyright office has said ai-generated outputs are not copyrightable[1], but hasn't weighed in on weights(?) Kind of seems like that should change?
Wasn't a relevant question for AlphaFold2, as the weights for it were CC BY 4.0 license.
These model weights (and many other ml weights) are clearly very useful in a commercial settings, but google thinks it can scare people into not using them with the wording of their license, are they right?
It's deeper than just weights because the topic is biological.
In the old days of genomics there were massive patent wars. First, the human genome project itself. Craig Venter got massive funding to sequence the human genome with the understanding he'd patent all the genes. So there was a space race of sorts where the private sector sought to beat him - lead by Francis Collins now head of the NIH. It came out a tie (or that's what they called it), Bill Clinton brought them both on a stage and said "great job! also genes aren't patentable!"
Then a whole stink arose around Myriad Genetics who patented a BRCA test. Now that's a bigtime gene far as cancer goes see: Angelina Jolie. Then in 2013 the supreme court ruled genes cannot be patented.
So what is alphafold 3? Is it a ground truth of which protein interacts with what? In which case it seems not patentable. Or is it a method, or algorithm, to estimate protein interactions? That's more grey area. Idk. If google wanted to monetize it proper they'd probably keep it as an internal black project and cook up pharma collabs and such. But they've made it public(ish). Still a long way to go, or at least some more steps. If we say Protein A interacts with Protein B, we then have to ask whether they're expressed in the same cell, which itself is not enough! Most bio measurements are in big batches of millions of cells. It has to be same cell at the same time. So if our batch is a million cells w/ protein A, a million cells w/ protein B, then it looks like both are "on" in our batch of 2 million cells. But the truth is more nuanced. And then even then, other considerations such as post translational modifications and which cellular compartment these proteins reside in.
That has basically zero practical effect. If Google hands out the weights to thousands of people, and even one of them leaks them to somebody who hasn't "seen a bunch of agreements", then Google's only protection against further redistribution is copyright. Which doesn't exist.
Yes, they can come after the leaker. If they can identify the leaker, which they probably can't. But even crucifying the leaker won't put the genie back into the bottle.
Yeah, google can control initial distribution. but in the long term, I believe Google's ability to enforce that Terms of Use (re: no commercial use) is entirely dependent on whether they have IP ownership of the weights
While not directly analogous, people often think "it's just math" or "it's just numbers" and therefore claim their use is ok. I would encourage those people to read about illegal numbers: https://en.m.wikipedia.org/wiki/Illegal_number
I think you could replace "weights" with other things that you probably would want restricted (eg private keys).
My point is exactly that: just because something is the result of some fixed mathematical process plus randomness does not mean there's no nuance and it's always ok to share/publish.
It hasn't been weighed on, but as someone with no legal credentials, it wouldn't surprise me if the ultimate answer on models being copyrightable is "No."
Ultimately, the working parts of a given model are completely unknowable to even the smartest humans once you get to doing anything past bare basics. We know the shape of the model, the number of layers, and what inputs/outputs correlate to, but not really anything else. It's the product of a machine trying things randomly until something works, then the best model produced is selected for production.
Not altogether different on a high level perspective from generating an image, or piece of text using a model. You're introducing a random factor, number of steps, and the machine uses this unknowable model to produce something a person can understand.
I do think the law should update and grant some protections to people who produce models, because losing all protection would mean the death of open model releases, and then we'd be even more seriously staring down the barrel of corpos controlling the entirety of the technology moreso than we are now. At least open models provide some semblance of control for end users.
> I do think the law should update and grant some protections to people who produce models
This is what every reasonable person thinks, and because legislators and lawyers generally aren't that great at or keen on designing new frameworks for IP protection, the most likely outcome would be extending the concept of copyrights to models.
They did it for photography, for software programs, and they will do it for AI models.
I haven't used bluesky, but can't you just go to different sites/forums for discussions with different people? This seems to me like complaining about a lack of oil painters on hackernews, when deviantart is just a few clicks away
No it's kind of all one global feed right now. There's different "curated" lists of users you can follow and "super clusters" of tight knit interest groups like "swifties" and "kpop" but basically they're still plugging in moderation so admin has had to ban accounts that were trolling the lefties, has left it with a bit of a monoculture IMO
It's technically ready to federate but there aren't any other services that make it easy to create an account: you can self host your identity which is good and then select your own moderation services to follow but as to "who's on bluesky" its a certain type of person.
Vigorous discussion between different viewpoints is completely different than multiple polar opposite echo chambers.
Take reddit for example, you have /r/politics that downvotes, deletes, and shadow bans all conservative voices out of existence, and /r/conservative that does the same thing for liberal voices.
I'd like a place where thoughtful discussion of all viewpoints, left, right, and center are not just tolerated, but encouraged.
You really can't find people with different views arguing on reddit? Seems like arguing is like 99% of the comments there.
If you don't like /r/politics, there's alternative subreddits specifically built for political debate. Like r/changemyview
>One error-correction round in the surface code. The X and Z stabilizer information updates the decoder’s internal state, encoded by a vector for each stabilizer. The internal state is then modified by multiple layers of a syndrome transformer neural network containing attention and convolutions.
I can't seem to find a detailed description of the architecture beyond this bit in the paper and the figure it references. Gone are the days when Google handed out ML methodologies like candy... (note: not criticizing them for being protective of their IP, just pointing out how much things have changed since 2017)
reply