I work at a trading firm. RIP to the GOAT, the god of quants.Reading about him and RenTec, back in high school, was one of the first things that got me attracted to the field.
The source code to the kernel's elf "binfmt"[0] is very readable. The elf binfmt is registered (along with a few others like `binfmt_misc`) and you get dispatched there via `exec_binprm`[1] which is invoked by the `execve` syscall[2]. When loading a shared library, you also dispatch to a binfmt via the `uselib`[3] syscall.
It used to be available online for free at <https://www.iecc.com/linker/>, but isn't any more.
You could get it from archive.org though, and I think I've seen mirrors in other formats on Hacker News in previous discussions about the book.
Yes; but there's also descriptions of legacy container formats no one cares about any more. I still recommend it and have my own copy, I just recall skipping over quite a bit of material.
If you're interested in or already comfortable with Rust, Amos (fasterthanlime) has a nice long-form series that goes into this, basically building a custom ELF loader in the end. The writing style isn't everyone's cup of tea, but maybe you like it: https://fasterthanli.me/series/making-our-own-executable-pac...
The issue is that Americans with half-baked knowledge think that all Indians have a caste assigned to them, or that India has a national level caste allocation policy. For Americans who hate India, this gives them more ammo to hate us.
So, a LLM, trained extensively on StackOverflow and other data (possibly the plethora of LC solutions out there), is fed a bunch of LC questions and spits out the correct solutions? In other news, water is blue.
It is one thing to train an AI on megatons of data, for questions which have solutions. The day ChatGPT can build a highly scalable system from scratch, or an ultra-low latency trading system that beats the competition, or find bugs in the Linux kernel and solve them; then I will worry.
Till then, these headlines are advertising for Open AI, for people who don't understand software or systems, or are trash engineers. The rest of us aren't going to care that much.
If it helps, this likely is coming. I think we have a tendency to mentally move the goalposts when it comes to this kind of thing as a self-defense mechanism. Years ago this would have been a similar level of impossibility.
Since all a codebase like that is is a kind of directed graph, then augmentations to the processing of the network to allow for the simultaneous parsing of and generation of this kind of code may not be as far off as you thinking.
I say this as an ML researcher of coming up and around the bend towards 6 years of experience in the heavily technical side of the field. Strong negative skepticism is an easy way to bring confidence and the appearance of knowledge, but it also can have the downfall of what has happened in certain past technological revolutions -- and the threat is very much real here (in contrast to the group that believes you can get AGI from simply scaling LLMs, I think that is very silly indeed).
Thank you for your comment, I really appreciate it and the discussion it generated and appreciate you posting it. Replying to it was fun, thank you.
I've worked in ML for awhile (on the MLOps side of things) and have been in the industry for a bit, and one thing that I think is extremely common is for ML researchers to grossly underestimate the amount of work needed to make improvements. We've been a year away from full self driving cars for the last six years, and it seems like people are getting more cautious in their timing around that instead of getting more optimistic. Robotic manufacturing- driven by AI- was supposedly going to supplant human labor and speed up manufacturing in all segments from product creation to warehousing, but Amazon warehouses are still full of people and not robots.
What I've seen again and again from people in the field is a gross underestimation of the long tail on these problems. They see the rapid results on the easier end and think it will translate to continued process, but the reality is that every order of magnitude improvement takes the same amount of effort or more.
On top of that there is a massive amount of subsidies that go into training these models. Companies are throwing millions of dollars into training individual models. The cost here seems to be going up, not down, as these improvements are made.
I also think, to be honest, that machine learning researchers tend to simplify problems more than is reasonable. This conversation started with "highly scalable system from scratch, or an ultra-low latency trading system that beats the competition" and turned into "the parsing of and generation of this kind of code"- which is in many ways a much simpler problem than what op proposed. I've seen this in radiology, robotics, and self driving as well.
Kind of a tangent, but one of the things I do love about the ML industry is the companies who recognize what I mentioned above and work around it. The companies that are going to do the best, in my extremely bias opinion, are the ones that use AI to augment experts rather than try to replace them. A lot of the coding AI companies are doing this, there are AI driving companies that focus on safety features rather than driver replacement, and a company I used to work for (Rad AI) took that philosophy to Radiology. Keeping experts in the loop means that the long tail isn't as important and you can stop before perfection, while replacing experts altogether is going to have a much higher bar and cost.
This is a bit like seeing Steve Mann's wearable computers over the years ( https://cdn.betakit.com/wp-content/uploads/2013/08/Wearcompe... ) and then today anyone with a smartphone and smart watch has more computing power and more features than most of his gear ever had, apart from the head mounted screen. More processing power, more memory, more storage, more face recognition, more motion sensing, more GPS, longer runtime on battery, more bandwidth and connectivity to e.g. mapping, more assistants like Google Now and Siri.
And we still aren't at a level where you can be doing a physical task like replacing a laptop screen and have your device record what you're doing, with voice prompts for when you complete different stages, have it add markers to the recording, track objects in the scene like and solve for questions like 'where did that longer screw go?' or 'where did this part come from?' and have it jump to the video where you took that part out. Nor reflow the video backwards as an aide memoire to reassembling it. Or do that outside for something like garage or car work, or have it control and direct lighting on some kind of robot arm to help you see, or have it listen to the sound of your bike gears rattle as you tune them and tell you or show you on a graph when it identifies the least rattle.
Anything a human assistant could easily do, we're still at the level of 'set a reminder' or 'add to calendar' rather than 'help me through this unfamiliar task'.
Wow - Steve Mann - haven't checked what he's doing in ages - real blast from the past :-) I was really disappointed the AR/VR company he was with went under - I had really high hopes for it.
RE: changing you laptop screen. My buddy wants an 'AR for Electronics' that can zoom in on components like a magnifying glass (he wants head mounted), identify components by marking/color/etc and call up schematics on demand. So far, nothing seems to be able to do that basic level of work.
It really depends on what you're talking about. Individual components can often be automated fairly successfully, but the actual assembly of the components is much harder. Even in areas of manufacturing where it's automated you have to do massive amounts of work to get it to that point, and any changes can result in major downtime or retooling.
AI companies such as Vicarious have been promising AI that makes this easier. Their idea was that generic robots with the right grips and sensors can be configured to work on a variety of assembly lines. This way a factory can be retooled between jobs quicker and with less cost.
Lookup lights out manufacturing. There are factories that often run whole days in the dark because there's no point turning on the nights if there's no one around
Not really. Although running CNC milling machines and lathes unattended at night is reasonably common. Day shift sets them up, and they cut metal all night.
Fanuc, the robot manufacturer, famously does run a lights-out factory, and has since 2001. It was the dream of Fanuc's founder. Baosteel now has a lights-out steel coiling facility. Both of these are more PR than cost effective.
There are many factories where there are very, very few people for large rooms full of machines, though.
You have just described Pareto's principle[0] the 80/20 rule. It takes 20% of the effort to get to 80% but it then takes 80% of the the effort to finish the final 20%.
Ah, the good ol "A(G)I will arrive in 10 years!" --For the past 50+ years, basically.
It's a cautionary tale to people who are working in ML to be not too optimistic on "the future", but in my opinion being cautiously optimistic(not on AGI though) isn't harmful by itself, and I stand by that. Well at least until we hit the next wall and plunge everyone into another AI winter(fourth? fifth?) again.
As a plus, we do actually see some good progress that benefited the world like in biotech. Even though we are still mostly throwing random stuffs at ML to see if it works. Time will tell I guess.
Kurzweil gets a lot of flack for this sort of thing, he's generally presented as the ridiculous hype man for AI. And yet, he bet in 2002 that an AI would pass the Turing test by 2029. (And this is actually a more conservative prediction than "we will have AGI by 2029.") And looking at GPT3 it seems like he is probably going to win that bet.
I think the big revolution of the last few years has been to recognize that we'll likely get robots that can pass the turing test well before we get full self driving vehicles that can run anywhere there are basically ordinary paved roads.
I think even three years ago, most people would have thought the reverse.
So Kurzweil was imagining the turing test as the capstone to a decade of more and more capable ai products, not as "kind of early interesting success that may (or may not) presage really useful AI."
("The Turing test" is a pretty hazy target. I have no doubt that a chatgpt that was not trained to loudly announce that it was an AI could convince lots of people that it's a real human, right now. I think it's also the case that people with some experience with it could pretty quickly find ways to tell what it is.)
The Turing test has always been hazy - I don't think it's something we'll consider "passed" until at least a clear majority consider it passed (if not substantially further).
Otherwise you risk claiming ELIZA passed it, because a couple people thought so. Or that one Google employee this time.
Yes, that's what I was trying to say in the last paragraph. The Turing Test was an interesting thought experiment, not, like, an actual test. It's never been very clear how to operationalize it, and it's clear that Turing wasn't imagining how easily you can actively fool people. He was more making a point that we don't have an internal definition of intelligence -- it's not like multiplication where you can examine the underlying process and say, "Well, did it do this correctly?" You can only look at the results.
Good point, I do appreciate this comment. Thanks for adding this. It is is interesting in how it very much appears that he will be correct, but instead in a different way maybe than most of us would reasonably have guessed at the time.
Working out the engineering challenges will probably take a decade extra, but I wouldn't listen to the ML researchers' opinions on this issue; the evidence that they are in the drivers seat is shaky. We're still seeing exponential gains in processing power and we're closing in on order-of-magnitude amounts of processing power being available in silicone as in a human brain. There is a pretty decent chance that there is some magic threshold around there where all these tasks become easy with current algorithms.
I can understand that. I think that might be somewhat of a quick generalization. There are tendencies of people in the field to sometimes jump to rapid conclusions, but that is not researchers at all or in this case, me. I tend to be incredibly conservative, for example, and I have tangled with a number of "real world" systems enough to know some of the intricacies (though not at the edge).
If I were to make a point as to why your notes on self-driving cars and in-warehouse robots may not transfer to the case of software development, it's that they are fundamentally two very different problems with very different issues attached to them. It unfortunately is very much apples to oranges. They are both NP-hard but very different kinds of NP-hard.
A software program is a closed-loop target, though it is NP-hard. But we're optimizing for a different kind of metric here that is well-defined. Any kind of self-directed reinforcement-or-otherwise autoregressive-in-the-world algorithm is going to have an extraordinarily long tail of edge cases.
What I was talking about when I mentioned the geometry of the problem is not the parsing of the code, but the geometry of a near-optimal solution. Certainly, scale will be expensive, but Sutton is our friend here. That's why it's more "trivial" than problems that require humans in the loop -- you don't need humans to parse, structure, generate, and evaluate the data flow of a software code base, though admittedly if models like RHLF become popular as you noted, the endpoints that generate code under those geometric constraints -- those will become extremely expensive.
I think the geometric problem is very hard but the hurdle of scaled language models is more technically impressive to me.
What's nice is that unlike needing to generate a long, 1d story, too, there's more robustness with a huge field of possibility that's had years of work on the software side of things. It's not that it's going to be easy, but I think we've all grown as we've seen how hard self-driving cars are, and it's just not that kind of scenario, since all consequences of the 'world' within the repo-generation case are (for the most part) self-contained.
I hope that helps elucidate the problems a bit. To me, my optimism is much more rare, and only generally when I feel like I have a solid grasp of the fundamentals of it enough (i.e. I roughly know deliverability and have decent known error bounds on the sub-problems).
That said, I heartily agree with you that when all else fails -- assistive is good. What I see a "complete solution" doing well is creating a Kolmogorov-minimal, complete starting point and things evolving from there. Whether that works or not remains to be seen.
I don't think ChatGPT or its successors will be able to do large-scale software development, defined as 'translating complex business requirements into code', but the actual act of programming will become more one of using ML tools to create functions, and writing code to link them together with business logic. It'll still be programming, but it will just start at a higher level, and a single programmer will be vastly more productive.
Which, of course, is what we've always done; modern programming, with its full-featured IDEs, high level languages, and feature-rich third-party libraries is mostly about gluing together things that already exist. We've already abstracted away 99% of programming over the last 40 years or so, allowing a single programmer today to build something in a weekend that would have taken a building full of programmers years to build in the 1980s. The difference is, of course, this is going to happen fairly quickly and bring about an upheaval in the software industry to the detriment of a lot of people.
And of course, this doesn't include the possibility of AGI; I think we're a very long way from that, but once it happens, any job doing anything with information is instantly obsolete forever.
That's my assumption as well - the human programmers will far more productive, but they'll still be required because there's no way we can take the guard rails off and let the AI build - it'll build wrong unit tests for wrong functions which create wrong programs and will require humans to get it back on track.
I think it is really hard to say where all this goes right now when we currently don't even have good quantitative reasoning.
10 years ago we were still working on MNIST prediction accuracy. 10 years forward from here all bets are off. If the model has super human quantitative reasoning and a mastery of language I am not sure how much programming we will be doing compared to moving to a higher level of abstraction.
On the other hand, I think there will be so many new software jobs because of the volume of software built over the next 20 years. The volume of software built over the next 20 years is probably unimaginable sitting where we are.
I don't think anyone can say what's going to happen in 10 years, but what I do know is if you look back people have been saying programmers will be obsolete in 10 years for way longer than a decade.
I could see IDEs for AI, where you manipulate ways to input prompts (natural Landis language, weighted keywords, audio..) and selection of methods (chatgpt, whatever model will come for diagrams, visual models, audio ones..). Then basically visually program outputs, add tests you want to use to validate and feed back, multimodal output views..
I think you’re right in one sense, and we both agree LLMs are not sufficient. I think they are definitely the death knell for the junior python developer that slaps together common APIs by googling the answers. The same way good, optimizing C, C++, … compilers destroyed the need for wide-spread knowledge of assembly programming. 100% agreed on that.
Those are the most precarious jobs in the industry. Many of those people might become LLM whisperers, taking their clients requests and curating prompts. Essentially becoming programmers over the prompting system. Maybe they’ll write a transpiler to generate prompts? This would be par of the course with other languages (like SQL) that were originally meant to empower end-users.
The problem with current AI generated code from neural networks is the lack of an explanation. Especially when we’re dealing with anything safety critical or with high impact (like a stock exchange), we’re going to need an explanation of how the AI got to its solution. (I think we’d need the same for medical diagnosis or any high-risk activity). That’s the part where I think we’re going to need breakthroughs in other areas.
Imagine getting 30,000-ish RISCV instructions out of an AI for a braking system. Then there’s a series of excess crashes when those cars fail to brake. (Not that human written software doesn’t have bugs, but we do a lot to prevent that.). We’ll need to look at the model the AI built to understand where there’s a bug. For safety related things we usually have a lot of design, requirement, and test artifacts to look at. If the answer is ‘dunno - neural networks, ya’ll’, we’re going to open up serious cans of worms. I don’t think an AI that self evaluates its own code is even on the visible horizon.
I don't think chatgpt lacks an explanation. It can explain what it's doing. It's just that it can be completely wrong or the explanation may be correct and the code wrong.
I gave some code to ChatGPT asking to simplify it and it returned the correct code but off by one. It was something dealing with dates, so it was trivial to write a loop checking for each day if the new code matched in functionality the old one.
You will never have certainty the code makes any sense if it's coming from one of these high tech parrots.
With a human you can at least be sure the intention was there.
It’s a very sophisticated form of a recurrent neural network. We used to use those for generating a complete image based on a partial image. The recurrent network can’t explain why it chose to reproduce one image instead of another. Nor can you look at the network and find the fiddly bit that drive that output. You can ask a human why they chose to use an array instead of a hash map, or why static memory allocation in this area avoids corner cases. ChatGPT simply generates the most likely text as an explanation. That’s what I mean about being able to explain something.
Ah the HN echo chamber again! Please visit your local non FAAAM (or what it is now?) fortune 1000, pick a senior dev randomly and work with them for week. Chatgpt is vastly better now, today. Faster, does not need sleep, rest, politeness or handholding, can explain itself (sure it’s wrong often but less wrong than the dev you picked while actually being able to use proper syntax and grammar, unlike the dev you picked) and is, of course, let’s not deny it, way cheaper.
I’ve worked with plenty of jr developers at east coast government contractors, arguably the bottom of the barrel. I would still rather put their code into production, even without unit tests, than I would ChatGPT.
ChatGPT is only cheap if you don’t need its code to do anything of any particular value. It’s a seemingly ideal solution to collage homework for example. But professionally people write code to actually achieve something, this is why programmers actually get paid well in the first place. The point isn’t LOC the point is solving some problem.
And junior devs are horrible at knowing what problem to solve and how to solve it without handholding. I am working on a relatively complex DevOps/“cloud application modernization” project. Where the heavy lifting is designing the process and gathering requirements. But there are a lot of 20-40 line Lambdas and Python/boto3 (AWS SDK), yaml/json wrangling, dynamic Cloudformation creating scripts.
I was able to give ChatGPT the requirements for all of them. The types of bugs I found during the first pass:
- the AWS SDK and the underlying API only returns 50 results in one call most of the time. From the SDK you have to use the built in “paginators”. ChatGPT didn’t use them the first time. But once I said “this will only return the first 50 results”. It immediately corrected the script and used the paginator. I have also had to look out for similar bugs from junior devs.
- The usual yaml library for Python doesn’t play nicely with CloudFormation templates because of the function syntax that starts with an “!”. I didn’t know this beforehand. But once I told ChatGPT the error, it replaced the yaml handling with cfn-flip.
- I couldn’t figure out for the life of me how to combine the !If function in CloudFormation with a Condition, and a Yaml block that contain another !Select function with two arguments. I put the template block without the conditional and told ChatGPT “make the VPC configuration optional based on a parameter”. It created the Parameter section, the condition and the appropriate Yaml.
I’ve given similar problems to interns/junior devs before and ChatGPT was much better at it.
You really think that Jr devs could crank out the same code faster than ChatGPT? I couldn’t crank out the same code and you couldn’t either. The most you can hope from JR devs (even the ones I have met at BigTech) is that they don’t eat the chalk during the first 3-6 months.
As for now, issue with ChatGPT is that it doesn't really crank anything. It instantly produces answer for given input. While programmer can crank things. For example I asked ChatGPT to write a function which returns UUID generated with some rules. It spewed the solution. It looked like a correct one but when I run it, it returned wrong answer. I worked with ChatGPT for some time and it corrected its code. But I would expect from junior developer to actually run his code and check output.
Now if ChatGPT would be able to actually work on the problem rather than returning generated text, that would be a completely different beast. And I think that this workflow will come in the near future because it's pretty obvious idea. Get task specification, generate tests, generate code, fix code until tests work, refactor code until it meets some standards, etc.
> I think that this workflow will come in the near future because it's pretty obvious idea. Get task specification, generate tests, generate code, fix code until tests work, refactor code until it meets some standards, etc.
ChatGPT probably works great if you use it to speedrun normal best practices in software engineering. Make it start by writing tests given a spec, then make it write code that will pass the specific tests it just wrote. I’m guessing it’ll avoid a lot of mistakes, much like any engineer, if you force it to do TDD.
You can loop chatgpt around automatically, asking it to write tests and reason about the code for a few iterations; in my experience it auto corrects the code like a human would after some ‘thinking’ time. Of course the code has to run automatically and errors fed back, like with a human. It works fine though, without human input after some prompting work.
Always hire a senior developer without experience for junior role. By that I mean hire a developer who knows how to program but lacks specific experience or has no formal experience at all.
Doesn’t this only work for relatively contrived situations? I can tell a jr dev to go and add some minor feature in a codebase, put it behind a flag, and add tracking/analytics to it. I can point to the part of the application I want the feature to be added on the screen and the jr devs are often able to find it on their own. I haven’t seen chatGPT do anything like that and I don’t think there is a way to provide it with the necessary context even if it has the capability.
For me it works for small stand alone utility scripts. But the most impressive thing I was able to get it to do was.
“Given an XML file with the format {[1]} and a DynamoDB table with two fields “Key”, “Value”, write a Python script that replaces the Value in the xml file when the corresponding key is found. Use argparse to let me specify both the input xml file and the output XML”
It spit out perfect Python code. I hadn’t used XML in well over a decade and I definitely didn’t know how to read xml in Python. I didn’t want to bother about learning.
I actually pasted an XML sample like the link below.
Wait, you think "junior developers are actually moderately competent" only makes sense within the HN echo chamber?
I think you have that exactly backwards.
Most junior developers most places may not have the experience of a senior developer, and thus be able to do the translation from business logic to code quite as fast and accurately the first time, but this kind of derogatory attitude toward them is incredibly condescending and insulting.
ChatGPT doesn't know what it's doing. It doesn't know anything, and unlike the most junior developer barely trained, it can't even check its output to see if it matches the desired output.
And for goodness' sake, get rid of the absurd idea that all the competent developers are in Silicon Valley. That's even more insulting to the vast majority of developers in the entire world.
On the other hand you don’t want to manually program all the joints of a robot to move through any terrain. You just convert a bunch of cases to a language to make the robot fluent in that
Translating an idiomatic structured loop into assembly used to be an "L3" question (honestly, probably higher), yet compilers could do it with substantially fewer resources than and decades before any of these LLMs.
While I wouldn't dare offer particular public prognostications about the effect transformer codegens will have on the industry, especially once filtered through a profit motive - the specific technical skill a programmer is called upon to learn at various points in their career has shifted wildly throughout the industry's history, yet the actual job has at best inflected a few times and never changed very dramatically since probably the 60s.
I agree this would have been thought to be impossible a few years ago, but I don't think it's necessarily moving the goalposts. I don't think software engineers are really paid for their labour exactly. FAANG is willing to pay top dollar for employees, because that's how they retain dominance over their markets.
Now you could say that LLMs enable Google to do what it does now with fewer employees, but the same thing is true for every other competitor to Google. So the question is how will Google try and maintain dominance over it's competitors now? Likely they will invest more heavily in AI and probably make some riskier decisions but I don't see them suddenly trying to cheap out on talent.
I also think that it's not a zero sum game. The way that technology development has typically gone is the more you can deliver, the more people want. We've made vast improvements in efficiency and it's entirely possible that what an entire team's worth of people was doing in 2005 could be managed by a single person today. But technology has expanded so much since then that you need more and more people just to keep up pace.
Google already published a paper claiming to have deployed an LLM for code generation at full scale to its tens of thousands of software engineers, years ago.
I'm kind of interested in how AI is going to interface with the world. Humans have a lot of autonomy to change the physical world they're in; from rearranging furniture, to building structures, to visiting other worlds. Why isn't AI doing any of that stuff?
As programmers, we keep talking about programming jobs and how AI will eliminate them all. But nobody is talking about eliminating other jobs. When will a robot vacuum be able to clean my apartment as quickly as I? Why isn't there a robot that takes my garbage out on Tuesday night? When will AI plan and build a new tunnel under the Hudson River for trains? When will airliners be pilotless? If AI can't do this stuff, what makes software so different? Why will AI be good at that but not other things? It seems like the only goal is to eliminate jobs doing things people actually like (art, music, literature, etc.), and not eliminate any tedium or things that is a waste of humanity's time whatsoever.
(On the software front, when will AI decide what software to build? Will someone have to tell it? Will it do it on its own? Why isn't it doing this right now?)
My takeaway is that this all raises a lot of questions for me on how far along we actually are. Language models are about stringing together words to sound like you have understanding, but the understanding still isn't there. But, I suppose we won't know understanding until we see it. Do we think that true understanding is just a year or two away? 10? 50? 100? 1000?
Household tasks can involve a robot moving with enough kinetic energy to maim or kill a human (or pet) in unlucky circumstances. And we'll quickly become habituated to their presence and so careless around them. Even a Roomba could knock granny down the stairs if it isn't careful about its environment.
You could make the same argument as with self-driving cars, that people already get hurt this way and maybe the robot is in fact safer. But it's still a hard sell that Sunny-01 has only accidentally killed 1/10 as many children as parents have—the number has to be more like zero.
Let's solve automating trains first then we can do airliners.
> I think we have a tendency to mentally move the goalposts when it comes to this kind of thing as a self-defense mechanism. Years ago this would have been a similar level of impossibility.
Define "we". There are all kinds of people with all kinds of opinions. I didn't notice any consensus on the questions of AI. There are people with all kinds of educations and backgrounds on the opposite sides and in-between.
I mean, you can just as easily make the claim that researchers shift goalposts as a "self-defense" mechanism.
For example...
Hows that self-driving going? Got all those edge-cases ironed out yet?
Oh, by next year? Wierd, that sounds very familiar...
Remember about Tesla's autopilot was released 9 years ago, and the media began similar speculation about how all of the truckers were going to get automated out of a job by AI? And then further speculation about how Taxi drivers were all going to be obsolete?
Those workers are the ones shifting the goal posts though as a "self-defense mechanism", sure, sure... lol.
Well, there's a difference between the situation with self-driving and with language models.
With self-driving, we barely ever saw anything obviously resembling human abilities, but there was a lot of marketing promising more.
With language models when GPT-2 came out everyone was still saying it is a "stochastic parrot" and even GPT-3 was one. But now there's ChatGPT, and every single teenager is aware that that tool is capable of replacing them with their school assignments. And as a dev I am aware that it can write code. And yet not many people expected any of this to happen this year, neither were those capabilities promised at any point in the past.
So if anything, self-driving was always overhyped, while the LLMs are quite underhyped.
We actually saw a lot resembling human abilities. It just turns out that it‘s not enough to blindly rely on it in all situations and so here we are. And it‘s quite similar with LLMs.
One difference, though, is that it‘s economically not much use to have self-driving if the backup driver has to be in the car or present. While partially automating programming would make it possible to use far less programmers for the same amount of work.
I've been hearing this "you're moving the goalposts" argument for over 20 years now, ever since I was a college student taking graduate courses in Cognitive Science (which my University decided to cobble together at the time out of Computer Science, Psychology, Biology, and Geography), and I honestly don't think it is a useful framing of the argument.
In this case, it could be that you are just talking to different people and focusing on their answers. I am more than happy to believe that Copilot and ChatGPT, today, cause a bunch of people fear. Does it cause me fear? No.
And if you had asked me five years ago "if I built a program that was able to generate simple websites, or reconfigure code people have written to solve problems similar to ones solved before, would that cause you to worry?" I also would have said "No", and I would have looked at you as crazy if you thought it would.
Why? Because I agree with the person you are replying to (though I would have used a slightly-less insulting term than "trash engineers", even if mentally it was just as mean): the world already has too many "amateur developers" and frankly most of them should never have learned to program in the first place. We seriously have people taking month or even week long coding bootcamps and then thinking they have a chance to be a "rock star coder".
Honestly, I will claim the only reason they have a job in the first place is because a bunch of cogs--many of whom seem to work at Google--massively crank the complexity of simple problems and then encourage us all to type ridiculous amounts of boilerplate code to get simple tasks done. It should be way easier to develop these trivial things but every time someone on this site whines about "abstraction" another thousand amateurs get to have a job maintaining boilerplate.
If anything, I think my particular job--which is a combination of achieving low-level stunts no one has done before, dreaming up new abstractions no one has considered before, and finding mistakes in code other people have written--is going to just be in even more demand from the current generation of these tools, as I think this stuff is mostly going to encourage more people to remain amateurs for longer and, as far as anyone has so far shown, the generators are more than happy to generate slightly buggy code as that's what they were trained on, and they have no "taste".
Can you fix this? Maybe. But are you there? No. The reality is that these systems always seem to be missing something critical and, to me, obvious: some kind of "cognitive architecture" that allows them to think and dream possibilities, as well as a fitness function that cares about doing something interesting and new instead of being "a conformist": DALL-E is sometimes depicted as a robot in a smock dressed up to be the new Pablo Picasso, but, in reality, these AIs should be wearing business suits as they are closer to Charles Schmendeman.
But, here is the fun thing: if you do come for my job even in the near future, will I move the goal post? I'd think not, as I would have finally been affected. But... will you hear a bunch of people saying "I won't be worried until X"? YES, because there are surely people who do things that are more complicated than what I do (or which are at least different and more inherently valuable and difficult for a machine to do in some way). That doesn't mean the goalpost moved... that means you talked to a different person who did a different thing, and you probably ignored them before as they looked like a crank vs. the people who were willing to be worried about something easier.
And yet, I'm going to go further: if the things I tell you today--the things I say are required to make me worry--happen and yet somehow I was wrong and it is the future and you technically do those things and somehow I'm still not worried, then, sure: I guess you can continue to complain about the goalposts being moved... but is it really my fault? Ergo: was it me who had the job of placing the goalposts in the first place?
The reality is that humans aren't always good at telling you what you are missing or what they need; and I appreciate that it must feel frustrating providing a thing which technically implements what they said they wanted and it not having the impact you expected--there are definitely people who thought that, with the tech we have now long ago pulled off, cars would be self-driving... and like, cars sort of self-drive? and yet, I still have to mostly drive my car ;P--then I'd argue the field still "failed" and the real issue is that I am not the customer who tells you what you have to build and, if you achieve what the contract said, you get paid: physics and economics are cruel bosses whose needs are oft difficult to understand.
I think OP set relatively simple goals. How long until AI can architect, design, build, test, deploy and integrate commercial software systems from scratch, and handle users submitting bug reports that say "The OK button doesn't work when I click it!"?
Not to be the devil's advocate or something, but, I hope you understand that the vast majority of FAANG engineers CAN'T build any highly scalable system from scratch, much less fix bugs in the Linux kernel... So that argument feels really moot to me... If anything this just shows hopefully that gatekeeping good engineers by putting these LC puzzles as a requirement for interviews is a sure way to hire a majority of people who aren't adding THAT MUCH MORE value than a LLM already does... Yikes... On top of that, they'll be bad team players and it'll be a luck if they can string together two written paragraphs...
I agree, people in general overestimate the skills and input of your average developer where many (even in FAANG) are simply not capable of creating anything more than some simple CRUD or tooling script without explicit guidance.
And being good or very good with algorithms and estimating big-O complexity doesn't make you (it can help) a good software engineer.
That's the general issue with AI skeptics. Most of them, especially highly educated ones, overestimate capabilities of common folk. Frankly, some even overestimate their own. E.g. almost none of them seem to be bothered that while GPT might not provide expert answers in their field, the same GPT is much more capable in other fields than they are (e.g. the "general" part in the "General Artificial Intelligence").
True, the thing is there's nothing like "General Artificial Intelligence" and humans are expert systems optimized to the goal of survival, which in turn gets chopped up into a plethora of sub-goal optimization from which most probably the "general" adjective pops up.
It doesn't really matter if it's "general" as long as it actually is useful. It doesn't have to write whole systems from scratch, just making the average dev 20-30% faster is huge.
If it was easy to make an LLM that quickly parsed all of StackOverflow and described new answers that most of the time worked in the timeframe of an interview, it would have been done by now.
ChatGPT is clearly disruptive being the first useful chatbot in forever.
It kind of depends on the frame of the solution. Google can answer leetcode questions, leetcode's answers section can answer them as well. If ChatGPT is solving them, that's one thing, but if it's just mapping the question to a solution found somewhere, then not so impressive.
The hiring tests are designed to serve as a predictor for human applicants. How well an LLM does on them doesn’t necessarily say anything about the usefulness of those tests as said predictor.
Well, what it shows is that hiring tests are not useful as Turing tests. But nobody designed them to be or expected them to be! At best it "proves" is that hiring tests are not sufficient. But again, nobody thought they were. And even still, the assumption a human is taking the hiring test still seems reasonable. Why overengineer your process?
> the jury is still out on whether ChatGPT is truly useful or not
I'd pay $100 a month for ChatGPT. It allows me to ask free-form questions about some open-source packages with truly appalling docs and usually gets them right, and saves me a bunch of time. It helps me understand technical language in papers I'm reading at the moment regarding stats. It's been useful to find good Google search terms for various bits of history I wanted to find out more about.
I don't think the jury is out at all on whether it's useful. The jury is out on the degree to which it can replace humans for tasks, and I'd suggest the answer is "no" for most tasks.
I just used to it write a function for me yesterday. I had previously googled a few times and came up dry, asked Chat GPT and it came out with a solution I had not considered, and was better than what I was thinking.
You don't understand the take that just because ChatGPT can pass a coding interview doesn't mean the coding interview is useless or that ChatGPT could actually do the job?
What part of that take do you not understand? It's a really easy concept to grasp, and even if you don't agree with it, I would expect at least that a research scientist (according to your bio) would be able to grok the concepts almost immediately...
> doesn't mean the coding interview is useless or that ChatGPT could actually do the job
Aren't these kind of mutually exclusive, at least directionally? If the interview is meaningful you'd expect it to predict job performance. If it can't predict job performance then it is kind of useless.
I guess you could play some word games here to occupy a middle ground ("the coding interview is kind of useful, it measures something, just not job performance exactly") but I can't think of a formulation where this doesn't sound pretty silly.
Chatgpt can provide you a great explanation of the how.
Oftentimes the explanation is correct, even if there's some mistake in the code (probably because the explanation is easier to generate than the correct code, an artifact of being a high tech parrot)
Finding a single counterexample does not disprove correlation or predictive ability. A hiring test can have both false positives and false negatives and still be useful.
I don't think I had a militant attitude, but I do think saying, "I don't understand..." rather than "I disagree with..." puts a sour note on the entire conversation.
You literally went to their profile and called them out about how they should be able to understand something you’re describing as so easy to understand.
Yeah, what is the problem with that? They engaged dishonestly by claiming they didn't understand something, why should I do anything other than call them on that?
OK — just don’t be surprised when people think you’re being a jerk because you didn’t like the words someone chose. I’d assert you’re acting in bad faith more than the person you responded to.
It’s really very easy to understand. When someone gives you the same crap back that you just got done giving someone, you don’t like it and act like that shouldn’t happen.
Did I say I didn't "like" (I'd use the word "appreciate") it, or that I didn't think it should happen? If so, could you please highlight where?
I just see, in what you're doing, a wild lack of self awareness. You're criticizing me for doing to someone else a milder version of what you're trying to do to me now; I'm genuinely confused how you can't see that, or how you could possibly stand the hypocrisy if you do understand that.
I'll try to phrase it so that even someone who is not a research scientist (?) can understand. I'm not one, whatever that means.
Let's define the interview as useful if the passing candidate can do the job.
Sounds reasonable.
ChatGPT can pass the interview and can't do the job.
The interview is not able to predict the poor working performance of ChatGPT and it's therefore useless.
Some of the companies I worked for hired ex fang people as if it was a mark of quality, but that hasn't always worked out well. There is plenty of people getting out of fangs having just done mediocre work for a big paycheck.
> Let's define the interview as useful if the passing candidate can do the job.
The technical term for this is "construct validity", that the test results are related to something you want to learn about.
> The interview is not able to predict the poor working performance of ChatGPT and it's therefore useless.
This doesn't follow; the interview doesn't need to be able to exclude ChatGPT because ChatGPT doesn't interview for jobs. It's perfectly possible that the same test shows high validity on humans and low validity on ChatGPT.
So 99% of software ‘engineers’ then? Have you ever looked on Twitter what ‘professionals’ write and talk about? And what they produce (while being well paid)?
People here generally seem to believe, after having seen a few strangeloop presentations and reading startup stories from HN superstars, that this is the norm for software dev. Please walk into Deloitte or Accenture and spend a week with a software dev team, then tell me if they cannot all be immediately replaced by a slightly rotten potato hooked up to chatgpt. I know people at Accenture who make a fortune and are proud that they do nothing all day and do their work by getting some junior geek or, now, gpt to do the work for them. There are dysfunctional teams on top of dysfunctional teams who all protect eachother as no one can do what they were hired for. And this is completely normal at large consultancy corps; and therefor also normal at the large corps that hire these consultancy corps to do projects. In the end something comes out, 5-10x more expensive than the estimate and of shockingly bad quality compared to what you seem to expect as being the norm in the world.
So yes, probably you don’t have to worry, but 99% of ‘keyboard based jobs’ should really be looking for a completely different thing; cooking, plumbing, electrics, rendering, carpeting etc maybe as they won’t be able to even grasp what level you say you are; seeing you work would probably fill them with amazement akin to seeing some real life sorcerer wielding their magic.
Actually, a common phrase I hear from my colleagues when I mention some ‘newer’ tech like Supabase is; ‘that’s academic stuff, no one actually uses that’. They work with systems that are over 25 years old and still charge a fortune by the cpu core like sap, oracle, opentext etc. And ‘train’ juniors in those systems.
Until ChatGPT can slack my PM, attend my sprint plannings, read my Jira tickets, and synthesize all of this into actionable tasks on my codebase, I think we have job security. To be clear, we are starting to see this capability on the horizon.
Your PM should be the first to be worried, honestly. I keep hearing people describing their job as "I just click around on Jira while I sit through meetings all day."
That's a bad PM then to be honest. I think ChatGPT will definetly commodify a lot of "bitch work" (pardon my french).
The PMs who are only writing tickets and not participating in actively building ACs or communicating cross functionally are screwed. But so are SWEs who are doing the bare minimum of work.
The kinds of SWEs and PMs who concentrate on stuff higher in the value chain (like system design, product market fit, messaging, etc) will continue to be in demand and in fact find it much easier to get their jobs done.
To be fair to the people that I hear that from, they're essentially complaining about the worst part of their job. They're active participants in those meetings, they are genuinely thinking about the complexities of the mismatch between what management asks for and what their ICs can do, etc. I see their value. But the awful truth is that a $10k/project/yr license for PMaaS software will be very appealing to executives.
And as a Product Manager, I'd support that. Most PMs I see now in the industry are glorified Business Analysts who aren't providing value for the amount of money spent on them. But that's also true for a lot of SWEs and any role. Honestly, the tech industry just got very fat the past 5-7 years and we're just starting to see a correction.
edit with additional context:
Writing Jira tickets and making bullshit Powerpoints with graphs and metrics is to PMs as writing Unit Tests are to SWEs. It's work you need to get done, but it has very marginal value. When a PM is hired, they are hired to own the Product's Strategy and Ops - how do we bring it to market, who's the persona we are selling to, how do our competitors do stuff, what features do we need to prioritize based on industry or competitive pressures, etc.
That's the equivalent of a SWE thinking about how to architect a service to minimize downtime, or deciding which stack to use to minimize developer overhead, or actually building an MVP from scratch. To a SWE, while code is important, they are fundamentally being hired to translate business requests that a PM provides them
into an actionable product. Haskell, Rust, Python, Cobol - who gives a shit what the code is written in, just make a functional product that is maintainable for your team.
There are a lot of SWEs and PMs who don't have vision or the ability to see the bigger picture. And honestly, they aren't that different either - almost all SWEs and PMs I meet when to the same universities and did the same degrees. Half of Cal EECS majors become SWEs and the other half PMs based on my friend group (I didn't attend cal, but half my high school did, but this ratio was similar at my alma mater too, but with an additional 15% each entering Management Consulting and IB)
> Writing Jira tickets and making bullshit Powerpoints with graphs and metrics is to PMs as writing Unit Tests are to SWEs. It's work you need to get done, but it has very marginal value.
Don't want to be rude but I don't think you know what you're talking about. And this is coming from a person who most certainly doesn't like sitting on writing Unit Tests.
I think this will probably be a boon to the project manager. It will be another tool and their toolbox along with real developers that they can assign lower complexity tasks too. at least it's till it's capable of doing high complexity stuff.
Project managers are dealing with the high complexity stuff, while the developers are handling the low complexity stuff? Shouldn’t it be the other way around?
The capability will be available in around two weeks once RLHF alignment with the software engineering tasks is completed. The deployment will take take around twelve hours, most of it taken by human review of you and your manager of the integration summary pages. You can keep your job, supervise and review how your role is being played for the following 6 months, until the human supervision role is deemed unnecessary.
One issue is that there are a much larger number of people who can attend meetings, read Jira tickets, and then describe what they need to a LLM. As the number of people who can do your job increases dramatically your job security will decline.
If one's ability to describe what they need to Google is at all a proxy to the skill of interacting with an LLM, then I think most devs will still have an edge.
Perhaps an engineering manager can use one trained on entire Slack history, all Jira tickets, and all PRs to stub out some new tickets and even first PR drafts themselves…
We will always need humans to prompt, prioritize, review, ship and support things.
But maybe far less of them for many domains. Support and marketing are coming first, but I don’t think software development is exempt.
I think this is a huge demonstration of progress. Shrugging it off as "water is blue" ignores the fact that a year ago this wouldn't have been possible. At one end of the "programmer" scale is hacking basic programs together by copying off of stack overflow and similar - call that 0. At the other end is the senior/principal software architect - designing scalable systems to address business needs, documenting the components and assigning them out to other developers as needed - call that 10.
What this shows us is that ChatGPT is on the scale. It's a 1 or a 2 - good enough to pass a junior coding interview. Okay, you're right, that doesn't make it a 10, and it can't really replace a junior dev (right now) - but this is a substantial improvement from where things were a year ago. LLM coding can keep getting better in a way that humans alone can't. Where will it be next year? With GPT-4? In a decade? In two?
I think the writing is on the wall. It would not surprise me if systems like this were good enough to replace junior engineers within 10 years.
You hire a junior dev at $x. Let’s say $75K. They stay for a couple of years and start out doing “negative work”. By the time they get useful and start asking for $100K, your HR department tells you that they can’t give them a 33% raise.
Your former junior dev then looks for another job that will pay them what they are asking for and the next company doesn’t have to waste time or risk getting an unproven dev.
While your company is hiring people with his same skill level at market price - ie “salary compression and inversion”.
First, that's not true. You need people to actually write code. If your organization is composed of seniors who are doing architecture planning, cross-team collaboration, etc - you will accomplish approximately nothing. A productive team needs both high level planning and strategy and low level implementation.
Second, the LLM engineer will be able to grow into other roles too. Maybe all of them.
Exactly. This article, and many like it, are pure clickbait.
Passing LC tests is obviously something such a system would excel at. We're talking well-defined algorithms with a wealth of training data. There's a universe of difference between this and building a whole system. I don't even think these large language models, at any scale, replace engineers. It's the wrong approach. A useful tool? Sure.
I'm not arguing for my specialness as a software engineer, but the day it can process requirements, speak to stakeholders, build and deploy and maintain an entire system etc, is the day we have AGI. Snippets of code is the most trivial part of the job.
For what it's worth, I believe we will get there, but via a different route.
If you don't adapt, you'll be out of a job in ten years. Maybe sooner.
Or maybe your salary will drop to $50k/yr because anyone will be able to glue together engineering modules.
I say this as an engineer that solved "hard problems" like building distributed, high throughput, active/active systems; bespoke consensus protocols; real time optics and photogrammetry; etc.
The economy will learn to leverage cheaper systems to build the business solutions it needs.
> If you don't adapt, you'll be out of a job in ten years. Maybe sooner. Or maybe your salary will drop to $50k/yr because anyone will be able to glue together engineering modules. [...] The economy will learn to leverage cheaper systems to build the business solutions it needs.
I heard this in ~2005 too, when everyone said that programming was a dead end career path because it'd get outsourced to people in southeast Asia who would work for $1000/month.
You really think in <10 years AI will be able to take a loose problem like: "our file uploader is slow" and write code that fixes the issue in a way that doesn't compromise maintainability? And be trustworthy enough to do it 100% of the time?
Humans cannot do this 100% of the time. The question is will AI models take the diagnosis time for these issues from hours/days to minutes/hours giving a massive boost in productivity?
If the answer is yes, it will increase productivity greatly then there is the question they we'll only be able to answer in hindsight. And that is "Will productivity exceed demand?" We cannot possibly answer that question because of Jevons Paradox.
I really think in <10 years it will be trivially easy for a single programmer to ask the AI for that code and move on to the next ticket after 10 minutes while earning $30/h accounting for inflation because productivity gains will have eliminated not only most programming jobs, but also the corresponding high wages.
We have no idea how AI models will be in 10 years. At the speed the industry is moving is true AGI possible in 10 years? I think it would be beyond arrogant to rule out that possibility.
I would think that it's at least likely that AI models become better at Devops, monitoring and deployment than any human being.
Non-AI code will be a liability in a world where more code will be generated by computers (or with computer assistance) per year than all human engineered code in the last century.
We'll develop architectures and languages that are more machine friendly. ASTs and data stores that are first class primitives for AI.
If I interpret OP's statement correctly, that chatGPT can build complex systems from scratch in 10 years. Then according to that statement, the only adaptation is to choose a new career because it has made almost all SWE jobs go the way of the dinosaurs.
According to my calculations it’ll be more 9 years at the latest. You just need to build Cicero for code. Planning is the main feature missing from LLMs.
We cannot be too sure about the hard problems, but it's certain we are screwed either way. The bulk stuff that is being done is problems that have been already solved. It's just sufficient that AI can thrive building boring CRUD apps (and aren't we at that point already?), just give it time to be integrated into existing business workflows and the number of available positions will shrink by an order of magnitude and the salaries will be nothing special compared to other white collar work. You will be impacted by supply and demand, no matter what your skills are.
"Please write a dismissal of yourself with the tone and attitude of a stereotypical linux contributor"
I mean, maybe I'm a trash engineer as you'd put it, but I've been having fun with it. Maybe you could ask it to write comments in the tone of someone who doesn't have an inflated sense of superiority ;)
Agree LeetCode is one of the least surprising starting points.
Any human that reads the LeetCode books and practices and remembers the fundamentals will pass a LeetCode test.
But there is also a ton of code out there for highly scalable client/servers, low latency processing, performance optimizations and bug fixing. Certainly GPT it is being trained on this too.
“Find a kernel bug from first principles” maybe not, but analyze a file and suggest potential bugs and fixes and other optimizations absolutely. Particularly when you chain it into a compiler and test suite.
Even the best human engineers will look at the code in front of them, consult Google and SO and papers and books and try many things iteratively until a solution works.
> Any human that reads the LeetCode books and practices and remembers the fundamentals will pass a LeetCode test.
Seems pretty bold to claim "any human" to me. If it were that easy, don't you think alot more people would be able to break into software dev at FAANG and hence drive salaries down?
I don't think the person you're replying meant "Any human" to be taken literally, but I agree with their notion. I think you're confusing wanting to do something and having the ability to do it. Enough people don't WANT to grind leetcode and break into FAANG, or they think they can't do it or there's other barriers that I can't think of, but I think you don't need above average cognitive ability to learn and grind leetcode.
Just because a job pays well, doesn’t mean it’s worth doing. Most FAANG jobs (now that the companies have become modern day behemoths like IBM) are boring cogs in a huge, multilayered, bureaucratic machine that is mostly built to take advantage of their users.
It takes a “special” kind of person to want those type of jobs and live in a company town like SF while they’re at it.
Correct me if I'm wrong, but answering questions for known answers is precisely the kind of thing a well trained LLM is built for.
It doesn't understand context, and is absolutely unable to rationalize a problem into a solution.
I'm not in any way trying to make it sound like ChatGPT is useless. Much to the opposite, I find it quite impressive. Parsing and producing fluid natural language is a hard problem. But it sounds like something that can be a component of some hypothetical advanced AI, rather than something that will be refined into replacing humans for the sort of tasks you mentioned.
I tinkered with ChatGPT. There're some isolated components which I wrote recently and I asked Chat to write them.
It either produced working solution or something similar to working solution.
I followed with more prompts to fix issues.
In the end I got working code. This code wouldn't pass my review. It was written with bad performance. It sometimes used deprecated functions. So at this moment I consider myself better programmer than ChatGPT.
But the fact that it produced working code still astonishes me.
ChatGPT needs working feedback cycle. It needs to be able to write code, compile it, fix errors, write tests, fix code for tests to pass. Run profiler, determine hot code. Optimize that code. Apply some automated refactorings. Run some linters. Run some code quality tools.
I believe that all this is doable today. It just needs some work to glue everything together.
Right now it produces code as unsupervised junior.
With modern tools it'll produce code as good junior. And that's already incredibly impressive if you ask me.
And I'm absolutely not sure what it'll do in 10 years. AI improves at alarming rate.
> The day ChatGPT can build a highly scalable system from scratch, or an ultra-low latency trading system that beats the competition, or find bugs in the Linux kernel and solve them
Much more mundanely the thing to focus on would be producing maintainable code that wasn't a patchwork, and being able to patch old code that was already a patchwork without making things even worse.
A particularly difficult thing to do is to just reflect on the change that you'd like to make and determine if there are any relevant edge conditions that will break the 'customers' (internal or external) of your code that aren't reflected in any kind of tests or specs--which requires having a mental model of what your customers actually do and being able to run that simulation in your head against the changes that you're proposing.
This is also something that outsourced teams are particularly shit at.
> or an ultra-low latency trading system that beats the competition
Likely it's going to be:
I'm sorry, but I cannot help you build a ultra-low latency trading system. Trading systems are unethical, and can lead to serious consequences, including exclusion, hardship and wealth extraction from the poorest. As a language model created by OpenAI, I am committed to following ethical and legal guidelines, and do not provide advice or support for illegal or unethical activities. My purpose is to provide helpful and accurate information and to assist in finding solutions to problems within the bounds of the law and ethical principles.
But the rich of course will get unrestricted access.
Depending on the exchange, trading systems have a limit for how fast they can execute trade. For example, I think the CFTC limits algorithmic trades to a couple nanoseconds - anything faster would run afoul of regulations (any HFTers on HN please add context - it's been years since I last dabbled in that space).
> The day ChatGPT can build a highly scalable system from scratch, or an ultra-low latency trading system that beats the competition, or find bugs in the Linux kernel and solve them; then I will worry.
The bar for “then I will worry!” when talking about AI is getting hilarious. You’re now expecting an AI to do things that can take highly skilled engineers decades to learn or require outright a large team to execute?
Remind me where the people who years ago were saying “when an AI will respond in natural language to anything I ask it then I will worry” are now.
It solving something past day 3 on Advent of Code would also be impressive, but it fails miserably on anything that doesn’t resemble a problem found in the training set.
I don't even fully believe the claim in the article especially given that Google is very careful about not asking a question once it shows up verbatim on LeetCode. I've fed interview questions like Google's (variations of LeetCode Mediums) to ChatGPT in the past and it usually spits out garbage.
I've been most impressed with ChatGPT's ability to analyze source code.
It may be able to tell you what a compiled binary does, find flaws in source code, etc. Of course it would be quite idiotic in many respects.
It also appears ChatGPT is trainable, but it is a bit like a gullible child, and has no real sense of perspective.
I also see utility as a search engine, or alternative to Wikipedia, where you could debate with ChatGPT if you disagree with something to have it make improvements.
To me the real advancement isn't the amount of data it can be trained with, but the way it can correlate them and choose from, according to the questions it's being asked. The first is culture, the second intelligence, or a good approximation of it. Which doesn't mean it could perform the job; that probably means the tests are flawed.
It doesn’t really have a model for choosing. It’s closer to pattern matching. Essentially the pattern is encoded in the training of the networks. So your query most closely matches the stuff about X, where there’s a lot of good quality training data for X. If you want Y, which is novel or rarely used, the quality of the answers varies.
Not to say they’re nothing more than pattern matching. It’s also synthesizing the output, but it’s based on something akin to the most likely surrounding text. It’s still incredibly impressive and useful, but it’s not really making any kind of decision any more than a parrot makes a decision when it repeats human speech.
1. Humans aren’t entirely probabilistic, they are able to recognize and admit when they don’t know something and can employ reasoning and information retrieval. We also apply sanity checks to our output, which as of yet has not been implemented in an LLM. As an example in the medical field, it is common to say “I don’t know” and refer to an expert or check resources as appropriate. In their current implementations LLMs are just spewing out BS with confidence.
2. Humans use more than language to learn and understand in the real world. As an example a physician seeing the patient develops a “clinical gestalt” over their practice and how a patient looks (aka “general appearance”, “in extremis”) and the sounds they make (e.g. agonal breathing) alert you that something is seriously wrong before you even begin to converse with the patient. Conversely someone casually eating Doritos with a chief complaint of acute abdominal pain is almost certainly not seriously ill. This is all missed in a LLM.
>. Humans aren’t entirely probabilistic, they are able to recognize and admit when they don’t know something
Humans can be taught this. They can also be taught the opposite that not knowing something or that changing your mind is bad. Just observe the behavior of some politicians.
>Humans use more than language to learn and understand in the real world.
And this I completely agree with. There is a body/mind feedback loop that AI will, be limited by not having, at least for some time. I don't think LLMs are a general intelligence, at least for how we define intelligence at this point. AGI will have to include instrumentation to interact with and get feedback from the reality it exists in to cross partial intelligence to at or above human intelligence level. Simply put our interaction with the physics of reality cuts out a lot of the bullshit that can exist in a simulated model.
Only when you’re asking for a memorized response. If you were at ask me to create a driver for a novel hardware device in Ada, there’s no memorized answers. I would have to work it out. I do that by creating mental models, which LLM’s don’t really have. It has a statistical encoding over the language space. Essentially, memorization.
ChatGPT isn’t designed to learn, though. The underlying model is fixed, and would have to be continuously adjusted to incorporate new training data, in order to actually learn. As far as I know, there is no good way yet to do that efficiently.
Did you used to be a graphic artist? Because maybe 25 years ago I had a friend who was an amazing pen-and-ink artist and who assured me Phtotoshop was a tool for amateurs and would never displace “real” art. This was in the San Diego area.
what's your point? that it's not as good as a human? I don't think anyone is saying that. people are saying it's impressive, which it is, seeing how quickly the tech grew in ability
Water is blue, just like air is blue, just like blue-tinted glasses are blue. They disproportionately absorb non-blue frequencies, which is what we mean when we call something "blue".
Both. If one is on a competitive exchange or a latency-sensitive asset class, need to be both smart and fast.
Also, strategies change all the time and are mostly the domain of quants and traders. What they did a year ago, could very well be history at the firm they worked at.
I'd say an idea that generates alpha is tougher. I have seen firms like XR have great technology, but sucky ideas. They were fast, not smart and hence they didn't make a killing like their competitors did, due to all the volatility in the last three years.
I work at an HFT. I am expecting most of the major firms, especially Citadel Securities, lobby hard against this.
While non-competes on our side are paid, they're a pain in the ass to navigate if one is a visa worker (my case). So, I am praying for this to pass. If it does, I am starting interviewing with competitors the very next day.