The future is feeding all answers into a meat grinder and then forming word like nuggets from the paste and hope we like this? I am sorry but you lost your mind. There is a very fundamental disconnect between LLM hype and reality.
Here is my latest experiment, I needed to search for a string in dangling commits and asked Bing: https://i.imgur.com/pQtpkH5.png You will notice my question was a link because I copied the title from https://stackoverflow.com/q/21903972/308851 and apparently that made it into a link (I had the question open because every time I need to do this I just jump to this question, I outsourced storing my knowledge to SO ages ago). Every single answer given by Bing is wrong, of course, because git log -g is not appropriate for this as the answer states. It's right there: "Commits that are (still) referenced by the reflog are considered reachable and not dangling". And here's the best part: the Learn More 1. Stackoverflow.com links to this very question. The language model have seen "git log -g" in the vicinity of the string "Search for string in dangling commits in Git" and thought it's appropriate. The irony.
The answer I got with the prompt "You are a git expert. How do I search for a string in dangling commits?"
Was:
Dangling commits in Git are often referred to as commits that are no longer reachable from any branch, tag, or the HEAD. These commits usually occur when you create a commit and then move the branch pointer away from it without creating a new branch or tag. They can also occur during rebases or other history rewriting operations.
Git has a garbage collector (`git gc`) that will periodically clean up these dangling commits, but until that happens, they're still in your repository and can be examined.
Unfortunately, Git doesn't provide a direct command to search for a string in dangling commits. However, you can find these commits using the `git fsck` command, which checks the filesystem for errors and dangling commits.
Here is a way to do it manually:
1. List all the dangling commits using `git fsck --lost-found`. This will print out all the dangling commits and store them in `.git/lost-found/commit/`.
2. For each dangling commit, you can search for a specific string using `git log -p <commit-id>` or `git show <commit-id>` and then pipe the output to `grep`.
Here's an example of what this might look like in a script:
mkdir -p .git/lost-found
git fsck --lost-found
for commit in $(ls .git/lost-found/commit/); do
git show $commit | grep -q "your string here" && echo "Found in commit $commit"
done
This script will print the IDs of any dangling commits that contain the string "your string here".
Remember to replace "your string here" with the string you are searching for.
*NOTE*: This operation could be time-consuming if you have a lot of dangling commits or a large repository. Also, this approach might not work if the garbage collector has already cleaned up the commits. Make sure to be careful when manipulating Git's internals like this, as it could potentially corrupt your repository if done incorrectly.
I don't like the profit motives, but it's tough for me to argue that the experience of using stackoverflow is better than just jamming my question into chatgpt.
Along those lines, how is that position not randomania?
You’re basically saying that the signal to noise ratio is zero so its utility is zero. I guess you must think all of the comments here where someone mentioned how it helped them solve a problem, also have a 0 (or at least dismissible) signal to noise ratio?
What signal to noise would be needed for you to say it’s not devoid of any information? What would the minimum accuracy need to be? 80%? 95? 99? 100?
The thing is there are many questions where it is a short path to a solution that can then be verified. The most useful way to use it productively is to induce it to give you pointers to things you had no idea existed. Then dereference those pointers via traditional means (ie google, man pages). Be your own null ptr (hallucinations) segfault handler. Critical thinking a hundred percent preserved.
It’s a little bemusing how people take issue with next token prediction trained with a log likelihood loss. In particular they think that getting a high accuracy extremely low perplexity LLM is unimpressive because of the low accuracy high perplexity autocorrect on their phone. No- the llm’s emergent generalization capabilities as nll drops really is a capital p Phenomenon.
But I feel that I don’t need to try to change anyone’s mind anymore. Everyone had a threshold logp where they’ll change their mind about this, whether it’s attained in the next week or long from now. Just state the nonzero perplexity that you would accept from a person thinking, and wait for the loss curve to meet it.
You're ignoring all the times the stackoverflow answers also provide zero information, so this is not a balanced assessment. Prompting LLMs correctly to get high quality answers is a skill.
> automated plagiarism without understanding anything
I'm old enough to remember the memes campaigning to rename "software engineering" to "googling StackOverflow", so plagiarism isn't a point against the value of an LLM, and given the case of an LLM made by StackOverflow using StackOverflow data it's also not a legal issue.
How does the issue of plagiarism bear on its utility? That’s like saying it’s impossible to get value out of listening to a torrented mp3.
I’m not sure understanding is an inherent issue either. When I Google ”weather lisbon,” Google spits out a number. The app doesn’t ”understand” anything, but it still helps me plan my day.
I think most of the critique is coming from the people that made their mind up already or used only GPT-3.5 in the past.
Then it becomes a self-fulfilling prophecy: a) "ChatGPT is stupid! Also, plagiarism! No value!" b) " I wrote a carelessly bad prompt and it responded with a seemingly bad answer!" c) " This confirms my biases!"
I find that when I provide it with enough context and a set of requirements (as I would with a non-AI engineer), it's very good at problem solving.
> there can not be in LLM as it's just automated plagiarism without understanding anything.
Conjecture. We lack a mechanistic understanding of "understanding", so you can't make this claim. Human answers are also fallible, so how do you know that human cognition isn't also statistical pattern matching similar to what's used in LLMs? If it is, then they are understanding, their understanding just has flaws.
I don’t know how to square this with the fact that I routinely ask GPT-4 for code and get back answers that are in fact correct. You’re making it sound like it’s purely random, like I’m just getting back a string of zhrueksnbd8373&;! or something.
But that is what you get, it's just dressed up to sound like the answer. But it's not the answer , it can not be. If it accidentally is the answer, that's an accident, nothing more. See my example above about dangling commits...
> If it accidentally is the answer, that's an accident, nothing more.
Accuracy rates much, much higher than chance are not accidental. While you're pointing to a valid failure mode of LLMs, you're not using the correct words to describe it which is leading you to say nonsense.
It's really not though. I ask it questions about typescript all the time: "i want to do X with types, how do i express this in TS syntax?" and it answers correctly. No disagreement that i don't always agree with its subjective judgment, but that is on par with subjective opinions on stack overflow.
I read the announcement on VentureBeat, and I am not sure what they are trying to do.
>Integrating AI with code development is something that Microsoft has been doing for several years with its Github Copilot technology. Chandrasekar said that Overflow AI is not an attempt to replace Github Copilot, but rather to just provide more information resources to developers and the organizations they work for.
>“We’re certainly not looking to replace GitHub copilot, we’re very complementary to what you’re actually going to do writing code,” he said. “You need a really solid foundation on Stack Overflow for Teams to give accurate, validated and curated information.”
I really do not understand what that means.
EDIT at time of writing, link pointed to a Youtube video, now it's a text announcement, so I am gonna go read that.
They are trying to remain relevant with the latest tech buzzwords. I think what we're seeing is a lot of companies jumping on the hype curve for AI similar to when there were a lot of ICO's during the hype cycle of crypto. I'm not comparing AI to crypto as technologies or anything, just the hype cycles.
Stack Overflow for Teams is meant to be an internal knowledge base for company specific questions.
What he's saying is that they're going to focus on delivering answers that only an internal knowledge base can supply. This would be complementary to co pilot which can only give you generalized answers.
As a developer with that I still laughingly refer to as a "career", ChatGPT/Copilot /etc is starting to feel very much like "we're going to need to let you go, but we'd like you to stay on for a while to train your replacement."
If anything the last few months have made me more confident that at the end of the day, we’re going nowhere fast.
With all of the money and effort put into GPT-4, and now all of the attention from 3rd party devs, we have very little to show for it? At best we have something that improves productivity by a few percent. It’s..ok. The more you use it the more you realize it’s very flawed.
I will never completely discount that future models will be fundamentally different, but I’d put the chances of that happening in the next 5 years very low, despite what we’re hearing from certain industry folks.
If we are using “twice as productive” to mean that a typical developer can complete twice as much work (not just writing code, but overall work) in a fixed amount of time, I’d put that at ~5%.
If we are talking about specific workflows, I would agree with you that this is quite likely to occur.
I'm not exactly sure how one might properly define developer productivity: LOC being famously terrible precisely because it's easy to game; plus we keep saying "I know I should write documentation and tests…" and often still don't, but LLMs can at least help with both; and how much of the problem is a game-of-telephone communication failure somewhere between the user and the coder, which is (IMO) one of the strengths of these language models?
Given all the people worried about AI safety issues[1], I agree it's likely we might not see any substantial quality[2] improvements for five years.
Even without that, there's no way to tell how close we are to a "solution" to the question of intelligence; it may well be that LLMs are to AGI what all the self-driving car startups you've never heard of are to SAE's Level 5.
[1] alignment: this entire area is described as "pre-paradigmatic" by the actual researchers 6 months ago
privacy preservation: anything that might come under GDPR
audience suitability: e.g. no adult content for minors
[2] speed/efficiency yes, but scoring similarly on the tests.
This is a pretty common take but I don't think it holds up to scrutiny. Generating new code is such a minuscule part of what I actually do as a developer that I only ever mention it in passing, if at all, when talking about my job. Defining requirements that somehow manage to make a compromise between what every different department wants, communicating our advantages and difficulties to relevant parties, customer meetings, figuring out what legal requirement this 10-year-old server's cron job fulfils, etc. These are all things I don't see any of these systems figuring out anytime soon and that's where the real value add is. Writing up a class that can be described in a few simple sentences is not where the money is made.
This is where you separate the two classes of developers, I think. The ones that think their job is writing code, and the ones that see their job as coming up with solutions to problems.
If you think your job is just writing code, then yeah, you're about to get replaced. But if that was the case, then you would already have been replaced by an outsourced team of developers that can get things done cheaper than you.
We as developers are the conduit between what the business needs and what computers can do. We're translators, and not in the "we write code" sense. A non-developer won't be able to use GPT to write any non-trivial system because they won't have a clue what to ask for.
And if someone thinks "oh but they will one day be able to list their requirements exactly and GPT will just do it", I will ask: Well, why hasn't that taken off with offshore teams, then? The answer to that is exactly the answer to why GPT poses no risk to anyone's career: Stakeholders don't know what they need, and it's up to developers to figure it out.
> If you think your job is just writing code, then yeah, you're about to get replaced.
Honestly, even that would require major advancements, which is speculative at best. The last time I used Copilot to generate a few lines of code, there were three different mistakes. I spent much more time debugging than it would have taken me to just write the code myself.
It can do some things very well, but it fails all the time.
For the foreseeable future, I think we're firmly in the "this is just a tool" territory.
> This is where you separate the two classes of developers, I think. The ones that think their job is writing code, and the ones that see their job as coming up with solutions to problems.
The BLS data has some insights into how the government regarded different job titles over the years.
> Create, modify, and test the code and scripts that allow computer applications to run. Work from specifications drawn up by software and web developers or other individuals. May develop and write computer programs to store, locate, and retrieve specific documents, data, and information.
> Research, design, and develop computer and network software or specialized utility programs. Analyze user needs and develop software solutions, applying principles and techniques of computer science, engineering, and mathematical analysis. Update software or enhance existing software capabilities. May work with computer hardware engineers to integrate hardware and software systems, and develop specifications and performance requirements. May maintain databases within an application area, working individually or coordinating database development as part of a team.
15-1251 Computer Programmers
15-1256 Software Developers and Software Quality Assurance Analysts and Testers
15-1257 Web Developers and Digital Interface Designers
> This is a pretty common take but I don't think it holds up to scrutiny. Generating new code is such a minuscule part of what I actually do as a developer
Yes, generating new code is a fairly small part of my job too. But this is also a pretty common take re ai.
Maybe I'm wrong, but generative ai seems to be eating human capabilities for breakfast right now, and we're only ~8 months in. Today it can do a small part of a small part of my job, and rather more of a junior dev's job, but where are we going to be in five years time? I'd quite like to still have a paying job over that timescale.
People used to think that creating art or music or playing go or whatever, were activities that required a human meatbrain. Turns out its all just (effectively) a bunch of parametric equations. What if the non-coding skills you list will be similarly amenable/vulnerable to automation by ai?
They all remain examples of tasks that were widely asserted to be beyond a mere computer, and yet can now be done better by a mere computer than by the median human.
(I'm not actually sure which of those three you were singling out and why; they're all different in their own ways).
At the beginning of the devlopment of computer chess, people said it would NEVER beat a reasonable player. Then it did. Then they said it would NEVER beat a seasoned tournament player, and it did. They said it would NEVER beat an international master or grandmaster...but it did.
People keep saying never. Our society can never fall...but societies have fallen since the dawn of time. You might be safe in your job, but what about your children or grandchildren?
> Our society can never fall...but societies have fallen since the dawn of time
I guess the modern answer from a globalist is that back in the day, societies weren't as entangled as they are today, so they could go into freefall relatively easily with little impact by other societies, so they let them fall, especially if it was beneficial for them.
But today everything is so interconnected that if one fall, it would heavily impact a large part of the rest of the societies, that we cannot let it happen. Just look at how unimportant most people would consider Ukraine to be, before the invasion. But it has a huge impact on large sectors, and most countries would have this impact. Some areas of the world, like the Eurozone, even purposefully lock themselves in with their partners so if Greece gets impacted by something, member states have to help them as it'll impact them greatly if they don't.
So I guess my end argument is that there isn't really "societies" anymore, we're just part of a global one where the decentralized nature is so strong that we cannot really fail (globally) anymore unless there is something really big impacting all of us (like COVID).
Well, as technology progresses, we will become even more tightly coupled than before. Not sure if we could measure that, but if there is a coupling factor F, then it's proportional to the level of advancement of our technology, and the square F^2 is proportional to the level of hurt that will ensue if it breaks down.
I’m not sure how to put this gently, but if you think computer chess is the same thing as LLMs, you’re not qualified to have an opinion on the future of technology.
Chess was clearly just an analogy, and of a specific (old fashioned) AI doing what was widely believed (before we solved it) to be a uniquely human indicator of intelligence; it's not like anyone can accurately list the things that we will actually invent over the next 5-50 years.
Worse: thanks to psychology, even if a time traveller came back to now with such a list (and proof of being a time traveler), their examples would not feel real to most until they could be experienced directly — I knew that GPT-3 could code simple web pages just over two years[0] before ChatGPT came out, but it still didn't hit me emotionally until we were all able to play with it ourselves.
An analogy is only useful if it is meaningful. IBM's Deep Blue had a fundamentally different design than LLMs. "AI" is just a broad, abstract category that they both happen to share. The other day I heard someone argue that being skeptical about the inevitable "blockchain rennaissence" is like being skeptical about smart phones, back in the day (in this case, the category was just "technology", which is even more vague). Statements like this are not grounded in reality, they are just abuses of logic, useful to influencers in fabricating trends.
> IBM's Deep Blue had a fundamentally different design than LLMs.
Yes, and?
I mean, you clearly get what you put in the next sentence:
> "AI" is just a broad, abstract category that they both happen to share.
But, there's also no requirement for future chat bots to be Transformer models like GPT.
All we know for sure is that this blob of weights seems to be pretty good at a lot of different things that used to be things that only humans could do.
Thus, I think as another demonstration of "we used to use this as a proxy for intelligence and now we can't" it's fine to use chess as an analogy because it demonstrates exactly that thing.
No idea what comes next, nor when — it's still possible that "solving" general intelligence is beyond human capacity.
I've read science fiction literature which had more substance than this kind of reasoning. When the argument approaches "on a long enough time scale there is no limit to progress, so anything is theoretically possible,", then it becomes just a platitude not worth stating.
In almost all the discussions I've ever had about AI, I've only ever heard comments about the next immediate effects of AI: it can improve this thing or that, while humans do the rest. Almost no one considers the effects decades down the line: complete replacement.
There's another effect, especially with regard to information synthesis via AI: previously the vast majority of information was obtained with people's names attached to them. All Stackoverflow answers have names attached to them, and articles on the internet too.
Now, AI bypasses that, making information gathering more anonymous and less human. I already remarked in comments that information sharing has always had two functions in human history: bonding humans together and trading useful information. AI removes the bonding aspect and creates a pathological environment where information sharing is too anonymous.
Your comment is one of the first I've seen that actually talks about a longer-term effect of AI, but AI programmers and companies are too bedazzled by the promise of short-term financial gain to give one care about ethics, except for mouth-service.
Decades is a long time, by then the world will be much different. You could extend this argument to any advancement: "people only talk about the benefits of agriculture. Nobody talks about how once we go from hunter-gatherers to large groups, we'll discover industrialization and cause large-scale climate change."
Though we do need to prepare for AI replacement, I think we'll be much more successful pushing for things like UBI than pushing against things like AI assistants.
> Previously the vast majority of information was obtained with people's names attached to them. All Stackoverflow answers have names attached to them, and articles on the internet too.
> Information sharing has always had two functions in human history: bonding humans together and trading useful information
I'm sorry but I disagree. When I see StackOverflow answers or random articles I rarely remember the associated name, unless they're especially notable (e.g. Eric Lippert). And I'm not sure about the vast majority, lots of Stack Overflow and Reddit answers are under pseudonyms.
I'm in academia. I believe that when people contribute to the pool of knowledge, the knowledge comes first and their credit comes second; not to be forgotten, but not something most people need to know. When applying for jobs and bragging to people I can show off that I discovered or invented X, but people learning or using X don't need to know my name.
I don't think most people asking technical questions are looking for "bonding", they're just looking for an answer to the question. I talk with colleagues about things like opinions and the latest hot topic, but AI isn't going to replace these discussions.
> I'm sorry but I disagree. When I see StackOverflow answers or random articles I rarely remember the associated name, unless they're especially notable (e.g. Eric Lippert). And I'm not sure about the vast majority, lots of Stack Overflow and Reddit answers are under pseudonyms.
That is already an unfortunate side effect of our current advanced state of technology.
> I don't think most people asking technical questions are looking for "bonding", they're just looking for an answer to the question. I talk with colleagues about things like opinions and the latest hot topic, but AI isn't going to replace these discussions.
I never said they are looking for it. I just said that it is a function of information exchange. In other words, it is an evolutionary instinct that just occurs, and has occurred whether we want it to or not. AI for the first time will be poised to remove it.
AI might indeed replace your colleagues one day, since an entire academic department may not even need that many professors or researchers to run it, once it's augmented with AI.
> Almost no one considers the effects decades down the line: complete replacement. ... Your comment is one of the first I've seen that actually talks about a longer-term effect of AI
I don't know where you've been this past year, but hand-wringing about AI replacing humans in the long term has been everywhere. It's gotten less common in recent months as the limitations of the models become apparent, but it's still prevalent.
And then you will find an answer that is like 10 years old for a version of the technology that is no longer maintained.
Seriously, try to find something relevant say for Ionic Framework , and most of the answers in StackOverflow are outdated garbage that was only relevant for Angular based Ionic.
The first thing StackOverflow should use AI for is to clean up those "hundreds of thousands of answers" so that the irrelevant things are removed. Right now StackOverflow feels like the library's "Technical Book" aisles around the 2000s, with tons of books about "Office 97", "Visual Basic 6", "QBasic by Example", and similar obsolete books.
I actually think that could be an amazing resource to benchmark the LLM - essentially incentivizing people with internet points to find particularly difficult or edge-case questions, that would help improve the model.
If it works, I do. I would prefer to deal with an AI that I know will be polite and objective rather than trying to game the system or overly anxious to close everything.
I'm glad it's not just me. Sounds like it's been recorded with a laptop microphone by someone random they grabbed in the corridor. The chosen music is also very strange, giving the video a strong enterprisey mood.
Aside from the generative stuff, SO is finally including semantic search capabilities. Their search is currently terrible. However, search itself is a huge endeavor, it’s not enough to just slap ANN on some pregenerated vectors.
Applying AI to Stack Overflow to make the site have more relevant content (e.g. tag outdated, apply version info) would be detrimental to their revenue model. Going through all the content on SO and tagging it with the relevant versions the answer applies to would be immensely helpful to users but would make page views fall off a cliff most likely.
I read something recent that the SO revenue is dropping and was showing signs of this before the recent AI push. In my opinion, this is because video is king with younger people and SO is geared towards people with beginner questions. You can find a huge amount of sw dev instructional content on youtube.
I absolutely cannot stand video tutorials for basic things. Why write the text of a command when you can stretch it out to 10 minutes of video? Perhaps even ask people to subscribe or buy from your sponsor while you're at it.
Oh I am in full agreement. I think its a generational thing though where younger folks might prefer looking in yt first for instructional types of things rather than a QA site.
For what it’s worth, I’m 41 and I often prefer video content. It keeps me on track when my attention span would get in the way were I reading a longer tutorial or manual.
So it's a ChatGPT-like interface to Stack Overflow data? What are its advantages over ChatGPT? (it cites sources, but I can't spot any others).
This is also noteworthy [1]:
> we're using AI to take existing knowledge sources (confluence, github, google drive etc) and provide an initial tagging framework and a series of questions and answers from them
What for? Presumably an LLM trained on a company's internal data, but with higher start up cost (time taken for initial tagging).
People asking questions will be angry because the bot okayed the question, why did the mods close it without discussion?
Mods will be angry because why did the bot okay stupid questions?
And it doesn't even go into the meat of the problem - stack overflow is supposed to allow everyone, including its competitors, free access to all the questions and answers.
The questions and answers belong to the users, not to the CEO of stack overflow.
My conspiracy theory is Joel Spolsky saw the writing on the wall long ago:
> The company has been growing, too. Today we are profitable. We have almost 300 amazing employees worldwide and booked $70m in revenue last year. We have talent, advertising, and software products. The SaaS products (Stack Overflow for Teams and Enterprise) are growing at 200% a year. That speaks to the fact that we’ve recruited an incredibly talented team that has produced such fantastic results.
> In May I wrote about Stack Overflow's business, which lost $42 million over 6 months and had just laid off 10% of its employees. Since then, the company's fiscal year-end results came out. Despite growing revenue, it lost $84 million over the year ending on March 31, 2023.
Thank god Wikipedia isn’t run like Stack Overflow. As an end user, they have pretty much the same value proposition: user generated answers to my questions. Wikipedia is still doing well, meanwhile it seems SO is constantly being driven off a cliff by bimbos in management.
Not everything needs to be a damn unicorn. SO is an information repository. They need to accept that stop trying to “enhance” it with more crap because they don’t realize their median user is a junior dev who really just needs to serialize a Java object and isn’t going to pay or put up with any LLM-generated nonsense.
SO doesn’t need large language models. What they really need is a better model of what answers are good, what answers are outdated, and what answers should be expanded to include more info (and sometimes, what answers should be slimmed down a bit). Turn the top answer to popular questions into a wiki so that everyone can update it. And then add backlinks for questions which were closed for being “duplicates”. It solves so many problems SO has.
Another thing. This “comments aren’t for extended discussion” nonsense needs to go too. Any question could easily include a Reddit-style discussion tab to facilitate discussion. I’m sure much of it would be at least as valuable as the answers themselves.
Sure, but companies like Stack Overflow is set up in a way where they have to at least aim to be Unicorns, as people have invested tons of money in it with the expectation (hope) that it'll pay back at least 10x/100x, otherwise they probably wouldn't invest in it in the first place.
Contrast that to Wikipedia/Wikimedia which is a foundation funded by donations rather than "investments", and the difference between the two almost couldn't be larger, just because of that.
Their VS Code demo question ("Do we use tabs or spaces?") is a clever one, but then it's just pulling from an answer on their Q&A site. Not particularly impressive in an AI sense.
The Copilot Chat beta is able to answer this question by itself in VS Code, though not in Visual Studio.
Doesn't this use case just get gobbled up by Google/Bing, as many people tend to search for these problems and then click on the Stack Overflow link? The search engine's LLM response will answer the question first.
I'm 97% certain one of these days a judge is going to set a precedent that allows this by going "welp, it would cost too much to untangle this mess, so we'll just allow it".
> The future of collective knowledge sharing.
The future is feeding all answers into a meat grinder and then forming word like nuggets from the paste and hope we like this? I am sorry but you lost your mind. There is a very fundamental disconnect between LLM hype and reality.
Here is my latest experiment, I needed to search for a string in dangling commits and asked Bing: https://i.imgur.com/pQtpkH5.png You will notice my question was a link because I copied the title from https://stackoverflow.com/q/21903972/308851 and apparently that made it into a link (I had the question open because every time I need to do this I just jump to this question, I outsourced storing my knowledge to SO ages ago). Every single answer given by Bing is wrong, of course, because git log -g is not appropriate for this as the answer states. It's right there: "Commits that are (still) referenced by the reflog are considered reachable and not dangling". And here's the best part: the Learn More 1. Stackoverflow.com links to this very question. The language model have seen "git log -g" in the vicinity of the string "Search for string in dangling commits in Git" and thought it's appropriate. The irony.