Hacker News new | past | comments | ask | show | jobs | submit login
GPT-Migrate converts repos from one lang/framework to another (github.com/0xpayne)
213 points by transitivebs on July 2, 2023 | hide | past | favorite | 170 comments



Who/what is the:

1. Author

2. Copyright holder

3. Copyright license

...of the code generated by this tool?

Unless the answer to all of these is unambiguously "the original", then you shouldn't be using such tool on any code, especially on your employer's intelectual property.

Sorry to be so negative about it but this is something that I see skipped over in all discussions related to AI. Just because it's AI does not make it immune to copyright law. You're giving away your code to a 3rd party company under their terms and conditions, and receivig some new code back, again under their terms and conditions. The fact that it uses AI under the hood is irrelevant, you're dealing with a business that produces you an output and you should know the terms before submitting anything to them, especially if you don't own that thing.


You’re making a few claims implicitly here:

- the copyright status of the output of LLMs is ambiguous

- this ambiguity represents a legal risk to the users of LLMs

- given that ambiguity, nobody should use LLMs for work that is intended to be copyrighted

It’s not clear to me if any of these are true individually, never mind all together.

Maybe the copyright status is ambiguous but I think the probability that the output of LLMs is owned by anyone other than API caller is very low. You can copyright works, but you can’t copyright the ideas within those works.

Possibly this represents a liability issue but I think the probability of that is vanishingly small. Just because a legal theory exists doesn’t mean it’s going to be enforceable - one of the reasons for the existence of Uber, youtube, etc. if it’s a fait-accompli it’s not a risk.

Finally, it representing a risk of copyright liability is still just a business decision. I would guess that existing software projects are riddled with code snippets of ambiguous or incompatible licenses. All it takes is one person to copy a function that is GPL and label it otherwise and then it’s out in the world, technically running amok. It is probable that under the strictest definitions, every software company is engaging in some low level of copyright infringement. All-in, this means that while it’s a risk, it’s a risk you’re probably already taking and so you might reasonably conclude to ignore.


At least the first two are true if you are a lawyer, as every lawyer I know is waiting for AI legislation to make the two first points clear.

Any lawyer will tell you if the first two points are unclear the third point is rock solid - don't use tools that have ambiguous copyright terms until AFTER the big legal fallout/legislation unless you are willing to bet the entire farm.


From my non-lawyer "common sense" point of view, the reasoning that 1 and 2 implies 3 seems absurd in the face of the "no penalty without law" principle.

If the law is so unclear that lawyers can't determine legality and are waiting for additional guidance from the lawmakers, shouldn't it be legal by default?


The risk is not "being punished under a new law for things your business did before the law existed". The risk is "there's a new law that made your whole business model illegal, now what?"


Well, I guess that's also a risk - that a future law forces you to stop selling your software until you remove all AI-generated code.

A law like this seems pretty much impossible to enforce, though. Even if it turns out that GPT is just parroting remixes of GPL code, all examples of generated code I've seen are fully indistinguishable from code someone may have come up on their own - with the exception of contrived cases where someone is actively trying to get it to output a particular fragment of code verbatim.


It's perfectly possible to have a new untested act judged illegal under existing laws, but only after performing the questionable act, and it being examined and judged.

My common sense says that without a tiny bit of doubt, copilot is utterly outlaw so far, essentially stealing gpl code where the only price was credit. It probably has several other problems but that is one that anyone can see. I don't mean everyone agrees, I mean the information needed to make the judgement is all present to all observers without needing to see the code or the training process or be a lawyer. There are countless examples of regurgitated gpl code, with no credit. End of search.

It's inexcusable because it would have been almost zero burden to get everyone's consent to be included in a collective credit. They didn't even do that.

So, whether or not this assertion of mine ever amounts to anything, it's a simple example of something being illegal all along, but maybe you don't know it until after you do it, and it gets examined and judged, and the judgement goes the way that burns you.

It doesn't require any new law.


Or you start a limited liability company, win the market from all the people who are terrified of maybes and worst case your liability is limited to company assets.


This period if legal ambiguity is the period when gold (meaning significant wealth) is spun from pressing the legal margins. Remember Napster, an illegal from the get-go startup? Yet one of it's founder, Sean Parker, is one of the Internet Goldenboys with the c-suites, legal, drug and celebrity problems only crazy wealth can afford. It's not an easy path, but right now is when the first movers are starting to sprint...


Last I heard the output of an AI was uncopyrightable. Didn't some court case determine that?


I suspect the raw output of an LLM will not be considered copyrightable but the composition of those raw outputs into a coherent work is likely protected.


What a mind-bogglingly stupid decision if so, I can just train a network on the identity function and get royalty free mickey mouse now?


It's court precedent, so a loophole like that probably won't work


Your comment is very valid. I'd just add that AI tools are clearly taking the "YouTube approach": they provide a large value added, ignore copyright for the moment, and hope to resolve it peacefully at some later point in time. This worked very well for YouTube.


I wouldn't describe the Content ID regime and the myriad lawsuits and backroom deals as "peaceful".


YouTube wasn't killed and thrived as a platform throughout the process. Meanwhile YT ads funded the lawsuits and negotiations, with a surplus. It is pretty much a solved problem now. This is as peaceful as it gets when you genuinely infringe on someone's very valuable rights.


Yeah they survived but I think we're worse off in a world of Content ID, copystrikes, erosion of fair use, theft of ad revenue by game companies. The list goes on. YouTube should probably be a lesson, not a model to copy.


Yes, the lesson is not to contribute anything of value to open source or another person's platform.

Creators need to use restrictive licenses, then all of these parasitical corporations will cease to exist.


Limiting the scope to software, I'd say it's fair to distinguish MIT licenses from GPL. The latter provides way more freedom for the user (as opposed to corporations willing to profit without giving back). I am fair more comfortable contributing to an AGPLv3 software as opposed to a MIT licensed one.

I can't talk about licensing for content creators (like youtube), because I do not have much experience about it.


You're right, but I don't see how we could not have those things with the copyright laws as they stand and people being what they are. Maybe it could be a little bit better, but not substantially better.


> Sorry to be so negative about it but this is something that I see skipped over in all discussions related to AI.

I don't know which forums you are discussing these things in, but this is the first comment on all discussions about AI on HN.


I don’t know what’s the actual legal status on LLMs right now, but here’s my thought: if you created artwork in Adobe Photoshop, does Adobe own your art work? If you made a world in Minecraft, does Minecraft own your world? What about using the AI tools in Adobe Photoshop? Or WorldEdit in Minecraft?

I’m sure these question can be easily answered by looking it up, so why would LLMs be any different? You have control over what the LLM generates (prompts).


I'd say that the difference is in the ownership of the data the tool has as it does it's job. In the case of Photoshop, these are the algorithms possibly owned by Adobe and the bytes you feed it. In the case of LLMs it is the model which was built using data with disputed ownership and the bytes you feed it.

Consider a hypothetical LLM that was trained on data having a single undisputed copywrite owner. What would be the legal status of it's output?


> Consider a hypothetical LLM that was trained on data having a single undisputed copywrite owner. What would be the legal status of it's output?

In that case the tool would almost certainly generate a derivative work, which would be a copyright violation. It's the same as if I took sick strong inspiration from a certain song that I wrote a new one with the same melody and chords, which has happened a bunch of times.

But generally LLMs are most useful when they're trained on a broad enough corpus to avoid these issues.


It's not really about the size of the corpus but about its ownership.

Anyway, now consider an LLM that was trained on two corpuses(?) with two distinct undisputed owners.


What difference does that make? The question is whether the work that is produced is derivative or not surely.


That is the question indeed.


But then the presence of the LLM doesn't change anything right?


Thanks for saying this, this is my perspective as well and I've been kinda dumbfounded by all the questions about copyright from LLMs. They're just tools and computer programs right? Why would copyright be any different while using them than other tools.

Never seen it questioned that Windsor & Newton could actually be the copyright holder for half the world's art.


Because the inputs to the models are copyrighted works?

If I open up Photoshop and make a new file, I own my art. So why would me opening up an existing PSD and moving all the pieces around and then claiming it as my own be an issue?

I don’t think anyone would question this at all if the model were trained only on your own data. It’s the part where a bunch of other people’s stuff was involved that makes it fuzzy enough to be an open legal question.


I read your position as being the copyright should belong to the holders of the copyright for the training works?

I can sort of see that but a) I'm also seeing it posited that the copyright could belong to either ht model or the company that produced the model (where does that logic come from?) and b) My own creativity is trained on the works of many others but my work is still my own, why is this different for an LLM?


It’s different because rules are written and enforced by humans not robots. So it’s whatever the copyright office says at the moment.

I think a same result though is that any work translated by the original author counts as a translation that maintains but doesn’t extend copyright.


hell Pantone claims to own colors let alone art

The question is more like this, if google books has an api that lets you retrieve one sentence from any book, and you use it to fetch whole chapters of valuable books written by actual insightful authors.

Do you really deserve to claim to have written the book you patch together from that? Especially, do you really deserve to not only benefit from selling the new book, but not even acknowledge nor pay a percentage to the original authors?


It depends how derivative my resulting book is. If I put all the sentences in the same order then I've created a derivative work, but if I create a meaningfully new work then I'd expect the copyright to be mine.


How do you know what deserves credit and what doesn't, when the tool doesn't tell you? How do I know whether you had the valuable insight to craft an artful line, or ot was handed to you and you glued it to another and another?

When you write a program using a library, there is no such ambiguity or credit-washing. You wrote your app, the library authors wrote the library.

copilot is essentially stripping that. You get to not only write your app, but look like, and even, apparently, feel like, you produced everything that went into it. All that stuff that was done by someone else and gifted to you, well that's all just the tools you the real artist used. It's absurd to credit a paint brush eh?


What’s the ownership of code that’s translated into another programming language by a human, given that usually there’s not a complete 1:1 of all features, so some of it will need to be rewritten?


It’s weird all of these fringe legal theories are coming out now in response to “AI”, which isn’t much more than a fancy search engine.

No one was asking those questions when the backbone of any modern tech company was, ironically, thousands of copy/pasted lines of code from StackOverflow and any other references that come up in a Google search. The number of SDE’s I know who can write code without a web browser is vanishingly small.


Valid points, and it needs to be said. But it's a PR away from switching from an external API call to using a locally installed gippity brain to do the migration.


Surprise you just paid a few pennies and gave all LLM users and another company you don't know all of your IP, while exposing yourself to future litigation.


Most companies' IP is mostly worthless. It's their brand and market position that's much more valuable.

If you disagree, how many users are on your twitter clone right now? Exactly.


People continue to parrot this point, but API use is not used to train models.[1]

Don’t trust OpenAI? Use Microsoft.

Don’t trust Microsoft? Run TII Falcon locally.

1: https://openai.com/policies/api-data-usage-policies


Great feedback.

I'm not the author of this project, but in my understanding, it's the same as if you were to write the code yourself. The project doesn't publish anything and it works entirely locally aside from LLM calls (which could in the future be 100% local as well). So you remain the author and have complete control over the license of the generated code.


How does running it locally make a difference to copyright?

If I run an OCR software 100% locally, do I get the copyright on the scanned result of Harry Potter?

Don't understand: using a giant LLM locally negates all the copyrights contained in the LLM? In which country is that a law?


I think GP means that at least you aren't dealing with a SaaS middleman that _also_ may intercept your code or claim authorship of the conversion.


No, you don't "remain the author". You have never been the author if you translate the real author's code base.

The license is the one of the original, since this is unambiguously a derivative work.


> and it works entirely locally aside from LLM calls

So not entirely locally. Yes these could eventually also run locally but OP’s point still stands.


This is great to hear but I don't fully know how far reaching are openai's claws (since it does require an openai api key).

If it runs 100% locally then yes, it would be safe to use.


totally agreed that diff projects need to be careful w/ sharing proprietary code w/ third parties.

openai's official stance is that it will never use API calls as training data, and that in my understanding it may retain API call data for up to 30 days for compliance purposes, but that it legally won't store it beyond that (whereas chatgpt convos are meant to be stored and used for training purposes).

as a next step, they could provide a swappable version of the LLM provider using something like https://github.com/imartinez/privateGPT, https://github.com/alexanderatallah/window.ai, etc. would love to have a standard develop here as the community matures around LLM usage


Recommended edit: IANAL

I find this reasoning a bit flimsy. Copyright doesn't have anything to do with publishing, and is it really that clear that you are the sole author of the derived work?


You mean ownership nor license changes.


Hmm not sure, I'll let you know after I use it and receive a cease and desist letter.


It feels to me as though LLMs should (eventually?) really shine at these kinds of tasks where the intent is already defined in code of some sort and the challenge of the task is lots of detailed legwork that humans find hard, more because it's time consuming and not interesting so hard to focus on, rather than because it's technically challenging.

So swapping languages, yeah maybe, but I expect of more practical use would be the situation where you inherit a legacy codebase in an ancient version of a language or framework that hasn't been loved in a long time. I saw this so many times when doing dev team for hire work.

Obviously you'd want to do boat loads of testing and there may well be manual work left to do afterwards, but I think it would be the kind of manual work that felt like you were polishing something new and clean and beautiful rather than trying to apply bits of sticky tape to something unmaintainable.

I also wonder about eventually being able to say to an LLM "take this codebase and make it look like my code", or maybe one of your favourite open source developer's code. Maybe everyone could end up with their own code style vector attached to their github profile describing their style. You could find devs with styles close to yours to work on your team, or maybe find devs with styles different to yours so you could go and argue about tabs vs spaces or something.


That's it precisely. I'm working on the exact same thing GPT-migrate is doing, but I'm approaching it from the other direction first. My project is trying to generate a test suite that aims to cover the full original functionality (bug for bug as they say). That way a tool like GPT-migrate has a much better chance of generating the translation without errors and whoever uses it can have more confidence in that the output will be correct.

I'm a bit intimidated that Josh came so close in just a week of work but it's also inspiring confidence that this is the right track and it's actually going to work when all the puzzle pieces fall into place.

edit: damn, this project actually creates rudimentary tests as well. It's such a lean approach, makes me feel like I'm still coding in 2022 when Josh is firmly in 2023.


I agree! This is a pretty elegant approach and while like you I haven't fully internalized the 2023 way of building AI native product, I'm inspired and increasingly confident in how much can be accomplished in a lean way.


How do you fit all the codebase tho? GPT-4 is limited to 32k tokens which is like 500 lines?


I scan through files in chunks, with particular questions for the LLM, building up a context that is eventually used to write the test.


> take this codebase and make it look like my code

That's exactly what LLMs don't do. In my experience, there is no way to convince ChatGPT (even v4) to follow any conventions or obey any rules. It might try a bit, but it always ends up writing everything its own way, usually as verbosely as possible.


You are 100% false on this one, especially with GPT-4. I can take entire vanilla JS functions, and ask the language model to rewrite them using typescript, snake case naming conventions, and show it a block of other code that I've written to adhere to the same structure including jsdocs and it nails it nearly every time.


In your example, you are asking GPT-4 to rewrite code from one public style to another public style. I was saying that LLMs cannot reflect your personal style, because they know nothing about it. Even when instructed at length and provided with short examples (that fit in the context window), they always gravitate to the public average.


Then you're doing something wrong because it's exactly what it does for me.

> Given this schema ... some create statement for a table > Use this as a template ... some unrelated function > I want you to implement this ... some pseudo code


> So swapping languages, yeah maybe, but I expect of more practical use would be the situation where you inherit a legacy codebase in an ancient version of a language or framework that hasn't been loved in a long time. I saw this so many times when doing dev team for hire work.

I'm exploring this problem space - this is a wide spread pain point across almost all companies that are older than 3 years old. I've seen this at tech startups as well as very large companies and anything in between. Dropbox is arguably a top tier engineering organization that probably manages tech debt as responsibly as can be expected and they still had to make major investments to move their codebase forward from various eras of web tech 1) https://dropbox.tech/frontend/the-great-coffeescript-to-type... 2) https://dropbox.tech/frontend/edison-webserver-a-faster-more... Everyone else is much worse off so the investment required to move forward is usually immense. This leads to full rewrites. Which is nice but error prone and sometimes entails huge opportunity costs

> Obviously you'd want to do boat loads of testing and there may well be manual work left to do afterwards, but I think it would be the kind of manual work that felt like you were polishing something new and clean and beautiful rather than trying to apply bits of sticky tape to something unmaintainable.

Agreed. In my opinion, now's a great time to get started with a semi automated approach like this while betting that the program synthesis and code generation capabilities will rapidly improve over the next few years. Larger context windows, solutions for hallucinations / reliability and better training data will help reduce the manual labor required.

> I also wonder about eventually being able to say to an LLM "take this codebase and make it look like my code", or maybe one of your favourite open source developer's code. Maybe everyone could end up with their own code style vector attached to their github profile describing their style. You could find devs with styles close to yours to work on your team, or maybe find devs with styles different to yours so you could go and argue about tabs vs spaces or something.

I've been thinking that personal style / training / fine tuning could become somewhat of an asset. "You are a principal software engineer at Google with particular expertise migrating codebases from {sourcelang} to {targetlang}." works fine but imagining a much richer portable input would possibly be quite valuable.


I’ve encountered lots of companies that keep paying license fees to oracle for DBs that could just as well be Postgres but for the work of updating and testing the code…


This would really useful if it worked with legacy code. For example, you could migrate all that COBOL code into Java or Python, or all the Fortran scientific code into C++ or Python.

I tried to migrate a twenty year old Visual Basic 6.0 project to c# by doing it piecemeal with GPT4 and it failed completely. Both in the UI and the backend. I am keeping my fingers crossed for a GPT n+1 that actually can do this.

Incidentally, I found out that GPT4 (as in chatGPT) is very useful if you need to program in VB6 which is nearly absent from search results these days.


The other day I was trying to translate some old Fortran 77 code to C.

I tried asking ChatGPT to do it. It looked good on the surface, but it had subtly mangled some of the if conditions, etc, which resulted in it producing completely different numbers from what the original Fortran code did.

Then I remembered good old f2c, and I tried using that. Unlike ChatGPT, the code f2c produced was (as far as I can tell) correct, albeit a lot uglier. But it is a lot easier to refactor ugly-but-correct code into nicer-and-correct code, than incorrect code into correct code.


you could try providing gpt original code and ugly C, and ask it to refactor. In my experience gpt is fairly good in code refactoring


>> This would really useful if it worked with legacy code. For example, you could migrate all that COBOL code into Java or Python, or all the Fortran scientific code into C++ or Python.

Having done a year-long stint at a mainframe team in one of the large financial corps (no, bigger than that) I can assure you that this is never going to happen for COBOL-to-java (or to-anything) unless there are strong guarantees of 100% correctness. See, one of the first things they tell you when you join a COBOL team is that you don't touch the code, unless you've filed a form that explains every detail of the change you want to make and why. In the team I worked for, that was a 10+ page Word form that would put a herd of elephants to sleep with its obstinacy and recalcitrance.

And that was only to change some JCL scripts- the scripts that run the COBOL jobs. Nobody dares to change now 50-years old COBOL code. Because every time they do, the corp loses millions. So I was told by those who knew better than me, and had been doing that job all their lives.

Bottom line, until someone figures out how to transform a gigantic, half a century-old COBOL codebase into java without breaking nothing at all, there's not gon' be any migrations.

I get a feeling that the requirements for scientific code are going to be much looser, and that this is going to cause a whole lot of mayhem, on the other hand.


I think at this stage there is no project GPT is going to do faultless, it's more like a boilerplate conversion. Will save you time, but needs a lot of writing after, it wont be able to properly link dependencies, and overview an entire project at this stage, even if it's alone for the max-token limit. But the did a nice job in this project addressing some of the issues.


GPT would be a lot more useful if it were able to mark sections where it "isn't confident that it can do well". Another thing would be an iterative approach that includes compiling and testing the code.

Essentially, you want to build an artificial programmer ("junior dev") who can work with a better developer/manager. That seems to be the way in the short-term. Doing this by just single-shot text transformation is a lot harder. Humans can't really do that neither.


When you say it's failed completely what do you mean?

I've been translating between c# and python and having a great deal of success at the function and class level. I even ported unit tests easily between xunit and pythons unittest library. I've got close to 100% test coverage so I'm fairly confident it's done well


translating between popular mainstream languages is not usually an issue.

translating a 20yo project written in a language that isn't used much anymore is a lot different


We should upload VB6 projects to GitHub.


Failed as in it produced not working code that would take longer to fix than if I re-wrote it myself. Or, in the case of user interfaces, the generated UI code simply was useless and did not even resemble the original UI.


aren't the tests also written by chatgpt then?


But they are confident everything is done well!

The code looks good, and that's a great success. What else would you like?


ship it.


Wouldn't it be easier to convert VB6 to VB.NET and then convert that to C#? The latter conversion is mostly just a find-and-replace operation, as I understand it.


Sounds like you haven't tried doing it. V6 and VB.Net are fundamentally different languages. VB.Net has much more in common with C# than it has with VB6. It is the conversion from VB6 to VB.Net that is the hard part but it is probably no harder to convert directly to C#.

In particular the way that object destruction works is completely different. VB6 using reference counting and .Net languages use a garbage collector.

Systems using reference counting destroy objects and run the destructors as soon as there are no references to the object while those using garbage collectors might only dispose of the objects when memory is low, perhaps never. This means that object lifetime can be very different and that patterns such as RAII require extra work in .Net.


That's true but was VB6 programs seriously made about memory and resource management? I think it should be modernized anyway when upgrading.


Of course they were. We wrote optimizers in VB6 that used serious amounts of memory and controllers for hardware that required careful control of resources.

What exactly is more modern about a system that does not have deterministic finalization as part of object scope?


agreed. I'm guessing the limiting factor here is training data w/ code from these languages that aren't as well represented in OSS


These examples always look so interesting and promising until you try them out with anything more than a "Hello world" application. It would be very interesting if it worked beyond trivial examples, but I'm not holding my breath.


I bet it doesn't really work for reliably for anything other than a toy project


Some good comments in here; I have done a fair amount of work with GPT-4 as a writer of go code, and there are two categories of difficulties for a "Tier 2" (Tier 1.5?) language with GPT-4.

The first is API hallucination, which hits as soon as you drop down into non "major" repository packages. Even GPT-4 acts like 3.5, and will cheerfully make up / use old API interfaces, pretend it knows newer versions that it does not know, and generally loop you around in very, very convincing-looking code that just does not work.

The second is style related. In particular, Go is picky with its error return semantics, and GPT-4 doesn't worry too much about this; I'm remembering a particularly subtle and annoying deadlock where it didn't defer closing a database connection inside a go routine, or alternately check for an error, and close the handle.

On balance, both of these seem super, super solvable, either by a custom LLM, or a next version with updated training. I think of GPT-4 as a reasonable mid-to-senior engineer in terms of output right now, and I think it's reasonable to start trying to port frameworks.

That said, I think I'd want it to do an excellent job at porting tests over first, and I'd inspect those heavily, and then I'd consider how to deliver a style guide for the target language in the prompts. By default, GPT-4 doesn't know exactly how you want things coded.

One last comment, Claude seems appealing to me here, with its longer context window. That said, I haven't been successful at fully using the context window -- e.g. "here's a tarball of a repo, please do x/y/z". I think word on the street is that the Claude folks use ALiBi, regardless, the 100k attention window from Claude feels more like one that can choose to alight on key areas of the input, not one that can take the entire 100k tokens into context.


> API hallucination

This term is better than anything I was able to come up with.

The ability of LLMs to make up convincing looking bullshit is remarkable.

I'll share a funny (because it's just so dead wrong) thing I had: I was asking about a problem SnakeYAML (popular JVM YAML lib) and it suddenly started adding Jackson (JSON Object Mapper for JVM) annotations, insisting those would work. (Spoiler alert: no)


How many people even have access to the gpt-4-32k, or gpt-4-32k-0613 models.

I think they give everyone access to the gpt-3.5-turbo-16k, but I have not found a way to request access for the 32k model.

There does seem to be an option through azures openai service: https://azure.microsoft.com/en-us/products/cognitive-service...


Is Azure as simple to use as OpenAI? Their documentation are much less simple.


Supported languages are in config.py:

Python, JavaScript, Java, Ruby, PHP, C#, Go, Rust, C++, C++, C++, C, Swift, Objective-C, Kotlin, Scala, Perl, Perl, R, Lua, Groovy, TypeScript, TypeScript, JavaScript, Dart, Elm, Erlang, Elixir, F#, Haskell, Julia, Nim, PHP


| sort | uniq

C, C#, C++, Dart, Elixir, Elm, Erlang, F#, Go, Groovy, Haskell, Java, JavaScript, Julia, Kotlin, Lua, Nim, Objective-C, PHP, Perl, Python, R, Ruby, Rust, Scala, Swift, TypeScript


>> GPT-Migrate is currently in development alpha and is not yet ready for production use. For instance, on the relatively simple benchmarks, it gets through "easy" languages like python or javascript without a hitch ~50% of the time, and cannot get through more complex languages like C++ or Rust without some human assistance.

In other words, it doesn't really work.

The current wave of LLM applications still seems to me like someone just invented homeopathy and a whole bunch of people are convinced it's real and are trying to use it to create a cure for cancer. It's just people waving their hands about and intoning magick formulae, that don't work and don't produce anything useful at all.

I am curious to see where all this is going to end up. Is someone going figure out a way to make LLMs work for real-world er work? Are we all waiting patiently the next big LLM version to see if it can do the things that the current best-of-the-best can't?


I don’t understand why this talking point keeps surfacing. LLMs are in significant usage already in many production environments and are already improving peoples’ lives and productivity. I can have a dig for more examples but I’d urge you to do a bit of research and try out the gpt-4 API for yourself. Or make some more qualified statements with specific bugbears. Equating it all to homeopathy is so confusing.

- Image descriptions eg. Be My Eyes: https://www.bemyeyes.com/

- Summarisation and content distillation, question answering Etc: literally everywhere. This is now a solved problem (to the point of it being dull) thanks to LLMs.

- Customer service chat triage.

- helping students learn- eg khan academy, tutor Lily

- Helping create and debug software. If you don’t think is happening then you’re either living under a rock or just close to believe people are lying about how they’ve used it?


> - Summarisation and content distillation, question answering Etc: literally everywhere. This is now a solved problem (to the point of it being dull) thanks to LLMs.

Calling this problem domain "solved" only undermines whatever point you're trying to make, unless you really believe that current LLMs which fabricate information, flip flop between answers with follow up questions, and even gaslight the user, can be called "solutions".

LLMs have proven themselves to be such untrustworthy "sources" of information and knowledge that I'm struggling to understand why anyone would even try to make this particular claim. It's trivial to refute by anyone who's used them and has been thoroughly refuted by StackOverflow Developer survey which reported that only 2.8% of developers "highly trust" the output of AI tools.

https://survey.stackoverflow.co/2023/#section-developer-tool...


Summarization, language translation and tasks in that manner (morphing tasks?) get much lower hallucinations.

Not sure what that survey has to do with anything.


> Summarization, language translation and tasks in that manner (morphing tasks?) get much lower hallucinations.

Source?


There are no benchmarks that test that. It's my experience and the experience of others who use LLMs for such tasks.

Down below there's a comment pushing back on saying Summarization has been solved. Even he/she is saying hallucination is rare.


I'd push back on summarization being solved. I have extensively used GPT 3.5 and 4 for this in production, and it works very well but it still often just ignores critical aspects of the original text, and it will rarely even completely hallucinate details. And this is with a lot of iteration on the prompt.

We still have to throw out > 50% of its output because a human can summarize the text much better.


> and are already improving peoples’ lives and productivity.

Don't forget this is coming at the cost of jobs that the AI is replacing. I would say, for those people, it is hurting, if not destroying, their lives.

You would all do well to remember this.


So does all forms of automation. By your logic we should've never invented computers because they took jobs from human computers[0]. Improved technology and automation reduces jobs for people doing things the old way and makes the job easier for everyone else. No job is necessary forever, that's just how the world works.

[0]: https://en.m.wikipedia.org/wiki/Computer_(occupation)


And we've always had a shitty answer to dealing with the fallout.

Unskilled people are running out of things they can do or retrain to. Sending everyone to college has only destroyed the meaning of a college degree and created a surplus of postgrads who can't earn anywhere near their potential for lack of demand.

If we keep overfishing this lake, at some point (I am guessing in the near future) we will hit an inflection point where the number of people vastly outnumbered the availability of jobs, and social services will be strained to the point of collapse.

That is when the torches and pitchforks will come out, and these folks had better hope to whatever God they think exists that AI will save them. Because my bet is it won't.

What are YOU going to be doing to help these people? I advocate for UBI wherever feasible, I simply don't see a way to put these folks back to work without laying waste to whatever industry they decide to enter.


>> - Image descriptions eg. Be My Eyes: https://www.bemyeyes.com/

Not working yet. From the website:

Starting today, you can register for the waitlist in the Be My Eyes app.

>> - Customer service chat triage.

More specifically? Which company is currently using LLMs for this purpose?

>> - helping students learn- eg khan academy, tutor Lily

Helping them learn, but what? Per wikipedia:

Statements made in certain mathematics and physics videos have been questioned for their technical accuracy.[43]

Sounds like using LLMs will just generate more garbage teaching material.

>> - Helping create and debug software. If you don’t think is happening then you’re either living under a rock or just close to believe people are lying about how they’ve used it?

Oh, absolutely. People are absolutely lying to themselves. OpenAI's and DeepMind's systematic testing of their code-geneator LLMs make it very clear that those systems produce incorrect code the majority of the time. The best results reported are 28.8% for Codex (in perpetual preprint: https://arxiv.org/abs/2107.03374) and 29.6% for AlphaCode (preprint: https://arxiv.org/abs/2203.07814 Science, paywalled: https://www.science.org/doi/10.1126/science.abq1158) and the latter is with their special 10@k metric which basically means the LLM gets 10 guesses.

I have definitely observed people convince themselves online that CoPilot or ChatGPT even is helping them "improve their productivity" or some such. It is obvious that they are fooling themselves, badly. If you push them, they immediately say "oh yeah, it makes mistakes, but I can correct them" etc. So they just feel like it's useful, even when it's just making them do more work.

In fact, that's exactly like homeopathic self-delusions: people use it because it makes them feel better, not because it has any measurable benefit.

Btw:

>> ... you’re either living under a rock ...

Stop being a jerk.


>I have definitely observed people convince themselves online that CoPilot or ChatGPT even is helping them "improve their productivity" or some such. It is obvious that they are fooling themselves, badly. If you push them, they immediately say "oh yeah, it makes mistakes, but I can correct them" etc. So they just feel like it's useful, even when it's just making them do more work.

That mistakes can be made doesn't mean time isn't being saved. Everybody makes mistakes as is and first try code isn't typically being pushed human or not. The presence of mistakes means nothing.

Frankly, you're the one who comes off delusional here. "Most people are telling me one thing but they can't be right because i believe so so they must be lying to themselves" isn't normal behavior, especially when your biggest argument that they must all be wrong and you right is that the machine makes mistakes. That's very weak.

Take a step back and really look at what you're saying.


Dude I’m literally—right now— using an LLM to fuel and *build* new features in my production website (ablf.io). I’m just some random eng. I’m sure there are many like me. And I’m not talking about gpt just recommending the odd function. It builds entire modules and writes comprehensive test suites. I’d rate its competency as similar to a junior dev. I guess I’m lying to right? It helps me learn about and write ML and NLP stuff I’m very new to. It’s basically replaced stack overflow for me.


"Dude", there is no doubt that LLMs can generate code, and there is also no doubt that they can't generate correct code reliably and in fact they are extremely inaccurate. We know this because OpenAI, DeepMind, Salesforce (I kid you not) and others have tested their systems and though the measures of performance they use are arbitrary and designed to make them look better than they are (like the n@k measure which just lets the LLM guess any number of times) they still score very badly. See the results I quote in the above comment.

So I have no doubt you are literally, right now, using some LLM to do stuff, I just have no doubt that it is not doing what you think it does.

You say you're an engineer? I know that means you write code, but the first thing that's drilled into engineers in training and in work is that you don't just make a thing and call it a day, you make sure to understand the properties of the thing you built and what it can do, and what it can't. Like you don't just put some planks on stilts and say "here's a bridge, come and drive your cars over it". You sit down and do the maths and decide what loads the bridge can take (and you optimally do this before building the bridge). So have you done anything like that? Do you have any way whatsoever to tell how often your system works and how often it shits itself?


I came here looking for a rust->forth pipleline, are you telling me that isn't available? /s

I've long believed that interlingual glue should occur within llvm at the IR level. Reversing each layer of a compiler feels plausible if complex.


A prototype that works 50% of the time is not homeopathy. A new cure for some types of cancer that would succeed 50% of time would be an achievement.


Unless it straight up kills the other 50%.


Than it's certainly not homeopathy. It doesn't kill anyone.


The readme says "~50%" of the time, which seems to be a number pulled out of a hat (to be nice) rather than any serious attempt at quantifying performance (there are no metrics of any kind and the authors are asking for relevant benchmarks). It's more like the authors' feeling, than anything they have ascertained in any reliable way. That's pretty much the same way homeopaths test their "remedies" (i.e. water).


Much less impressive, though still useful: ChatGPT is an awesome movie subtitle translator. Only very unusual phrases need to be corrected, often there are no such cases. There are projects on GitHub that automate the translation. Short SRT files can be just pasted into the chat with appropriate instructions.


How can ChatGPT translate a comedy show without knowing what's going on on screen, and various context that contribute to the humor?


Feed the scene into whisper to extract the audio and then feed that into got 3.5/4 for context?


That's a valid point! In similar vein, I was always impressed with the translation of Asterix and Obelix into English. The puns were done quite well!


> Only very unusual phrases need to be corrected, often there are no such cases.


Very interesting concept!

I'm left wondering if you could also use this to document or clean up machine generated code. Eg, some process generates a huge wad of bytecode, or autogenerated Java; a GPT tool cleans it up so you can actually do some things with it as a regularly skilled human.


I have a community website that I built a while ago in angular that I wish I had built in react instead. Mayne then I would at least get some help maintaining the open source repo on GitHub. I’ll give this a try and report back


Did it work? Link the repo.


Such brave


I really doubt this works reliably given my own attempts at doing this for extremely small use cases.


We need migration tool between frameworks : Rails -> Django

Fast API - Express


Unfortunately I can't trust GPT will not hallucinate something, I tested it to code review my code and it hallucinated issues.

It would be great if you could give it old , ugly code and you could get something better.

Maybe Intellj guys can use this tools to increase their productivity and we can get 100% correct tools that work with AST not with tokens, and can do advanced transformations and review that you can trust without having to double check it.


They will have to train their own model. ChatGPT has been trained on textual data.


Someday we'll all be writing code that injects the programming language as a dependency. This migrate magic seems bass-ackward.


One could argue we do already via code generation when we define protobuf definitions or other idl’s. But yeah a chat oriented idl which then code generates c, Go, python, just based on the required problem domain is an interesting vision.


Interesting and potentially great use case. How does it figure out the minutia selecting adequate dependencies?


If you think this does anything more than serve files to chatgpt with a custom prompt, youre not living in this reality


ITT most are wishfully duped


I've done using chatgpt on some demo code I got it to write between languages and Frameworks, it works to an "ok" level.

Edit: this was demo code I asked chatgpt to come up with in the first place, so the output had no problems license wise that the input didn't already have.


I expect to see how you use your gpt-migrate to migrate gpt-migrate to JS.

Then from JS to Python again.

Run the test and compare.

Once done, good job !


Is there an upper limit on the size of code base it can handle? (LLM context size)


It would be nice if it could estimate GPT usage costs with a dry-run.


Ah, finally a way to turn the output of nodejs programmers into something less hateful! All it took was the early days of the singularity ;)


I just want a small model which will ingest a language spec & be capable of understanding external repo code


I just want a model which will be capable of understanding actually anything…


useadrenaline.com


Anybody who noticed semantics correctly translated? a [1,2]==[1,2] will be different in Python and JS.


Whow, I always wanted to move away from my old perl web app to typescript. Will report the results


Please do. I’m curious to see what this looks like in production repos.


How well does this work?

Could it generate tests to confirm behavior in the target and source?


Looks like it does.

(Optional) If you'd like GPT-Migrate to validate the unit tests it creates against your app before it tests the migrated app with them, please have your existing app exposed and use the --sourceport flag.


Seems like this actually works by generating tests and continuing to try different things until the tests run successfully on both the source and the target codebase.


Yeah.. same.. :<


yep; it's generating unit tests and making sure that things are at least statically correct in terms of syntax. early days, but the ability to have loosely-defined acceptance criteria would go a long way here (like you would have for a well-groomed dev task issue).


I'm tempted to point this at the Linux kernel repo and tell it to convert it "to work on Windows" and see what happens...


Rewrite in Rust for guaranteed 1st spot on HN's front page.


this would be legendary.

prolly going to fall on its face for something of this complexity, but "the next big thing always starts out looking like a toy"


Will use GPT-Migrate to migrate to rust ;)


Or in Brainfuck


As far as I have tried, at least chatgpt basically cannot do brainfuck. I think it might just be too far from natural language. It basically just gives you hello worlds with randomly injected characters.


I wonder why. GPT-3 could do brainfuck just fine:

https://papers.nips.cc/paper/2021/hash/0cd6a652ed1f7811192db...

(though worse, for complex cases, to one of the compared systems - see figure 2) (oops, fully disclosure: the system in question is my PhD work).


Doesn't ChatGPT excel at providing step-by-step instructions? That's kind of what Brainfuck is with only eight options for each step. I'd imagine that it could reach the end result using a combination of those eight options. Can it not?


Idk, I'd get nonfunctioning code. With explanations of what every piece of code does. Just that the code it is explaining, is actually gibbrish.


Has anyone tried it yet? Would be crazy if it worked.


I remember when Siri was first demoed, a dev I worked with was convinced it was just like having human personal assistant. I hadn't seen the demo, but I was confident he was wrong. Haven't tried this either, but I'm equally confident it wouldn't work .. or even kind of come close to working


Siri was objectively better in the past - I remember using it when I got my iPhone 4S and being impressed with how it returned very relevant data from WolframAlpha and other sources - then it all dried-up and all Siri is useful for now is voice-control for my bedroom lights and phone alarm...


It will be rewritten in .NET and rely on HyperV.


Funnily enough, it seems like someone already attempted to do this around 20 years ago - search for "Umlwin32"


Will it spit out WSL1 or WSL2?


Did you try it yet? Curious how far it could get.


Need all Python ML repos to be migrated to Ruby equivalents ASAP.


sure it does. good riddance kids these days


I can’t believe the author doesn’t allow conversion to JavaScript

It’s the most popular language for heavens sakes. Developers really need to stop bringing their religion into open source.


> Developers really need to stop bringing their religion into open source.

Says the person complaining "why isn't my language supported"


[flagged]


Then why is the tool not written in it?


That's a take.


I'd laugh, but LLMs actually made me begin to reconsider my dismissal of JavaScript and Python. They're still annoying, and the wider ecosystem is a disaster, but for the first time ever, I see personal value in working with the most popular tools: their popularity means that they're what LLMs work best with. So if I'm going to involve GPT-4 in some coding, I'm better off using JS or Python than asking for e.g. Lisp.


Type helps LLM, so possibly TypeScript become the king for LLM era. I don't know.


Delusion and cognitive dissonance run deep for a large chunk of the JS world.


The interpreter and JIT for which are written in C++, so... Is C++ even more universal?


???

The literal example shown in the README has targetlang set to nodejs. Maybe it's a bit odd to specify the runtime instead of the language, but in practice, that's maybe more useful.


would be cool to see a loop of:

js ⇒ python ⇒ js

and then compare the output JS w/ the input JS. could get wilder too like:

js ⇒ rust ⇒ typescript ⇒ java ⇒ js


I've always felt bad charging so much for so few lines.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: