Hacker News new | past | comments | ask | show | jobs | submit login

"We are quickly reaching the limits of current AI"

So many words in this piece, but nothing to back up this primary assertion.




For what it's worth, I can actually tell the hype around AI died down a little. 8-10 months ago you'd hear AI from grandmas and their interest in playing with gpt. That's gone. From programmer's perspective, it is still useless. I can't still get chatgpt to give me a useful ffmpeg or a basic php database script. I don't know who is really using ai for what, other than crappy chatbots.

They are saying JSON works perfectly with chatgpt (gpt4) and still out of 10 queries I get 1 or 2 messed up. I can't depend on it yet. Image generation is still terrible and useless, can't use for programming, text-to-speech neural still sucks big time, not sure who is doing what with it. Almost feels like crypto-hype.


Really? I have a lot of success with bash and ffmpeg scripts. I don't even try to write them myself these days.

I don't do JSON stuff much but either function calling or grammar based sampling should be able to force correct formatting if the issue is syntax.

ImageGen seemed like a solved problem to me with stable diffusion about 6 months ago or so, between LoRAs and the tons of models availble on sites like civitas..

SOTA Text to speech/voice cloning is scary good, but it's only available via SaaS like 11labs (perhaps just them).

One interesting weekend they allowed anyone to clone a voice (required only maybe 10-60s of audio) to say anything

I made Bob Ross read Neuromancer using 60 seconds of fuzzy audio from youtube, but 4chan made something like 20k fake audio recordings on vocaroo of everything horrible and hilarious you can think of. Then they shut that off for good reason on Monday.

There are some cool applications there. I watched an hour video about the talkshow wars between Leno and Conan and i couldn't figure out what the voice was but it sounded like a smart old guy, apparently it was a weird clone of Biden. Which means anyone can now make YouTube content with any kind of voice. Which is bad for impersonation, but good if, say, you don't like the sound of your voice and want to go for a different one (maybe an old timey radio broadcaster)

What definitely has become shittier is the quality of gpt4. But that's not an AI issue, that's OpenAI being OpenAI.


Same. GPT-4 completely replaced my googling/stackoverflow routine. I generates a customized solution for my specific needs with step by step instructions. I think it works in a smarter (but slower) mode if the question is non-trivial. With pdf plugins it helps me reading and understanding recent papers. It generates an excellent summary and answers all my questions regarding any advanced concepts in the paper. I believe it's already smarter than me. I don't know if it can replace me in the mid-term but it definitely increases my productivity.


Making good tools with language models takes time. There are nuances that aren't immediately visible without working on them.

Unless there's a mismatch in the schema/messages, function calling is very reliable at providing JSON.


> Image generation is still terrible and useless

Image generation is thriving and now an indispensible part of the creative process. The tools are fresh and still evolving, but the impact even in this early stage is ridiculous. You can easily take rough sketches and have it spit out fully fleshed out colored prototypes. Depending on your perspective it's either an insane productivity-multiplier or scary as f-ck or both.

Look at https://civitai.com for a taste. Look around. You really believe this is terrible and useless? Most of these images came straight from SD with some minor tweaking, usually upscaling. I think it's quite something.

It's not just MidJourney, it's the entire ecosystem surrounding models like SD, ControlNet, etc. Some artists are swinging their fists, others shrug, but all are affected by it whether they like it or not and the damn thing is just getting started. SD came out in 2022. As a young aspiring artist I can imagine it's hard to not feel like this whole thing is a massive punch in the gut.

Like an experienced and highly-skilled artist, it's easy to shrug this off when you're an accomplished programmer. It can't replace me, right? Well, true. It's a gradual thing, but the impact it has on junior programmers is undeniable. What it means for our branch if leagues of young people lose interest in the meticulous acquiring of technical and often arcane knowledge through "manual programming" is anybody's guess. Maybe it's a good thing, maybe it's not.

> Text-to-speech neural still sucks big time

This is not my experience at all. I am actually practicing French with it and it can pick up my weird accent. It's ridiculous how good it has become in such a short time.

EDIT: Oh, sorry, I read it wrong. But in that case, the same thing applies. 11labs is insane and getting better. It fools many, many people on Youtube. The Bark project is also scary at times. This is just getting started.

> I can't still get chatgpt to give me a useful ffmpeg or a basic php database script.

I believe these issues are at least for a significant fraction caused by the severely castrated and surprisingly lacking UI efforts from OpenAI. Just free-form "talking" with an LLM might be a good way to get grandma to use it, but as a professional programmer (or professional anything, really) it's bordering on stupid.

The prompts and settings you use determine 90% of your success. OpenAI's interface gives you nothing. They just recently introduced something like "profiles", but it's amateurish at best. It makes me feel like programming without IDE (and Vi/Emacs).


This, how absurd. Do they think the future is always going to be in "just add more parameters"?

It's obviously going to be smaller, efficient and such. Even GPT4 itself is supposedly along these lines with MoE


> nothing to back up this primary assertion.

Well, he's making a prediction, a speculation, a claim.

If it was as simple as `x->y` it would no longer be valuable.


Sure, but even then you're expected to back up your speculation. There doesn't seem to be anything within the article to support the claim that we're at the tail end of development; just assumptions that today's limitations will apply tomorrow.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: