Hacker News new | past | comments | ask | show | jobs | submit login
OpenAI pulls Johansson soundalike Sky’s voice from ChatGPT (bloomberg.com)
143 points by helsinkiandrew 7 months ago | hide | past | favorite | 190 comments




The Sky voice sounded nothing like Scarlet Johansson.

I'm worried that in the future somebody might try to create their own voice to regain it as an accessibility tool after a medical event, and they're going to have their voice literally stolen from them under the fact that it might sound like another famous person 90% of the time.

In the music industry it's actually a known thing that if somebody famous sounds like you, you basically have no real ability to become a star or sell your music without problems. Sure it happens, but it's honestly the same problem many actors that look alike go through but just with sound.

This is pure censorship/money grab, and to make it worse, the voices sounded nothing alike.


Apple shipped that as an accessibility feature in 2023

https://support.apple.com/en-us/104993


> Your speech is processed securely on device overnight while your device is charging and connected to Wi-Fi.

Neat


Wow! I'd not noticed this before! Thanks for sharing!


The original voice that they pulled or the one they replaced it with under the same name after the lawyers started calling?


they tweeted that it is "her"


We cannot infer from Sam Altman's tweet whether he was referencing the movie Her because gtp-4o enables AI companions in general a la Her, or whether he was specifically referencing Johansson character in the film.

There are other AI companions as well in the film.


Obviously he was refering to Audrey Hepburn in Breakfast at Tiffanys.


It is incredibly disheartening to see what was born a non-profit dedicated to guiding AI towards beneficial (or at the very least neutral) ends, to predictably fall into the well-worn SV groove of progressively shittier behavior in the pursuit of additional billions. What is victory for Open AI even supposed to look like by now?


> What is victory for Open AI even supposed to look like by now?

Making a shitton of money.

Sorry.


I may be incredibly shallow but personally I am experiencing feelings of joy and validation.

Every time I tried to suggest that maybe, LLMs and GAN tools don't make creativity easier but lazier and emptier or that this technology area is parasitic off human culture, every time an OpenAI junkie told me, "hey, perhaps humans aren't much different from LLMs", or someone said artists are derivative too and don't really deserve any more protections or are "gatekeeping art"...

... my anger at the time is vindicated every time these greedy, cynical wretches that the US tech industry has raced to anoint are taken down a peg because of their own very obvious greed and expedience.

I am loving this.

I may also be shallow in feeling a measure of glee that Microsoft is racing forward to shoehorn this utter toxicity into every corner of their product range, just in time for their customers to fully understand how it reeks of contempt for them.


This is a sentiment I'm starting to see more of, and have really started internalizing in recent months.

For every creative task I've given an LLM in the last 2 years, if I cared at all about the output, I ended up redoing it myself by hand. Even with the most granular of instructions, the output feels like a machine wrote it.

I have yet to meet anyone who felt any kind of emotion from generated art, except for "wow, it's cool that AI can make this". That's because (imo) art comes from experience, and experiencing is absolutely not what LLMs do.

Meanwhile, my dad, whose AI experience amounts to using MS Copilot "two or three times," is sending me articles about Devin, and how it's over for software engineers.


> I have yet to meet anyone who felt any kind of emotion from generated art, except for "wow, it's cool that AI can make this".

Have you ever observed how difficult it is to _remember_ AI generated pictures?

I can think of only one AI-generated art thing that has stuck with me, and it's because of the enormous amount of effort the guy using it went to generating really genuinely creepy fake photos to go with a plausible but fake story (about a lost expedition in the early era of photography).

I thought at the time, OK, maybe people will do creative things with it. Maybe I am wrong.

Except that months on I can't remember any specific detail of any of the photos in enough detail to visualise it. Only the emotion and the feel, which could have been evoked by that talented person entirely without Stable Diffusion.

There is something about AI generated photos, in particular, that confounds my ability to remember the image (as a photographer)


That is very interesting, I'm picking up what you're putting down. YouTubers make rampant use of AI art. If nothing else, this era of YouTube will be recognizable from afar.

I do like that many people have learned to recognize the writing style and visual aesthetic, and are rejecting it.

> maybe people will do creative things with it

_Some_ people will do _some_ creative things with it, but most people will use it as a shortcut—as long as there's some kind of output, they couldn't care less about the quality. How much of correspondence is just an LLM summarizing what a different LLM wrote? If the internet wasn't dead before, this is surely killing it.


> I do like that many people have learned to recognize the writing style and visual aesthetic, and are rejecting it.

This is the thing that gives me hope -- inquisitive people who have no idea how ChatGPT does what it does can point out ChatGPT-generated text. It's more difficult with GAN-generated images but in the creative community I am part of, some people are very literate about this already.


I don't think this will hold for too long. We already had soulless art hanging on the walls of waiting rooms and bank branches, even before GenAI. Rather sooner than later AI products will be indiscernible - even today enough AI outputs are for many - so what then?


I don't know.

But quite a lot of people understand the difference, at a visceral level, between a painting made by an individual amateur artist and a painting made for selling at one of those Fine Art chains, or the difference between something rough and charming and a painting you might have seen in the 90s while trying to locate the loo in a UK branch of McDonalds.

People's instinctive artistic "literacy" is often surprising.


There's a lesson in here somewhere, but I'm not sure what it is.


> Have you ever observed how difficult it is to _remember_ AI generated pictures?

No matter how much you filter and purify it, puke will rank.


I think it’s more about uncanniness and some sort of latent, subliminal incoherence. Like maybe it is somehow disruptive to visual memory in a subtle but noisy way, because it doesn’t hang together quite right.

I have no science to back this up, mind you. But I struggle to recall details of these images (I also believe I have a limited form of aphantasia so it could just be my flawend noggin)

But I will take your point ;-)


i’m sorry, i just don’t understand what you’re trying to say. you’re happy that the leading AI firm is full of shit despite promises to the opposite? what makes you happy about that?


I'm happy other people can see what has been obvious to me from day one.

It's not just schadenfreude (which I admit is unattractive, if beguiling.)

It also gives me hope that ordinary people are beginning to get to grips with the idea that they don't have to accept or be excited for new technologies just because they are new technologies, and that the people bringing new technologies don't have to be good people just because they are capable people. Seemingly smart people can be intellectually and morally lazy.

I have no obligation as a techie person to be excited about AI, or to be default-positive about the "leading firm", or to give the benefit of the doubt, or anything like that. There's no moral rule that one should be positive about new technology until it's proved bad. This is a classic tech industry false belief.

OK so the fall is not happening as quickly as Juicero. But it's a start.

What's your case for why should I not be happy?


They're gloating about being right about SV tech culture. Being right about the heel turn is some cold comfort, I guess.

But parent shouldn't feel too proud of their prognostication skills. OpenAI is a venture of Sam Altman and Elon Musk, so how could it be anything other than what it is? You'd have to be insanely naive about SV (and, more broadly, what "non profits" of billionaires in any sector even are) to assume this was ever born of altruism.


I'm not gloating. I'm just enjoying the spectacle.

I also don't profess surprise at who OpenAI have turned out to be. Rather I am surprised that other people are surprised.

It's not a heel turn, except in their wider cultural fortunes. It has been obvious to me from literally day one that everything to do with DALL-E and ChatGPT and onwards is bad for culture. There has never been anything other than creepy, dystopian, Black Mirror overtones.

But the valley falls for hucksters every time. And it's often the same hucksters.


> You'd have to be insanely naive to assume this was ever born of altruism.

Yet the vast, vast majority did and a still large proprtion continue to proclaim that these projects were born of altruism and continue to serve these altruistic goals and these people are most incredible altruistic humans to ever grace this fine planet of ours.


I can't understand this. It seems like don quixote trying to defeat windmills.


Why? Why is it inevitable that the AI world must do LLMs, must steal all of culture without recompense, and must deride human ability in order to defend the limitations of the replacement technology?

Answer is: it’s not. To all three. And collectively we can decide to be better. This is why artists are pushing back. One day perhaps the tech world will understand that they aren’t Luddites but instead champions for humanity.


Because progress in AI depends on a great dataset. A dataset that does not exist in the public domain. And the progress in AI is worth too much to stop because laws are not enacted yet. My quality of work has improved significantly since having access to chatGPT4. And this has improved by leaps again since chatGPT4o.


Nothing wrong with a little schadenfreude.


And I am loving the fact that LLMs and generative AI is showing what an absolute nonsense it is for any artist to think that just because they made a shitty drawing or song once other people are not allowed to also make that shitty drawing


Do you make and sell any art?


I make art but I don’t sell it. Don’t get me wrong, if a painter makes a painting that painting is his and he can sell it, I can’t steal it. But I can’t also paint that? Not download a copy of it? Not take a picture? How do you own the idea of flowers in a vase because you drew it once? That’s insane to me


Well it is insane, of course. But it’s also a straw man argument.


From my point of view, the artists are evil!


What's shitty about this?

They approached Johansson and she said no. They found another voice actor who sounds slightly similar and paid her instead.

The movie industry does this all the time.

Johansson is probably suing them so they're forced to remove the Sky voice while the lawsuit is happening.

Nothing here is shitty.


Asking someone to license their voice, getting a refusal, then asking them again two days before launch and then releasing the product without permission, then tweeting post launch that the product should remind you of that character in a movie they didn't get rights to from the actress or film company is all sketchy and -- if similar enough to the famous actress's voice, is a violation of her personality rights under California and various other jurisdictions: https://en.m.wikipedia.org/wiki/Personality_rights

These rights should have their limits but also serve a very real purpose in that such people should have some protection from others pretending to be/sound like/etc them in porn, ads for objectionable products/organizations/etc, and all the above without compensation.


I will agree with you if

- they used Johannson's actual voice in training the text to speech model

or

- a court finds that they violated Johannson's likeness.

From hearing the demo videos, I don't think the voice sounded that similar to Johannson.

But hiring another actor to replicate someone you refused your offer is not illegal and is done all the time by hollywood.


> But hiring another actor to replicate someone you refused your offer is not illegal and is done all the time by hollywood.

Probably this could indeed make them "win" (or not lose rather) in a legal battle/courts.

But doing so will easily make them lose in the PR/public sense, as it's a shitty thing to do to another person, and hopefully not everyone is completely emotionless.


> But doing so will easily make them lose in the PR/public sense, as it's a shitty thing to do to another person, and hopefully not everyone is completely emotionless.

If an actor is saying no and you have a certain creative vision then what do you do?

Johansson doesn't own the idea of a "flirty female AI voice".


Find someone else? You think this is a new problem? Directors/producers frequently have a specific person in mind for casting in movies, but if the person says no, they'll have to find someone else. The solution is not to create a fictional avatar that "borrows" the non-consenting person's visual appearance.


> The solution is not to create a fictional avatar that "borrows" the non-consenting person's visual appearance.

That's exactly what was done when Jeffrey Weissman replaced Crispin Glover in Back to the Future Part II.


First time I hear about it, but reading about it, it seems that specific case actually changed the typical terms for actors to prevent similar issues?

> Rather than write George out of the film, Zemeckis used previously filmed footage of Glover from the first film as well as new footage of actor Jeffrey Weissman, who wore prosthetics including a false chin, nose, and cheekbones to resemble Glover. [...]

> Unhappy with this, Glover filed a lawsuit against the producers of the film on the grounds that they neither owned his likeness nor had permission to use it. As a result of the suit, there are now clauses in the Screen Actors Guild collective bargaining agreements stating that producers and actors are not allowed to use such methods to reproduce the likeness of other actors.[

> Glover's legal action, while resolved outside of the courts, has been considered as a key case in personality rights for actors with increasing use of improved special effects and digital techniques, in which actors may have agreed to appear in one part of a production but have their likenesses be used in another without their agreement.

https://en.wikipedia.org/wiki/Back_to_the_Future_Part_II#Rep...


If they didn't use her voice at all, doesn't seem like there would be a case or even concern.

Also, they proceeded to ask her for rights just 2 days before they demoed the Sky voice. It would be pretty coincidental that they actually didn't use her voice for the training at all if they were still trying to get a sign off from her.


If they used her actual voice for training the model that shipped then I agree with you. It seems like they used the voice from another woman who sounds similar though.


It doesn't "seem like" in this instance, "no no that is not what we did we commissioned someone else" without specifying who is their claim.

From technical standpoint, a finetuned voice model can be built from just few minutes of data and GPU time on top of an existing voice model, almost like how artists LoRAs are built for images. So it is entirely within possibility that that had happened.


I guess it takes more than a couple of days to organize things with an A list star, esp. if there's a studio recording session involved rather than just using existing material.

This strongly suggests they weren't trying to get her voice until the last minute (would have been too late for the launch) but, rather, they had already used the other actress, and realized they were exposing themselves to a lawsuit due to how similar they were.

It was a CYA move, it failed, and now their ass is uncovered.


Maybe despite not using her voice at all they wanted to give her some money as a gesture of good will and/or derisk the project.


Surely the company that has been gobbling up data and information without the rights to them or any form of compensation have suddenly turned a new leaf and decided to try and pay an actress that isn't involved.

Like, lets be real here. This wouldn't be the first time they would be using material without the right to them and I don't expect this to change any time soon without a major overhaul of EVERYTHING IN THE COMPANY and even then it will probably only happen after lawsuits and fines.


I would like to buy you a horse as a gesture of goodwill to derisk this flight attendant / passenger situation.


> The movie industry does this all the time.

Such as? Please give example..


What I'm wondering is why are they doing that in the first place. Why is the best AI company in the world trying to stick a flirty voice into their product?


It pains me to say it, but I really think it pays dividends to consider the very obvious possibility that the people who are doing this are in general just not socially well-adjusted.

Everything about OpenAI speaks of people who do not put great value on shared human connections, no?

Hey, I like that artist. I am going to train a computer to produce nearly identical work as if by them so I can have as many as I like, to meet my own wishes.

Why is it surprising that it didn't really cross their mind that a virtual girlfriend is not a good look?

This is not an organisation that has the feelings of people central to its mission. It's almost definitionally the opposite.


Yes, it seEms a LOt of big Names in tech have this same problem. Curious that, isn't it?

I also think it is tipping their hand a bit. I know companies can do multiple things at once, but what might this flirty assistant focus suggest about how AGI is coming along?


Perhaps it is stumbling over the question of whether its creators are good people.


...because human brains enjoy being talked to in a flirty voice, and they benefit from doing things that their customers like? Doesn't seem that mysterious


Guess you are their target market.


It is incredibly disheartening to see celebrities from traditional media expressing open disdain for the century's most revolutionary piece of technology.

Johansson was foolish to turn this down. This all sounds like she realized the mistake, regretted it, then sent her legal team to pursue this frivolous cease and desist out of spite.

I'm disappointed that OpenAI didn't see this for what it is, and decided to comply instead.


> It is incredibly disheartening to see celebrities from traditional media expressing open disdain for the century's most revolutionary piece of technology.

Even though it threatens their livelihoods and is parasitic off their work?

It's not disheartening at all: it's positive.

> Johansson was foolish to turn this down. This all sounds like she realized the mistake, regretted it, then sent her legal team to pursue this frivolous cease and desist out of spite.

This all sounds like absurd wishful thinking.


Maybe I'm missing obvious, but you seem to think it's disheartening when someone decides to not collaborate with a corporation, and the right choice for the corporation to ignore what the person thinks, "force" the collaboration anyways?! That seems outright crazy to me.


I'm not talking about collaboration or lack thereof. I'm talking about Johansson's foolishness, and her subsequent tantrum after she rightfully regretted that poor judgment.


> Johansson's foolishness

Explain why you call a famous actress foolish if she refuses to give a corporation permission to use her voice. Your entire argument is built on this opinion.


I am as interested in the use of the words “tantrum” and “spite”, which that statement is not representative of.

It reads like projection of some other feelings or unstated interests.


> to see celebrities from traditional media expressing open disdain for the century's most revolutionary piece of technology.

Really? AI has lots of potential but so far the big uses of the recent title wave have been an enormous increase in the creation of visual and text-based sludge, barely usable for anything serious most of it, by hustling online marketers and social media spammers.

Even where tools like GPT are used productively by people to simplify their business processes and so forth, every piece of information they claim has to be scrutinized for hallucinations to the point of them being useless as much more than idea generators for contexts where factual correctness isn't important...

Yay!


I had no idea they sounded alike, not really knowing (or caring) what Scarlett sounds like. I didn't think her voice was part of her celebrity status...

However, I did prefer Sky above the other voices. It didn't come across as seductive or flirty to me, just something that sounded close to the Google Assistant or Alexa. "Neutrally approachable" is how I'd put it, like a good receptionist, a mix of casual warmth and surface level concern. Shrug. I hope they bring her back soon.


> I didn't think her voice was part of her celebrity status

She did the voice for Her, a movie about an AI voice. If you check IMDB she's literally credited as "(voice)". Specifically in the realm of AI, her voice is her entire celebrity status.


> her entire celebrity status

I mean…if you check the IMDB page you’ll see she’s done a lot more than that…and is known widely for her role as Black Widow in the Avengers movie series.

So, I think she’s got more to her status than “the voice”.

I’m sure you were trying to say in this OpenAI voice rights context, but I think it’s important to consider she is indeed a bonafide not-just-a-voice A-list Hollywood Celebrity.


You can't cut off the qualifying first half of the sentence and then complain that the second half isn't qualified. Come on...


In female AI voice we seem to have encountered the overlap between your typical simp and corpo simp.

Funny thing is, when an attractive female goes against a corp... the simps picked the corp... we are well off into a Blade Runner 2 digital girlfriend scenario aren't we?

If any enterprising soul wants to join up and start selling chatGPT enabled anime body pillows, DM me, I have a supplier ready.


Yes. Sky voice was perfect. It was neither robot-y nor flirty. I preferred it always.


Why do tech companies keep using dystopian stories as manuals? The flirty tone, the voice being awfully similar to the lead actress of Her.

Next we are getting Ex Machina?


Don't Create the Torment Nexus [0]

[0] https://knowyourmeme.com/memes/torment-nexus


Immanetizing the eschaton


I'm not sure, but my guess is everyone's got a slightly different "uncanny valley" function.

So for me, this voice had no impact — sure I noticed it seemed a bit "flirty", but that's not a thing that engages me in any way as it feels equally fake when a human does it, and if anything I pattern-matched to the Pierson's Puppeteers in Ringworld; the original Alexa advert was moderately creepy, but I could see they were trying to mimic the computer in Star Trek; but one example I do have of being disturbed by a product advert was the use of a cheerful up-beat soundtrack for "The Robot Dog With A Flamethrower | Thermonator": https://www.youtube.com/watch?v=rj9JSkSpRlM


I don't recall Her being a dystopian movie


I suppose it depends on your own perspective. For my part, a world in which people "date" algorithms instead of other human beings certainly seems pretty dystopian.


True. But in the story, the algorithms are better humans than the humans, so it wasn't actually a bad thing.


And then when the algorithm transcended everyone realized that they knew how to love and could simply love the people around them. About the most utopian AI story I’ve ever seen.


I could probably buy a realistic sex doll that looks better than the girls that would date me but I’d rather hump the real thing.


HEH... ok? I'm not sure that's relevant. The question is - does a human developing a relationship with a post-human sentience make the story automatically dystopian?

Seems to be you'd have to be pretty prejudiced against AI to say "yes".


I think that when algorithms become as sophisticated as what's shown in Her, we have to come to regard them as people, which means they have rights to do with themselves as they please, including getting romantically involved with humans.

More frightening is Ex Machina, which shows what happens when such an AI isn't regarded as a person by its creator, and sees fit to take personhood for itself.


Heh, arguably we already date algorithms and break up with real people. It's the disconnect between the two that people are sad about... the matching algos and profiles set impossible standards for regular people.


I’m glad I met my wife before dating apps were a thing. Though I have to ask, why can’t you just meet someone the “normal” way without an app doing it for you?


I'm in a similar position to you, but from singler friends, I've heard two reasons. Firstly, all the other single people are on the apps. Secondly, some are concerned about being seen as a creep when flirting or approaching the "normal" way. Thus, online dating is the new normal, and the old-fashioned talking in person is less common.


Movie was actually polyamorous advocacy cloaked as sci fi


All your comment revealed is you view the world through other people's lenses.


I am fascinated by this split in interpretation of the movie. It seems clearly dystopian to me, and I was surprised to learn (here on HN) that there were a lot of people who didn't see it that way at all.

I have no insight to draw from this, I'm just fascinated by it.


For me anyway, "dystopian" media is one that conjures a world that is significantly worse than ours in some way. I did not get that from the movie. The movie did not portray the AI as being a malevolent or even a negative presence in the main man's life.

At its core, Her was a beautifully-shot love story between two flawed beings, nothing more.


It was pretty subtle dystopian. On surface it was a feel good movie about that guy becoming happy. But there were a few scenes were it was happening to a lot of people. Everyone with a cell phone basically was falling in love with it, and totally controlled by their love.

It was left a little in doubt whether the AI really did reach 'enlightenment' and beam itself to the stars, or the company/government shut it down because society was collapsing.


> or the company/government shut it down because society was collapsing.

I do not see how you could interpret the ending of Her this way.


Guess all good movies leave a lot open to interpretation, but difficult to do it and be good.

Like that "Rebel Moon" on Netflix was how to NOT do it, with tons of stupid exposition spelling out stupid details that didn't make any sense.

Versus "American Sniper" that was so evenly portraying all sides, that Right leaning people thought it was a liberal movie, and Left leaning people thought it was Right Wing propaganda. It was all so well done you could read into it a lot of your own feelings.

So "Her" was about the danger of technology. And at the end there were some scenes that you could read into how a lot of people were falling for this phone app and things were going downhill. But, it wasn't clear cut, the movie is really good at splitting the difference on how the app was also making people be happy, and was helping them.


To me it was dystopian in all the ways a good Black Mirror episode is, a future where humans are falling in love with LLMs is not a utopian outcome


<SPOILER>

I don’t fully recall the ending but doesn’t the AI grow past the guy and “break up” with him, leaving him devastated at the end?

A bit sounds like the Replika AI drama from the last year. </SPOILER>


"Devastated" isn't the word I'd use, and a breakup does not a dystopia make.


I agree with you on a personal level, though again I’m sure if I CTRL+F replika subreddit I’d find many people describing their emotions with similar words.

Anyway, let’s say he was negatively affected by that relationship, IIRC.

Reminds me a bit of this: https://www.uniladtech.com/news/ai/man-married-hologram-no-l... , up to you if you find people developing strong feelings to inanimate objects that can’t care less, dystopic or not.


It's obviously dystopian to be in love with something that doesn't care about you at all. That is not, at all, what Her depicted. Sam clearly felt for Arthur deeply. Breaking up because their life paths were incompatible doesn't mean she didn't feel anything for him.


That made me happy, for once seeing AI getting what it deserves and dating other AI on its own level instead of being forced to "date down" mere humans. It was a story of emancipation for me.


Not devastated, happy to have loved and ready to love again.


Wasn't the main guy's job writing personal letters to his clients' friends and lovers? It seemed like a world where no one was connecting with each other anymore.


Well, she was on Ghost in the Shell as well... proper dystopia.


They know what they're doing. No way this is not intentional. It's work and they got free publicity.


The PR response is crafted to negate the fact that it was Scarlet’s voice but they don’t deny they were mimicking Her.


Exactly, it was absolutely meant to be Johansson's voice


It’s hard to say it wasn’t intentional after this tweet:

https://x.com/sama/status/1790075827666796666

> her

And this one:

https://x.com/prafdhar/status/1790789900650037441

> @alex_conneau: came up with the vision of HER before anyone at OpenAI had, and executed relentlessly!


@alex_conneau's twitter header has a screenshot from HER right smack dab on his profile


Warning X breaks the back button on my browser, for fucks sake


Hold shift (new window) or control (new tab) when you click.


On mobile...


long tap, open in new tab


Whoosh


press down and hold (long tap) choose from options that appear


Wonder if there's any legal risk here. It's long been legal to imitate someone's voice as long as you don't try to make people think it really is that person's voice. This is very common in parodies.

In the case of computer-generated voices, there are qualities that are desirable that also happen to be attributes of a real person's voice. How many of these desirable attributes can a computer-generated voice have before it's considered too close to the set of attributes a particular person's voice has?


Yes, celebrities generally own the rights to use their likeness for commercial purposes or promotions.


How are voice likeness verified? Half this thread is people saying they don't sound alike, who can tell officially?

Real question.

Is there some waveform comparison that a court would accept?


IMHO a court would accept a waveform comparison that proved beyond all doubt that the two voices are similar, but I doubt you'd find a court that would settle the issue just because the comparison said no.

The cases brought forth by Marvin Gaye's family [1] showed that some judges will declare copyright infringement even if the melody, harmony and rhythm are different. Note that the author saying he reverse-engineered the original song in question probably had something to do with it, so in the end intent and artistic perception will always remain factors that no computer function can compute.

[1] https://en.m.wikipedia.org/wiki/Pharrell_Williams_v._Bridgep...


The similarity to Johansson feels more like a cover for the actual complaints I’ve seen, which are that the voice is far too breathy and flirtatious. It made the demo videos, with somewhat dorky men being complimented on their dress and whatnot, pretty damn uncomfortable. They either should have toned down the flirtation, or else added a more suave and seductive masculine voice to at least balance things out.


Probably not long before it can just generate voices however you like them best.


This is the actual problem. It is highly misogynistic and the comments here and elsewhere show how many people actually demand it.


How is it misogynistic?


Don't waste your time. No one can reason their way out of something they didnt reason their way into in the first place. A lot of people spent way more time than that individual thinking about what would appeal to the generic customer. And it so happens that regardless of gender, sexual preference, and pretty much anything else you can think of, humans prefer to look at human women and also prefer their voices over human men. And this is unlikely to change, ever, no matter how much they disagree with it, if history is anything to go by.

Let someone else swim against the tide if they feel like that's the best use of their time.


Strange, I didn't hear it, now I can see the resemblance, but doesn't seem like a big deal. It was the only voice I really used. I wonder if Spotify will follow, with their sound-a-like Ice-T voice?


They genuinely do not sound that similar to me


Agree. I watched "Her" recently and also "Under the Skin" so have SJ's voice in my working memory and Sky, while similar in some ways to SJ's voice, is different enough that this seems 100% defensible.


> We’ve heard questions about how we chose the voices in ChatGPT, especially Sky. We are working to pause the use of Sky while we address them.

Why?

They take it for granted that there is obviously something wrong here. Are users actually concerned about this? I think most people would not care. Is this just manufactured for the exposure? Is this just another example of OpenAI's condescending moral superiority complex?


My guess is that they’ve received a cease and desist letter. It’s not a hill they’re willing to die on, so they removed the voice and issued a public statement.


Are you missing the possibility that Scarlett might care?


Possibly, but that is not what their statement suggests to me, unless those 'questions' came from her/her legal team. I suppose they could also be worried about future legal action from her. Maybe it just isn't worth getting sued, even if you technically did nothing wrong. But I also have a hard time believing they went thru this careful process of voice selection and it never occurred to them that the voice sounded like her, and didn't think about the legal repercussions

https://openai.com/index/how-the-voices-for-chatgpt-were-cho...


> But I also have a hard time believing they went thru this careful process of voice selection and it never occurred to them that the voice sounded like her, and didn't think about the legal repercussions

You do know the whole modus operandi of AI companies is "screw laws, take it all we'll see what happens" right ? That they actively lobby to not respect IP laws because they couldn't exist if they were enforced ? That they do not care about the negative impact they might have on anything ?

I'm no proponent of IP myself but I don't take from others to redistribute for my sole profit.


> You do know the whole modus operandi of AI companies is "screw laws, take it all we'll see what happens" right ? That they actively lobby to not respect IP laws because they couldn't exist if they were enforced ? That they do not care about the negative impact they might have on anything ?

Yeah, for sure, I agree with all that. In this case, openai says they got a voice actor though so I'm not sure if IP applies here. Because of the movie SJ was in, I could honestly see a lawsuit going either way I guess


> In this case, openai says they got a voice actor though so I'm not sure if IP applies here

The effect of the DMCA on youtube has shown that IP seems to apply even if something vaguely looks or sounds like the original. "Apply" not meaning "is true", but "has consequences".


How about this statement from Scarlett Johansson herself?

https://daringfireball.net/linked/2024/05/20/openai-johansso...


Yes, I just saw this. Looks like the 'questions' did come from her and her legal team after all


She should be happy she was selected as the voice of such a revolutionary piece of technology. It's a honor for her voice to be selected, she should show some humility and better social awareness in these matters.


I can’t tell if you’re being sarcastic or not. Do you say the same thing when she’s “selected” for deep fake porn videos?


So, a completely different actor voluntarily performs voice acting (a job) in exchange for money, using her natural voice (not mimicking Scarlett) and because there are some similarities, this situation is comparable to... Deep fake porn? What the..?


Sorry, did you deliberately skip over a bunch of comments to read mine or something? It feels like you missed some context. The person I was replying to was saying that ScarJo should be happy and grateful that the OpenAI voice was chosen to sound like hers, and that she should feel honored. With that context in mind, my question was whether she should also feel honored when she’s chosen as the subject for somebody’s Deep Fake porn — something she similarly has no control over and is only selected for due to her appearance/voice. I brought up Deep Fake porn specifically because it’s something she has been a victim of in the past.


I wonder if we will see these headlines:

“Florida person finds loophole in system to marry ChatGPT voice assistant”

“ClosedAI back peddles controversial decision to ‘lobotomize’ voice assistance after widespread dislike”

“It is now legal to marry 3D printed humans with AGI in these 25 states”

“Florida person initiates divorce with ChatGPT, claims 50% stake of OpenAI. Florida state courts open to it. OpenAI tumbles at market open by 11%”


Look we both know it is Florida man not Florida person in them future headlines and ain't no way the legal system in this brave new world is giving a dude half in the AI divorce.


Breaking working functionality is not cool. Sure they should perhaps license from Scarlett Johansson directly though. It might cost 1% of their GPU budget though. Pulling the rug though, really gives end users a lack of control of the interface, and harms trust in their product and organisation.

Edit: I know it's not really Scarlett Johansson, but it does sound very similar, and they can do better to make it right, in terms of sound and legality!


> really gives end users a lack of control of the interface

This is the case for all mobile and web apps and has been the norm for over a decade now. If you want control over the UI or functionality, use local software that doesn't check some server for feature flags.

Even if Scarlett Johansson would agree to license her voice, which is a big if, it's not her voice that's being removed. It's a different actress that some users thinks sounds like Scarlett.

https://openai.com/index/how-the-voices-for-chatgpt-were-cho...


I'm pointing out that it harms OpenAI to make users feel this way, I know it's common to give users a lack of control these days and that edge LLM's are not viable for this yet, and good luck with third party ChatGPT clients etc!

I'm not saying it's her voice. I'm not disputing that they found someone with a similar sounding voice. But OpenAI should still aim to license from Scarlett Johansson and fine-tune on real licensed audio from her, the real deal, and let users switch to it if they want to.


I don't see why they should license her voice if she didn't provide the voice work. Actresses sometimes sound similar. If there's a demand that voices that sound "too similar" to well known actresses (or actors, obviously) need their okay or licensing that would seriously harm the less known ones ability to make a living with their own voice. Same with the demand to take it down because it sounds too similar. It's someone else, no reason to do something.


I agree but OpenAI wanted to take action. They should license properly if they want to take action, not just take down a legally sourced similar sounding voice.


How is changing a voice recording breaking functionality?


For me, it was the only voice that I liked and I am paying $20/mo to access it and I use it all the time.


Because I selected that voice and now it's gone. That was it's job. It no longer does that job and now when I listen to the different voice, it's distracting because it makes me think of all this silly stuff instead.


So now we know that they tried to license from Scarlett Johansson directly but didn't reach agreeable terms. Sam has a lot of plates spinning, other plates are much more important but of course they should have done better from the start, and I'm sure they know it. It's not the best possible voice. Personally I really like a few from the 110 in the VCTK corpus from University of Edinburgh. They're used in a few places, I assume there are no license issues, and they could hire and re-record the best anyway with more emotion samples. It's funny they have a set list of voices rather than a space of voices that people can choose from.


> Breaking working functionality is not cool.

if your product relied on temu scarjo, you never had a product.


Various Internet sources claim Rashida Jones sounds a lot like Scarlett Johansson: could Rashida Jones have voiced ‘Sky’?


Once I saw that suggestion I cannot un-hear it.

It's much closer to RJ than SJ... MUCH much closer. Enough that if it's not her, I'll be genuinely surprised.


Works where archive.ph is blocked:

2024-05-20T07:16:23.679Z

OpenAI to Pull Johansson Soundalike Sky's Voice From ChatGPT

OpenAI is working to pause the use of the Sky voice from an audible version of ChatGPT after users said that it sounded too much like actress Scarlett Johansson.

The company said that the voice, one of five available on ChatGPT, was from an actress and was not chosen to be an "imitation" of Johansson, according to a blog post.^1 Johansson played a fictional virtual assistant in the film Her, about a man who falls in love with an AI system.

The voices are part of OpenAI's updated GPT-4o, which debuted earlier this month and can reply to verbal questions from users with an audio response.

1. https://openai.com/index/how-the-voices-for-chatgpt-were-cho...

Maybe I have missed something. Is this really the fulltext of the "article"??


It doesn’t sound like Her.


When I heard it, I thought, wow, they licensed Scarlett Johansson, what an amazing Easter egg.

If they didn't and just cloned her voice, it's more disregard for creators and artists than I would have thought possible. What were they thinking?

Edit after reading the official story... not sure I believe it, seems disingenuous, at best they chose someone because they really really sounded like Scarlett Johansson, and no one said, it might be a problem.

https://openai.com/index/how-the-voices-for-chatgpt-were-cho...


They cannot disclose the identity of the person involved in order to protect their privacy. This is a very convenient cloak, leaving no way to know if they cloned Scarlet Johansson's voice or if it's from someone else.


Fortunately for Scarlett, she can just sue them and force them to tell her. She also isn’t shy to litigate. I was surprised that Sam wanted to poke that particular beehive.


I know trademark litigation is wishy-washy, but can she even claim her voice is unique enough to claim some form of infringement? If I just happen to sound like her, am I a walking infringement?

Voices such as Shatner or Walken are at least as much about speech patterns than the voice, giving you another axis to compare voices against, so I can somewhat see those as being trademarkable. But when I hear the ChatGPT voice, it just sounds like "slightly-flirty generic female voice 03" to me.


Sam Altman and other OpenAI employees have been referencing the movie ahead of the presentation[1][2]. I think it'd be really challenging to prove that they didn't have that exact outcome in mind.

[1] https://x.com/sama/status/1790075827666796666

[2] https://x.com/alex_conneau


Microsoft once sued a high school kid named Mike Rowe for trademark infringement for having a website mikerowesoft.com so anything is possible.


I’m not sure it’s a trademark question so much as an appropriation of likeness question. I just meant that they can’t really train on her actual voice samples and then conceal whether they did it; that’s what discovery is for.


OpenAI is going to do a very good job of pretending that it stealing literally everything on the internet and re-selling it is some unique new activity that's not happened before. That's their core business. What they're not going to be able to do is pretend that hiring a soundalike and then making repeated references to the original voice is ok. This isn't new legal ground no matter how much money you throw at it. Otherwise every advert on TV would have some no-name actor doing impressions of well known actors. They will have to pay ScarJo a lot of money, and probably they'll have to stop using the voice too - because they've pissed her off so much at this point.


The Sky voice was obviously flirtatious sounding. It was actually very cringy and embarrassing for OpenAI, and it seemed really like a jump the shark moment TBH


I agree that the Sky voice is flirtatious.

But I'd also argue that flirtatiousness is a very good match for the character of Samatha in "Her" -

https://www.youtube.com/watch?v=GV01B5kVsC0&t=84s


It was in the demo, but not in the current tts implementation, where it is very neutral.


If it actually sounded like her I'd have been using it.


They should replace it with Mira Murati's voice for the female one, and Grek Brockman's voice for the male one.



> The voices are part of OpenAI’s updated GPT-4o

This is false. The TTS conversation feature has been out for quite a while.


Shame they’re pulling Sky. It is the only voice I could tolerate in chats. Also I agree with the others who don’t hear the resemblance. How much of these complaints actually come from users? This saga seems very media-driven.


I'll be honest, I don't hear a close similarity. If I were to pick a celebrity, I'd say it sounds more like Rashida Jones, though Rashida's voice is slightly higher.


They will gain more users than they will lose including many of the people who are so outraged by this, because, there's no such thing as bad publicity.


[dupe]

More discussion on official post:

https://news.ycombinator.com/item?id=40412903


The math here is pretty simple.

OpenAI clearly wants the “her” association. It’s already the favourite voice and projects many of the existing and imagined capabilities really well.

So they tried to pay for it, but it was either too expensive or blocked by SJ.

Her/teams comments read like a negotiation.

But I suspect they’ve overplayed their hand. At best they may get a settlement and OpenAI will modify the voice to make it sound less like SJ.


Honestly, I thought it sounded a bit like her initially but I don't like her voice anyway. I thought it sounded awful in Her.


The Voice still works on my app?


It's ironic that she was in a movie called Her as an Ai


That was the point. OpenAI is pretty good at marketing.


Hard to imagine they would stoop so low a few years ago


Now its just expected.


This was a very low hanging fruit.


> OpenAI is working to pause the use of the Sky voice from an audible version of ChatGPT after users said that it sounded too much like actress Scarlett Johansson.

Uh?? How can any voice sound /too much/ like Scarlett Johansson?

:)



On point. OpenAI engineers grinding so hard they need companionship from somewhere.


Came here to post this. It's the best critique of the whole current OpenAI strategy.


I don't follow


I was implying many users would prefer the bot voice to be as much like SJ's as possible.


Doesn't sound anything like Scarlet Johansson - https://x.com/AngelinaPTech/status/1741450012314280431/video...

Her - https://www.youtube.com/watch?v=Ij0ZmgG6wCA

If it was a soundalike then it would be a legal issue obviously, nothing to discuss.

Looking at the HN comments they are told it sounds like Her so that's what they believe. So you can't trust the NPC's to decide, they just regurgitate media headlines.

How do you quantify it?


I dunno - I think I'm a pretty normal person, and I thought they sound extremely similar. I thought it was pretty obviously intentional.


It doesn't sound that similar in the twitter video.

But I have tried many times. It does sound similar to me. Ask her to recite poetry with natural pauses and expression.


huh, I listened to those and I feel like they're fairly similar, obviously the _Her_ scene is more emotional which distorts the comparison somewhat


i agree, doesn't sound anything like Her to me.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: