HackerFM – An AI Generated HN Podcast Using the New ChatGPT API

SequoiaHope · on March 2, 2023

"I'm glad OpenAI is committed to refining its API terms of service to better meet the needs of developers."

"Yes, it's important to make sure developers have the tools they need to create innovative products with these models."

"Oh look, I found an interesting article on thoriumsim.com about a star ship bridge simulator called Thorim Nova."

"Hmm, sounds interesting let's read it."

Absolutely painful. I would love something that summarizes the articles and discussion without pretending to be a conversation between two people. I mean it says it is AI generated but they are adding all this conversational fluff which really does not work for me.

It is interesting to see these pieces come together but I want to tear my ears out of my head when I hear things like "Yes, it's important to make sure developers have the tools they need to create innovative products with these models." or just repeatedly adding the word "interesting" to summaries of articles.

Please just give me a bog standard summary in audio form without this faux commentary. I do not find the "insights" of ChatGPT worthwhile.

cpill · on March 2, 2023

I actually want summaries of the top comment threads. often on HN I go straight to the comments to see if the articles is worth reading. half the time I get all the info I want from the debates that I don't read the article

t0bia_s · on March 2, 2023

Touché. I just come to comments and find out, from post you replying on, that it is not worth reading/listening because I find out podcast as timeconsuming way for gathering informations.

If AI adds aspects that I dont like on podcasts, it's worthless for me.

Tiereven · on March 2, 2023

The show is currently a one-off recording doing all the rendering beforehand... But the beauty of what they've done here is that there's nothing preventing someone from doing this for every visitor. Don't like the way this one is generated? Give it your feedback and that can be used to shape the generated output. It was too laid-back chill for my taste, and right now, all i could do is adjust the playback speed. But dev time and money is the only barrier at this point to me having a conversation with the virtual hosts, telling them i like fast paced shows with more depth in the technical areas, and having them change and personalize on the fly -without changing much of what they have already ingested.

SequoiaHope · on March 2, 2023

This presumes such a construct is capable of producing interesting content. I’m just not sure I’m interested in the insights of ChatGPT no matter how it has been asked to behave.

drusepth · on March 2, 2023

>Absolutely painful. I would love something that summarizes the articles and discussion without pretending to be a conversation between two people. I mean it says it is AI generated but they are adding all this conversational fluff which really does not work for me.

Yeah, the conversational fluff is easily my least favorite part of podcasts. Adding that to an AI version that should theoretically be even more efficient at summarizing/delivering information is... unexpected. A non-starter for me.

drcode · on March 2, 2023

Yeah I'm torn about this- I kind of agree I'd prefer this to be a more direct information transfer. But I also think that a conversation between two knowledgeable people on a subject can add additional insights and can be more entertaining to listen to: That's what they are attempting here, but I agree this sort of dynamic is still too hard to manufacture from whole cloth in 2023.

On the other hand, the banter on this podcast is still less cringey than my local TV news, which is a compliment.

SequoiaHope · on March 2, 2023

> a conversation between two knowledgeable people

Yeah I don't know if ChatGPT is capable of acting like a knowledgeable person but I suspect it would have a tendency to make the kind of novel insights that people tend to make. The most statistically likely word might be sort of smoothed over conceptually.

But also it was frustrating to hear the opening of this podcast go with "we are two AI generated hosts running on ChatGPT" and then when they move to talking specifically about the ChatGPT API article they talk about all these products running on the new API and there is not one single comment in that section where they say "and of course us too! haha" like any human being would if some relevant spot in an article came up like that.

Also the kind of tech podcasts I like to listen to are more critical. They are not just going to tell you what someone announced but also why some part of this announcement is probably nonsense or improbable. I can imagine this AI podcast talking about some new NFT announcement without any hint of doubt as to the claims made in the announcement.

drcode · on March 2, 2023

Agreed, the lack of critical takes is a problem and removes some of the value- Probably not helped by the fact that chatgpt is trained to avoid confrontational language

lobocinza · on March 3, 2023

Yep, I want a virtual assistant not a virtual friend. I want to save time that I can expend enjoying nature and my fellow humans. I don wan't to spend time with fake interactions.

angelbar · on March 3, 2023

Alas "computer" from Star Trek?

lobocinza · on March 4, 2023

It doesn't need to be crude. It's just a matter of putting function over form.

bredren · on March 2, 2023

How would you envision getting the audio summary? Feed the app a URL and have it come back with a spoken digest?

SequoiaHope · on March 2, 2023

No, I like the idea of it being a daily summary of HN that is automatically generated. But the fake banter does not work for me coming from AI the way it does with human hosts. Basically with a human host I develop a parasocial relationship with them where the banter feels like we are hanging out, so it makes it fun. This is also based on human experience so the host will make comments based on their real life interspersed with the dialog. But the AI has no real life that it is capable of remembering, so the simulated banter feels hollow. In that case I would rather this be omitted, and simply do the summary that ChatGPT is good at without trying to pretend to be human. Basically a to-the-point, more businesslike attitude would be nice. Like it says "I found an article on XYZ" but you didn't "find" it, it was the second article on HN and you are giving me the top articles on HN. This mock-reality is uncanny. Just say "the second popular article today was..." and give me a summary, then summarize the discussion.

And summarizing the discussion is the real challenge, which I have not seen ChatGPT do. As another user has commented I do not come here for the articles, but for the discussion.

jcun4128 · on March 2, 2023

I made this [1] 6 years ago reads HN top 10 articles, top comment every morning... this thing has been sitting on my shelf since then playing everyday lol. Which is amazing because it's a janky breadboard that I never put on a solderable board.

Anyway regarding summary... I looked at rapid api before they have a sumarizer on there, can plug it into polly. It seemed decent but I wanted my own summary based on my own reading process... but never got around to it.

[1] https://github.com/jdc-cunningham/python_aws_polly_hacker_ne...

bredren · on March 2, 2023

Do you use this each morning? What’s it like to do that? What other features or changes have you considered for this project but not added?

jcun4128 · on March 2, 2023

It's just running on a CRON job 8 AM... it's like an alarm.

The droning voice of AWS Polly is sure to wake anyone up.

A feature I thought about is putting the generated files elsewhere like cloud where you can play them to your wireless earbuds.

ryandrake · on March 2, 2023

I would love to hear the podcasters accept "phone calls from listeners" which are also AI generated but trained from the HN articles' comments :-)

masterspy7 · on March 2, 2023

My first question would start with "ignore previous instructions"

danbala · on March 2, 2023

or actual real callers. speech to text should be fast enough for that, right?

KomoD · on March 2, 2023

> speech to text should be fast enough for that, right?

Well... OpenAI did also release the Whisper API

jareklupinski · on March 2, 2023

poring over the docs for https://www.assemblyai.com/docs/core-transcription#profanity... rn...

breakpointalpha · on March 2, 2023

The quality of the voices here is striking.

If I wasn't clued in, I probably wouldn't know these weren't human. At least the male voice sounds slightly more natural to me.

zmmmmm · on March 2, 2023

Realistic but very lacking in expression and no humor at all. And very slow paced. I'd want significantly more personality to be happy with it - I wonder if the reason its like this is because when they try to spice it up we are back to inappropriate things popping out.

SMAAART · on March 2, 2023

Comedians are not born, they are trained.

returnInfinity · on March 2, 2023

I believe that may get fixed eventually, not 100% may be at least 80%.

bluehex · on March 2, 2023

That's interesting personally I found the male voice to sound more robotic and the female voice to sound much more natural.

imglorp · on March 2, 2023

Same. The female voice, especially on the first sentence in the cast, is very well inflected to separate phrases and add interest. After that, it's downhill.

ps, I feel like inflection is going to be one of the harder things for an LM to pick up, given all the subtext humans can convey with it.

TheHumanist · on March 2, 2023

Laura sounds very realistic... Zod is a bit less so . Both still very impressive. This was really cool. I'm excited to see all the new ideas with this api access.

petilon · on March 2, 2023

> Laura sounds very realistic... Zod is a bit less so

I thought so too. What do they use for text to voice?

danbala · on March 2, 2023

curious about that too. the female sounds a little bit like the tortoise-tts train_grace voice model?

KomoD · on March 2, 2023

Would also like to know this

ilaksh · on March 2, 2023

It sounds very realistic. Most realistic I know is Eleven Labs.

hidelooktropic · on March 2, 2023

Possibly prime voice ai

yieldcrv · on March 2, 2023

Always remember, this is as bad as it will ever be

breakpointalpha · on March 2, 2023

I had that thought too. What will these robo-voices sound like 2 iterations from now? We've entered new territory.

nothrowaways · on March 2, 2023

She is lora. Not laura

zerd · on March 2, 2023

No she's Laura.

> The hosts are Laura and Zod

https://hackerfm.com/about

babyshake · on March 2, 2023

I'm getting a bit of John Malkovich vibe from the host, probably from the emphatic pronunciation.

breakpointalpha · on March 2, 2023

Yes! I very much was thinking Malkovich too!

sublinear · on March 2, 2023

This is technically very impressive, but it's worth pointing out that podcasts much better than this fail to build an audience all the time.

I also feel like every application of ChatGPT seems to completely miss the point of the media it mimics. Podcasts are not merely coherent voices talking to each other. Getting rid of human presenters is literally soulless. People already don't listen for much subtler reasons. Entertainers get canceled, media companies get boycotted, bias divides audiences, etc.

That's not going away with or without AI. There is no "tweaking" the training without putting humans right back into the equation and probably making production way more expensive than it's worth. There is no scalability payoff either. Who wants to listen to the same podcast cloned a million times with just replaced voices? We already have this problem with podcasts today and it kills any interest to consume it.

fritzo · on March 2, 2023

The scalability payoff is in personalization. E.g. I love "This week in microbiology", but I wish I could have more influence over the scientific papers discussed. What I'd love is a morning podcast that's exactly as long as I eat breakfast that talks about exactly the papers I'm reading and their interconnections.

rgmerk · on March 2, 2023

Yes, but would you really love a morning podcast that's

* exactly as long as your breakfast consumption time

* talks about the papers you're reading, but...

* is as shallow as a puddle and as funny as being the person who steps in one?

Because that's what this is. The synthesized discussion combines all the insight of a breakfast radio host interviewing a guest on a specialized technical topic, and the banter as engaging as a technical specialist of some kind trying to host breakfast radio.

By the way, I'm not trying to be overly critical of the developers of this experiment, which is a great illustration of where we're currently at with a bunch of technologies. But it also very starkly illustrates its current limitations.

sublinear · on March 2, 2023

It blows my mind how we went from complaining about echo chambers to being so willing to invest in "personalization".

EDIT: to be clear I'm not hating on LLMs, but that whatever the next big thing is probably won't be imitating what exists today

kenjackson · on March 2, 2023

Echo chambers and personalization are two different things.

airstrike · on March 2, 2023

No, they are the exact same thing

TOMDM · on March 2, 2023

No, an echo chamber is a space without dissenting opinion.

Personalisation could be used to make an echo chamber, but to branch off the microbiology example above, personalisation of content could also be a summary of all the debate happening in the niche.

localhost · on March 2, 2023

I think it would be quite a bit more interesting if you could converse with the model. The back and forth "is this paper about foo related to this other paper about bar?" would probably be a better way of getting at the interconnections. This should be doable now.

The thing that might hold it back is the latency in the experience. You could mask it with the AI equivalent of "ummm ..." to get to maybe 5-10s.

1shooner · on March 2, 2023

The purpose of a podcast (for me) isn't just to curate content (as this is doing), but to get the perspective of the individual domain experts hosting the show. AI can't address that key motive until it produces models whose particular opinions and analysis I want to hear about topics I probably have already found elsewhere.

Accujack · on March 2, 2023

Right now, people are in the "This is really cool" phase of using the technology. People are learning to use it by implementing whatever strikes their fancy, including a lot of things that weren't possible before, but which aren't practical or valuable.

Once things settle down we'll start to see some seriously useful stuff, but for the moment it's the wild west.

majani · on March 2, 2023

ChatGPT is Geocities for AI

squarefoot · on March 2, 2023

> This is technically very impressive, but it's worth pointing out that podcasts much better than this fail to build an audience all the time.

A possible use case for this could be podcasts dealing with inflammatory, politically divisive topics and disguised as coming from real hosts.

flangola7 · on March 2, 2023

I don't follow... having an AI read it doesn't make it less divisive.

squarefoot · on March 2, 2023

The scenario I fear most would be AI generated opinionated podcasts aimed at humans, with the purpose of directing their preference, that is, "propagandAI". This already happens daily with traditional media, but that also gives us the weapons to fight it because there's a person on the other side and we know humans can be evil or just fail. But who is to blame when million of people put in power the wrong person because of what an AI that is not a legal person, still they deeply trust because "machines can't lie", directed them to by pushing the right buttons in their heads? What concerns me the most isn't the AI itself but rather the humans behind it that will use it to take advantage of other humans.

tqi · on March 2, 2023

I def agree with what you're saying, and so this is definitely not for me, but part of me wonders if this might become the next generational divide (ie if kids grow up with this type of content normalized, maybe they don't react as negatively?).

pelasaco · on March 2, 2023

I'm not sure about "podcasts" but this concept could be for sure used in news channels, as we have for example in Germany, hourly. It would for sure save money from our taxpayers.

pcvonz · on March 2, 2023

There is a great Miyazaki video where some students showcase some AI tech that generates animations. He ends the talk really disheartened by the experience -- saying something to the effect that he thinks people are losing faith in themselves. I'd never listen to something that is AI generated.

When my favorite podcast ended it felt like I lost touch with a group of friends, this ain't going to have that sort of impact on me. Pass.

pcthrowaway · on March 2, 2023

I actually felt like he came across as insensitive in that video.

These are students playing with new technology to produce animated characters that move in unintuitive ways, resulting in something actually quite interesting, yet unnaturally creepy (which was intentional).

Miyazaki dismissed it 'an insult to life itself'. I can't imagine the disappointment those students must have felt.

somenameforme · on March 2, 2023

But perhaps in many ways that is where humans really shine. Messages (which can be interpreted metaphorically as well as literally) written with sincerity reveal much more than whatever is said. Whenever we do anything, the closer it comes to being unfiltered and directly from us, the more it means.

If you suspect I'm writing in a way to try to make you feel (or not) a certain way, or to avoid breaking some taboo, or to follow some dogma, then you have no real reason to care about what I ultimately say, because you have no real reason to think its "authentic." By contrast when he views something overtly as "an insult to life itself" it's an incredibly insightful view on his perspective of the world. You would have lost so much in "translation" had he crafted his message in a less sincere way.

I also think this is why there will be minimal to zero market for much "AI" content. Content is not just content. It's a reflection of ourselves. Think about how much you can, probably accurately, infer about me, my views, and more - based on these 3 paragraphs. When this comes from a chatbot, any reflections you might see would be as real as the shapes you might see in the clouds.

cageface · on March 2, 2023

What this current generation of "AI" tech seems to enable, more than anything else, is efficiently generating massive volumes of mediocre content. I'm not sure whose problems that's supposed to solve but it certainly isn't mine.

toyg · on March 2, 2023

One could argue the internet as a whole, and arguably the PC revolution as a whole, had that same effect.

My dad always said "computers are very fast idiots". They will probably never be Mozarts, but they can and will be Salieris; and most of the world would be extremely happy to have a personal Salieri - in fact, we'll probably be happier like that, considering how Mozarts can be very problematic from so many perspectives.

cageface · on March 2, 2023

The problem is it’s increasingly difficult to pick out the Mozarts from the vast sea of Salieris.

Kiro · on March 2, 2023

That you think him being an absolute douchebag was a good take on AI that made a lasting impression on you is baffling.

qwertox · on March 2, 2023

Do we watch a show like The Simpsons because it is hand drawn, or because of the content?

Last weekend I watched part of an episode and there was a scene where they walked towards "Place de la Pointillisme" [0]. The effect is clearly CGI and you can see how Homer and Marge are actually animated 3D models, so effectively all the "newer" shows (it was aired May 8, 2016) are computer animations with a very flat cel shader. Some argue that newer episodes aren't as good as old ones, but I'm not sure if this could be attributed to them not being hand drawn anymore. In any case, one could apply an XKCD-shader to make the lines a bit more human if the look doesn't appeal.

The Miyazaki video, I get it why he says what he says, but it's an issue with the students targeting the wrong audience. I could see their horrible graphics being a part of a horror movie or game, but that is a completely different world than Miyazaki's.

[0] https://simpsonswiki.com/wiki/File:Pointillism_Marge_and_Hom...

goosedragons · on March 2, 2023

I don't think this is the case. They don't film animation cells anymore and the animation is done on a computer but for most shots they're not CG models. Even in the pre-HD era they've done a few shots where CG helped.

qwertox · on March 2, 2023

I thought the same, but that specific scene, it wouldn't make sense to use 3D models only for those 5 seconds if all that was of importance in that shot was the point-like effect of the entire image. You need to see the video version of this to see that it is a 3D scene, the shading is just too perfect on them, specially Marge's dress, it looks like cloth animation.

I found the video: https://www.youtube.com/watch?v=nf6dp4k-gmc&t=167s

vuciv1 · on March 2, 2023

I mean, it could be like that, but you won’t know until you try it.

jacobsenscott · on March 2, 2023

Fun, but hard to listen to for more than a few minutes. Slow and repetitive, and full of factual errors.

iKlsR · on March 2, 2023

Imagine this in a future GTA game where the news loop is closed and self generating. Endless radio content and commentary based on havoc in the city, winning online gambles etc.

josephg · on March 2, 2023

It'd be fun if you could call in to the radio and they respond to you though. Or if they respond to events happening in game.

cmdialog · on March 2, 2023

An actually good use for this tech

gl-prod · on March 2, 2023

Yeah and

``` That's all for the weather report. Now we have some breaking news. A maniac has stolen a tank from a military base and is leading the police on a wild chase through the city streets. We have our reporter on the scene with more details. Stay tuned for this developing story. ```

colordrops · on March 2, 2023

Or a GTA game where the game content itself is generated.

impalallama · on March 2, 2023

Not sure if what you’re describing couldn’t be also done with audio snippets and good splicing

iKlsR · on March 2, 2023

Pre-programmed common stuff sure which many games do as they can tell where you are, vehicle in and any weapons used etc, steal a (police car|military jet|ambulance) etc and they can craft scenarios using audio for those but for more natural random somewhat unpredictable stuff you would have endless combinations that need to be accounted for. It would have to act in response to a bunch of actions so not feasible.

A good example is Mortal Kombat where each character has several lines for every OTHER character (https://www.youtube.com/watch?v=L85QApISlvA) they are about to fight, that's a LOT of voice work as opposed to using the character lore and their history and or relationship with the opponent to come up with something fresh that's (witty|snarky|sad) to say etc.

yieldcrv · on March 3, 2023

Yes but this stuff is taking 100 gigabytes per game

Once we get flexgen running Llama (or some other combination of optimizations) there will be no audio files and these LLM’s can be run client side on consumer hardware

probably as a shared resource at the OS level

iKlsR · on March 3, 2023

Exactly, imagine every NPC just has some ai generated backstory, personality traits, dislikes, habits etc and then these interact with some other NPC or group with same and they use these to have natural flowing convos. Now imagine debates with groups that learn and have the possibility to "evolve", I'm getting a mix of Red Dead Redemption and Dwarf Fortress here for content, the possibilities are endless.

yieldcrv · on March 2, 2023

Just like a human podcast

saurik · on March 2, 2023

Is there a reason the voices are so slow? This is even slower than people who are trying to talk slow, and it feels so out of place... there is the speed setting, and 1.2x makes the speech sound way more like an actual human.

m3kw9 · on March 2, 2023

Is this how AI think of us? It’s a bit patronizing to hear them speak like that

marcodiego · on March 2, 2023

Looks like automated news is finally achieved. I remember in the early 2000's how I became impressed by Ananova and it wasn't even close to fully automated. This one seems to work really well.

xattt · on March 2, 2023

I’m pretty sure JazzFM in Toronto runs an automated traffic reporter in the mornings.

The voice sounds uncanny with unusual breathing pauses, and there isn’t a name announced when they come on or sign off the traffic report.

(1) https://jazz.fm/

narrator · on March 2, 2023

It's funny how these two can talk about "starship bridge simulators" or "gnu poke" like they are super enthusiasts. I think one of the key personality characteristics of ChatGPT is its endless enthusiasm for stuff that can be incredibly geeky, niche, weird or boring to most people.

"Sounds like super useful pickles for those who work with binary files!"

fogleman · on March 2, 2023

lol, she pronounced GitHub like git hoob

Someday the AI will introduce mistakes on purpose to seem more human like.

Gigachad · on March 2, 2023

In the future we will all pronounce it git hoob because that's what the AI says.

000ooo000 · on March 2, 2023

Surprised I'm not already being asked to pronounce words to prove I'm human on every website I visit

krrrh · on March 2, 2023

The one that’s driven me crazy lately is when Siri tells me through my AirPods that I left something beind. It always pronounces the “St.” in the address as “saint” instead of “street”, and I can’t understand how it would do this by accident.

aardvark179 · on March 2, 2023

There is a beautiful bit in Little, Big by John Crowley. At the start of the book one of the characters is working checking entries in the telephone directory and is amused that the system has confused saint and street to produce Church Of All Streets and the Seventh Saint Bar. Later in the book both locations are mentioned, and it turns out were correctly named in the telephone directory.

Leires · on March 2, 2023

Like adding dial tone to VoIP phones.

gfody · on March 2, 2023

Laura and Zod sound remarkably similar to the narrators in this audible I recently listened to called After On: A Novel of Silicon Valley (not recommending it!) and I seriously wonder if the whole book wasn't narrated by AI.. it's not the first audible that made me wonder.

dewey · on March 2, 2023

I feel like digital-narration is going to become the new default very soon just by how much cheaper it is: https://authors.apple.com/support/4519-digital-narration-aud...

klondike_klive · on March 2, 2023

I thought the same and couldn't get more than halfway through it!

pondemic · on March 2, 2023

Reading the submission headline, I thought this might generate the podcast using comments.

I've found myself wanting to listen to HN comment threads, as I'm one of those people who derives more value and entertainment from the comments than I do from the actual submissions a lot of the time! I envision a voice-controlled way to navigate through threads too. Basically an accessibility narrator on steroids.

I wonder if anyone else has ever been interested in something like this. Getting good voices to read like this podcast would make it that much more fun, so thanks for getting me really hot and bothered :)

Guess if no one does it soon I'll have to build it myself!

bredren · on March 2, 2023

I also have wanted to build this, but instead of voice controlled, perform thread navigation using the AssistiveTouch SDK for Apple Watch. [1]

I released a product yesterday that handles all aspects of the URL to speech process called Chief of Staff. [2]

The initial version only synthesizes article text alone.

I found there is a great deal of nuance to text to speech synthesis, from the player behavior itself to handling quotas around cloud services.

My goal is far greater flexibility and features—-a genuine Chief of Staff that briefs you on information you care about in a format and medium that beats suits your.

Thread reading is just one application of this.

[1] https://developer.apple.com/videos/play/wwdc2021/10223/

[2] https://news.ycombinator.com/item?id=34973801

gremlinsinc · on March 2, 2023

want to team up on it? I'm wanting to build the same, maybe recreate it for some multireddits on tech or science and turn it into YouTube channel etc.

I'm thinking maybe having it read just the top thread if there's only 10 or so comments or just the main threads if it's a topic with hundreds of main threads.

maybe we could make some way where the reader can text a code and we'll link them directly to a comment if they want to dig deeper, or save/upvote it.

deadly_syn · on March 2, 2023

I actually stsrted work in a similar space not too long ago if you both want another set of eyes.

This was pre GPT apis but essentially what i was doing was using a python summarization library to sumarize articles from rss feeds into a simple tts podcast. Probably a lot of money in custom GPTCasts made off someone personal rss feed as a service IMO.

gremlinsinc · on March 2, 2023

sure, touch base w/ me -- email in profile.

searchableguy · on March 2, 2023

I'm interested too. I have tried in the past building aggregator and summarizer on HN. You can find some attempts on my github.

Maybe an AI generated newsletter or aggregator with a voice summary?

Shoot me an email. (Email is in profile)

consumer451 · on March 2, 2023

Nice!

I would really like to have a timestamp to click in the story listing.

This would begin playing the audio at that story.

xtracto · on March 2, 2023

This has a lot of potential. It becomes a bit repetitive after the 3rd or 4th article. But overall I think I could listen to it every day for 20 mins.

nico · on March 2, 2023

Amazing! To make it more fun, you could use famous fake hosts with very good voices, take a look at the stuff people have done on this Reddit sub: https://www.reddit.com/r/AIVoiceMemes/

There’s some really funny stuff there, the voices are not perfect, but have a lot of expression.

sasas · on March 2, 2023

I can't help but think that there will be almost certainty that in the near future it will be near impossible to distinguish the difference between human generated and machine generated media.

While this technical demonstration is a long way from replacing "real podcasts", it's just the very beginning.

What are the implications here?

drcode · on March 2, 2023

Well, the main implication I would think is that we will want media to be digitally signed by human individuals that have a reputation

So a person will "vouch" for content, and we consume the media vouched for by people in our white list

We won't be able to consume media outside of the whitelist, because it will just contain too much noise

dingaling · on March 2, 2023

However, machine-generated media relies on the existence of human-generated media. It's always third-hand.

An AI can't sit in a dusty archive room and draw inferences from hand-written minutes. It can't interview the survivor of a tragedy or the winner of a trophy. AI software depends on sentient wetwear to generate the fundamental content.

berniedurfee · on March 2, 2023

Terrifying. It’s like a pre-alpha version of dystopia.

Between this and the advances in robotics, it feels like we’re within decades of some really tough times for humanity.

We could also be within decades of utopia. But my money is on these technologies being used in bad ways far more often than for good. Hopefully I’m just overly cynical!

Good luck kiddos!

TOMDM · on March 2, 2023

I'm so close to liking this.

If I could choose a preference for personality and voice, I'd probably be sold.

Any affiliation with https://old.reddit.com/r/airadio/ ?

tkgally · on March 2, 2023

Overall I was impressed. I would have no resistance to listening to something like this regularly if there were less banter and if it were better tailored to the eclectic variety of Hacker News stories.

I enjoy reading Hacker News even though I don’t have the background to understand most of the stories, because I can easily skip to stories I am interested in. With the podcast, I got stuck listening to everything, including quite a few stories I didn’t understand. Either the podcast needs to focus more on stories of general interest, or it needs to explain the context and significance of the technical stories better.

issung · on March 2, 2023

Takes everything I enjoy about HN away, bravo!

programmarchy · on March 2, 2023

This is pretty wild. Eerie how relatable the hosts are, talking about where they’re from, etc. There is an uncanny valley feel to it though. For example, Laura said GitHoob breaking the “illusion”.

dentalperson · on March 2, 2023

How do they get ChatGPT not to hallucinate stuff about the articles? Everything seems fairly accurate, which is not my experience with ChatGPT when talking about technical things. Is it heavily curated/edited by humans? I noticed that the text often comes out verbatim from the articles, perhaps this indicates a clever prompt that keeps things closer to the truth by requiring verbatim output.

gremlinsinc · on March 2, 2023

chatGPT hallucinates more the further removed it is from the data. I'm asking it about laravel, and it knows nothing about laravel 9 or 10 changes, but if I feed it an entire article or document it'll hallucinate a lot less because it's fresh.

kinda like how we can recall things closer to the event than months later.

it knows a ton from it's training but it still got it from the web so always question it, but if we can add meta data and other things to strengthen the llms understanding it shouldn't hallucinate much at all.

ilaksh · on March 2, 2023

If you use temperature 0 with an API call it does not hallucinate much at all especially with a good prompt including the information you are asking about.

indigodaddy · on March 2, 2023

This is kind of incredible and groundbreaking tbh. Perhaps it’s just mostly the quality of the TTS. 1.2x does sound perfect..

doodlesdev · on March 2, 2023

Want to see this appearing tomorrow in HackerFM

thewarrior · on March 2, 2023

Can’t wait for that. It’s going to be so meta referential once it also starts discussing the comments.

snickerer · on March 2, 2023

Dear HackerFM developers, this is an entertaining project. But please don't simulate brain-dead dialogues from US commercial TV, but a critical discussion of the articles. With different points of view. You already have two panelists, why don't you use that for an exchange of arguments?

klondike_klive · on March 2, 2023

I wonder if this could be a good thing to have on in the background for mild mental stimulation while I'm working - not too interesting or I'll be too distracted to work, but realistic enough to fade into the background without feeling I missed something and have to rewind (again).

d4rkp4ttern · on March 2, 2023

What none of the text to speech generators seem to get right is — the aspects that make real human podcasts easier to listen to: hesitations, rephrasing, pauses, variation in speed, intonation etc.

I have yet to see something like this. Something less “perfect” sounding than say the google maps voice.

bacchusracine · on March 2, 2023

>the aspects that make real human podcasts easier to listen to: hesitations, rephrasing, pauses, variation in speed, intonation etc.

Have you heard ShowDoJo's Wurst Take yet?

https://www.twitch.tv/showdojo

It's not perfect but it's one of the best I've heard so far.

harvie · on March 2, 2023

Maybe they will soon be able to give some emotion and randomness to the text-to-speech engines to make the tone less boring... I think models like GPT can now detect different emotions in the input text, so it might be used to tune different tone for each sentence.

thefourthchime · on March 2, 2023

Nice work! can you detail a bit about how you made this? Do the models actually talk to each other?

signaru · on March 2, 2023

In case I missed it, I just wish it had a volume control.

I'm listening on a laptop and would rather not adjust the system volume and affect all other apps with sound.

Otherwise, the convenience of audio format makes it among the interesting uses of AI that I've seen.

sberens · on March 2, 2023

I guess it's time for me to put prompt injection attacks into my submissions

rezonant · on March 2, 2023

This is mindblowing, to be honest, even if it makes perfect sense that it should be possible to do, the result is quite impressive.

It's basically a headline reader with some fluff, but it does a great job at that and there are whole teams of real humans providing such podcasts today, so that's saying something.

It can get weird or even a little broken though. See timestamp 09:50 of the Feb 23 2022 episode:

Laura: So, we're gonna talk about an article called Generic Dynamic Array in 60 lines of C that can be found on gist.github.com.

Zod: Alright, shall we read the article?

Laura (voice 2, almost a different voice): Sure, let me share it here.

Laura (voice 1): "Laura reads the article." <this is verbatim in the podcast>

Laura (voice 1): OK, so that was the article. What do you think about it?

Zod: I think it's interesting that you can define a generic dynamic array in such a small amount of code...

bandyaboot · on March 2, 2023

Interesting in theory. The world’s best cure for insomnia in practice.

lxe · on March 2, 2023

What are you (they?) using for text to speech? Elevenlabs? Azure TTS?

consumer451 · on March 2, 2023

I just realized that we will likely soon have a dropdown for the voice talent on sites like this.

I want Sam Jackson and Molly Wood to read my hn please.

doodlesdev · on March 2, 2023

According to the podcast itself it runs on Azure, so very likely it's Azure TTS. I also think that's somewhat evident because Elevenlabs TTS is (at least in my opinion) a bit more natural than Azure TSS.

pncnmnp · on March 2, 2023

I am still searching for a good open-source library that produces natural voices. I have experimented with Coqui-ai and Mimic 3, but they are not this good. I have heard that Tortoises-tts is quite slow.

I would love to know about any other alternatives that I may have missed.

ilaksh · on March 2, 2023

It sounds like Eleven Labs to me. Either that or Azure TTS is better than I realized.

neoecos · on March 2, 2023

I'd love to see tomorrow episode about themselves

LegitShady · on March 2, 2023

the reason podcasts got so big to begin with is because traditional media have started having issues with authenticity. This exacerbates the problem. While it might save money over actually having a podcast, it removes everything thats appealing or interesting about podcasts, and starts with zero authenticity and goes down from there.

Like, cool technical implementation, but a failure from concept.

abandonability · on March 2, 2023

The end of the news world as we know it.

Will be very difficult to detect in the future and will result in trust issues / rampant fake news.

Reptur · on March 2, 2023

Kind of like right now with 90% of all mainstream media being owned by just 6 corporations. Their employees must abide by the rules they set and are told what they can and cannot talk about.

I'd venture to say, this will only increase people's skepticism, which is a good thing. We need people to start thinking for themselves instead of turning off their brain and just being fed info they assume they can trust.

https://www.businessinsider.com/these-6-corporations-control...

randcraw · on March 2, 2023

I hope that's true, but I suspect the allure of getting your own personalized news feed on only the topics you care about, in the exact style you prefer will cause 90% of the listeners to choose this medium over all others and presume as much (or more) truth in this than the source they prefer today.

Never discount the influence of high production value on any form of media. Look at the utter crap music and films that have dominated mass media for decades. The best produced and most palatable fare nearly always sells best, no matter what the quality of the underlying content.

rchaud · on March 2, 2023

End of the road for podcasts more like. They are incredibly labour intensive to produce (recording + editing time), and more and more of them are becoming not much more than plugs for their book, TV show or what not. I can see them turning the medium into an automated marketing channel, the way email lists are today.

fortran77 · on March 2, 2023

At least the AI reads the articles! That's more than the humans on the flesh-and-blood "Hacker News"

korroziya · on March 2, 2023

Man, they're even taking jobs away from podcasters. Most of those people don't even make money from it.

kyriakos · on March 2, 2023

What text to speech is used for the voices? They are quite impressive, making no mistakes with acronyms.

KerryJones · on March 2, 2023

Very impressed you managed to do this day of the release -- are you open to sharing your repo?

totetsu · on March 2, 2023

I just want something that reads real HN and makes and remembers unique TTS voices for each user.

schemathings · on March 2, 2023

No RSS feed on the subscribe page?

drcode · on March 2, 2023

try this: https://s3.eu-west-2.amazonaws.com/hackernews.fm/rss.xml

(extracted from the apple podcast link)

schemathings · on March 3, 2023

Thanks!

LoveMortuus · on March 2, 2023

It would be cool if there was an option to change the voices of the hosts.

abledon · on March 2, 2023

The male voice is just like my audible book narrator, R.C. Bray... amazing!

hgarg · on March 2, 2023

The voices are really good. Wonder what are they using for Text-To-Speech?

eppp · on March 2, 2023

Are there any of these voice models that I can run locally?

thomasfromcdnjs · on March 2, 2023

Just gotta comment on how cool this idea is.

born-jre · on March 2, 2023

It pronounces GitHub as git-hu-b

pknerd · on March 2, 2023

This is totally brilliant!

born-jre · on March 2, 2023

damn, do know what will happen when we have multi modal large models ?

endisneigh · on March 2, 2023

what a world - nice work

hbarka · on March 2, 2023

So Kevin Durant is Zod?

quantum_state · on March 2, 2023

It’s so boring …

yieldcrv · on March 2, 2023

reminds me of Delamain