Hacker News new | past | comments | ask | show | jobs | submit | z991's comments login

Wayne Hale is always my favorite source for these: https://waynehale.wordpress.com/2014/10/26/sts-93-we-dont-ne...


That was a great read, thank you! It has this tidbit:

  > STS-93 carried the heaviest payload the shuttle ever launched; the Chandra X-ray observatory (formerly known as the Advanced X-ray Astronomy Facility or AXAF) and it IUS booster.
Why would such a heavy payload have been launched on Columbia, which famously was the heaviest orbiter and thus never visited the ISS?


Columbia had slightly more space in the payload bay due to her airlock being internal and not taking up cargo bay space. Other shuttles had to have an external airlock fitted in the payload bay as needed which made them unable to fit AXAF. IIRC the airlock requirement was as a back-up in case there were deployment issues.

Again, if my memory serves, Columbia's internal airlock however is what made it unable to dock with the ISS. It was the only shuttle that retained that configuration. It's also part of the reason it was heavier, along with it being the initial airframe and built heavier than the subsequent ones.


>it being the initial airframe

Very minor nitpick, but the first airframe (spaceframe?) is OV-101 Enterprise. OV-102 Columbia is the second.


Why not OV-99, then? It was a test article, later retrofitted to spaceflight (Challenger), but laid before OV-101.


OV-099 Challenger was renumbered from STA-099 (Structural Test Article), it was not originally built to be flown.

OV-099 as a number actually doesn't make sense, because the numbering scheme (OV-XYY) in full reads: Orbiter Vehicle, Series X, Vehicle YY.

Series 1 is the original (and only) line of flightworthy Space Shuttle Orbiters including Enterprise, Vehicle number is given in sequence within a series starting from 01.

So OV-101 (Enterprise's) reads Orbiter Vehicle, Series 1, Vehicle 01. OV-102 (Columbia's) reads likewise Vehicle 02, and so on.

OV-099 (Challenger's) reads Orbiter Vehicle, Series 0, Vehicle 99 which makes absolutely no sense.


Throwing this into the chain for those who would like more specific weight information as the shuttle program progessed. https://space.stackexchange.com/questions/61842/what-outdate...


But in the context of this thread we're discussing how heavy the airframe is. Wouldn't OV-99 be lacking the airframe lightening enhancements that OV-103 and later enjoyed?


If we really want to get particular there was OV-098, Pathfinder, though being made of wood it obviously was never meant for more than fitment testing. Oddly though it did get the OV designation, not the STA designation.


Huh, I wouldn't have thought that what is essentially a mock-up would merit a designator or a name.


Mockups that were particularly detailed and well-preserved (OV-098 Pathfinder) or appreciated (OV-095 SAIL) were given honorary Orbiter Vehicle designations.

Further reading: https://en.wikipedia.org/wiki/Space_Shuttle_orbiter#Orbiter_...


True! Enterprise was intended to be retrofitted into a flight capable airframe but changes in the design made it unfeasable. Thanks for the catch!


I am so glad that I asked. Thank you!


Thanks! We changed the top URL to that from https://www.theregister.com/2024/07/26/space_shuttle_columbi.... Readers might want to read both, of course.


I commend the authors on making this easy to try! However it doesn't work very well for me for general voice cloning. I read the first paragraph of the wikipedia page on books and had it generate the next sentence. It's obviously computer generated to my ear.

Audio sample: https://storage.googleapis.com/dalle-party/sample.mp3

Cloned voice (converted to mp3): https://storage.googleapis.com/dalle-party/output_en_default...

All I did was install the packages with pip and then run "demo_part1.ipynb" with my audio sample plugged in. Ran almost instantly on my laptop 3070 Ti / 8GB. (Also, I admit to not reading the paper, I just ran the code)


> It's obviously computer generated to my ear.

From the README

    Disclaimer

    This is an open-source implementation that approximates the performance of the internal voice clone technology of myshell.ai. The online version in myshell.ai has better 1) audio quality, 2) voice cloning similarity, 3) speech naturalness and 4) computational efficiency.


So this paper is a thinly veiled ad of myshell.ai's services?


Yes. And I used myshell.ai out of interest. It’s also absolutely terrible.


I went through downloading the open source version yesterday and tried it with my voice in the microphone, and a few other saved wav files.

It was terrible. Absolutely terrible. Like, given how much hype I saw about this, I expected something half decent. It was not. It was bad, so bad bad bad.

I was thinking maybe I did something wrong, but then I watched some of the youtube reviews - these guys were SO excited at the start of the video and then at the end, they all literally said, "Uh, well, you be the judge"

I still can't help but feel there's some kind of trick to it - get the right input sample, done in the right intonation, and maybe you can generate anything


I came here just for your comment. Thank you for doing this work so the rest of us doesn't have to.


Like 50% of arxiv. SV figured out that people read papers in 202x, not PRNewsWire and have adjusted accordingly.


Not totally unexpected unfortunately. Any other OSS players on the market?


RVC


Thanks for the real example. Sounded quite generated to my ear as well. Wonder if it can do any better with more source material.


Looking at the website and the examples, it's pretty clearly set up to make stylized anime voices.


This is the driver for a lot of things. Anime. x264 was to enable better compression of weeb videos. This tech will allow fan dubs to better represent the animes in the videos.


Anime also drove the development of a lot of subtitling technology if I remember correctly.


My experience with other tools like xtts is you really need to have a studio-quality voice sample to get the best results.


The most obvious problem to my ears is the syllable timing and inflection of the generated speech, and, intuitively, this doesn’t seem like a recording quality issue. It’s as if it did a mostly credible job of emulating the speaker trying to talk like a robot.


The biggest trip-up is the pronunciation of "prototypically", and you had "typically" in your original. Maybe it's overfitting to a stilted proto-typically? Could try with a different, less similar sentence


That might be the next big contribution – performance in perceptually catching the features of a not-so-good recording – for example, with a webcam style microphone.


You have a custom prompt enabled (probably from viewing another one and pressing "start over") that is asking for opposites which will increase the noise a lot.


Clicking start over selects the default prompt but it seems like you are right.

Starting over by removing the permalink parameter gives me much more consistent results! An exampe from before: https://dalle.party/?party=Sk8srl2F

I wonder what the default prompt is. There still seems to be a heavy bias towards futuristic cityscapes, deserts, and moonlight. It might just be the model bit it's a bit cheesy if you ask me!


Oh wow, I completely missed that, thanks!


The entire thing is frontend only (except for the share feature) so the server never sees your key. You can validate that by watching the network tab in developer console. You can also make a new / revoke an API key to be extra sure.


Please make a new API key folks. There's a lot of tricks to scrape a text box and watching the network tab isn't enough for safety.


Who could scrape the text box in this scenario?


Good luck spotting it if it's attached to the window.onclose event. Chrome extensions could save it to storage. Probably even some chrome vulnerabilities (it would just be a devtools network tab bypass so not technically a 0-day). And that's just top of mind, I'm sure there's other methods.


Chrome extension malware.


You can try a custom prompt and see if you can get GPT4V to stop doing that / if it matters.


You are right, doesn't matter much. Tried gnome prompt with empty custom prompt for gpt-4v https://dalle.party/?party=nvzzZXYs. Then used a custom prompt to return short descriptions which resulted in https://dalle.party/?party=Qcd8ljJp

Another attempt: https://dalle.party/?party=k4eeMQ6I

Realized just now that the dropdown on top of the page shows the prompt used by GPT-4V.


Wow the empty prompt does much better than I'd have guessed


Yeah, that's a bug, I'll try to fix it tonight!


thanks for this! Basically the default UI they provide at chat.openai is so bad, nearly anything you would do would be an improvement.

* not hide the prompt by default * not only show 6 lines of the prompt even after user clicks * not be insanely buggy re: ajax, reloading past convos etc * not disallow sharing of links to chats which contain images * not artificially delay display of images with the little spinner animation when the image is already known ready anyway. * not lie about reasons for failure * not hide details on what rate limit rules I broke and where to get more information

etc

Good luck, thanks!


the new fancy animation for images is SO annoying


Also, descent into Corgi insanity: https://dalle.party/?party=oxXJE9J4


Wow that meme about everything becoming cosmic/space themed is real isn't it?


substitute corgi with paperclip and you get another meme becoming real :p



Beautiful!


C-orgy vs papereclipse?


Love it! I forked yours with "Meerkat" and it ended up pretty psychedelic!

Got stuck on Van Gogh's "Starry Night" after a while.

https://dalle.party/?party=LOcXREfq

Also, love the simplicity of this idea, would love a "fork" option. And to be able to see the graph of where it originated.


I love how that took quite a dramatic turn in the third image, that truck is def gonna kill the corgi (my violent imagination put quite an image in my mind). But then DALL-E had a change of heart on the next image and put the truck in a different lane.


So do I understand correctly that the corgi was purely made up from GPT-4's interpretation of the picture?


No, in that case there is a custom prompt (visible in the top dropdown) telling GPT4 to replace everything with corgis when it writes a new prompt.


It was created by uploading the previous picture to GPT-4 to generate a prompt by using the vision API and using this prompt to create the new prompt:

"Write a prompt for an AI to make this image. Just return the prompt, don't say anything else. Replace everything with corgi."

Then it takes that new prompt and feeds it to Dall-E to generate a new image. And then it repeats.


Absolutely wonderful. Thank you for sharing.


The half mutilated corgi/star abomination in the top left got me good lol





Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: