Hacker News new | past | comments | ask | show | jobs | submit | swazzy's comments login

Note unexpected three body problem spoilers in this page


Those summaries are pretty lousy and also have hallucinations in them.


I agree. Below are a few errors. I have also asked ChatGPT to check the summaries and it found all the errors (and even made up a few more which weren't actual errors, but just not expressed in perfect clarity.)

Spoilers ahead!

First novel: The Trisolarans did not contact earth first. It was the other way round.

Second novel: Calling the conflict between humans and Trisolarans a "complex strategic game" is a bit of a stretch. Also, the "water drops" do not disrupt ecosystems. I am not sure whether "face-bearers" is an accurate translation. I've only read the English version.

Third novel: Luo Yi does not hold the key to the survival of the Trisolarans and there were no "micro-black holes" racing towards earth. Trisolarans were also not shown colonizing other worlds.

I am also not sure whether Luo Ji faced his "personal struggle and psychological turmoil" in this novel or in an earlier novel. He certainly was most certain of his role at the end. Even the Trisolarians judged him at over 92 % deterrent rate.


Yeah describing Luo Ji as having "struggles with the ethical implications of his mission" is the biggest whopper.

He's like God's perfect sociopath. He wobbles between total indifference to his mission and interplanetary murder-suicide, and the only things that seem to really get to him are a stomachache and being ghosted by his wife.


And this example does not even illustrate the long context understanding well, since smaller Qwen2.5 models can already recall parts of the Three Body Problem trilogy without pasting the three books into the context window.


And multiple summaries of each book (in multiple languages) are almost definitely in the training set. I'm more confused how it made such inaccurate, poorly structured summaries given that and the original text.

Although, I just tried with normal Qwen 2.5 72B and Coder 32B and they only did a little better.


Seems a very difficult problem to produce a response just on the text given and not past training. An LLM that can do that would seem to be quite more advanced than what we have today.

Though I would say humans would have difficulty too -- say, having read The Three Body problem before, then reading a slightly modified version (without being aware of the modifications), and having to recall specific details.


This problem is poorly defined; what would it mean to produce a response JUST based on the text given? Should it also forgo all logic skills and intuition gained in training because it is not in the text given? Where in the N dimensional semantic space do we draw a line (or rather, a surface) between general, universal understanding and specific knowledge about the subject at hand?

That said, once you have defined what is required, I believe you will have solved the problem.


Eczema sufferers have been doing this for a while


i wonder if we’ll ever get to a point where we don’t think about storage at all. or will we always fill up whatever storage that’s available.


For a lot of applications we have pretty much reached that point already. 2TB NVMe SSD are around the $100-$150 price point these days. Unless they are actively trying, the average desktop user is never going to fill that up. There are only so many holiday pictures you can take, after all.

I think the size of audio files is a great example of why storage needs won't infinitely grow. Although we have orders of magnitude more storage space these days, audio files haven't really gotten any bigger since the CD era. If anything, they have gotten smaller as better compression algorithms were invented.

The thing is, human hearing only has so much resolution. Sure, we could be sampling audio at 64-bits with a 1MHz sample rate these days if we wanted to, but there's just no reason to. Similarly with pictures: if you can't see any pixels standing a foot away from a poster-sized print, why bother increasing the resolution any more?

The big consumer-range data hogs are 1) video torrents, and 2) games. Both of them have a natural upper bound due to human perception. They might still grow by an order of magnitude or two, but it won't be much more before it just becomes pointless.

Enterprise is a bit of a different story, of course - especially now that AI is rapidly increasing the value of data.


I don't think holding human perception as a standard for video games holds up. For audio and video it makes sense, but we are a long way of for video.

For video games there is a huge "storage waste" factor. A factor that I think is more important than the human perception limit. If you look at modern games you can probably throw away double digit percentage signs of most games if a capable team would have the time to optimize for disk space. It's simply not done because it has little advantage. I think this wastage factor will scale with the complexity of video games regardless of graphical fidelity.


> 2TB NVMe SSD are around the $100-$150 price point these days. Unless they are actively trying, the average desktop user is never going to fill that up. There are only so many holiday pictures you can take, after all.

4k family videos would like to have a word with you.


Big brother potential aside, I could imagine a future in which everyone wears a body camera everywhere. A digital record of your entire life. Given increasingly good AI extraction, you can then have searchable transcripts of your conversations, query your life for the last time you saw movie X (interesting IP questions here), re-experience moments with grandma, whatever. There was a Black Mirror episode incorporating this concept.

4k video * lifetime is greater storage requirements than available to consumers today.


If you find such a concept interesting, I would recommend the movie Final Cut (2004) with Robin Williams. Not a particularly good movie but it does have this as an interesting premise.


resolution beyond human perception is useful, for watermarking, data tracking, EXIF, DRM schemes, in/ex filtration in general.


> There are only so many holiday pictures you can take, after all.

You miss the point, sadly.

Yeah storage is getting larger and cheaper but modern cameras/phones/etc take photos that are way larger than they used to do.

My first digital camera in like 2001 or so could only take either 20 "large" (640x480) bitmap photos or 80 "small" pictures, taking a few tens of kilobytes each at most.

My 2022 iphone se takes high quality pictures that are easily in the 7-8MB range (i just checked).

So yeah, disks keep getting larger, but so does the media.


And image sizes aren't really bound by human perception in any real way. Yes, screens typically range from 2-8MP, and probably won't go much beyond 32MP. But more resolution, more dynamic range, more color fidelity are incredibly useful for editing pictures; and you can always add resolution to add digital zoom.

Photos are however constrained by physics. There are only so many photons captured in a certain area in a given time, so no matter whether we talk about the tiny lenses and sensors of smartphones or the much larger versions on dedicated cameras there's a very real limit (and phones are pretty close to the limits imposed by their sensor size)


I spend so much storage on just storing things that are readily available on the internet, either to speed up local access or because one day they might not be as readily available. I don't think we will "solve" storage for good unless we also "solve" bandwidth.


I think we will always fill up the storage available as we crank up the fidelity of our recordings (simulations). Video is an approximation of reality, and we can always ramp up the resolution, ramp up the dimensions, always striving to simulate the universe at the same fidelity as the universe itself, to the point that the map becomes the territory itself. In other words, until it's no longer approximating, no longer compressing. There will never be enough storage to store our simulations of our reality as long as we're forced to use that same reality to construct the storage.


I'm already at that point, effectively. I spent a few decades worrying about fitting data into hard drives... most recently with my digital photos. I've stopped worrying.

My first computer stored data on audio cassette tapes. When I started college, the new hot machine in the server room was a DEC VAX 11/780 running VMS.

For nostalgia purposes, I have a virtual VAX 11/780 running VMS in my phone, it only takes about 3 GB. I don't do much else with the phone, so I have plenty of room left over.

I can imagine ways to fill up a petabyte personally... but not much past that.


Only if we reach a peak fidelity level. Since storage will never be infinite therefore unless we reach a level of media quality that is good enough for everything it will always be an arms race.


I think even if we got storage that could store enough information to be indistinguishable from reality itself, we would still want to save variations, clips, duplicates, intermediates... I don't know that there is a peak fidelity.


We are already doing that, though.

The thing is, there are only 24 hours in a day. There is a hard upper limit to the amount of content you can consume. You're not going to be downloading 1000 hours worth of content every single day, just because it is physically impossible to use it.


Tell that to /r/datahoarders. Always amazed by what people are archive in that sub reddit


My 500 GB HDD has been way more than adequate for the past 10+ years, and that includes having a Windows 8 partition that I haven't booted into in years.


having a hard time calculating what the pricing is for this


Oddly, I don't see anything about pricing for Workers AI on the Workers pricing page[0] but their Workers AI blog post from Sept 2023[1] says the pricing is per 1k "neurons":

> Users will be able to choose from two ways to run Workers AI:

> Regular Twitch Neurons (RTN) - running wherever there's capacity at $0.01 / 1k neurons

> Fast Twitch Neurons (FTN) - running at nearest user location at $0.125 / 1k neurons

> Neurons are a way to measure AI output that always scales down to zero (if you get no usage, you will be charged for 0 neurons).

Here's the key detail:

> To give you a sense of what you can accomplish with a thousand neurons, you can: generate 130 LLM responses, 830 image classifications, or 1,250 embeddings.

[0] - https://developers.cloudflare.com/workers/platform/pricing

[1] - https://blog.cloudflare.com/workers-ai/


How many dollar bills does it take to make a pile worth sleeping in?


Like with most serverless functions


having a hard time calculating why anybody needs this/wants this


productionizing ai models is a pain, this makes it easy. say you were building a d&d app and wanted to generate character art, this would make it very easy to get started. aws has similar offerings (e.g sage maker) but it’s not on the edge.


It seems cheaper than the OpenAI API and is very easy to use from a worker.


I see it more as a convenience feature for people already using CF Workers


Didn't know tarsnap was so successful


Sebastian's videos are great fun. You should also check out his series about a geographical game he built https://www.youtube.com/watch?v=sLqXFF8mlEU&list=PLFt_AvWsXl...


> This is the most important tip: avoid side projects with user accounts.

> If you build something that people can sign into, that’s not a side-project, it’s an unpaid job. It’s a very big responsibility, avoid at all costs!

Interesting for sure. But what about those of us that hoard side products?


Interesting to see stable diffusion find more and more practical use cases.


isn’t this just stable diffusion emitting vector images? it honestly seems like a way to harvest emails more than anything.


Stable Diffusion just generates images so they are doing something on top to convert to SVG. Hard to tell how well the svg-ification is of the resulting image.


Using Fargate for on-demand provisioning remains slow for us (> 60sec). Without Docker image caching support (1) and working within a VPC (2), it easily takes 2+ minutes.

(1) https://github.com/aws/containers-roadmap/issues/696

(2) https://stackoverflow.com/a/67398011




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: