Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: Birdseed – A silly way to get random numbers by hashing tweets (github.com/ryanmcdermott)
68 points by ryansworks on Nov 16, 2015 | hide | past | favorite | 22 comments



i know most people talk about the technical aspect of things, but i got to hand it to you, that's an amazing name


Thank you! It was better than my second choice which was "Bird of Pair-a-Dice"


I always love seeing clever ways to be random. Hadn't considered this one but it's a pretty good idea, and I'm sure there are even better ways to run with the basic concept (maybe analyzing color values of recent pinterest or instagram photos).

If you can find a private source of community-driven-randomness that'd be even better.


I wonder whether pictures would be a good source. I could see strong leanings towards blue (sky, water) and green (vegetation), and maybe also grey (concrete, asphalt). Would be interesting to do a collective histogram over a wide array of images and see if colors tend to average out or if there's clear peaks.


There's still plenty of room for enough entropy to power most use cases. Even two people trying to take identical pictures should have enormous differences when analyzed at the pixel level.



That's a very intriguing idea, I hadn't considered the possibilities of looking at public/private photos and sampling color values across to get some randomness. Private is certainly the key. With something like Birdseed, the randomness is totally public just as it is when getting randomness from atmospheric data. If someone figures out which wisps of clouds you are sampling or what search term you are using on Twitter, then the jig is up!


Other sources of randomness are https://en.wikipedia.org/wiki/Special:Random or https://news.google.com. The URLs could be parameterized to request a random language, too.


Well for the Wikipedia link you'd just be piggybacking off their random algorithm rather than consuming organic entropy. Regardless, they have some information on how the page is chosen here: https://en.wikipedia.org/wiki/Wikipedia:FAQ/Technical#random


maybe that'll be the next in the "twitch plays" series!


This is great! I did an experiment a long time ago, before Twitter closed their JSONP API, to make brownian motion visualizations with a similar concept.

http://binarymax.com/brownian_2.gif

I'll have to find some time to recode it to use this service :)


i wrote a paper about leveraging public random streams like this one. it can be downloaded here https://drive.google.com/file/d/0B9IkyvYlZZe7TldTRGlSMnpQX0U...


Stock market data does seem like a better source of entropy than Twitter queries.


I would tend to agree with you, though my tune might be different if I were an investment advisor!


I love it! Quick idea -- though I'm not sure if this would further the goal of having fun, even as a PR -- there's a documented way to make Birdseed subclass random.Random (https://hg.python.org/cpython/file/2.7/Lib/random.py#l72) and inherit the familiar interface and fancy methods.


To reiterate: "This is for fun. It's not secure. Don't use it in production :)"


I know that this has a disclaimer that isn't to be used in production. But, just know that hash functions are not there to provide randomness. There is no guarantee that a hash will be statistically indistinguishable from random noise.


There is no guarantee that any PRNG will be indistinguishable from random noise but a properly designed hash function will be as close as any.


Sorry - guarantee was probably too strong. I meant it in the mathematical sense (no more than a negligible chance of distinguishing from random noise) [1].

[1] https://en.wikipedia.org/wiki/Cryptographically_secure_pseud...


There are a bunch of CSPRNGs built out of hash functions. Are they all doing it wrong?


Of course, I'm sure that it is possible to construct one out of a hash function. I am also pretty confident that they do more than just serve the raw bytes of the hash.


I think you're technically correct that there's no guarantee, in that it's not part of the definition of a hash. A magical function which somehow returned an incrementing counter value for each unique chunk of data you fed to it, globally, would fit the definition of a cryptographic hash.

Real-world cryptographic hash functions, however, just try to approximate a random oracle. They attempt to achieve pre-image resistance and collision resistance by making their output look random. Certainly that's the case with SHA-224, which is what this code uses.

Some real-world CSPRNGs do just use hash functions directly. Linux's /dev/random implementation, for example, just returns a SHA-1 hash of its entropy pool contents. Yarrow (used in Mac OS X, iOS, and FreeBSD) does a final pass on its output using a block cipher, but requires that the hash function used in its earlier stages produce random-looking output. Fortuna is similar.

Of course, this code is insecure and should not be used in production, regardless of the internal details, simply because all of the inputs are known to a third party i.e. Twitter.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: