Hacker News new | past | comments | ask | show | jobs | submit login

Yup, someone's suggested that. I might do it going forward, if it's allowed, but for the first day I tried to prioritize getting it out there.

Also, I have another project for which I store a huge amount of data, and as it turns out, data costs a lot of money to store! Depends on the size of the list, I might be better off paying for the Domainr API.




According to a quick google search there are about 367million domains names registered. The max length is 253. So:

  367000000 * 253 bytes = 92.8GB
As you only need to know the domain exists.

In reality it would be smaller than that as the average length will be much shorted. I expect 10-20GB

You could get clever and use a bloom filter to save space, falling back to the api.


Just checked, latest zone file for .com is 4.9GB compressed, .net is only 456 MB compressed (2.2GB decompressed). Not sure how large .com is decompressed as CZDS is giving me download issues for .com today but some rough extrapolation from .net would land around 24GB.


I will try it in the next few days and report back. Most likely a combination approach would be the best - check against the known data, then double-check with API on case-by-case basis. I might require login to get extra data, so it would be just a subset of users doing a deep-dive. Thank you!


I’ve downloaded a copy in the past. It’s a large zone file full of NS records (obviously?) and compressible to a high degree.

https://czds.icann.org/home


Also, this is only .com domains being suggested, right? So you wouldn't need the entire dataset, just .com domains.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: