Hacker News new | past | comments | ask | show | jobs | submit login

They claimed 2B rows for their voter database. If I understood that correctly, that means they know about 2 billion people.

Do they cover countries other than America? Because the population of America is only 300 million or so; even accounting for false positives (failing to sync two accounts as one person) or fake identities (pet dog has an account), I don't really see how you can get from 300m to 2b.




I'm a developer at Votizen.

Just our voter history (whether a voter actually voted in a particular election) is, on average, 10 rows per voter record. 200 million voter records * 10 history records per voter = ~2 billion rows.


Ah, okay. That makes more sense. Thanks.


It's about ~200M voters (United States only)


I'm guessing that their database isn't voter registrations but individual votes--one person can have multiple votes and registrations recorded over the years.

I still don't see how that could add up to a factor of 10, though. I'm in a similar space and typically people will have fewer than that, although the voter records we have don't go back more than a decade or so.


A ~10 year record could mean ~30 chances to vote, but hardly anyone votes in a majority of those elections. I, too, am curious how they have an average of 10 votes per voter.


I just looked at a rough count - it's 1.1B voter history records. I don't have an offline copy to do heavier analytics on it - just from analyze stats on indexes for now.

Also, after signing up, people can correct/revise their voting records:

http://imgur.com/a/NOj4I

This is because 1) the states themselves don't keep complete records and 2) we don't have all voting history forever, as scarmig pointed out.

scarmig, you're "in the space" I googled and this was 1st: http://www.strike-the-root.com/3/scarmig/scarmig1.html What do you work on?

Sideline, is that related to Lessig's Root Strikers? Seems to be a different Libertarian site w/ unfortunate brand collision. :-/


That's not me, though I'm quite sympathetic to the libertarian critique, particularly the public choice side of things. Otherwise garden-variety left =)

As far as where in the "space" I am, it's a fairly well-established player, though I've been thinking of moving on recently. Just out of curiosity, how much a requirement is Python over there? Was thinking of dropping off a resume...


We're fairly invested in python at this point, but I think smart devs are professional learners. We have a great dev on staff, Emily, who came from Rails and has been immediately productive.

We're definitely not "full stack python" - best tool for the job, etc.


I've been legal to vote for 17 years, and I've voted in every election since I became legal. That's about twice a year for 17 years (odd years sometimes only have one), so I probably have around 30ish voting records.


A 1.7 ratio is totally believable. I've worked with voter databases that suggested higher average ratios, but I'm not sure if that's because the voters have all those options to vote or because it's an artifact of the data model.

Example: http://www.smartvoter.org/2012/01/17/ca/la/

This shows 14 election days happening in 2011 for LA County. But many of those are limited to a particular city or congressional district. I think when the data gets recorded at the state level, some expediencies cause the "available election" count to be distorted.


I haven't used Votizen but in other voting records databases, they record "Yes" or "No" as to whether they voted (and sometimes other information depending on the state).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: