Hacker News new | past | comments | ask | show | jobs | submit login
Enigma Public: Broad collection of public data (enigma.com)
116 points by merinid on June 21, 2017 | hide | past | favorite | 15 comments



Sorry to be that guy, but it would be great if that left hand nav would auto collapse into a hamburger button on mobile. It's taking up sooooo much space, and it shrinks the content to the point where it's unreadable.


the layout is pretty bad on desktop too... (with the data that I tried to look at, probably 75% of the page is blank: http://imgur.com/a/ecIKj)


You bring up a good point, IMO: the "Open in Data Viewer" button is far-too obfuscated. If I hadn't used Enigma before, and hadn't already known what was contained in a previously-visited dataset, I too would have assumed that I hit a dead end because the button was the last thing I noticed on the page.

The horizontally-flowing layout is problematic overall, but placing that "Explore Data" button much higher in the right-most column would be a decent compromise: http://imgur.com/a/lS7ks


This is really really good.

I follow the many emerging "collect lots of public data and make available" services, and I think this is one of the better ones I've seen. The data looks quite wide ranging too.


I always feel like gathering data and formatting correct data sets is the most important part of data science. Back during classes where you're taught machine learning require perfect data sets for you to learn the algorithms. And then when you're analyzing data, you'll use a library that runs these algorithms for you. Which means you'll spend the longest time making sure that you're data sets are correct and easily useable.

I love working on these types of projects. Things like scraping sports stats (which may or may not be considered legal, but not like I'm trying to make money for them), data from posted articles, music data, data about locations of public restaurants, etc.

Finding some place that offers correct public data is fantastic to see.


Why is the 'United States' not below 'Governments' like all the other countries ?


Optimizing for their audience maybe? Similar to sites putting "United States" at the top of a countries dropdown even though everything else is in alphabetical order? Just a guess.


I've interacted with Enigma folks throughout the past few years, have always been impressed with their work and methodology. I've had friends who've worked at other massive-public-data-gathering startups, it sounds like a tough business, since collecting/cleaning data is hard, but having data isn't alone a competitive edge. Don't know if Enigma will find success with public data (though their offerings go beyond data, but enterprise platforms apparently), but I've been impressed at the scope of their collection and ability to wrangle data into a standardized structure.

Here's one example: Senate lobbying disclosures. Enigma has taken the original XML data sources and created several flat tables (lobbyists, issues, reports) that can be linked through foreign/primary keys: https://public.enigma.com/browse/lobbyists/09264ee1-792f-445...

Here's what the raw material looks like:

https://www.senate.gov/legislative/Public_Disclosure/LDA_rep...

Excerpt: https://gist.github.com/dannguyen/7588b8334f5c8954d2c2b13bc4...

I've written my own scripts to clean up and organize this shitshow but it's nice to have Enigma to double-check against, or even get ideas on how to structure things. What's just as impressive to me is the work put in the taxonomy of datasets, e.g. United States > U.S. Senate > Lobbying Reports.

For less data-savvy users, just having a Google-like simple search bar is great for discovery of datasets that contain a term of interest: https://public.enigma.com/search/google

Note: Enigma has had offered this public data for free before, you just had to sign up for an account to even browse the data. This public interface is much nicer, especially for sending people links. Haven't tested out the export functions or the quotas, but in the previous incarnation, free accounts got a huge number of downloads a month.


Wow that's a lot of data! Is this free? I couldn't find pricing information. How is it funded?


It looks free (Creative Commons)—they have enterprise clients to keep it all afloat.

"We release all of our data under a Creative Commons License for anyone in the open-source or civic community to freely build upon and extend. We regularly collaborate with hundreds of journalists, not-for-profits, governments, and many other committed and curious people to help put data to good use. If you’d like to help out or suggest a project, please get in touch.

For commercial applications of our data services, we have solutions for enterprises of every size. Please contact us, and a member of our sales team will help you find the best solution."

https://www.enigma.com/public-data


> under a Creative Commons License

Looks like specifically CC-BY-NC 4.0


It looks like Enigma dog-foods their own data infrastructure products (Concourse / Assembly?). The public data thing inspired them to build better tools for themselves and sell them. Big lesson in that: sometimes what you learn along the way will be the most valuable.


Is there a REST interface or do we only get (bad) browser GUI?



What has happened to the initial idea of developing a platform for distributed homomorphic encryption?

EDIT: Seems like I am confusing it with http://www.enigma.co/




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: