Great to see this on HN! I'm one of the openFDA core team members and would love to help people who are interested in using the public drug adverse event API. It's good to note that we've also released all of the source code behind the platform (https://github.com/fda) and are actively interested in having members of the community help us make improvements.
Please do ping me if you have any questions about the API or want to learn more! sean.herron@fda.hhs.gov
I'm sure a lot of people, myself included, feel that government projects would be better and cheaper if they were developed as open source rather the the typical proprietary solutions developed by contractors that we see today.
What's your take on government and open source projects like this one?
Completely agree. We built openFDA from the beginning with the mindset that everything we produce will be open source. Our hope is that users of openFDA can help us make the API more efficient, return better data (we do a lot of cleanup), and independently verify our methodology.
Beyond improving our own site, it would be absolutely fantastic if someone took openFDA and spun up their own copy. That could be another government agency using it to serve up different data, an external group mirroring openFDA in case of government shutdown or other issue, or a company that uses our code to build something innovative.
I know that sentiment is shared among a lot of agencies right now. In particular, 18F (https://18f.gsa.gov) is a new digital services delivery unit that is looking to do this at a huge scale across the federal government.
The current state of federal IT contracting is so horrendous, that it is worth trying something if it has a 50% chance of failure, but how much harder is it to get contractors to work on a project if your contract mandates that it be open source or Free Software?
seanherron, do you work for a contractor or are you an in-house developer for the FDA or another federal agency?
I'm a federal employee serving as a Presidential Innovation Fellow (http://whitehouse.gov/innovationfellows) working on open data initiatives at the FDA. We worked with a contractor to build the platform and were happy to find they were incredibly excited about the prospect of open source.
We're Iodine (http://www.iodine.com), a consumer health startup based in SF. Our backgrounds are varied but many of us came from Google, including three of us from the search team. We're big believers in both open source and open data and we're excited to be part of openFDA!
The post mentions that "The FDA will continually work to identify additional public datasets to make available through openFDA" - do you guys already have an idea on what datasets are coming next?
Its a great Initiative for people working in Drug repositioning especially, thank you for working towards bringing about such a nice technological api that suits bioinformaticians like me... I have a question on why do you limit the Api calls to 60000 with a key per day, what is stopping you from setting an higher limit..
We set it to 120req/minute and 60,000req/day to ensure that load on the system isn't too high at launch. Over the next few weeks, we'll be adjusting the limits based on the traffic patterns we see.
As noted in the documentation, if you need more than 60,000 per day, give us a ring at open@fda.hhs.gov.
Huge shout out to api.data.gov as well - all of our key authentication and analytics are powered by their open source API Umbrella platform.
I've found myself more and more depending on raw data dumps rather than APIs, so was extremely happy to see mention of this on the openFDA page. Are you currently offering this or is that still to come?
I've been doing luigi pipeline work recently, I might see if I can get yours running and get some pull requests in :)
Hey guys, I'm an owner of DrugCite.com, we've talked to the FDA a few times over the last few years while building our site (They contacted us at various points with questions). They let us know last August they would be releasing this site but I guess I never thought it would so close to ours. Examples:
Our Drug Page:
http://www.drugcite.com/?q=ABILIFY
FDA Drug Page
http://open.fda.gov/drug/event/
Looks like they're including some data we didn't previous have access to or know about. Anyways, make no mistake this is huge and will be incredibly useful for doctors and patients everywhere, this is some great data. This is the type of data that should be investigated before you take any medication, prescription or not. I can see some of this data becoming common label information shortly.
It's really encouraging to see that there's someone in the U.S. gov't who not only cares about open source and the associated effects of transparency, but has some practical experience* in it.
The openFDA website is built on Jekyll (https://github.com/FDA/open.fda.gov) and its API is powered with Python and Node.js (https://github.com/FDA/openfda)...It's not just the framework/current-tooling that is nice, but that such systems use open, readable formats (such as Markdown for the web pages).
The current administration has always paid lip-service toward open-source...they won't satisfy people who think "open source" and "government" means hand over just about everything...but they're doing a good job making inroads on the parts of the U.S. data interfaces that were well-intended, but so obfuscated by poor design that it was a job in itself to parse/scrape their sites.
(FDA has always had really exhaustive dumps of their data...strewn about their legacy site...the API isn't as interesting to me as the documentation for the API and the pipeline of data)
* I don't want to just slag on Drupal...but Drupal was what Obama's head tech officer wanted in place, and to their credit, they did open-source parts of their custom Drupal modules...which were not particularly useful, because of the particulars of Drupal's module system and its quickly changing API...nevermind being only useful for other Drupal installations. But a lot of credit has to go to the U.S. gov't for pivoting off of Drupal to a mix of WordPress, Jekyll, and even node.js sites with less coupled components. It's been only about two or so years since Data.gov open-sourced its Drupal components before promptly switching to WordPress and CKAN modules...considering how a non-significant number of the fed sites are built on 12+ year-old code...the turnaround in the U.S. gov's stack is pretty amazing...(when it's not attempted on a service-critical site, such as healthcare.gov)
It's a very big semantic database of health terminology. Among other things, it has a subset called RxNorm that contains all currently prescribe-able prescription drugs.
I've been very impressed with it and I feel like not enough people have heard of it.
I have some history working with the Adverse Event Reporting data. The API is nice, but it just exposes what they already offer in flat files .... stale data. Does it seem reasonable to you that the FDA runs a year behind on this data?
Awesome initiative! As I was testing it, I noticed that the response consists of prettified JSON. I'm guessing that all that whitespace can be removed to save bandwidth?
We support gzip on the api json response for clients that support it. Given that, I'd expect the size improvements would be minimal for whitespace stripping but let us know if you have evidence to the contrary!
Good point! I always blindly assumed that removing whitespaces would lead to a decent size improvement, even after gzipping, so I ran a small test (gzip on linux, default parameters):
The raw json contains the whitespaces, while they were removed in the minified json. So there is a 47% improvement for the uncompressed version, and a 21% improvement for the compressed version.
What would be interesting to see is how the second (compressed) number scales with the filesize (I don't know enough about compression algorithms to guess that).
EDIT: I really don't know how to format a table in plaintext...
I hope the NSA will do the same and also open their datasets... maybe someone could catch someone who plans some "terroristic" actions before they happen...
Please do ping me if you have any questions about the API or want to learn more! sean.herron@fda.hhs.gov
Also, here's a direct link to the API documentation: https://open.fda.gov/drug/event