Hacker News new | past | comments | ask | show | jobs | submit login

It still boggles my mind what mental & legal gymnastics are passed through to get 3 years of funding for what looks like someone last night put together a RAG of a cbp data dump that you get for free from their site.

Not saying it's not useful, just fascinated at what excuses have gone in endless meetings for 3 years to do whats basically an afternoons worth of work.

Yes I mean that literally. You start off an aws rag app or a huggingface container, you download the CSVs fom cbp public portal, and you add a sh*tty backgroundcolor and a border glow that doesn't go all the way around a button. 10-15 more years of this and maybe we'll get the next dropbox again who knows.




Honestly, if you or anyone else has a really easy way to get all communications/publications from every government/government-adjacent body, I genuinely would love to hear it. Especially at the regional/local level, cause those can be a real pain, and we'd love to save ourselves some time


At least for the US, the Federal Register would be a good place to start if you're not scraping it already: https://www.federalregister.gov/developers/documentation/api...

Federal news radio: https://federalnewsnetwork.com/

Government procurement announcements and opportunities: https://sam.gov/reports/awards/standard

In general, you might get more traction if you tilt this toward government contract capture a la Deltek GovWin and other companies. Lots of money in contract intelligence as well as proposal support (in addition to government relations / lobbying)


Yep we get the sources above. We've definitely thought about contract capture--we've just been hesitant because it seems like such a competitive space. But it's definitely tempting, and it wouldn't take much to convince us for sure


What's the order of magnitude of the amount of data sources? 100s, 1000s, tens of thousands?


Definitely many thousands, possibly the 10s of thousands depending on what you count as a source (e.g., if you get press releases from one part if an agency’s site, and speeches from another part, one could consider those separate sources I suppose).




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: