Hacker News new | past | comments | ask | show | jobs | submit login

There's projects on this - one of the best known ones (to my knowledge) is the Stanford IP Litigation Clearinghouse Project:

http://www.lexmachina.org/




Yes, litigation and compliance tend to lead the way when it comes to extracting meaning from legal data pools. In my opinion, the single biggest obstacle to getting legal knowledge to play nice with software is the fact that it is all "silo'ed" due to: (1) being in MS Word format, (2) being confidential information, and (3) the lack of conventions/standards in legal documents.

The good news, though, is that legal documents tend to follow a fairly narrow channel of variations, when isolated to particular practice areas (e.g., leases, sales of goods, service agreements, motions, etc.)

I've always wanted to run a huge number of documents through Beyesian filters or something similar to develop some interesting classification rules, but it's damn hard to get a pool of representative documents that isn't strictly confidential.


I am doing this with patents. Google "Killing patents with Python" if you want to see the presentation.


Shoot me an email. I didn't find your presentation. I did find a Yahoo Answers on "Can a 4 foot long ball python kill a 4 lb. kitten?"


you need to keep the quotes in the search. doing that I found this link

http://topsy.com/pycon.blip.tv/file/4879824/?allow_lang=en

which then links to a video of the presentation (watching it now)

http://blip.tv/pycon-us-videos-2009-2010-2011/pycon-2011-how...


Rusty or Sol?


The only word-processed files that might be non-confidential I can think of are contracts made in the past couple of decades between companies that have both declared bankruptcy. Either that or public EULAs.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: