Hacker News new | past | comments | ask | show | jobs | submit login
AT&T Invents Programming Language for Mass Surveillance (2007) (wired.com)
95 points by leokote on July 4, 2013 | hide | past | favorite | 13 comments



I managed to track down version 2.0.2 of Hancock, circa Nov. 2010. Haven't a chance to even quickly glance through it, but here's a GitHub mirror for those so inclined: https://github.com/mqudsi/hancock

I'm more interested in why it's vanished from the web.


> I'm more interested in why it's vanished from the web.

Looks like it might just be corporate bitrot. It was hosted under user webspace at http://www2.research.att.com/~kfisher/hancock/ rather than somewhere more durable. When kfisher left AT&T in 2011, her account & webspace probably got wiped.

She's now a DARPA program manager: http://www.darpa.mil/Our_Work/I2O/Personnel/Dr__Kathleen_Fis...


> Programs written in Hancock work by analyzing data as it flows into a data warehouse. That differentiates the language from traditional data-mining applications which tend to look for patterns in static databases.

Seems like Clojure and Storm:

http://pseudofish.com/storm-a-real-time-hadoop-like-system-i...

http://www.ibm.com/developerworks/library/os-twitterstorm/

http://www.infoq.com/presentations/Zolodeck

https://github.com/nathanmarz/storm/wiki/Rationale


Was not this how Thin-thread was supposed to work?


From the manual, a brief explanation of how AT&T uses it (or at least, did some years ago):

Hancock’s stream construct models data that can be viewed as a sequence of values in a fixed format. Typical examples include records of telephone calls on a long-distance network, session logs from an Internet service provider, and billing records from a credit card company. Hancock constructs make it easy to filter streams to remove unwanted records, to sort streams to improve access locality and hence performance, to detect user-defined events in streams, and to execute user-specified code in response to those events.

[...]

At AT&T Labs we have a suite of Hancock programs that run daily to calculate signatures or profiles of AT&T’s long-distance customers. These signatures are used for fraud detection and marketing. The programs process roughly nine gigabytes of stream data daily. The most complex application produces a Hancock map that stores values of 120 bytes for over 300 million active keys (from a space of 10 billion possible keys). These maps require roughly seven gigabytes of space on disk.


This sounds like the sort of tool your credit card company would use to freeze your card if it shows up at POS in different states at too short an interval, buys gas 10 times in an hour, etc.

Real-time data analysis is useful for surveillance, but there are legitimate uses as well, credit card fraud being one of them.


It seems that the original software http://www2.research.att.com/~kfisher/hancock/ is no more available. Any recent mirror?



Thanks for that link, I was able to download the 1.5MB .ps format of the manual using it. I was unable to download the 420k .pdf format, it crapped out after 128k.


> ... the FBI has been asking Verizon for "community of interest" records ... Verizon, though, doesn’t create those records and couldn’t comply.

Ahem.


Isn't the programming language C and this; just a tool written in that language?


It looks like there are some syntactic extensions, particularly in iteration.


Seems fun that AT&T took this down




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: