Hacker News new | past | comments | ask | show | jobs | submit login

The fork lives at https://github.com/OpenIB/OpenIB.

IPs are hashed at https://github.com/OpenIB/OpenIB/blob/master/inc/functions.p.... Hashing them was part of the point of the fork.

  $identity = crypt($userIP, '$2a$07$' . $hashSalt . '$');
poster_id is probably called with the hashed IP address.



I won't say I didn't miss anything, but it seems that function is used for 3rd parties, and the real ip is still stored in the database anyway at least until deletion of a post

    $query = prepare(sprintf("INSERT INTO ``posts_%s`` VALUES ( NULL, :thread, :subject, :email, :name, :trip, :capcode, :body, :body_nomarkup, :time, :time, :files, :num_files, :filehash, :password, :ip, :range_ip_hash, :sticky, :locked, :cycle, 0, :force_anon, :embed, NULL)", $board['uri']));


>poster_id is probably called with the hashed IP address.

But even if the database was wiped, as far as I can tell, poster_id still uses the old method posted below. So as long as the secure_trip_salt hasn't changed, which probably wouldn't unlike $hashSalt which was meant to rotate, and you could get the threadid and postid from a screenshot or archive, by running the range of ipv4 addresses through the function, you would get due to the uniformity property of sha-1, approximately 256 aliases of which one is from the poster.

So if I was the FBI and 8chan was not able to provide IP addresses, I would ask what the secure_trip_salt was and generate the table and match it to all the posts in that thread or at least the most significant ones, then trim down the matching IPs. After that I would query to isps or other entities who own or monitor the remaining IP addresses, and using the timestamp, get the identity of the person, if they weren't using a secure VPN that does not log and I could not deanonymize.

------------------------------

Poster id still is used as follows:

The files:

templates/post_reply.html templates/post_thread.html

Contain:

    {% include 'post/poster_id.html' %} 
And poster_id.html contains:

    {% if config.poster_ids or (mod|hasPermission(config.mod.show_ip_less, board.uri)) %}
     {% if post.ip == config.tor_ip_hash %}
       <span class="poster_id" title="This user is posting via the Tor hidden service.">000000</span>
     {% elseif post.thread %}
       <span class="poster_id">{{ poster_id(post.ip, post.thread, board.uri) }}</span>
     {% else %}
       <span class="poster_id">{{ poster_id(post.ip, post.id, board.uri) }}</span>
     {% endif %}
    {% endif %}


Which still call the poster_id function in https://github.com/OpenIB/OpenIB/blob/master/inc/functions.p...

which returns:

    return substr(sha1(sha1($ip . $config['secure_trip_salt'] . $thread . $board) . $config['secure_trip_salt']), 0, $config['poster_id_length']);


The "ip" field used to hold the plain IP address, so it's still called that, even though it now holds a hash.

The post() function fills it in like so:

  $query->bindValue(':ip', isset($post['ip']) ? $post['ip'] : $identity);
It's called from https://github.com/OpenIB/OpenIB/blob/master/post.php#L988, but $post never gets an 'ip' key, so it always uses $identity (which was created using getIdentity(), which currently hashes the IP address).

I think the method you describe for checking whether a post could have come from an IP address would work, if they gave up all the relevant salts. secure_trip_salt isn't supposed to change. hashSalt is also needed, because the IDs are generated using the hashed IP addresses, but it's changed infrequently from what I remember (changing it logs out all moderators because sessions are tied to IP addresses, so it's easy to notice).


Thanks for locating that line. I think between the two of us we've figured out all the relevant parts of the system and assessed what is going on with post ids and ips in the database.

At minimum, from my understanding, ipv4 addresses look 100% recoverable with the database and $hashSalt, and 3 upper octets recoverable as long as you have hashSalt, secure_trip_salt and an archive of the thread.

Of course this is just from the board software perspective, so the next layer in assessing the privacy of the users with regards to what the FBI can get, is the server, hosting, and upstream providers whether intentional or otherwise may have additional identifying information , for example cache or logs that can be correlated with the posts.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: