_ose7's comments

_ose7 · on June 22, 2017

They currently have 150B posts (https://www.tumblr.com/about) so I expect that to be quite challenging. If you seriously make any progress though, it'd be cool to see.

awalton · on June 22, 2017

Maybe 1:100 of those (being incredibly generous) are actually unique posts and not someone else reposting something from someone else, so maybe it's not such a hard task...

tangent128 · on June 22, 2017

It's fairly common for commentary to be added to a reblog via tags rather than after the quoted post, so even reblogs without commentary may need saving.

jandrese · on June 22, 2017

This assumes there is commentary on Tumblr worth saving.

PhasmaFelis · on June 22, 2017

I'm guessing you've bought into the meme that Tumblr is literally nothing but left-wing politics.

jandrese · on June 22, 2017

No, the comment section on an average tumblr page:

X has shared this Y has shared this Z has shared this ....

Actual commentary? Not so much.

On a side note I tried going back there to see if the situation has improved and somehow the interface is even worse now. The blog I was trying to read would only appear as a slide in on the side and would disappear at the drop of a hat. I don't even understand how you are supposed to use it now.

lmm · on June 22, 2017

Are you saying that meme is inaccurate?

breakingcups · on June 22, 2017

Are you implying it's not?

lmm · on June 22, 2017

Yes. The meme aligns with my own experience of tumblr, so I'm inclined to believe it over contrary anecdotes (of course a more rigorous study would be a different story)

amiga-workbench · on June 22, 2017

That and porn, and Stephen Universe fans.

nickpsecurity · on June 22, 2017

Step 1: Hack into a supercomputing center with at least 1Gbps line.

Step 2: Massively-parallel downloading of all the sites using clustered nodes, compression of it, and resulting data stored into high-performance, clustered filesystem.

Step 3: Move it off of there when traffic is low or overnight if system doesn't go offline overnight.

_ose7 · on June 22, 2017

Step 0: Acquire at least ~1PB of storage to store all the data.

toomuchtodo · on June 22, 2017

I have over 1PB of storage at my disposal.

_ose7 · on June 22, 2017

Business or personal? It's doable but it's a lot of money to buy all those drives.

toomuchtodo · on June 22, 2017

Personal.

boulos · on June 22, 2017

$6k for 600T fully assembled backblaze storage pod: https://www.backuppods.com (no affiliation, just pointing out that ~$12k isn't a massive amount).

noisem4ker · on June 22, 2017

Note that hard drives are not included.

1000TB / 8TB = 125 HDDs

125 * $200 = $25k

boulos · on June 23, 2017

Aww! Damn my quick posting (and too good to be true!). Thanks.

nickpsecurity · on June 22, 2017

It's in the supercomputing center. How aboug I modify it where a filterimg step is run deleting everything that doesn't match on desired image festures?

_ose7 · on June 22, 2017

> Step 3: Move it off of there when traffic is low or overnight if system doesn't go offline overnight.

Where are you moving it to? You think that even if you manage to hack into a "supercomputing center" that nobody's going to notice 1PB of storage filled with GIFs?

toomuchtodo · on June 22, 2017

Having worked in a supercomputing center for a detector at the LHC, yeah, you could probably get away with storing a few hundred TBs for a short period of time without anyone noticing or caring. A whole petabyte might be pushing it.

nickpsecurity · on June 23, 2017

And I learned about it from you people. All of them said security was lax. Most of them said they personally were using the supercomputer for their own stuff at some point.

toomuchtodo · on June 23, 2017

What's this "you people" bub? :) in this case, it's not a security issue, it's a resource accounting issue.

nickpsecurity · on June 23, 2017

People working in and around HPC centers. Obviously. :)

And no, it's both accounting and security issue. One guy I know who does security in ASIC's that stole HPC time in the past did it by modifying the accounting system to not show his jobs. It was easy as it wasnt designed to stop accounting fraud by hackers.