Hacker News new | past | comments | ask | show | jobs | submit login

Amazing. This comment must compare to the famous Slashdot takedown of the iPod.

"For a Linux user, you can already build such a system yourself quite trivially by getting an FTP account, mounting it locally with curlftpfs, and then using SVN or CVS on the mounted filesystem. "

(Not meant as criticism; we're all horribly naive in hindsight.)

No, I think we could have dismissed that one as ridiculous even back at the time.

What power users need to always remember is how important usability by the uninitiated is when adopting solutions. This is why the market for security products is such a problem - they nearly always necessarily either put a barrier in front of common operations, or expose the user to the risk of losing their data by error, or hand all the difficult bits over to a commercial third party.

It's not amazing at all :) I still struggle to find a reason why should I use dropbox over ftp or any other file repository service. I guess marketing and creating artificial buzz played big role in their success.

I get that many techies can live without Dropbox and it's ilk (myself included), but NOBODY should ever be advocating FTP. It's insure (no encryption - unless you're talking about FTP(E)S, but that introduces it's own issues), it's broken by design (no clear client/server relationship which can cause issues for NATing and filewalls (particularly if running with TLS), output specs depend on the host OS (eg directory listings), no automatic way of differentiating between text and binary data so modern FTP clients have to guess from file extensions (picking the wrong mode will break your files)). ?FTP is outdated - from a bygone era we no longer compute in and thus by modern standard it's become horrible in every conceivable way.

Thankfully we have SFTP which natively supports chroot (not all FTP servers do), key-based logins (more secure) as well as passwords, compression, and no fuzzy callback ports like in FTP. Also sshfs is pretty handy too.

If one needs "anonymous FTP" then you can also throw HTTPS into the list of better solutions: TLS encryption, compression, smarter handling of MIME types, and again no stupid fuzzy callback ports.

I don't often say things this strongly, but FTP should die.

FTP needs no defending -- it was really useful in 1979, but times have changed (e.g. I suspect every machine on the Internet uses an 8-bit byte). One point you wrote surprised me though:

> no clear client/server relationship which can cause issues for NATing and filewalls (particularly if running with TLS)

Really, crocks like NAT and stateful firewalls should die. Layers 4 and below are inherently peer-peer -- the net should not treat endpoints differently (i.e. should not privilege some over others). That simply encourages a "client" or "consumer" mentality in both the technical and social senses.

The thing is while NAT is horrible for what you're saying, it probably did more to improve security than anything else, which wasn't it's primary goal.

I remember what the internet was like when ADSL/cable models first came along. Everyone was getting pwned none stop. Any RCE could easily be applied by scanning a consumers DSL/cable IP pool and you'd be able to hit a very high %age of them.

NAT totally stopped this.

It was the firewalling that stopped those attacks. Granted you could argue that the firewalls only came popular in households because routers were shipped to address a need for NATing but pragmatically we really should have been installing firewalls on our PCs in the pre-router days of the internet.

> The thing is while NAT is horrible for what you're saying, it probably did more to improve security than anything else, which wasn't it's primary goal.

Are you defending NAT? It sounds like a Vietnam era construction: you had to destroy the Internet in order to save it.

We now have a seemingly entrenched tree-structured (i.e. centralized) network again, the very 1960s architecture we tried so hard to get away from.

... and yet billions (?) of dollars are moved around every day using this technology "from a bygone era" (transferring CSV files for ACH transfers, etc.).

I'm not a big fan of FTP and hardly ever use it any more, but it does what it was designed to do and still manages to work pretty well considering how much everything else around it has changed.

Have you seen the list of requirements used for FTPing ACH transfers? It uses TLS (something that isn't part of the original FTP specification - what little of one there was) to transfer PGP encrypted files (something that wasn't even invented when the FTP specification was written) and even with all these extra steps put in there's still a lot of ways the process can easily fall apart. I've spent enough time building systems that interact with these kinds of banking systems to know that using FTP isn't doing themselves any favours. In fact the whole process of working with ACH files is a complete mess and saying ACH still uses FTP doesn't really improve the validity of FTP - it just demonstrates more technology that really should have been depreciated before now.

I'm not the sort of person who advocates new technology for the sake of new technology. I normally get annoyed at the constant reinvention of wheel however some older tech is just bad and FTP is one of those. It got the job done when it was first written but it made a bunch of mistakes along the way. Mistakes we've learned from and have since written a thousand better transfer protocols. So it's about time people laid FTP to rest.

It will when windows file explorer will support sftp out of the box

I gladly pay $100/year to never ever ever EVER have to hear my wife complaining about me losing the pictures of us and the kids. You can't put a price on that my friend. :)

Same reason I pay for iCloud storage on our family account. Peace of mind.

Could I wire up some rsync contraption, sure maybe - would I sleep peacefully? No.

Funnily enough, I wrote a "one-way sync your OneDrive to external disk" tool for my wife to avoid the opposite problem: loss of data due to failures of the paid-for sync solution.

I agree. Having them take care of backups has a lot of worth in my book also

Google photos is better for that.

I don't fully understand the underlying cause, so take this with a pinch of salt... but my girlfriend's father's photo collection almost got wiped out by Google Photos.

After being locked out of his Samsung tablet (supposedly it set itself a lockscreen out of the blue), I checked whether it had backed his photos up on Google Photos, but nothing there... After resetting it (it seems Samsung removed the ability to reset the lockscreen password via your Google account) I assumed that the photos were lost. However upon opening the app we rejoiced when they started appearing. Shortly afterwards, the Google Photos app popped up a message stating that an upgrade was required, after which all of the photos had disappeared again.

The workaround was to reset the tablet, open up Google Photos, wait until the photos had synced and then disconnect from the internet as soon as possible to prevent Google Photos from trying to update itself (the message couldn't be dismissed). My hunch was that the version of Google Photos that shipped with the tablet was very old and they have long-since updated the format for storing photos, hence why they wouldn't show up online?

Methinks you could definitely get in touch with Google about this.

Product support may well actually do what it's supposed to do...

Alternatively, keep the tablet on cotton wool until you see someone mention in here they're from Google, check their profile for contact info and email them directly. Keep doing this until you get a reply back, and get them to poke your issue over to the right department. :P

> get in touch with Google about this


Why didn't you just log into his account on the web and click on photos?

I did. I got a couple of random low-res thumbnails (they were just generic enough that I wasn't sure if they were actually original photos, or Google samples), and that was it – the rest were mysteriously missing.

They were clearly there on the server somewhere, but something was causing them not to be shown on the web, or the latest version of the Google Photos app.

Sure, but I've read a story like this about about pretty much every company. In our house we have our photos in 3 places: the device, remote backup, and local offline backup (external hard drive). 2 is easy, 3 takes a little bit of a routine.

Right, it's why I don't trust these services myself. To be fair, while I say that Google Photos almost wiped his photo collection out, we didn't actually realise that sync had been enabled in the first place. I guess in a funny kind of way it saved his photo collection (otherwise it would have been toast when we reset the device to get past the lock), even if it did make it painstakingly frustrating to retrieve them again.

But one has to make sure they subscribe to the paid plan, because Google offers free unlimited storage only if you let them resize/compress the photos. Which is something I refuse.

Anyway, Dropbox wins on the OS-level integration, which also doubles as a bullshit- and hassle-free way to transfer files between your devices.

But well, at this point I'm a paying customer of both Dropbox and Google because I run out of free-tier storage on both :).

My sister refuses to receives photos by Dropbox because, she says, it's less usable than WeTransfer and it looks like a virus to her. When sitting with her, I noticed:

- Dropbox aggressively suggests to create a new account to recipients,

- She's unknowledgable, so when a popup appears, she assumes it's mandatory. Hence she installed Dropbox without wanting it on her phone and PC and couldn't understand why so many steps are required to download a few photos. Also, "why does it try to upload all my folders ?!?" hence the spammy/virus impression.

- Dropbox sends an Android notification for every upload, which is both annoying and worrying to her, because she doesn't want her private life to go online.

I always assumed the mass went with Dropbox. Turns out the first example I see from that audience thinks it's a virus. Big lesson of user experience here.

I can imagine. My use patterns mostly let me avoid this, but I've seen some of the stuff you've described.

It seems that any good business, in its quest to grow, will eventually start making annoying and user-hostile things. Dropbox definitely was cleaner and better in the past than it is now. It's a story I repeated by company after company - they reach peak quality, and instead of leaving things as they are, they have to "innovate" in more and more crap.

I'm one of the earlier users of Dropbox, so I still have a "Public" folder with an ability to create direct links to uploaded files. They've turned this feature off for new accounts some time ago - I guess because people started using Dropbox as a CDN. Still, it's one of its most useful features for me. I use the "Public" folder almost every time I want to transfer some files people - it lets the recipient avoid all that popups and captive forms bullshit.

I have 1/2 TB of pictures kids pic, videos from cell phone, SLR etc backup to multiple HDD. If everyone put 1/2 TB or more to google, can google backend really handle that, if so for how long?

Also need to consider how long it will take me to download back those pic once they decide to shutdown the "free" service.

> If everyone put 1/2 TB or more to google, can google backend really handle that, if so for how long?

Not everyone is going to do that though (anytime soon anyway) so that's not a real concern. It's like asking when gmail first launched "okay but what if EVERYONE uses the full gigabyte?"

I'll answer in reverse order:

> Also need to consider how long it will take me to download back those pic once they decide to shutdown the "free" service.

This is an incredibly good point, both in terms of bandwidth considerations (particularly their ratelimiting) and in terms of products randomly disappearing with limited takeout windows.

FWIW, https://get.google.com/albumarchive/<G+ UID> will net you takeout archives of your image albums. Incidentally this works with any Google account that doesn't have public photo access turned off, and is rather fun to play with (as is the site: search operator :D)


> can google backend really handle that, if so for how long?

YouTube used to officially report that 300 hours are uploaded per minute, back in 2014. http://tubularinsights.com/hours-minute-uploaded-youtube/ says we're likely at 700hr/min now.

OK. (Been wanting to do this math for a while, actually...) Let's see. This is all back-of-the-envelope and I wouldn't mind some more concrete numbers to work with!

YT reencodes all videos into several formats.

I'm looking at http://youtu.be/1tQ5XwvjPmA, which is 1:20:58 long. It was uploaded fairly recently so has the full complement of encodings. I see:

- 5 DASH audio bitrates: 51k (27.53MB), 66k (31.93MB), and 120k (58.02MB) for clients that can decode OPUS, 89k Vorbis (46.67MB), and 132k M4A (73.16MB)

- 6 DASH video sizes in both WebM/MP4 (so 12 total formats): 256x144 (43.09MB / 63.54MB); 426x240 (39.79MB / 140.34MB); 640x360 (71.80MB / 122.65MB); 854x480 (118.37MB / 266.00MB); 1280x720 (234.63MB / 548.81MB); and 1920x1080 (463.04MB / 1.05GB). (Yes, WebM is amazing compared to MP4.)

- Three legacy video formats: 176x144 3GP (39.51MB), 320x180 3GP (116.05MB), 640x360 WebM (211.30MB), 640x360 MP4 (205.97MB), and 1280x720 MP4 (621.68MB).

So, for this standard, 30fps 1080p video, YouTube is actually storing... 4.51GB of data. Huh! Nice.

If this video is 1h20m, 1-(60/80) means I should subtract 25% from 4.51, and I get 3.38GB for one hour of video.

OK. Taking that figure of 700 hours... that's 2366GB (2.31TB) per minute :)

In other words YouTube needs to find disk capacity for 39.42GB of data every second.

I'm not sure how to multiply by an increasing gradient with a back-of-the-envelope calculation, so I'll punt and pretend it was 700 hours/min all the way back to 2014, so the past 2 years. Quite inaccurate, but possibly still interesting:

(2.31 * (1024^4)) * 12 * 365 * 2 = 22249277495024025.60

Uhh.... that's... ah. 22PB. Err, 19.76PB to be precise.

This is for the boring 30fps-and-under 1080p videos out there. Not the 60fps, 2K/4K/8K (!), 360° and similar stuff, and there's an increasing pile of that being uploaded.

22 PB = total Youtube data need for last two year.

1/2 TB per user (like me)

22PB = 44,000 users.

Google need 1000 times that space in their data centers to handle 44 million users.

Also, I might think those 1/2 TB of data are very valuable, But only a few of them are interesting to a few of my friends, family members. They are probably very hard to monetize. Even for myself, I only browse them may a few times every a few years.

If I am a PM for such product and try to propose to Alphabet to build 1000 new youtube size data center to handle only 44 millions users, I would have hard time to justify it.

FWIW, I'm not familiar with how and where the Internet Archive gets their funding, but in 2014 they had 50PB of storage (https://archive.org/web/petabox.php). So IA can manage 50PB as a small-to-medium private company. (Incidentally they've been running since '99.)

As for BackBlaze, also a medium-large business, they're now storing... https://www.backblaze.com/blog/200-petabytes-of-customer-dat...

Both IA and BackBlaze are private/nontraded, which means have they have lower operating capital. Diskspace is simply not that expensive now.

There's a guy on a DC++ filesharing server (find a server list - it's one of the biggest ones) who has been sharing 400TB of data for some time. Speaking of DC++, most newer clients show the total shared data for all users connected to the server you're on, and that number on some of those larger servers is usually 1-2PB.

I also saw a guy on reddit a while back who was in exactly the right place at the right time when his workplace was upgrading, and he now has a nice $200/mo electricity bill in the form of, you guessed it, 400TB of diskspace. I'm not sure if he got it all for free, but I think he may have.

So it's not a money problem; it's a space problem and a power problem. This is why flash storage is so interesting, it generates less heat and can be packed somewhat more densely, and it uses less power too. Once Flash-vs-platter hits the 49%/51% in terms of relative cost things are going to get interesting.

At the moment the major retailers are just doing simple things like firmware customizations to run their disks at lower speeds (for nearline storage) or start up with the disk off and stuff like that. Facebook's cold storage datacenters also use Reed-Solomon encoding instead of RAID/ZFS for redundancy at less used space.

I actually do think Google have actually done the kinds of allocations you speak of, using thin provisioning; after all, literally every new Google account gets 15GB of diskspace! And then there's sync profile data, whatever internal metadata is associated with the account (such as your search history), etc, that needs to be stored too.

I fully believe Google have multiple exabyte-scale datacenters. If they don't I'll be genuinely surprised.

Using thin provisioning (which is ultimately just "how much are they really using, and how can we encourage them not to use more than X") is how they manage it.

So you're right - actually provisioning enough free storage for these users would definitely be an unpleasant task. But they carefully balance what everyone uses with what they have available.

This kind of high quality, high effort comment is why I love this site so much. Thanks for crunching the numbers and making me drop my jaw at the amount of data.

Just upvoting you doesn't suffice today.

Or to look at it another way, not even 20% of a single AWS snowmobile: https://aws.amazon.com/snowmobile/

Of course, they presumably need to duplicate it for redundancy too, so maybe a full 2/3s of one!

So long as Google decides to keep the service operating, maybe.

Yes, but Dropbox isn't even profitable yet.

Every time this topic comes up, which is reasonably often, I like to link to this: http://www.michaelrwolfe.com/2013/10/19/why-is-dropbox-more-...

Which is funny because 'a folder, that syncs' is what the Windows 95 briefcase was for.

It was nice. And they ditched it for Offline Folders, which was "a folder that syncs, until it breaks all the time".

And in the meantime, here's Joel Spolsky in 2008:

“Imagine all your devices—PCs, and soon Macs and mobile phones—working together to give you anywhere access to the information you care about.” And what is this Windows Live Mesh? It’s a way to synchronize files.

Jeez, we’ve had that forever. When did the first sync web sites start coming out? 1999? There were a million versions. xdrive, mydrive, idrive, youdrive, wealldrive for ice cream. Nobody cared then and nobody cares now, because synchronizing files is just not a killer application. I’m sorry. It seems like it should be. But it’s not. Damn, they just finished building something called Windows Live FolderShare and I haven’t exactly noticed a stampede to that. I’ll bet you’ve never even heard of it. [..] this so called synchronization problem is just not an actual problem, it’s a fun programming exercise that you’re doing because it’s just hard enough to be interesting but not so hard that you can’t figure it out.


And now Microsoft is ditching Offline Folders which break a lot, for Work Folders - folders that sync over HTTPS, without a local hidden cache.

Also, I don't remember names, but there were similar 3rd party paid services in late 90s - and they also had virtual disks integrated in Windows.

>ftp or any other file repository service

Dropbox is a success because millions of people don't know the meaning of the words you just wrote. This is even hinted at int he application itself: "Dropbox is kind of like taking the best elements of subversion, trac and rsync and making them "just work" for the average individual or team."

Explain FTP to your mom and see how long it takes them to figure out how to do it. Also how can you FTP from an iPhone? How do you have local copies on multiple device stay in sync? How do you share a link to your file to a third party?

I'm talking about normal people and not experienced engineers.

I did, and dad uses this to share files with friends, privately (I had eventually built a tiny Flask app that does access checks and serves same files over HTTP). It's just a network-connected external drive to him.

Maybe you can't, and that's fine. For me, being able to work on stuff on one machine, then jump over to another one of my machines and it's just there is hugely valuable. The backup element is kinda secondary but nice to have.

This is exactly why I use Dropbox. Backing up, sharing files with others may one day be handy sure.

Having files on all of my devices without the effort is why I pay them whatever their monthly fee is.

Comfort. The reason is comfort.

Dropbox replaced USB sticks for transporting miscellanious files for me. The less I need to do my own admin stuff the better (it's not productive, and it's not my core competence).

The point being most users are not power users. They just want stuff that they can get to work in 5 mins after a quick download. That's what a lot of geeks that have geeky solutions don't get. No one wants to do anything, they just want their cake for free preferably and they will pay money to eat it.

I am a power user and I still want the 5 minutes after a quick download, the two aren't exclusive and there is too much (IMO) pride in the 'this is difficult to use so I'm special for knowing how to use it'.

I spend way too much time bashing heads with the rest of the software ecosystem these days.

Heck I'm an engineer and I wouldn't want to setup a SFTP server to replace my Google Drive account

The Slashdot takedown wasn't naivety, it was CmdrTaco's honest opinion of the iPod. It's absurd the way he's portrayed to be making some sort of market prediction.


On the other hand this comment from the same thread https://news.ycombinator.com/item?id=8917

Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact
