Hacker News new | past | comments | ask | show | jobs | submit login
AWS Snowcone (amazon.com)
478 points by jeffbarr on June 17, 2020 | hide | past | favorite | 217 comments



I must not interact with big enough data to understand, but what is the point of this? I have watched/read several of the Snow product descriptions from AWS and I am still not quite sure I understand the point.

It seems like a way to sync local big data to AWS cloud storage for data that is too large to realistically transfer via the internet. So is this simply a sneakernet external harddrive because physically shipping an 8TB hard drive has better bandwidth than using the internet?

I would love it if someone could explain a couple of use cases for the Snow family of products to someone that has never had to handle 100 GB of data much less terabytes.


I can speak to part of the rationality around Snowball:

The old method of customers sending in hard disks and AWS importing them turned out to be _incredibly_ high touch.

* Customers would put the wrong labels on the wrong drives. Given they were encrypted, and the label details were critical to matching the right disk with the right encryption key, that was a big issue.

* Imports would routinely fail due to all sorts of driver bugs both on the AWS side, and on the customer side. The number of fringe NTFS quirks was phenomenal, let alone all the wide range of other formats that had to be handled

* Encryption tooling still really isn't all that user friendly. Too many opportunities for mistakes.

* Drives would get damaged during shipping.

* Stuff would go missing in shipping.

It just didn't scale as a solution, and caused unending customer pain and frustration.

Snowball solves _all_ of that, by taking out of the equation almost anything about the process that customers could get wrong. It handles all of the encryption stuff, is extremely ruggedised, has basic tracking. The label is an e-ink display that automatically contains all the right information. It becomes a "just works" solution.


It's just sneakernet with bells and whistles. As it should be.


Those are all really good points. I'm interested in hearing more about the NTFS quirks - what kind of issues were cropping up?

Now I'm really curious what filesystem Snowball wound up standardizing on, and how/if file metadata/ACLs/alternate data streams are maintained.


I helped a client do a transfer to S3 via snowball once for a music studio client with a smorgusboard of random hard drives.

The software client for doing the transfer is some sort of command line JVM tool from my memory. You locally create S3 buckets on the snowball and copy from the local disks into the buckets.

Once the snowball is returned, the buckets appear in your online account.


The main issues with NTFS would be the system either wouldn't be able to see the drive, or would fail to be able to read part of it. They were building up quite a repertoire of tricks to get things to read correctly, but it was almost impossible to safely automate.


Do those even matter if the final file destination is S3?


It's mostly a technical curiosity to me. Although maybe they would if you're using it in a system backup/archive workflow?


great examples of why to use this over "just" a large external hard drive in a case. Thanks for the insight!


Doesn't solve * Drives would get damaged during shipping. * Stuff would go missing in shipping.


It does. The Snowball was designed in a ruggedised fashion. It can take some extremely large shocks while in transit without putting customer data at risk. I forget the precise tests carried out, but I believe they even included "dropped from roof of multi-storey building".

Stuff going missing in shipping is solved by very low powered tracking chips in them. AWS knows where those devices are and could tell shipping firms if needs be.


sure it does. the drives are specifically designed to be rugged enough to handle transportation (and are tamper resistant and tamper evident), and I imagine amazon probably uses their own delivery network to transport it, which means if it is lost in shipping, it is amazon's fault (and presumable can be corrected). It also wouldn't surprise me if at least some of the product contain GPS to make it less likely to get lost.


Pretty sure the E-Ink on the snowball devices are set up for UPS tracking label format.


Lots of potential uses:

Imagine Amazon was bidding on a big Department of Defense contract ($ in billion). DoD let's say wants to send drones / UAV's whatever to do big sensor passes out in far off areas / imagery, wideband radio etc). There is NO fast internet for the TB's generated. You pop this out of the drone, put it on the next resupply mail flight out, and ingest into Amazon cloud on far side for analytics.

Let's stay a store wants to collect 30 days of customer video related to all traffic in their 100 stores to do some sort of AI analysis on how folks use the stores. Again, some poorly paid store staff person can unplug this after a week and ship it off.

Let's say I generate data but instead of wanting to do a snowmobile every year, I want to do more granular moves. Again, I could do a weekly or whatever ship, and again from maybe a few locations.

I think especially in places with lower high speed internet penetration (africa, china etc) there is room for something "snackable" like this.

Let's say I distribute movies to movie theatres. Again, do a weekly ship in/out.

It's $60/job. Not bad. No data IN fees.


For a sense of scale, my company used an AWS Snowmobile to transfer 100PB of archived satellite imagery to AWS.

More details here: https://blog.maxar.com/earth-intelligence/2017/digitalglobe-...


For a sense of the world, I went to Manila in 2018 to find out why there are several distributor can only sent their data weekly.

One of the distributor show me a 64 MB usb flashdisk that’s travelled every week with the delivery truck to get 4 MB of data from their branches (8 hours truck driving from the capital)

Went to Myanmar in 2017. The nation is just upgrading to 4G in 2016. But not every places have 4G coverage

2019 in eastern Indonesia (Seram island) you’re lucky if you can get a reliable 5 KB/s internet. It’s a remote area, but we still have a branch there :(


Why not satellite internet then?


$$$


"Never underestimate the bandwidth of a station wagon full of tapes hurtling down the highway."


For real! Reminds me of this article...

"A carrier pigeon with a cargo of microSD cards can transfer large amounts of data faster, and more cheaply, than just about any other method"

Link: https://spectrum.ieee.org/tech-talk/computing/networks/pigeo...


This almost got me and then I looked at the date of publication.


If you're new to joke RFCs, I suggest reading up on RFC 3514, a suggestion that malicious IP packets should be required to set the high order bit of the IP fragment offset field, which is to be designated the "evil" bit. This significantly simplifies implementing effective network monitoring.


It's still true. A pigeon can easily carry 10 microSD cards, 1TB each. Even if you're on a gigabit connection, it will take you over 22 hours to copy 10TB.


They meant that it was posted on April fools, not that it's an old article.


Or these days, micro flash cards.


What was the premium of storing the data in S3 as opposed to keeping it in your own tape warehouse? Definitely curious about the cost benefits!


S3 Glacier Deep Archive is probably the closest to a tape warehouse, at the standard rate it would cost $99000/month to store 100PB of data.

$0.00099 per GB per month * 100,000,000 GB = $99000


You need to account data transfer OUT if you need to get your data from S3. _If_ you could download 100PB data, it'd cost you $1M.


Hopefully it is not stored as one giant object and hopefully your retrieval would be only specific parts (small parts.)


You’d want to do some research on getting data out of glacier. I’ve read some nightmare stories about the bill from trying to get a small amount of data out of glacier. I can’t remember the details, and some of the large cost was people sort of doing it wrong. But some of the tooling makes it easy to do wrong.

When I was using glacier, we treated it sort of like a write only storage, with the understanding that if we ever needed data out, we’d have to think long and hard about the data’s value, and probably would have gotten an AWS rep on the line to make sure we didn’t do anything stupid.


Is your tape library stored redundantly across multiple data centers?


Unfortunately I don't have access to that info, sorry!


Just FYI your company has a cert mismatch on the following link from that blog post: https://platform.digitalglobe.com/gbdx/

The cert is limited to *.wpengine.com

And the link 404s so.. that's a shame. Was interested in learning more about the platform.


Thanks for the heads up! I'll notify the team.

You can learn more here: https://www.digitalglobe.com/products/gbdx


Like everything from DigitalGlobe -- it doesn't work.


LOL. What products have you had bad experiences with?


That was an excellent read. Thanks for sharing.


I work in healthcare, sometimes we have to go to places with no internet, and sometimes with no electricity. We gather a lot of HIPAA-protected data. This might be useful for those events.

Sometimes, even when we're in the middle of a big city, there's still no secure or reliable internet connection at the location we've been given, so this might be good for those cases, too.

I'm not on the infrastructure side of things, so I can't say for sure. But I've already forwarded the page to our mobile IT team to make sure they've heard about it.

Edit:

They might also be good for television and movie production. In the early part of this century, rushes were sent from remote filming locations to the studios loaded onto iPods. With everything being 4K and higher now, this might be useful to move those rushes to a cloud location where they can be accessed by multiple people who need to see them in multiple cities.


Lord of the Rings was almost stolen this way: https://www.youtube.com/watch?v=Ndge8WlM9q8


So yea, pretty much. I had to do this with a project from a couple years ago where we were moving a ton of images from a legacy colo'd server to S3. I can't remember how much data it was but it was enough to where it didn't make sense and would've taken for-fucking-ever to move it to S3 over public internet.

When we did it, we bought a drive, went to the colo, offloaded the files, and you have to do some stuff to kick off a job to let them know you're sending a drive and what kind and all that sort of thing. Then you mail it to them and they run an import into an bucket.

My guess with these drives is controlling the hardware makes the process significantly more efficient for them, and for the enterprise customers they're trying to move off colo and onto AWS.


One use case I know of is in a mining operation. They collect a ton of data about their equipment and their mines as they work, but they are often out in very desolate areas with no internet.

The mining equipment runs greengrass right on the machine to do real time inference to help make mining more efficient as well as flag potential equipment failure to the operator.

So they would strap this device onto the equipment, it would collect data and do inference, and then they can swap it out for a new one and send the old one back so the data can be added to their data lake to update all of the AI models.


The speed is questionable in many countries where consumer internet connections already have 500mbit upload speed. But in the US I think bandwidth can be quite limited if you're not in a major city with fiber.

The other big part is pricing. AWS charges $ 0.09 per GB for outbound transfer, so getting 8 TB of data out of AWS via the internet costs over $ 700. Doing it through Snowcone costs around $ 300 + shipping.

Reasonable use-case would be transferring video material or high resolution sensor data.


It never ceases to amaze me that people pay that much for transfer. It's one tenth the price on DigitalOcean and it isn't like this is comparing two different types of things.


If you have data that you're unlikely to need to access (i.e. backups), not many providers can compete with Deep Archive storage pricing.

4TB on S3 Deep Archive: $3.96/mo to store, $360 to restore and download over the internet 4TB on B2: $20/mo to store, $40 to download 4TB on Digital Ocean: $80/mo to store, $30 to download

Yes, it feels almost extortionary to have to pay that much to get your data out of AWS. But depending how long you store data for, it can end up being the cheaper option: you're just gambling on how likely you are to actually need to download it.


DO is incredibly limited in terms of products compared to AWS.


Yes but this is just data storage. I get it if you need to be on network, yadda, yadda, but so many people just senselessly use S3 and rack up bills that are easily avoidable.


Digital Ocean Spaces doesn't have the same durability designs of S3. Or performance in terms of TPS. It's not apples to apples.

DO is great for small personal projects, but, I wouldn't run a fortune 500 company on it.


What does that have to do with the price of bandwidth?


Can you transfer data out of AWS with Snowcone? I thought it was all transfer in, which is free.


You can transfer out, but you pay the similarly insane egress costs. Somewhat cheaper, but still bonkers.

Snowball/snowcone, US East

$0.03 per GB ~-> $300 per 10TB

If you're buying disks at scale that's almost the cost of the disk.

Network egress

$0.15 per GB

$0.08 per GB (>150TB/month)

It makes sense for the occassional transfer so you're not paying for expensive EBS storage for stuff you don't need cloud access to.


A screenshot in the article implies that you "Export from S3" https://media.amazonwebservices.com/blog/2020/snow_job_type_...


I would assume that this would be for gathering information from remote data collection nodes or from computers that don't access networks. Instead of hooking up with a laptop and some kind of portable NAS, you could connect a snow product to slurp up the data and then mail it to Amazon to upload. I think their selling point is the reliability of the snow family, and not having to worry about the data after the initial upload which can be super critical for some business models.


I've used AWS snowballs fairly frequently and it is definitely a specialty product.

For our use case it was genome sequencing. Each individual genome was a terabyte or so depending on what types of data we sent. We can crank out a few hundred genomes per week at full production so when our collaborators need to receive their data they usually neither have the capacity in bandwidth or storage for cohort of thousands. So it gets shipped to AWS.

We've made improvements to get more direct connections to cloud services so we don't use them anymore but for awhile we were filling a snowball and shipping it every week or two. We also used the Google Transfer Appliance once or twice (480tb) to transfer projects all at once instead of piecemeal.


It's sneakernet optimized to deliver direct to S3, with full end-to-end integrity checks. Depending on the amount of data you have and your internet connectivity, this can be significant optimization — for example, say, you work with professional photography or video but live in the United States where only a small fraction of people have an uplink speed greater than 10-30Mbps even if their download speeds are 500+Mbps. If you can generate 8TB of data in less than a month, this is faster and doesn't interfere with all of your other network usage.


We used them in a B2B setting when signing an enterprise data contract for nationwide real estate MLS data, including photos. The photo's needed to be uploaded to our S3 bucket and the only method available to download was via FTP. It made more sense to ask the third party provider to download the images to these disks, send them to us, and then we shipped them to AWS.


The video [1] quickly lists a number of use cases.

> Explore the new AWS Snowcone with Bill Vass and Jeff Barr (2:46)

[1] https://www.youtube.com/watch?v=8RNRssCiR_E&feature=emb_rel_...


I recently listened to an interview with the editorial team of Deadliest Catch (Art of the Cut #51). I could see this product being used by similar media productions where thousands of hours of film footage is generated in remote locations with low uplink speeds. Instead of shipping the drives all the way to the post-production house, the drives could be ingested at the nearest AWS facility.


I imagine if you're already integrated into the AWS ecosystem, there's a convenience in treating your on-premise node as "just another node". You can run EC2 instances on it, and manage it using the same tools. But with better latency and more robust local connectivity due to the proximity.

The ability to build AMIs and run them on Snowcones gives you the power to build applications that do all sorts of interesting filtering, pre-processing, and analysis at the edge.

To me the product feels like snakeoil and I'm skeptical it will take off in a big way, but I think more than enough customers will use it to justify their investment.

Remember, AWS is selling a platform. The more claws they can sink into their customers, the harder it is for them to switch to a competing one.


I don't get this one, but for the larger Snow products, yeah, it's for transferring data too large to realistically sending over the internet.


I'm with you. I think this is really cool but I have a hard time finding a use case for it.

The use case for the other Snow products (Snowball, Snowmobile) are clearer, as they are for transferring petabytes of data to/from AWS without having to use slow internet connections. Snowcone could be used for this indeed, but at only 8 TB, the value proposition seems a lot less. Personally, I'd probably just suffer through having to spend a week doing a slow internet upload/download rather than paying for Snowcone.

Also, the articles talks multiple times about using Snowcone to upload the data for you, over your own internet connection. Why would I pay for Snowcone to do that when I could just upload directly to S3, without using Snowcone as the middle layer, and get the same speeds?

The compute aspect of it seems cool from a "im a tech enthusiast and this is cool" perspective, similar to how a Raspberry Pi is cool, but I don't see the real-world use case for hosting compute workloads on a rented Raspberry Pi. I would just buy a Raspberry Pi instead for cheaper.

What this does make me interested in is something like this, but purchaseable outright without renting, a la AWS Outposts but the size of the Snowcone. It would be cool to have a swarm of Raspberry Pi sized devices that I controlled entirely through the AWS console with AWS services, and it would open up some niche use cases like having a tiny server cluster in places where I otherwise wouldn't have infrastructure, like a remote research camp or rural community.


Disclaimer: I work at AWS, totally different team, and had never heard of this product until this announcement. This is 100% my personal opinion and I'm not operating in any official capacity.

>Personally, I'd probably just suffer through having to spend a week doing a slow internet upload/download rather than paying for Snowcone.

Well, I think there's two things here.

1) A lot of businesses probably won't be willing to spend a week with reduced internet capacity to upload stuff. Things we as single users might be okay with might not always translate to being a good fit for a business overall.

2) My reading is that some of the use cases for this are areas where you are likely to have limited or no internet connectivity.

From https://aws.amazon.com/snowcone/

>AWS Snowcone is built for edge computing and data storage outside of a data center. It is designed to meet stringent standards for ruggedization, including free-fall shock, operational vibration, and more. When sealed, the device is both dust-tight and water-resistant, protected from water jets on all sides. Snowcone has a wide operating temperature range from freezing to desert-like conditions, and withstands even harsher temperatures in storage.

and:

>AWS Snowcone deploys virtually anywhere you need it. It features 2 CPUs, 4 GB of memory, 8 TB of usable storage, Wi-Fi or wired access, and USB-C power using a cord or optional battery. You can put it in a messenger bag, run it in an autonomous vehicle or an airplane, or even attach it to a drone.

So, ruggedization and the ability to run this totally off battery points me towards thinking about use cases where there's not existing infrastructure to take advantage of. I guess this supported by the 'run it in an autonomous vehicle or airplane' bit I'm quoting as well.


>1) A lot of businesses probably won't be willing to spend a week with reduced internet capacity to upload stuff. Things we as single users might be okay with might not always translate to being a good fit for a business overall.

Perfect for us. We used to ship small amount of data in the scheme of things on external drives to Amazon for long term storage in Glacier. Worked great. That program was dropped and replaced by Snowball.

We tried Snowball and never could get it to work properly in our location. Amazon support couldn't get it to work, either. It was really overblown for what we wanted to, anyway.

Sending over the wire isn't an option for us.

This is a better solution for us as long as the networking issues are resolved and the pricing works out.


> Also, the articles talks multiple times about using Snowcone to upload the data for you, over your own internet connection. Why would I pay for Snowcone to do that when I could just upload directly to S3, without using Snowcone as the middle layer, and get the same speeds?

Most or at least many households in the UK I think max out at my speeds (FTTC) which puts me at about 20Mbps upload. At 2.5 meg/second that's over 4 days per TB if I've done my maths right. 8TB is over a month.

I'm looking at a house where I'd have an upload speed about 1-2Mbps, so well over a month per TB uploaded.

> It would be cool to have a swarm of Raspberry Pi sized devices that I controlled entirely through the AWS console with AWS services,

You used to be able to manage things on opsworks outside of AWS if you installed a client. Maybe you still can do this kind of thing - possibly AWS Systems Manager for non-opsworks things? I get a bit lost in all the services these days and their restrictions.


Disclaimer: AWS employee, don't work on any of the Snow products

My guess is for situations where you don't have a decent internet connection. Like some remote research base way out in the middle of nowhere. The other major application I can think of is for transferring data to air-gapped networks. You could use Snowball for this also, but that would be overkill in a lot of cases. I think for air-gapped networks this is meant to fill in the niche between a USB drive and something like Snowball.

EDIT: expanding on that off the top of my head I can think of a couple other applications. If you're using long range drones with lots of sensors (think RQ-4 global hawk, but also weather monitoring and the like) you're generating way more data than you could stream over a satellite uplink. So you could put a snowcone (or maybe several) inside the drone and use them for storage of all that raw data. On landing you can remove the snowcone(s) and ship it off to Amazon and all that raw data is available on S3 the next day. There's a bunch of other defense-type applications. Embassies could transfer these in diplomatic pouches for top secret information. In fact, if you wanted to be particularly secure but wanted lower latency, you could use the snowball to transfer a bunch of random bytes to use as a one time pad, which the embassy or intelligence center could use on demand as it transfers data back.


I would guess the real use case is: you considered becoming a Snowmobile customer but that's too heavy-duty and you need something for smaller amounts of data. Or you are a Snowmobile customer but sometimes you don't need to back up the truck for a relatively modest-in-size update.

So AWS made this.

Now they have an offering for you & everyone like you, that they can pitch to CxO execs, especially those who came from Snowmobile-using companies.


AWS already had Snowball available as an intermediate level:

https://aws.amazon.com/snowball/


For upload it doesn't make much sense. For download it's less than half the cost compared to AWS outbound bandwidth to the internet.


Holy hell. I'm looking at the pricing example on this page [0], and at $460 to transfer out 15 TB, you're telling me this is less than half the price of normal internet outbound traffic?

I knew AWS bandwidth costs were absurd, but it wasn't until seeing those numbers that it really hit me just how absurd they are.

0: https://aws.amazon.com/snowcone/pricing/


Yeah, come for the cheap compute, stay for the expensive network costs... forever.


A residential cable connection realistically offers something like 25Mbps down/3Mbps up. That's 246 days to upload 8TB!


According to Speedtest.net [0], the average US internet speed is 138Mbps down/51Mbps up. That's certainly not amazing, but on average that's only ~2 weeks to upload 8 TB.

I can certainly see where this makes more sense if you're below that average, though.

0: https://www.speedtest.net/global-index/united-states#fixed


I would add the standard caution when looking at averages, that symmetric 1/1 fiber may be pulling the upload value up significantly. Or that Speedtest users have higher internet speeds.

But I really want that stat to be true. The world of American ISPs is sad and depressing.


Average is useless here, because upload is so very restricted on some connections, but not others.

The first google result for median says it was 60/5 two and a half years ago. If you can sustain 90% saturation that's 164 days. Over half a year at 80%.


Also, what type of customers run speed test normally? Those on the more technical side, thus those running the test are likely to have higher level packages from their providers, further skewing the data


Wow! That's very surprising. It's been a while since I lived without fibre; maybe DOCSIS got a lot smarter in the last few years?


It hasn't in the US. Comcast's 1G down is still 10Mbit up


Mine is 35. Still terrible.


> It would be cool to have a swarm of Raspberry Pi sized devices that I controlled entirely through the AWS console with AWS services, and it would open up some niche use cases like having a tiny server cluster in places where I otherwise wouldn't have infrastructure, like a remote research camp or rural community.

You’re looking for Greengrass, which is hidden away in the IoT tooling AWS provide. It allows you to run Lambda functions and Docker containers on Raspberry Pi (and smaller) sized devices, controlled entirely through the AWS console.


Snowcone supports Greengrass, in fact


My employer has locations all over the world and provide services to our customers. Some of these are AWS customers as well and depends on a small edge compute that they control and knows is compliant with this and that.

For them Snowcone is perfect as it's much cheaper than Snowball and does the job of trickling data back home.


> Hardware failures on amazon.

“Someone else’s problem”

That’s what you’re paying for on any cloud provider. Any hardware problem isn’t your problem and it (should be) abstracted away.


That's true for hardware that's actually "in the cloud" (aka at an AWS data center) where a hardware failure simply results in your workload automatically shifting to other hardware. But if a Snowcone breaks on you, it still certainly is your problem, because you're the one that now has a physical brick in your office and you need to find a replacement hardware for whatever you were previously using Snowcone for.


My bad. I didn’t actually read the article!


> So is this simply a sneakernet external harddrive because physically shipping an 8TB hard drive has better bandwidth than using the internet?

Yeah. When you have to migrate 100TB data from your own data store to AWS it is much easier and practical to just send the drives. They even help you with that, for a price of course.


I'm thinking defense / electronic warfare. All the data we'll be collecting in the field needs to travel back to a data center somehow. Not sure what the rules of the game for on vs. off-premise storage in that scenario, but I still think this is one of the main demand drivers


When talking about petabytes of data is definitely faster to physically offload the data, ship it, and physically load it on an edge location.

Regarding the size of the data. Well, yes. There are companies with petabytes of data. Banking is one example that comes to my mind when thinking about petabytes scale data.


When you realize that your data is going to take 24 months to copy elsewhere (e.g., the cloud or another DC) over the fastest link you have, you have a real need for something like this.


I had up back up several years of training data off-site and we used snowball to ship a few tbs of images. Not sure it was completely necessary but it was fun


its physical edge storage, wasn't the use case clear? you can have the cloud right next to you! an entire node of a content distribution network, by itself in your rugged remote environment, but with you!

you can cloud compute separated from the cloud, but together, with your snowcone!


The e-ink display part triggers thoughts of an Amazon of the future that uses entirely reusable boxes to ship things to consumers. Hard plastic containers w/ e-ink labels. Boom, no more millions (billions? trillions?) of cardboard/plastic boxes used to enclose packages. Still millions/billions/trillions of boxes/plastic packaging for the items themselves, but at least the transit envelope would be reusable.


At one point, Amazon was arranging pickups of the boxes if you filled them with charitable donations (clothing?). Probably not 1 for 1, but it’s something.


They tried reusable plastic boxes for Amazon stuff in Seattle years ago. It was a pretty colossal failure. People just kept the plastic boxes and used them for stuff around the house.

"Bill the people who keep the plastic boxes," is the oft-heard refrain!

What if they're stolen off your porch due to living in a not-great neighborhood? You still owe?


Bill upfront and credit back on return. Amazon shouldn't be responsible for the stealing happening in your neighbourhood.


They often shoulder that responsibility to reduce friction, and I think it works net-positive in their favor.

Billing the customer and then penalizing them for being a victim of theft introduces so much friction and frustration.


That is true and besides the point which is that they are not responsible.


It'll happen eventually... just attach wheels and some AI like this company: https://www.starship.xyz/


That's a cool idea. I wonder how many re-uses a box like that would take to make it cost effective over the same amount of cardboard. Should be feasible I'd think.


If the cost to return the box is more than a cardboard box, its not feasible.


A lot of Amazon users get daily/weekly drop offs, and increasingly via Amazon’s own logistics.

Seems fairly trivial to have those drivers pick up last week’s carriers when they drop off this one’s.


They did that with Amazon Fresh, and (at least from my perspective of several years ago) I think the asymmetry between # of packages and frequency of delivery didn't work out for them that well. We'd end up holding onto a stack of 4-5 crates for a week or so, which means Amazon would have to float probably a ton of these across their customers. That's all a guess though, but these days we get deliveries in plastic/paper bags, no reusable containers, so something must not have worked out.

That said, grocery delivery is much more "bursty" than package delivery.


Not necessarily. If the box costs 10 times as much, but they come up with a way to use the box 12 times, it was worth it.

That won't work if the recipient doesn't have a need to mail something back, but for a use case where you expected most boxes would be shipped back (maybe a phone repair company?), it could absolutely make sense.


The post you're saying "not necessarily" to has nothing to do with the cost of the box itself. If returning it costs too much, the box could be free and you'd lose money.


Just make it up in volume! ;)


This must be where all the traded-in Kindle screens are going. I was surprised Amazon was willing to pay me $25 for my 7-year-old Kindle, but I guess it's cheaper than manufacturing new eInk displays.

Maybe I'm a germophobe, but isn't it kind of gross that this thing is shipped in no container and then put on your desk?


Maybe I'm a germophobe, but isn't it kind of gross that this thing is shipped in no container and then put on your desk?

How long do germs last on an e-ink display?

I know on paper and cardboard and such it's not very long. From what I can remember, they die while still in the mail stream.

If you're concerned, hit it with some Lysol or soap and water.


I think parent commenter isn't just worried about the e-ink display, but the general fact that the entire device itself is going to be sitting in dirty shipping containers, warehouses, trucks, etc without any kind of covering. I know I've gotten some shipments before that have stains of who-knows-what-liquid soaked into the cardboard.

I'm not a germophobe, but even so, this probably isn't a device you want to be handling while also eating a sandwich. Lysoling it is probably wise.


I’m a germaphobe but it’s probably not worse than say reading a postcard, or putting your laptop in an airport bin, or your backpack under an airliner seat where people put their feet


or handling cash money


What strikes me is that the e-ink screen could update during shipping. Seems like shippers would want to avoid that possibility.


I think in the typical use case it's going somewhere that's equally dirty or worse, but if not, I believe it is water resistant and can be washed with a soapy sponge.


Is it dishwasher safe?


The water in a dishwasher can get up to 60C, which is above the maximum storage temperature of most consumer electronics.


Just think of all the “snow” products as a modern version of sneaker net (it used to be faster to run down the hall with a disk than transfer files over the network because networks used to be really slow).

The snow products solve the same issues. Despite fiber connections and such it just doesn’t make sense to transfer massive volumes of data over the network. It’s often literally faster and cheaper to ship a box of hard drives via UPS or FedEx.

Snowball was for big sets of files and is the size of a suitcase. Snowmobile is for petabyte scale and is literally a tractor trailer full of disks.

The use case here seems to be more towards remote situations with smaller data. You have something that collects a lot of data and need to get that into your cloud. Instead of running around with a bunch if portable hard drives and then having someone transfer the data manually to S3 over the internet you just dump your data into the snowcone and hand it to your local UPS guy and let AWS take care of the rest. Lots of remote data collection devices and such would fit into that model.

Clearly the use case is rather specific but for people in the business of collecting data on stuff and then needing to get it into the cloud this is actually a nifty little device.


My AWS Snow Family visual notes: https://www.awsgeek.com/AWS-Snow-Family/


neat. what do you use for drawing those diagrams?



Interesting at the very least for retrieval of backups from S3 Deep Glacier. Because the Glacier retrieval fees with bulk transfer are only 0.003 (yes 0.3 cent, not 3 cent) per GB. The thing that kills the backup use-case is $ 0.09 per GB bandwidth costs to the internet. The Snowcone brings that down to $ ~0.037 + shipping if you use the full 8TB per device.

And that includes a $ 0.03 bandwidth fee from S3 tot Snowcone that I guess they're going to reduce over time since it's all on their internal network.


This is nice! I want a homelab for my data, but I like AWS. It'd be cool to run my own personal database on this, and only run pg_backup to S3. RDS is too expensive for personal projects, but I'm not sure if free Heroku dynos and such aren't enough for raw data processing. So yeah, I guess if AWS had tiny RDS with primary node as Snowcone, I'd buy one. Otherwise I might buy one later and install my own database on it.

Not sure where AWS is going with this, but I'd like to see AWS offer a tiny version of AWS Outposts, where you can get any kind of AWS service in a box.


My thoughts exactly. Seems like a great way to have a local DynamoDB for the team to collaborate with. I wonder how strong the sync to cloud story is.


I wonder if AWS are shooting themselves in the foot (if these things become very popular), by making the "Cloud" a physical, tangible thing. I think part of the lustre for some customers is that they don't know what they're paying for when they start a 2 vCPU EC2 instance and must think it's something crazy complex and special. Now having it on your desk in a tiny little box will make them wonder what they pay so much for.

The other thought I have is that maybe there's a market for shipping around bytes in mail boxes not just between a business and AWS, but just any people and businesses. I've seen B2 and Dropbox (I think) also have these "we'll ship you a drive" things, but maybe they'd outsource that for example to a third party who just did it really well and cheaply.


One of the main reasons that we're running our own hardware is because it's difficult and cost prohibitive for us to get our data to a cloud provider.

We're running a sort of sneakernet between our data collection agents and our data ingestion locations, each day each location receives 2-4 encrypted SSD's (Samsung T5) with up to 1TB of data. That then gets uploaded to our central location (overnight) for processing, and the next morning they're drained and ready for the next mission.

If Amazon had launched this earlier, and our cashflow (or funding :P) a bit better then maybe we'd have opted for running a constant stream of these snowcones to Amazon. Though processing costs are also a big concern, the cloud providers are at least twice as expensive as running metal, even when looking at 1 year paying for the hardware up front, and if you're cost sensitive when buying the hardware it could be 3-4 times cheaper than the managed cloud.

What I'd be afraid of with this server is losing them in the mail. I wonder if they've got a system where you could mirror two snowcones before you send one to them.


Throw a red LED on top and it's literally "the internet" from The IT Crowd


I don't think that's a very realistic outlook. Most money spent in aws is not by noobs going over their free tier not realizing they could have bought a raspberry pi; it's by institutions that are paying for convenience and know it.


if they don't do it, someone else will


Interesting. So you can launch an EC2 instance with 8TB storage in your closet, or handbag, or under the desk at your medical facility, or whatever. I gotta say, sounds kinda fun.

Maybe would be handy for serving video content etc on a local network?

Or if you have a network of video cameras not connected to the internet, use this for daily/weekly collection of recorded content?


What you describe is NAS. Popular brands give enough resources to run VMs, Docker containers, even perform 4K transcoding etc.


Take look at Synology devices.


I'm in the process of starting an ISP right now, so I've just been getting prices for dedicated fibre lines. So let's do some quick math.

A 1Gbps dedicated (uncontended) fibre line in Europe seems to be in the region of $/€/£600 per month. How much data can one push on that per day (assuming 1Gbit/s = 100Mb/s real world)?

In one day: 100 * 60 * 60 * 24 = 8.64Tb

So call it one snowcone per day. So you could in theory send 30 snowcones a month, a total snowcone value of $60*30 = $1800.

However, at $60/snowcone, you would have to send 10 snowcones per month before your own dedicated line breaks even.

So yeah, if you're sending 10 or more per month, every month, consider getting a dedicated line. But that seems like an unusual use case to me.

There's also a bunch of organisational reasons to use something like snowcone, I'm sure.


This is pure (wild) speculation given my experience with data needs that seem to fit the product. I bet one (the) customer profile for this was military or intelligence (before it was available to the public).

I say this given AWS prior gov/cia work. Securely capturing, transferring, and maintaining chain of custody for intel from various media and devices captured during raids or other activities was (several years ago) a total shit show. We had a closet full of trash bags from various raids. The analysts weren’t in the field, the op tempo and slow data rates prevented us from making good use of most of the data. Some of which was time sensitive (we would find out weeks later). Also some of the data captured was to be used as evidence in the host nation legal system (I don’t know if that ever happened) or (presumably) GITMO. I was just a grunt at the time, and this was a long time ago but I bet the problem still exists.


I'd love to see an "expedited import" option for this, where the device gets shipped to an AWS facility right near the carrier overnight hub and imported into S3 that night, landing in S3 by ~2am and able to be processed and delivered to customers before they get to work the next morning.

For many remote data collection activities this could remove the need for expensive and long lead fiber installation.


This is really cool. Is there documentation somewhere of what exactly the compute hardware looks like inside? 2 CPUs and 4GB of RAM, but is it x86 or ARM (I presume x86, but since it has to be on battery power..)? What size of EC2 instance can fit on there?


From the link, instance types are:

- snc1.micro, CPUs: 1, memory: 1 GiB

- snc1.small, CPUs: 1, memory: 2 GiB

- snc1.medium, CPUs: 2, memory: 4 GiB


The table says CPUs, not vCPUs. Unless they're using single threaded single core processors, I think snc1.medium would equal either at least 4 vCPUs. I don't think they use the term vCPU for Snowcone so in reality it has zero vCPUs. https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/instance...


Oops, you're right. I'll fix my comment.


Whoops, missed that entirely. Thanks!


I have no opinion on the necessity of this product (sucks that 100Gbps networks are still confined to datacenters), but did they reuse a backlight Kindle display for the screen that becomes the shipping label? That's neat, I love it when people take advantage of things other parts of their company did.


That's the part that I thought was truly creative.

What happens when the eink display is damaged or stops working during shipping though? I'm presuming if dynamic shipping labels like this ever become common, shipping companies will need their own independent identifier or a standardized physical identifier they can use to pull up shipping information from should such a case occur.


I'm guessing that shipping companies have some procedures in place today for dealing with packages with damaged labels. Even paper labels can be ripped, scarred, or otherwise damaged beyond a readable state.


Oh, fun! When a good Chinese manufacturer duplicates the form factor, someone can ship clones of these laden with malware for spear-phishing attacks on corporate, industrial, and classified networks.

> I connect the Snowcone to the power supply and to my network, and power up! After a few seconds of initialization, the device shows its IP address and invites me to connect:

With a convincing e-Ink display, and an unusually long delay, you've probably got a good 10 minutes on this local network to 0-day your way into routers and client machines.

> Next, I download AWS OpsHub for Snow Family, install it, and then configure it to access the device. I select Snowcone and click Next:

By this point you make the device self-brick and print out an error about how it needs to be sent back to Amazon for repairs, due to being banged up during shipping. Extra scuffs or dents in the case will sell this. While the replacement is ordered, use your now client-resident malware to exfiltrate data as you like, since you know there's data worth them copying offline. Or trojan up every data format you see so that after the data is moved out via a working Snowball you can eventually find an internet-connected device to exfiltrate with.


This seems like a legitimate concern. I didn't see any indication that the snowcone presented a code that would be verified through the aws opshub application, or a similar operation. Is this happening behind the scenes, maybe in the manifest file you upload?

If this was swapped in shipping it could potentially just work as expected while exfiltrating data whenever it could connect to the internet, potentially through builtin lte, etc.


Isn't that the same with any hardware you get? Someone could intercept your server from Dell and put malware on it.


I think that local storage of personal data ("data most important to you") will be a huge trend in the next few years for both homes and offices, especially if they can back up data to other trusted devices (in other homes and offices of trusted people). As always the problem is going to be usability - the AWS ecosystem is not friendly to non-techies.


I don't mean to sound snarky but isn't "local storage of personal data" what off the shelf NAS devices (QNAP, Synology, WD..) have been doing for almost two decades?

Even the backup thing, last time I tried they all had some pretty simple UI built around a rsync fork


I was thinking more having a "personal cloud" between trusted connected devices at different sites.


Being able to buy/rent an AWS outpost of this size for home use would actually be pretty neat.


> the AWS ecosystem is not friendly to non-techies.

nor is it supposed to be. AWS is tools to build end-consumer products, not an actual end-consumer product itself.


I actually think it's not so unfriendly. I have a stock trader friend who keeps all of his transaction records in S3 buckets, and he's not a 'techie'.

It's akin to having an understanding in file system - just that it's in the cloud this time. I'm sure a lot of non-techie understand enough about file systems to work what they needed it to do. And the same goes for the cloud.


I wish people would stop thinking of S3 as a filesystem because it isn't, and in ways that can cause serious issues. I can't tell you how many questions I get from devs asking "how do I rename a file in S3?" You can't. They're not files. It's not a filesystem. The folders you see on the web interface are just sugar, but if you have only a superficial knowledge of S3, you might be tricked into believing it behaves like a filesystem.

S3 is an eventually-consistent object store. We need to treat it as such.


You're exactly right, but non-techies doesn't distinguish nearly as much - file systems or S3 buckets is just a place to drop stuff and get them later/ or not. In this case, conincidentally, non-tech people have much less barrier to the concept, because they never really understood file system in the first place. They understood it enough to make their own stuff work, and honestly that's good enough for them.


I tried to use Snowball as part of a customer facing solution, but the lack of predictability around turnaround times made it impossible for me to justify the expense and overhead.

It is an awesome box and an awesome solution to a real life problem. I really wanted to love snowball.

I was not filling them all the way up by any means, so the turnaround had to be fast enough for me to justify it over just uploading to s3 over slowish connections.

The interface in the console was very opaque and gave no information about when the boxes would get shipped out.

I had weeks long delays with zero contact when the box types I wanted were out of stock and only found out that was the cause when I cried to support.

I also had boxes stall at the import stage after they had already been shipped back.

The software to transfer was also just ok.

I think with more love this can be a great tool, but there are some things that could make it better.


AWS Snowcone pricing here: https://aws.amazon.com/snowcone/pricing/


Hmm. A $2,000 loss fee.

I sure hope they don't use Amazon delivery. They'll deliver stuff to other houses, complete with a "delivery proof" picture of the wrong house and claim that you must have lost it.


Ironically you'll notice all the pictures of the snow-family devices have UPS labels.

EX https://media.amazonwebservices.com/blog/2020/snow_luna_1.jp... and https://www.slideshare.net/AmazonWebServices/new-launch-intr...


$60 to get 8TB into AWS is pretty good. Unless you have incredible upload speeds it's massively faster than uploading that much data.


It's definitely not cheap. As always with AWS.


Is this the beginning of mini self-hosted clouds where the orchestration happens in a web UI but the compute/storage is local? Like a cheap version of Azure Stack? I get that you can’t keep the snowcone, but if you could, it would be a little VM host with local storage on your LAN that could let you run everything internal.


I can't imagine that becoming more popular than just a drobo style device that runs its own software. Seems ridiculous to make your local hardware dependent on the cloud. (obviously this is common for iot devices but it seems that anything like this for enterprise or prosumer doesn't usually fly.


It looks like you can keep the Snowcone. One of the screenshots offers this option:

> "Perform local compute and storage workloads, without transferring data. You can order multiple devices in a cluster for increased durability and storage capacity."


At $6/day it's fairly expensive for non-temporary workloads.


They have a pricing example for this use case which includes the following caveat:

> First, you should contact AWS Sales to discuss private pricing for the long-term AWS Snowcone deployment


Or you could pay the $2,000 lost device fee.


I'm guessing the devices marked as lost don't continue to function (at least not once they connect to the internet again)


Has no one heard of NAS devices that can run docker containers? Synology Play?


I'm not sure whether I'm more annoyed by AWS' product naming scheme or the JS/Ruby/etc community's cutsy project names


I sometimes wonder if AWS product naming is driven by more than just cuteness or habit. Obscure names and acronyms could be an intentional strategy to tie users to AWS.


Wow, this could be awesome for running operation teams at Burning Man!

(Seriously, at the moment, some teams literally offline their server and physically take it to the playa to run there instead of figuring out connectivity there. Very little of it's using newer tech)


It's very tangential but what is the device shown next to the drive in the first picture? https://media.amazonwebservices.com/blog/2020/snowcone_jb_st...

I assumed it was here to give a sense of the relative size of the drive, but after staring at this picture for two minutes I genuinely have no idea what it is.

Actually at first I even wondered if that was the drive, since it's vaguely more in the shape of a snowcone than the drive itself.


I believe that's a water bottle


I thought the same thing, to my embarrassment.


Thermos bottle for keeping drinks cold or warm.


Water bottle or thermos?


It's a water bottle


Here is what I don't get about this product line. The docs say the encryption is done on the attached workstation. Which means your network transfer speed is going to be limited if you don't have a pretty high end machine with at least AESNI. What is the point of having the processor and all that in the device itself if you end up needing a high end workstation anyways to get the maximum performance out of the thing?


It’s a trade off between speed and security, and my assumption is that they’ve chosen to do it this way so that the encryption key isn’t sitting on the device during transit. By doing it this way even if someone can intercept data while it’s being shipped between the customer and AWS all they’re going to get is a disk full of random bytes.


The device doesn't need to store the key to do the encryption. It can be given the key when you boot it and then clear it before shipment.


Something related from my previous job, for data storage there's Serverpack 35 from Acromove. 120 TB per box, suitcase form factor, includes tracking and remote control. You seal it and you load it in courier car.

https://acromove.com/products/serverpack-35/

They also have Serverpack Edge which is supposed to do the same.


Didn't I see this on "Silicon Valley"?


This could fit into the SV show very well:

"we invented on premise cloud"

"you mean It's just a small server?"

"no no it's like having a bit of cloud in your home"


Bezos signature edition coming soon


> The AMIs must be made from an instance launched from a CentOS or Ubuntu product in AWS Marketplace

I wonder what's special about these images.



Anyone know what the FCC ID is? I would love to see the FCC application teardown but can't find the FCC ID anywhere.


I doubt it has it's own FCC ID. It probably uses an off the shelf wifi card to avoid having to do FCC testing.


The 21st century's pneumatic tube!


So you need to have Windows or Mac OS in order to use this? No love for Linux?

https://aws.amazon.com/en/snowcone/resources/


If you're on Linux, you're probably perfectly happy to use the `aws` command-line tools. No need for fancy graphics. :)


Didn't know they supported it as well. Thanks!


Seems relatively risk-adverse. Glad they're providing more variety to their data transferral services, but this really does seem like an extremely niche system that not many companies will fully utilise.


Hey Jeff, curious what the battery life of this device is?

I'm on a team that has a need for remote compute power, but getting to our devices on anything more frequent than a monthly basis is sometimes a challenge.


It sounds like the device does not have an internal battery so it depends on the capacity of the battery that you attach to it.

https://aws.amazon.com/snowcone/features/

> For a light workload at 25% CPU usage, the device can run on a 65 W battery for approximately 6 hours.

The docs are a little inconsistent on wattage requirement. The feature page above says "60 W+ (20 V) USB-C power adapters", but in the docs:

https://docs.aws.amazon.com/snowball/latest/snowcone-guide/s...

> any USB-C power adapter that is rated for 45W+


My team also may be looking for something with similar requirements in the near future, do you have any recommendations for solutions that you're considering? I know a number of earth science researchers that have mentioned ideas that would really benefit from that sort thing as well.


Answered my own question! Thanks for posting! Has some cool applications for sure, but too much draw for set it and forget type of operations.


On behalf of future searchers trying to avoid the "nvm I figured it out" problem, what was the answer?


All of this makes sense to me, except I have no idea how they fit 100tb of capacity in a box that small. Aren't HDDs maxing out around 16TB these days? They couldn't fit 6 drives in there.


You can fit 100tb in a Snowball, not the Snowcone, which maxes out at 8tb.

Having dealt with a few Snowballs from time to time, there is zero doubt they have space in those things for 100tb. They're not small or lightweight devices.

I'm super happy to see this device. It fills a niche a lot of my work falls into nicely.


Not sure where you're seeing 100TB?

> AWS Snowcone weighs 4.5 pounds and includes 8 terabytes of usable storage


Which makes it interesting in the other direction. If this box is worth two hundred dollars a month and two thousand dollars upon loss, why not put in a drive twice as big for an extra $200?


I was curious about this too, but going back now to take a closer look at that screenshot, those are three other Snowball models listed in the table, not three more Snowcone models.


If they really wanted to fit 100TB in a box that size, they could probably do it with flash. It would just be ludicrously expensive.


What happened to the Google/AWS services where you could ship a raw HDD? It seems like they've both retired that service in favour of more expensive custom hardware.


The HDD solution was problematic according to: https://news.ycombinator.com/item?id=23554752


Sort of EC2/EBS in a local NAS device? Interesting.


Is there any open hardware equivalent to this? I can see it being very useful for crowdsourced archiving projects.


It's funny hearing Amazon Polly trying to pronounce "vCPUs" like "vee-pus".


At first I thought it was the bottle.


"raspberry pi but with NSA"


At the first glance, I thought Amazon ship Kindle to customer with storage as a bonus.


Capitalizing on the fact that sneakernet is still faster than Internet.


Tactical edge computing?


They market this stuff aggressively to the government as a military IT solution.


Computers and software are essential to ISR, mission planning, and personnel.

Starlink will help with this, but sat data connections (IME) are perpetually saturated.


Is this used for AI type scenarios as a data buffer to s3?


What’s the price ?


$60 + $6 per day after 5 days. It adds up, examples at bottom: https://aws.amazon.com/snowcone/pricing/



I see this and think "oh look, a surveillance company made another spy device they want me to buy and put in my home."


It's a RasPi that glows in the dark




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: