Hacker News new | past | comments | ask | show | jobs | submit login
What's the best AWS S3 protocol alternative?
51 points by nickolov on May 31, 2023 | hide | past | favorite | 34 comments
That's it, with batteries included like file grant privilege/permission control, or presigned url with path support.



I believe Ceph [1][2] could be a good alternative. It can be self hosted and I believe some cloud providers also offer it. Here are some differences between S3 and Ceph [3].

[1] - https://ceph.io/en/

[2] - https://docs.ceph.com/en/latest/radosgw/s3/

[3] - https://www.lightbitslabs.com/blog/ceph-storage/


And incidentally Ceph supports both S3 and Swift (which I think is the best S3 protocol alternative):

https://docs.ceph.com/en/latest/radosgw/swift/


Digitalocean Spaces uses Ceph as their backend and it is NOT stable (at least with s3fs), had to migrate away from it and could not do a full-stack on DO (aka "no AWS") and had to swap in Wasabi.


Just here to rant about 'S3 compliance' a little - I have no real awareness on the topic at hand.

Tangentially this is lamenting at the lack of choice

Last night I tried to set up BackBlaze (S3 'compatible') behind Mattermost.

This doesn't go very far when clients use/rely on headers that specifically mention vendorization

BackBlaze won't work with clients that try to send x-amz-sdk-checksum-algorithm, Mattermost taught me this

TLDR: S3 isn't always S3


Did you happen to submit this bug to Backblaze for triage? If not, I'll get it to the right people!

> When you're using an SDK, you can set the value of the x-amz-sdk-checksum-algorithm parameter to the algorithm that you want Amazon S3 to use when calculating the checksum. Amazon S3 automatically calculates the checksum value.

> When you're using the REST API, you don't use the x-amz-sdk-checksum-algorithm parameter. Instead, you use one of the algorithm-specific headers (for example, x-amz-checksum-crc32).

https://docs.aws.amazon.com/AmazonS3/latest/userguide/checki...


I hadn't considered it! Your username is appropriate, too much to do :D

My requirements are pretty low so I was likely to just let it slide...

...but with your help (and perhaps from others on understanding the pieces), I'll consider it!

I'm remarkably unfamiliar with S3 style of services, only occasionally trying to dabble

edit: The choosing of appropriate headers may come down to Mattermost/clients -- BackBlaze might be technically doing the right thing saying "I don't know what to do with these"


What can I say, I want to solve all the things :) I will report back.


+1

Please do. Backblaze lists x-amz-sdk-checksum-algorithm as unsupported [1]. Would be great to have it supported to be able to use it with Mattermost and other tools that use min.io for S3.

[1] https://www.backblaze.com/b2/docs/s3_compatible_api.html


Backblaze is not a good choice for most things unless you were using it strictly for backup. Their S3 compatibility, as is most S3 compatibility, is a cruel joke last I looked at it.

I publish a mildly(?) popular offloading plugin for WordPress and adding Backblaze support was my biggest regret.


That's unfortunate to hear... I assumed the API to be more standardized in the strictest sense.

I was truly surprised to see Amazon branded headers in something claiming to be 'standards compliant'

They may be optional, but their presence simply complicates things; particularly when clients opt into them.

It devalues what is common/reused! Either locking you into Amazon or another protocol entirely


It’s also very, very slow.

I switched to Wasabi and am using the C++ AWS SDK. I’m impressed by their performance

Usage is domestic fibre gigabit symmetric from a 2013 MacBook Pro.


Depends what you are after.

For public access, something like webDAV is probably the lowest hanging fruit.

However its still HTTP based, so not great for speed or partial writes.

HDFS is not worth the hassle, same goes for ceph, its a lot of work for slower worse performance than ZFS/pNFS.

If you want speed and posix, then something like GPFS from IBM is where to go, failing that, lustre, which you can get from AWS if you want to try it.


The wording is a bit unclear, but I believe they were asking for an alternative to AWS that still uses the S3 protocol, not for an alternative to the S3 protocol.


I assumed the opposite. But now they has plenty of answers for both cases.


Protocol alternative? What are you specifically looking for?

It's object storage, the protocol is not that nuanced. Any other storage product will have it's own quirks. google cloud storage has an XML API that is similar enough and has signed urls, and just about every enterprise storage company has an "object" product.


Maybe Minio: https://github.com/minio/minio / https://min.io

I've only used it as a fairly straight forward object store though, so not sure about privileges/permissions (etc).


Depending on what you use it for, the license could be an issue.


For others, it's AGPL


If you are looking for an s3 protocol alternative it's probably swift. [0]

There is no widespread software adoption like for s3.

If you want something that is accessible via http and that's it take a look at seaweedfs? You can put files in over different gateways but access it over http. The rights management is probably not what you want. (You mentioned acls)

If you want some storage that you can access over http just setup anything and use a webserver like nginx in front of it...

The other commenters are right and you need to be more specific about your requirements.

Edit added link and some text I somehow missed [0] https://docs.openstack.org/api-ref/object-store/index.html


By 'swift', he's probably talking about OpenStack Swift [1].

[1] https://docs.openstack.org/swift/latest/


Min.io is a good alternative. See https://min.io/download#/


You say protocol alternative, but assuming you're more concerned with AWS as the host than S3 as the protocol you might try https://github.com/minio/minio

If you do feel an aversion to the protocol then the rclone backend list would be a good starting point

https://rclone.org/overview/

I like recent (v3) SMB over the network personally.


Is that

1. "alternative implementation with the AWS S3 protocol"

2. "alternative protocol with similar features to AWS S3"

?


Everyone seems to be using MinIO these days.


Storj uplink package has benefits around parallelism, erasure-coding, etc that improves on S3 protocol in many ways: https://docs.storj.io/dcs/getting-started/quickstart-uplink-...


I'm not sure what "protocol alternative" means or what your goals are, but it's worth suggesting that pretty much every other big cloud provider offers an S3-compatible blob storage product. Are you looking for something you can deploy yourself or just an alternative to S3 itself in general?


Doesn't currently meet all the OPs requirements, but I've been working on https://gemdrive.io/


Is there an actual S3 protocol definition/RFC anywhere? How does one go about implementing an SDK or do most people just use AWS's SDKs?


R2 doesn’t yet have fine grained controls/file grants but it does have a lot of batteries included and zero egress.

Edit: Disclaimer I work at Cloudflare


Your question is unclear. Do you really mean the protocol, which is also implemented in alternatives like MinIO, or the AWS service?


I've had good luck with Wasabi, no issues in past couple years so far (around 5tb stored).


Has anyone tried remote HDFS?


seaweedfs


askhn




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: