We moved over to garage after running minio in production with about ~2PB after about 2 years of headache. Minio does not deal with small files very well, rightfully so, since they don't keep a separate index of the files other than straight on disk. While ssd's can mask this issue to some extent, spinning rust, not so much. And speaking of replication, this just works... Minio's approach even with synchronous mode turned on, tends to fall behind, and again small files will pretty much break it all together.
We saw about 20-30x performance gain overall after moving to garage for our specific use case.
quick question for advice - we have been evaluating minio for a in-house deployed storage for ML data. this is financial data which we have to comply on a crap ton of regulations.
so we wanted lots of compliance features - like access logs, access approvals, short lived (time bound) accesses, etc etc.
how would you compare garage vs minio on that front ?
As a competing theory, since both Minio and Garage are open source, if it were my stack I'd patch them to log with the granularity one wished since in my mental model the system of record will always have more information than a simple HTTP proxy in front of them
Plus, in the spirit of open source, it's very likely that if one person has this need then others have this need, too, and thus the whole ecosystem grows versus everyone having one more point of failure in the HTTP traversal
Hmm... maybe??? If you have a central audit log, what is the probability that whatever gets implemented in all the open (and closed) source projects will be compatible?
Why not? The application logs who, when and what happened to disk. This is application specific audit events and such patches should be welcome upstream.
Log scraper takes care of long time storage, search and indexing. Because you want your audit logs stored in a central location eventually. This is not bound to the application and upstream shouldn’t be concerned with how one does this.
That is assuming the application is aware of “who” is doing it. I can commit to GitHub any name/email address I want, but only GitHub proxy servers know who actually sent the commit.
Thats a very specific property of git, stemming from its distributed nature. Allowing one to push the history of a repo fetched from elsewhere.
The receiver of the push is still considered an application server in this case. Whether or not GitHub solves this with a proxy or by reimplementing the git protocol and solve it in process is an internal detail on their end. GitHub is still “the application”. Other git forges do this type of auth in the same process without any proxies, Gitlab or Gerrit for example, open source and self hosted, making this easy to confirm.
In fact, for such a hypothetical proxy to be able to solve this scenario, the proxy must have an implementation of git itself. How else would it know how to extract the commiter email and cross check that it matches the logged in users email?
An application almost always has the best view of what a resource is, the permissions set on it and it almost always has awareness of “who” is acting upon said resource.
> Thats a very specific property of git, stemming from its distributed nature.
Not at all. For example, authentication by a proxy server is old-as-the-internet. There's a name for it, I think, "proxy authentication"?[1] I've def had to write support for it many times in the past. It was the way to do SSO for self-hosted apps before modern SSO.
> In fact, for such a hypothetical proxy to be able to solve this scenario, the proxy must have an implementation of git itself.
Ummm, have you ever done a `git clone` before? Do you note the two most common types of urls: https/ssh. Both of these are standard implementations. Logging the user that is authenticating is literally how they do rate limiting and audit logging. The actual git server doesn't need to know anything about the current user or whether or not they are authenticated at all.
Enough of shifting the goal posts. This was about applications doing their own audit logging, I still don’t understand what’s wrong with that. Not made up claims that applications or a git server doesn’t know who is acting upon it. Yes, a proxy may know “who” and can perform additional auth and logging at that level, but often has a much less granular view of “what”. In the case of git over http, I doubt nginx out of the box has any idea of what a branch or a commiter email is, at best you will only see a request to the repo name and git-upload-pack url.
What I'm really missing in this space is something like this for content addressed blob storage.
I feel like a lot of complexity and performance overhead could be reduced if you only store immutable blobs under their hash (e.g Blake3). Combined with a soft delete this would make all operations idempotent, blobs trivially cacheable, and all state a CRDT/monotonically mergeable/coordination free.
There is stuff like IPFS in the large, but I want this for local deployments as a S3 replacement, when the metadata is stored elsewhere like git or a database.
I would settle for first-class support for object hashes. Let an object have metadata, available in the inventory, that gives zero or more hashes of the data. SHA256, some Blake family hash, and at least one decent tree hash should be supported. There should be a way to ask the store to add a hash to an existing object, and it should work on multipart objects.
IOW I would settle for content verification even without content addressing.
S3 has an extremely half-hearted implementation of this for “integrity”.
That's how we use S3 in Peergos (built on IPFS). You can get S3 to verify the sha256 of a block on write and reject the write if it doesn't match. This means many mutually untrusting users can all write to the same bucket at the same time with no possibility for conflict. We talk about this more here:
Yeah, and as far as I understood they use the key hash to address the overall object descriptor. So in theory using the hash of the file instead of the hash of the key should be a simple-ish change.
Tbh I'm not sure if content aware chunking isn't a sirens call:
- It sounds great on paper, but once you start storing encrypted (which you have to do if you want e2e encryption) or compressed blobs (e.g. images) it won't work anymore.
- Ideally you would store things with enough fine grained blobs that blob-level deduplication would suffice.
- Storing a blob across your cluster has additional compute, lookup, bookkeeping, and communication overhead, resulting in worse latency. Storing an object as a contiguous unit makes the cache/storage hierarchies happy and allows for optimisations like using `sendfile`.
- Storing the blobs as a unit makes computational storage easier to implement, where instead of reading the blob and processing it, you would send a small WASM program to the storage server (or drive? https://semiconductor.samsung.com/us/ssd/smart-ssd/) and only receive the computation result back.
It certainly has a lot of overlap and is a very interesting project, but like most projects in this space, I feel like it's already doing too much. I think that might be because many of these systems also try to be user facing?
E.g. it tries to solve the "mutability problem" (having human readable identifiers point to changing blobs); there are blobs and collections and documents; there is a whole resolver system with their ticket stuff
All of these things are interesting problems, that I'd definitely like to see solved some day, but I'd be more than happy with an "S3 for blobs" :D.
Perkeep has (at least until last I checked it) the very interesting property of being completely impossible for me to make heads or tails of while also looking extremely interesting and useful.
So in the hope of triggering someone to give me the missing link (maybe even a hyperlink) for me to understand it, here is a the situation:
I'm a SW dev that also have done a lot of sysadmin work. Yes, I have managed to install it. And that is about it. There seems to be so many features there but I really really don't understand how I am supposed to use the product or the documentation for that matter.
I could start an import of Twitter or something else an it kind of shows up. Same with anything else: photos etc.
It clearly does something but it was impossible to understand what I am supposed to do next, both from the ui and also from the docs.
Perkeep is such a cool, interesting concept, but it seems like it's on life-support.
If I'm not mistaken, it used to be funded by creator Brad Fitz, who could afford to hire a full-time developer on his Google salary, but that time has sadly passed.
It suffers from having so many cool use-cases that it struggles to find a balance in presentation.
I was curious to see if I could help, and I wondered if you saw their mailing list? It seems to have some folks complaining about things they wish it did, which strangely enough is often a good indication of what it currently does
There's also "Show Parkeep"-ish posts like this one <https://groups.google.com/g/perkeep/c/mHoUUcBz2Yw> where the user made their own Pocket implementation complete with original page snapshotting
The thing that most stood out to me was the number of folks who wanted to use Parkeep to manage its own content AND serve as the metadata system of record for external content (think: an existing MP3 library owned by an inflexible media player such as iTunes). So between that and your "import Twitter" comment, it seems one of its current hurdles is that the use case one might have for a system like this needs to be "all in" otherwise it becomes the same problem as a removable USB drive for storing stuff: "oh, damn, is that on my computer or on the external drive?"
Beside personal photo store, I use the storage part for file store at work (basically, indexing is off), with a simplifying wrapper for upload/download: github.com/tgulacsi/camproxy
With the adaptive block hashing (varying block sizes), it beats gzip for compression.
Yeah, there are pleanty of dead and abandoned projects in this space. Maybe the concept is worthless without a tool for metadata management?
Also I should probably have specified that by "missing" I mean, "there is nothing well maintained and production grade" ^^'
Yeah I've been following it on and off since it was camli-store. Maybe it tried to do too much at once and didn't focus on just the blob part enough, but I feel like it never really reached a coherent state and story.
Something related that I've been thinking about is that there aren't many popular data storage systems out there that use HTTP/3 and/or gRPC for the lower latency. I don't just mean object storage, but database servers too.
Recently I benchmarked the latency to some popular RPC, cache, and DB platforms and was shocked at how high the latency was. Every still talks about 1 ms as the latency floor, when it should be the ceiling.
Yeah QUIC would probably be a good protocol for such a system. Roundtrips are also expensive, ideally your client library would probably cache as much data as the local disk can hold.
Kademlia could certainly be a part of a solution to this, but it's a long road from the algorithm to the binary that you can start on a bunch of machines to get the service, e.g. something like SeaweedFS.
BitTorrent might actually be the closest thing we have to this, but it is at the opposite spectrum of the latency -distributed axis.
But you don't really handle blobs in real life: they can't really be handled, they don't have memorable name (by design). So you need an abstractly layer on top of it. You can use zfs that will deduplicate similar blobs. You can use restic for backups that will also deduplicate similar parts of a file also in an idempotent way. And you can use git that will deduplicate files based on their hash
More like based on the prior that all projects in that space arent' in the best of health. Thanks for the github link, that didn't pop up in my quick google search.
I am using seaweed for a project right now. Some things to consider with seaweed.
- It works pretty well, at least up to the 15B objects I am using it for. Running on 2 machines with about 300TB, (500 raw) storage on each.
- The documentation, specifically with regards to operations like how to backup things, or different failure modes of the components can be sparse.
- One example of the above is I spun up a second filer instance (which is supposed to sync automatically) which caused the master server to emit an error while it was syncing. The only way to know if it was working was watching the new filers storage slowly grow.
- Seaweed has a pretty high bus factor, though the dev is pretty responsive and seems to accept PRs at a steady rate.
I use seaweed as well. It has some warts as well as some feature incompleteness but I think the simplicity of the project itself is a pretty nice feature. It’s grokkable mostly pretty quickly since it’s only one dev and the codebase is pretty small
Nothing content-addressed in RADOS. It's just a key-value store with more powerful operations that get/put, and more in the strong consensus camp than the parents' request for coordination free things.
Can you point me towards resources that help me understand the trade offs being implied here? I feel like there is a ton of knowledge behind your statement that flies right past me because I don’t know the background behind why the things you are saying are important.
It's a huge field, basically distributed computing, burdened here with the glorious purpose of durable data storage. Any introductory text long enough becomes essentially a university-level computer science course.
RADOS is the underlying storage protocol used by Ceph (https://ceph.com/). Ceph is a distributed POSIX-compliant (very few exceptions) filesystem project that along the way implemented simpler things such as block devices for virtual machines and S3-compatible object storage. Clients send read/write/arbitrary-operation commands to OSDs (the storage servers), which deal internally with consistency, replication, recovery from data loss, and so on. Replication is usually leader and two followers. A write is only acknowledged after the OSD can guarantee that all later reads -- including ones sent to replicas -- will see the write. You can implement a filesystem or network block device on top of that, run a database on it, and not suffer data loss. But every write needs to be communicated to replicas, replica crashes need to be resolved quickly to be able to continue accepting writes (to maintain the strong consistency requirement), and so on.
On the other end of the spectrum, we have Cassandra. Cassandra is roughly a key-value store where the value consists of named cells, think SQL table columns. Concurrent writes to the same cell are resolved by Last Write Wins (LWW) (by timestamp, ties resolved by comparing values). Writes going to different servers act as concurrent writes, even if there were hours or days between them -- they are only resolved when the two servers manage to gossip about the state of their data, at which time both servers storing that key choose the same LWW winner.
In Cassandra, consistency is a caller-chosen quantity, from weak to durable-for-write-once to okay. (They added stronger consistency models in their later years, but I don't know much about them so I'm ignoring them here.) A writer can say "as long as my write succeeds at one server, I'm good" which means readers talking to a different server might not see it for a while. A writer can say "my write needs to succeed at majority of live servers", and then if a reader requires the same "quorum", we have a guarantee that the write wasn't lost due to a malfunction. It's still LWW, so the data can be overwritten by someone else without noticing. You couldn't implement a reliable "read, increment, write" counter directly on top of this level of consistency. (But once again, they added some sort of transactions later.)
The grandparent was asking for content-addressed storage enabling a coordination-free data store. So something more along the lines of Cassandra than RADOS.
Content-addressed means that e.g. you can only "Hello, world" under the key SHA256("Hello, world"). Generally, that means you need to store that hash somewhere, to ever see your data again. Doing this essentially removes the LWW overwrite problem -- assuming no hash collisions, only "Hello, world" can ever be stored at that key.
I have a pet project implementing content-addressed convergent encryption to an S3 backend, using symlinks in a git repo as the place to store the hashes, at https://github.com/bazil/plop -- it's woefully underdocumented but basically a simpler rewrite of the core of https://bazil.org/ which got stuck in CRDT merge hell. What that basically gets me is that e.g. ~/photos is a git repo with symlinks to a FUSE filesystem that manifests the contents on demand from S3-compatible storage. It can use multiple S3 backends, though active replication is not implemented (it'll just try until a write succeeds somewhere; reads are tried wider and wider until they succeed; you can prioritize specific backends to e.g. read/write nearby first and over the internet only when needed). Plop is basically a coordination-free content-addressed store, with convergent encryption. If you set up a background job to replicate between the S3 backends, it's quite reliable. (I'm intentionally allowing a window of only-one-replica-has-the-data, to keep things simpler.)
Here's some of the more industry-oriented writings from my bookmarks. As I said, it really is a university course (or three, or a PhD)..
I upvoted this but I also wanted to say as well that this summary is valuable for me to gain a better groundwork for an undoubtedly complex topic. Thank you for the additional context.
I have used Garage for a long time. It's great, but the AWS sigv4 protocol for accessing it is just frustrating. Why can't I just send my API key as a header? I don't need the full AWS SDK to get and put files, and the AWS sigv4 is a ton of extra complexity to add to my projects. I don't care about the "security benefits" of AWS sigv4. I hope the authors consider a different authentication scheme so I can recommend Garage more readily.
Implementing v4 on the server side also requires the service to keep the token as plain text. If it's a persistent password, rather than an ephemeral key, that opens up another whole host of security issues around password storage. And on the flip side requiring the client to hit an endpoint to receive a session based token is even more crippling from a performance perspective.
This is not intended for commercial services. Realistically, this software was made for people who keep servers in their basement. The security profile of LAN users is very different than public AWS.
Of course I know FOSS software runs most of the internet. But not all FOSS software equally. “Use the right tool for the job” and all that.
Why by the end of the year? Garage has been around for a while. Its lack of enterprise adoption is not due to its youth, but rather that it’s the wrong tool for the job.
There are plenty of FOSS object stores that exist already and are better targeted at enterprise workloads. Garage is amazing, I run it on my home server, but it’s not really “enterprise” software. And it’s not trying to be.
(Also I know plenty of people from AWS and it seems that a few products are FOSS based but plenty are written in house. Running on Linux, of course)
"lack of enterprise adoption" - That you know of! Most organizations don't blog when they start using a software (:
"wrong tool for the job" - What is the right tool? If it checks all the boxes compared to Minio, and outperforms Minio, it is not unlikely to be used. Minio itself was originally FOSS after all, and it is not without its problems. I'm sure there's many devops folk that welcome an alternative.
AWS (+ S3) is cost prohibitive for many workloads, even at Fortune 500 scale. Plenty of opportunity here.
Enterprise adoption isn't the goal of every software project. If people adopt it, great, but I don't think that all maintainers are targeting this. Garage is explicitly not targeting performance, for example, nor is it targeting a rich feature set.
Minio is certainly trying to be the enterprise-ready FOSS front-end for an object store. I can name a few other alternatives, like SeaweedFS, Ceph, Swift that are also trying to provide rich features. I'm not sure who checks all the boxes compared to Minio or others, depends on the boxes I guess.
Tried this for my own homelab, either I misconfigured it or it consumes x2(linearly) memory(working) of the stored data. So, for example, if I put 1GB of data, seaweed would immediately consume 2GB of memory constantly!
That is odd. It likely has something to do with the index caching and how many replication volumes you configured. By default it indexes all file metadata in RAM (I think) but that wouldn't justify that type of memory usage. I've always used mostly default configurations in Docker Swarm, similar to this:
depending on what you need it for nextcloud has WebDAV (clients can interact with it, and windows can mount your home folder directly, i just tried it out a couple days ago.) I've never used webdav before so i'm unsure of what other use cases there are, but the nextcloud implementation (whatever it may be) was friction-free - everything just worked.
I don’t understand why everyone wants to replicate AWS APIs for things that are not AWS.
S3 is a horrible interface with a terrible lack of features. It’s just file storage without any of the benefits of a file syste - no metadata, no directory structures, no ability to search, sort, or filter.
Combine that with high latency network file access and an overly verbose API. You literally have a bucket for storing files, when you used to have a toolbox with drawers, folders, and labels.
Replicating a real file system is not that hard, and when you lose the original reason for using a bucket —- because your were stuck in the swamp with nothing else to carry your files in — why keep using it when you’re out of the mud?
Does your file system have search? Mine doesn’t. Instead I have software that implements search on top of it. Does it support filtering? Mine uses software on top again. Which an S3 api totally supports.
Does your remote file server magically avoid network latency? Mine doesn’t.
In case you didn’t know, inside the bucket you can use a full path for S3 files. So you can have directories or folders or whatever.
Some benefits of this system (KV style access) is to support concurrent usage better. Not every system needs it, but if you’re using an object store you might.
What personal experience do you have in this area? In particular, how have you handled greater than single-server scale, storage-level corruption, network partitions, and atomicity under concurrent access?
You have server-client state. The concept of opened files, directories, and their states. Locks. The ability for multiple writers to write to the same file while still providing POSIX guarantees.
All of those need to correctly handle failure of both the client and the server.
CephFS implements that with a Metadata server that has lots of logica and needs plenty of RAM.
A distributed file system like CephFS is more convenient than S3 in multiple ways, and I agree it's preferable for most use cases. But it's undoubtedly more complex to build.
It's a legitimate question and I'm glad you asked! (I'm not the author of Garage and have no affiliation).
Filesystems impose a lot of constraints on data-consistency that make things go slow. In particular, when it comes to mutating directory structure. There's also another set of consistency constraints when it comes to dealing with file's contents. Object stores relax or remove these constraints, which allows them to "go faster". You should, however, carefully consider if the constraints are really unnecessary for your case. The typical use-case for object stores is something like storing volume snapshots, VM images, layers of layered filesystems etc. They would perform poorly if you wanted to use them to store the files of your programming project, for example.
> I don’t understand why everyone wants to replicate AWS APIs for things that are not AWS.
It's mostly just S3, really. You don't see anywhere near as many "clones" of other AWS services like EC2, for instance.
And there's a ton of value on being able to develop against a S3 clone like Garage or Minio and deploy against S3 - or being able to retarget an existing application which expected S3 to one of those clones.
S3 exposes effectively all the metadata that POSIX APIs do, in addition to all the custom metadata headers you can add.
Implementing a filesystem versus an object store involves severe tradeoffs in scalability and complexity that are rarely worth it for users that just want a giant bucket to dump things in.
The API doesn't matter that much, but everything already supports S3, so why not save time on client libraries and implement it? It's not like some alternative PUT/GET/DELETE API will be much simpler-- though naturally LIST could be implemented myriad ways.
You wouldn't want your "interactive" user filesystem on S3, no, but as the storage backend for a server application it makes sense. In those cases you very often are just storing everything in a single flat folder with all the associated metadata in your application's DB instead
By reducing the API surface (to essentially just GET, PUT, DELETE), it increases the flexibility of the backend. It's almost trivial to do a union mount with object storage, where half the files go to one server and half go to another (based on a hash of the name). This can and is done with POSIX filesystems too, but it requires more work to fully satisfy the semantics. One of the biggest complications is having to support file modification and mmap. With S3 you can instead only modify a file by fully replacing it with PUT. Which again might be unacceptable for a desktop OS filesystem, but many server applications already satisfy this constraint by default
Because at this point it's a well known API. I bet people want to recreate AWS without the Amazon part, and so this is for them.
Which, to your point, makes no sense because as you rightly point out, people use S3 because of the Amazon services and ecosystem it is integrated with - not at all because it is "good tech"
Storage is generally sticky but I wouldn’t be so quick to dismiss that reason because it might explain why anything would fail to displace it; a bunch of software is written against S3 and the entire ecosystem around it is quite rich. It doesn’t explain the initial popularity but does explain stickiness. Initial popularity was because it was the first good REST API to do cloud storage AND the price was super reasonable.
Oh, I’m definitely not saying integration or compatibility have nothing to do with it - only that “horrible interface with a terrible lack of features” seems impossible to reconcile with its immense popularity.
First mover + ecosystem wins over interface. And also, I really don’t have so many issues with the interface as others seem to and we had to implement it from scratch for R2. There’s features we extended on top of S3 so it’s not really the interface so much but rather that Amazon doesn’t really invest in adding features so much. I suspect it’s really hard for them to do so. We also exposed a native JS API that was more ergonomic but the S3 endpoint saw more traffic which again points to the ecosystem advantage - people have an S3 codebase and want to do as a drop in replacement as possible.
They were first mover for cloud storage and combining object storage with HTTP. Previous attempts were WebDAV (not object storage and very complicated to the point that kk one implemented) and Hadoop which didn’t have HTTP and couldn’t scale like S3’s design.
I can only answer for Garage and not others. Garage is the result of the desired organization of the collective behind it: deuxfleurs. The model is that of people willing to establish a horizontal governance, with none being forced to do anything because it all works by consensus. The idea is to have an infrastructure serving the collective, not a self hosted thing that everyone has to maintain, not something in a data center because it has clear ecological impacts, but something in-between. Something that can be hosted on secon-hand machines, at home, but taking the low reliability of machines/electricity/residential internet into account. Some kind of cluster, but not the kind you find in the cloud where machines are supposed to be kind of always on, linked with high-bandwidth, low-latency network: quite the opposite actually.
deuxfleurs thought long and hard about the kind of infra this would translate to. The base came fast enough: some kind of storage, based on a standard (even de-facto only is good because it means it is proven), that would tolerate some nodes go down. The decision of doing a Dynamo-like thing to be accessed through S3 with eventual consistency made sense
So Garage is not "simply" a S3 storage system: it is a system to store blobs in an unreliable but still trusted coonsumer-grade network of passable machines.
Minio assumes each node has identical hardware. Garage is designed for use-cases like self-hosting, where nodes are not expected to have identical hardware.
not just about cost! improved performance/latency can make workloads that previously required a local SSD/NVME to be actually able run to run on distributed storage or an object store.
it can not be understated how slow Ceph/Minio/etc can be compared to local NVME. there is plenty of room for improvement.
Last time I looked at Garage it only supported paired storage replication, such that if I had a 10GB disk in location A and a 1TB disk is location 2 and 3, it would only support "RAID1-esq" mirroring, so my storage would be limited to 10GB
A bit of an off-topic question: I would like to programmatically generate S3 credentials that allow only read access or r/w access to only a certain set of prefixes.
Imagine something like "Dropbox": You have a set of users, each user has his own prefix, but also users want to be able to share certain prefixes with other users.
(Users are managed externally in a Postgres DB - MinIO does currently not know about them).
I found this really difficult to achieve with MinIO, since this appears to require an AssumeRole request, which is almost not documented in any way and I did not find a Typescript example. Additionally, there's a weird set of restrictions in place for MinIO (and also AWS) that makes this really difficult to do, e.g. the size of policies is limited, which effectively limits the number of prefixes a user can share. I found this really difficult to work around.
Can anyone suggest a way to do this? Can garage do this? Am I just approaching this from the wrong side?
from an operations point of view, I am surprised anyone likes Raft. I have yet to see any application implement Raft in a way that does not spectacularly fail in production and require manual intervention to resolve.
CRDTs do not have the same failure scenarios and favor uptime over consistency.
Apache Ozone is an alternative for an object store running on top of Hadoop. Maybe someone who has experience running this in a production environment can comment on it.
Files are normally stored hierarchically (e.g. atomically move directories), and updated in place. Objects are normally considered to exist in a flat namespace and are written/replaced atomically. Object storage requires less expensive (in a distributed system) metadata operations. This means it's both easier and faster to scale out object storage.
From the perspective of consistency guarantees, object storage gives fewer of such guarantees (this is seen as allowing implementations to be faster than typical file-systems). For example, since there isn't a concept of directories in object store, the implementation doesn't need to deal with the problems that arise while copying or moving directories with files open in those directories.
There are some non-storage functions that are performed only by filesystems, but not object storage. For example, suid bits.
It's also much more common to use object stores for larger chunks of data s.a. whole disk snapshots, VM images etc. While filesystems aim for the middle-size (small being RDBMs) s.a. text files you'd open in a text editor. Subsequently, they are optimized for these objectives. Filesystems care a lot about what happens when random small incremental and possibly overlapping updates happen to the same file, while object stores care about performance of sequential reads and writes the most.
This excludes the notion of "distributed" as both can be distributed (and in different ways). I suppose you meant to ask about the difference between "distributed object storage" and "distributed filesystem".
I believe OpenStack Swift in particular is known to work well in some large organizations [1], NVIDIA is one of those and also invested in its maintenance [2].
Ist this formally verified by any chance ? I feel like there's space where formal designs could be expressed in TLA+ such that its easier for the community to keep track of the design.
Docker is young and fashionable, every windows script kiddy uses it nowadays!
And then comes to the Docker forum complaining about strange issues, not realizing Docker Desktop is a different product, it uses a Linux VM to run the Docker engine, which was build for Linux ;-)
I explicitly wrote "old-school Docker Swarm", as that is missing love for years and everyone with 2 IT FTEs seems to be moving to k8s.
We saw about 20-30x performance gain overall after moving to garage for our specific use case.