MinIO: A Bare Metal Drop-In for AWS S3

tinco · on Aug 10, 2021

Is HDFS nice? I did a lot of research before settling on Ceph for our in-house storage cluster, and I don't remember even considering HDFS and I don't really know why. Ceph also is a drop-in for S3 for bare metal clusters.

I've been running Ceph for about a year now, and the start up was a bit rough. We are actually on second hand hard drives, that had a lot of bad apples, and the failures weren't actually very transparent to deal with, which was a bit of a disappointment. Maybe my expectations were too high, but I was hoping it would just sort of fix itself (i.e. down the relevant drive, send me a notification, and ensure continuity). I feel I had to learn way too much about Ceph to be able to operate it properly. Besides that the performance is also not stellar, it apparently scales with CPU frequency, which is a bit astonishing to me, but I've never designed a distributed filesystem so who am I to judge.

I was looking for something that would scale with the company. Now we've got 70 drives, maybe next year 100 and the next year 200. Now all our drives are 4TB, but I'd like to switch them out for 14TB or 18TB drives as we go along. We're not in a position to just drop 100k on a batch of shiny state of the art machines at once. Many filesystems assume the number of drives in your cluster never changes, it's crazy.

Cixelyn · on Aug 10, 2021

Curious -- any reason you didn't just go with a single machine export + expansion disk shelves on something like ZFS? Installing a MinIO gateway would also act as a bare drop-in for S3 too.

Asking since we're in the same position as yourself w/ high double-digit disks trying to figure out our plan moving forward. Right now we're just using a very large beefy node w/ shelves. ZFS (via TrueNAS) does give us pretty good guarantees on failed disks + automated notifications when stuff goes wrong.

Obviously a single system won't scale past a few hundred disks so we are looking at alternatives including Ceph, GlusterFS, and BeeGFS. From the outside looking in, Ceph seems like it might be more complexity than it's worth until you hit the 10s of PB range with completely standardized hardware?

tinco · on Aug 10, 2021

Some of our rendering processes take multiple days to complete, and the blackbox software we use doesn't have a pause button. So it's not that we're in need of 99.99999% uptime, but there's actually never a moment where rebooting a machine would be convenient (or indeed cost us money). Being distributed over nodes means I can reboot them and the processes are not disrupted.

merb · on Aug 10, 2021

for k8s there is also kadalu btw. which is based on glusterfs, but simplified.

gpapilion · on Aug 10, 2021

HDFS doesn't really work as a normal filesystem. I think some other commenters pointed out the challenges with FUSE.

If I recall correctly there isn't really a way to modify an existing file via HDFS, so you'd have to copy/edit/replace. Append used to be an issue, but that got sorted out a few years back.

Erasure coding is available in the latest versions. Which helps with replication costs.

I think, HDFS may just be a simpler setup than other solutions. (which is to say its not all that simple, but easier than some other choices). And I wouldn't use HDFS as a replacement for block storage, which is something I've seen done with Ceph.

tinco · on Aug 10, 2021

Thanks, we actually use Ceph as a straight up filesystem, that gets mounted on linux machines and then exposed to our windows based processing nodes (they are human operated) over SMB. I think that explains why HDFS is not a good fit for us.

ryanmarsh · on Aug 10, 2021

What about S3 didn't meet your use case? I don't work for AWS. I don't care if they lose business, I am interested in how different companies parse their requirements into manage vs. rent.

tinco · on Aug 10, 2021

One aspect is that we have a lot of data that has PII in it, and we feel safer if we anonymise that locally before sending it into the cloud. Once the data is cleaned up it's actually sent to GCS for consumption in our product). Another aspect is that this data has to be accessible as windows fileshares (i.e. SMB) to our data processing team. The datasets are in the range of several 100's of GB to several TB, each of the team members works on several of those datasets per day. This would strain our uplink too and maybe the bandwidth would be costly as well.

nycjay · on Aug 12, 2021

If you are writing a ton of small files (we have billions of audit blobs we write) the API put costs can quickly creep up on your. We pay much more for those than on the actual storage costs. If you want to use tags on your objects, they charge you per tag per object per month - again, another huge cost. We missed that when pricing S3 out, and needed to do a project to pull out all of the tags we had, and are currently working on batching up multiple blobs into one larger blobs to hopefully reduce our API costs by an order of magnitude. This is purely a cost decision for us, adding complexity to our application and its operation. S3 seems better suited for fewer larger files. Our backups and other use cases like that work perfectly.

jeffbee · on Aug 10, 2021

HDFS has pretty much all of Ceph's flaws plus it has a non-scalable metadata server, the "NameNode". If you're already up and running with Ceph I can think of no reason to abuse yourself with HDFS.

wingmanjd · on Aug 10, 2021

We're spinning up a medium sized Proxmox clusters (~50 nodes in total) to replace our aging Xen clusters. I saw Ceph is available on the Proxmox platform, but was hesitant to make all the VM storage backed by Ceph (throwing all the eggs into a single basket).

What were some of the other hurdles you faced in your Ceph deployment?

tinco · on Aug 10, 2021

We've been playing around with migrating our bare metals to proxmox as well. Though one main argument, being able to reboot/manage crashed GPU accelerated nodes, was invalidated by proxmox (KVM?) itself crashing whenever the GPU would crash, so it didn't solve our core problem. This is of course also due to that we're not using industrial components, but it is what it is.

I found Ceph's error messages very hard to debug. Just google around a bit for the manuals of how to deal with buggy or fully defective drives. There's a lot of SSH'ing in, running vague commands looking up id's of drives and matching them to linux device mount points and reading vague error logs.

To me as a high level operator it feels it should be simple. If a drive supplies a block of data, and that data fails its checksum, it's gone. The drive already does its very best internally to cope with physical issues, if the drive couldn't come up with valid data, it's toast or as close to toast as anyone should be comfortable with. So it's simple, fail a checksum, out of the cluster, send me an e-mail, I don't get why Ceph has to be so much more complicated than that.

tasqa · on Aug 10, 2021

I found proxmox not to be very user friendly growing to such cluster sizes. Proxmox itself has been very stable and supports pretty much anything but the GUI is not that great if you have many nodes and VMS, and the API can be lacking. However, using ceph as a backing store for VM images is pretty easy in proxmox. I have not used the cephFS stuff. I used it in a separate cluster both physically and standalone (not using proxmox integration).

So RBD is easy, S3 is somewhat more complicated as you need to run multiple gateways, but still very doable. The FS stuff also needs extra daemons, but I have not yet tested it.

pram · on Aug 10, 2021

You have to use something like FUSE to mount HDFS, if that is your intention. It's not really like Ceph. Unless your app is written to use the HDFS API directly it's going to be a bigger rigmarole to store stuff.

justinholmes · on Aug 10, 2021

Did you not evaluate linstor?

tinco · on Aug 10, 2021

Thanks, I didn't but it looks interesting, I'll research it later.

0x000000001 · on Aug 10, 2021

Be warned that their code quality is pretty bad. There was a bug I was dealing with last year where it did not delete objects but returned the correct HTTP response code indicating it did. This was widespread, not just some edge case I encountered. Their broken test suite doesn't actually verify the object on disk changed. I tried to engage them but they blew me off.

tobias3 · on Aug 10, 2021

Minio isn't durable. Any S3 operation might not be on disk after it is completed successfully. They had an environment variable MINIO_DRIVE_SYNC for a while that fixed some cases. Looking at the current code this setting is called MINIO_FS_OSYNC now (for some reason) https://github.com/minio/minio/pull/9581/commits/ce63c75575a... (but I wouldn't trust that... are they fsyncing directories correctly? Making sure object metadata gets deleted with the data in one transaction etc.). Totally undocumented, too.

I guess this makes minio "fast". But it might eat your data. Please use something like Ceph+RadosGW instead. It might be okay for running tests where durability isn't a requirement.

tyingq · on Aug 10, 2021

That had me curious, so I searched a bit in their issues.

Their attitude about it isn't great: https://github.com/minio/minio/issues/3536

That's too bad, as it seems well thought out in other areas, like clustering.

dragonsh · on Aug 11, 2021

MinIO team care about an issue if you are paid customer, not for people who use the open source. Indeed MinIO is not even fully S3 compatible with many edge cases and close the issues related to it by saying it’s not a priority.

You might want to look at other options as well like SeaweedFS [0] a POSIX compliant S3 compatible distributed file system.

[0] https://github.com/chrislusf/seaweedfs

tobias3 · on Aug 11, 2021

I haven't used seaweedfs yet, but it looks better (and small file/object performance should be miles better). W.r.t. to fsync/durability with the seaweedfs API to a volume server you have to turn fsync on via paramter and it is disabled by default. With S3 it is probably also off by default and you can turn it on per bucket: https://github.com/chrislusf/seaweedfs/wiki/Path-Specific-Co... .

Both should default to fsync on, with the option to turn it off. So not a great choice of defaults. Again, it probably looks good in benchmarks when people naively compare S3 stores. But it just shouldn't eat your data per default.

pbadenski · on Aug 10, 2021

We tried to use it a year ago or so, because of the performance promise. We were getting random glitches every few thousand files during operations. There was no obvious pattern, so difficult to reproduce, and as far as I remember there was mentions of it in github. Hopefully they acknowledge and get over this hump, as it seems like a promising project all together.

tailspin2019 · on Aug 11, 2021

Yep, same experience unfortunately

mtalantikite · on Aug 10, 2021

I also hit a very frustrating issue in minio where CORS headers weren't being set properly and there were many similar cases in their issues history. Their response was basically "works for me, sorry".

I'm pretty sure there was something weird going on with how minio was reading the config state, as I definitely was not the only one hitting it. Luckily I only had to use it for local testing in the project, but the whole thing didn't leave me feeling good.

[1] https://github.com/minio/minio/issues/11111

kapilvt · on Aug 10, 2021

Github issue link? They seem to have a solid ci setup, and I know several large enterprises using. But I found a bug for my usage != bad code quality.

0x000000001 · on Aug 10, 2021

My usage was "setup a basic single node for testing, upload a file with mc client, delete a file with mc client". They failed that test. It was responding with 200s but the file was never deleted.

There are loads of issues like this on their github: https://github.com/minio/minio/issues/8873

tyingq · on Aug 10, 2021

That's an interesting issue. Boiled down to the object name having '//' in it, which drove a certain direction for the shard location that wasn't the same shard location that the delete function looked in.

Sounds like the shard hashing happens before or after object name normalization depending on the operation. Ouch.

_hyn3 · on Aug 11, 2021

Based on sib comment, it looks like it was related to a poor name sanitation / matching function (which says, to me, that's risky if you have untrusted names), but this could also be caused by a lazy or delayed deletion strategy.

etaioinshrdlu · on Aug 10, 2021

I had issues with frequent crashes due to various panics a while a go. It eventually went away after a version upgrade. But now reading this I don’t feel terribly confident in using minio long term.

Areading314 · on Aug 10, 2021

Is Ceph with Rados Gateway a better alternative to this?

hardwaresofton · on Aug 10, 2021

CERN runs at least in part on Ceph and it's well documented:

https://www.youtube.com/watch?v=OopRMUYiY5E

0x000000001 · on Aug 10, 2021

I have a 500PB Ceph setup @ work, but I don't maintain it. It's been solid.

nateoda · on Aug 10, 2021

I would say no in production. I was recently testing a ceph + rgw as an on prem s3 solution, but high throughput puts + ls caused an index corruption that “lost” files according to future LS’s, the file was still there if you directly get it. When this was reported it was already found multiple years ago, and never fixed

marbu · on Aug 10, 2021

Could you reference a bug url? I tried to find it via tracker.ceph.com but failed to do so (I don't claim that the problem doesn't exist). That said referencing a bug url would be nice if you want to increase credibility of your claim.

AdamProut · on Aug 11, 2021

Could be: https://tracker.ceph.com/issues/24744

I know this bug has hampered our use of ceph at singlestore. Note that this is not an eventual consistency issue. When it happens the list command will permanently miss files.

jquaint · on Aug 11, 2021

100% agree. I pin the version we use because you never know if it will come with even more bugs.

merb · on Aug 10, 2021

what would be a better way to export a nfs storage to s3 than? swift, like it does for glusterfs?

heipei · on Aug 10, 2021

I don't know when this was written, but MinIO does not have a great story (or really any story) around horizontal scalability. Yes, you can set it up in "distributed mode", but that is completely fixed at setup time and requires a certain number of nodes right from the beginning.

For anyone who wants HA and horizontal elastic scalability, checkout SeaweedFS instead, it is based on the Facebook "Haystack" paper: https://github.com/chrislusf/seaweedfs

chrislusf · on Aug 10, 2021

Thanks! SeaweedFS has a linearly scalable architecture and performs much better.

It is also very easy to run. Just run this: "docker run -p 8333:8333 chrislusf/seaweedfs server -s3"

hardwaresofton · on Aug 10, 2021

SeaweedFS is an amazing project, thanks so much for making it.

I know I'm asking quite a biased source but are there any shortcomings of SeaweedFS that are well known? Any hangups/weird corners that you can think of just off the top of your head?

yencabulator · on Aug 11, 2021

I spent some hours looking at SeaweedFS, and walked away with the impression that most of the code outside of the happy paths wasn't exercised much.

For example, if you batch upload data, and the default 8 volumes happen to fill at the same time, you get transient errors until it has managed to create new volumes.

hardwaresofton · on Aug 11, 2021

I haven't looked at it as deeply -- thanks for pointing this out. Up until now I actually wanted to use MinIO for running an S3 service (Ceph + RadosGW is also an option for me, and this thread make me consider it over MinIO though it was always a strong contender). In the process of researching I earmarked SeaweedFS and cortx[0], but SeaweedFS attracted me way more -- looks like it would fit me "just right".

It'd be nice if there were an issue to explain this shortcoming, does the project know it happens and it's like a "hopefully we'll have auto scaling/adjusting volumes/online configuration update in the future" or something? How would one mitigate that?

[0]: https://github.com/Seagate/cortx

yencabulator · on Aug 11, 2021

Well, the author has to be aware of it: https://github.com/chrislusf/seaweedfs/issues/2216

chrislusf · on Aug 11, 2021

For this particular issue, it was fixed in a PR that checks whether a volume is "crowded" and creates new volumes before they fill up. Not really an issue any more.

chrislusf · on Aug 11, 2021

It really depends on the use case.

The project is still growing and there are different edge cases for each features, especially new ones. However, in general, I feel the project is structured layer-by-layer, and should be easy to fix the problems.

Some parts are complicated, e.g., FUSE mount. It's hard or impossible to be fully POSIX compliant. SeaweedFS has come a long way and has improved quite a lot, but maybe do not run your database on it just yet, until SeaweedFS supports block storage later.

killingtime74 · on Aug 10, 2021

I agree it’s great (and mostly used) for integration testing, do significant number of users actually use it for storage?

tuananh · on Aug 11, 2021

is it ready for production use? i can't find docs on how to run multiple master anywhere.

gunapologist99 · on Aug 10, 2021

Seaweedfs has made some questionable security decisions (https://github.com/chrislusf/seaweedfs/issues/1937).

mosselman · on Aug 11, 2021

You can still pass the -ip flag right? This mostly means you probably need to read some 'Seaweedfs in production' sort of guide.

gunapologist99 · on Aug 11, 2021

> you probably need to read some 'Seaweedfs in production' sort of guide.

This comes across as slightly condescending.

As I'm sure you'd agree, secure by default is very important, and it's what most responsible distributions aim for (i.e., Debian/Ubuntu). Starting up a daemon should not launch it in the most open way possible, but instead the most restricted way possible.

A reasonable expectation is that you should not have to pass the -ip flag; daemons should default to a secure configuration (which probably means defaulting to -ip 127.0.0.1, which you should be able to easily override, if that is your intention, and achieve the default behavior by simply passing -ip 0.0.0.0/0).

mosselman · on Aug 13, 2021

> This comes across as slightly condescending.

As does:

> As I'm sure you'd agree, secure by default is very important

I just meant it in a practical sense, you (as in people), need to read a guide in order to make it production ready instead of seafweed being production ready by default. I checked, there even is a guide in the repo, so I guess people need to read it.

Already__Taken · on Aug 10, 2021

wow that's got everything and almost the kitchen sink.

chrislusf · on Aug 10, 2021

Actually SeaweedFS already supports asynchronous backup to data sinks such as S3,GCP,Azure,etc.

I did not heard of this "kitchen" sink before. :)

But adding one more sink should be trivial.

xrd · on Aug 10, 2021

It's amazing. I use it everywhere.

The only limitation is that you don't have all the IAM access rules that you get with AWS.

Oh wait, that's exactly why I love it.

xrd · on Aug 10, 2021

I feel like generating my own "S3 signed URLs" from Minio (as a node script) is a much better way to layer security than that IAM mess.

And, the mc command line client is awesome.

And, it all runs inside dokku which is incredible.

andrewxdiamond · on Aug 10, 2021

What specifically do you have issues with when it comes to IAM?

It’s a complicated tool for sure, but it comes from the natural complication of dealing with auth in a very flexible way.

lazide · on Aug 10, 2021

Not the parent, but IMO it is an awkward way of thinking of permissions in an automated environment. It fits a human model much better, where you have a long lived entity which is self contained and expected to be trustworthy. Alice should have access to all finance data in this bucket because she works in finance, or Bob should be able to access these EC2 instances because he admins them.

It causes weird and overly broad privileges though usually, because you need to give permission to do any possible thing the job or user of the credentials COULD need to do, all the time.

This happens because any action to limit the scope usually causes more human friction than it is worth.

Ideally, when it is requested they do something, they get handed a token for the scope they are doing it in, which only gives them access to do the specific things they will need to do on the specific thing they need to do it, and only for the time they plausibly will need to do it for. This is a huge hassle for humans, and adds a lot of time and friction. For machines, it can be as simple as signing and passing along some simple data structures.

So for example, Alice would get a token allowing access to Q4 ‘20 only if that was plausibly correct and necessary, and then only for how long it took to do whatever she was doing. Bob would only get a token to access the specific EC2 instance that needs him to log into it because of a failure of the management tools that otherwise would fix things - and only after telling the token issuing tool/authority that, where it can be logged.

It makes a huge difference in limiting the scope compromises, catching security breeches and security bugs early on, identifying the true scope and accessibility of data, etc.

Also, since no commonly issued token should probably ever provide access to get everything - where the IAM model pretty much requires that a job that gets any random one thing, has to be able to get ALL things, then you also end up with the potential for runtime performance optimizations, since you can prune in advance the set of possible values to search/return.

orf · on Aug 10, 2021

You can model this with a combination of explicit conditions and principal/resource tags. You also can apply a specific custom policy with every role assumption that can be both time bound and more restrictive than the role policies themselves. All IAM stuff is also very heavily logged.

But overall I’m not sure constantly reaching out to IAM to retrieve scoped permissions for every single action makes much sense. Aside from the obvious latency issues the master set of credentials needs to have permissions to be able to request these scoped time-bound keys, and so them being leaked is just as bad as they can be used to just re-request access to “Q2 data”. Ok, so we need some logic to say “Alice should only be able to request these keys once a day” or some such, and these arbitrary requirements are much more complex to implement and a lot more fragile.

So it only makes sense if you’re expecting it to be materially more common for a service to somehow leak these time-bound single access keys but not leak any other credentials. Which isn’t an assumption that would hold up I think.

So what’s the point?

OJFord · on Aug 10, 2021

> It causes weird and overly broad privileges though usually, because you need to give permission to do any possible thing the job or user of the credentials COULD need to do, all the time.

Not really, unless you mean it needs permission to assume all the roles it could need in order to have the permissions it requires.

lazide · on Aug 10, 2021

That is exactly what I mean. A web server that makes database requests needs permission to do any query a web request would need to be able to trigger - not just permission for the specific query that makes sense for the specific request it is serving at the time.

It’s the difference between ‘can query the database’ and ‘can retrieve user Clarice’s profile information because she just made a request To the profile edit page’

Does that make sense?

OJFord · on Aug 10, 2021

Yes, I understand, but the point I'm making is that it does support 'roles', 'assuming' them for a period of time, then dropping those privileges again, or 'assuming' a different one, etc.

The 'because' isn't there, but I'm not really sure what that would look like, at least in a meaningful (not trivially spoofable) way.

lazide · on Aug 10, 2021

But no one is creating a role For read, modify, write for every distinct user bit of data no? Or at least I hope not and I doubt the system would function if they tried.

Tokens can do that.

OJFord · on Aug 10, 2021

But don't you just move the problem to the token-granting authority?

Don't get me wrong, I do see the hypothetical benefit, I'm just having trouble envisaging a practical solution. Is there something else not on AWS (or third-party for it) that works as you'd like IAM to?

lazide · on Aug 10, 2021

I don’t think you understand my comment? (And the top level comment?)

Your token grantor is just taking in whatever request state you have (session, permissions granted, whatever), and stuffing them into the token that gets passed around. Then the various back ends and client calls also do that, and where there is a permissions check (or conversion) necessary, say on a backend API call to access something, it checks it the callers token has the right permission.

I don’t want IAM to work that way. I don’t want to use IAM for this? It’s the wrong tool.

There are a ton of various signed token frameworks, all with various trade offs. JWT (ugh), Gaia Mint (internal Google), etc.

It tends to work best where there are multiple layers of services, as you have an abstraction layer you can do checking/audits, etc. at.

OJFord · on Aug 11, 2021

> a backend API call to access something, it checks it the callers token has the right permission.

Fine, but then that backend handler has permission itself to do whatever it is, regardless of whether or not the token does, since as you said, it COULD need it to service that request?

> I don’t want IAM to work that way. I don’t want to use IAM for this? It’s the wrong tool.

Yes.. I.. agree, I'm confused now why we're even talking about IAM, it's solving a slightly similar problem at a different level; isn't particularly useful here.

lazide · on Aug 11, 2021

Token checking can happen literally at the DB layer, or storage layer if you have such a system (like Google or AWS or whatever).

And I don't know why, you're the only who responded to my reply to 'What specifically do you have issues with when it comes to IAM?'

Haha

OJFord · on Aug 11, 2021

Well the use case wasn't clear to me 'up there', I thought you were arguing against IAM in general, or for cases where others do use it.

Essentially I suppose I disagree that it's only good for human users with long-lived roles, but I'm not saying it's the right tool for per-request granular authn, and I'd be surprised to learn that anyone is saying or (trying to be) using it like that. IAM's not even for end human users, (as in of your application) nevermind breaking further down into different types of request from them or on their behalf.

lazide · on Aug 12, 2021

I wasn’t referring to end users? I was referring to admins, employees of the company, etc, which is what IAM is a good fit for.

Using that as the sole way to Scope process/machine access though IS a weird fit in an automated environment for the reasons I laid out. You either come up with a broad scope that covers everything the job/process/machine could ever need to do or access (and then hope there is no exploit or bug that results in it accessing more), or build something like a token system that lets you get/scope access or permission in the context of the work it is doing on behalf of someone else. Which requires investment, but fits what should really be happening better. That is more of the ‘zero trust’ model, but certainly not all of it.

rhizome · on Aug 10, 2021

You answered your own question. "Very flexible" is a plus for Amazon because they can cover everybody's use-cases with a single concept. "Very flexible" is a minus for end users because they only need to take care of their own use-case.

So you can say it's a "natural" complication, and you'd be right, but that says nothing about usability, which is where "issues" tends to come in.

killingtime74 · on Aug 10, 2021

Probably learning curve

tyingq · on Aug 10, 2021

The gateway feature, where Minio works as a local cache for actual AWS S3 buckets, looks pretty nice.

https://docs.min.io/docs/minio-gateway-for-s3.html

eropple · on Aug 10, 2021

I've used this, years ago, at a company for exactly this, and it's really solid, I've also used it in a developer environment as a more expansive "fake S3" than the simpler ones I'd run across at the time. Good stuff.

mdaniel · on Aug 10, 2021

One will wish to be cautious, as they recently changed their license to AGPL-3.0: https://github.com/minio/minio/blob/master/LICENSE because they're afraid of AWS offering Minio as a hosted service, I guess

tyingq · on Aug 10, 2021

That seems okay, since you can use any S3 client library. So, good advice, but probably very few folks would have a need to touch the server side source.

Minio's client side libraries appear to be packaged separately, and Apache licensed: https://github.com/minio/minio-go

https://github.com/minio/minio-js

(Etc)

whateveracct · on Aug 10, 2021

And if you do touch the server-side code..do it in an open source fork?

marcinzm · on Aug 10, 2021

They more likely afraid of smaller and upcoming cloud providers offering it as an S3 drop in.

corobo · on Aug 10, 2021

This plus CDNs I'd imagine too. S3 protocol is the new FTP and minio ticks the box quickly, they want their share of that (and deserve it imo)

syshum · on Aug 10, 2021

Why would AWS offer Minio, a clone of an AWS Service, as a service?

That seems very confusing

remram · on Aug 10, 2021

Amazon wouldn't, but another cloud service might decide to run this rather than implementing their own S3-compatible object storage from scratch. Or they might use part of Minio's code to make their existing object storage solution compatible with S3.

adolph · on Aug 10, 2021

IIRC from this podcast with Anand Babu Periasamy [0], they already do.

https://www.dataengineeringpodcast.com/minio-object-storage-...

remram · on Aug 10, 2021

Who does what? Amazon runs Minio?

The notes don't mention this and the audio is over one hour, would you mind clarifying?

adolph · on Aug 11, 2021

another cloud service [Azure] decided to run this rather than implementing their own S3-compatible object storage from scratch

MonkeyClub · on Aug 10, 2021

So that AWS can still get people to pay subscription fees to them, instead of using their own hardware with a FOSS solution, if MinIO becomes too popular.

remram · on Aug 10, 2021

I don't understand. Aren't you describing S3? Why would Amazon offer a second version of S3?

hdjjhhvvhga · on Aug 10, 2021

I understood it as a joke referencing Aamazon's tendency to take just any open source product that happens to gain enough popularity, rename it and offer it as a shiny new feature of AWS.

cyberge99 · on Aug 10, 2021

Self hosted as a feature. Akin to managed vs colo hosting.

motives · on Aug 10, 2021

They pretty much already offer this with outposts (its technically their hardware but its on your premises).

remram · on Aug 10, 2021

By "self-hosted" you still mean still running on AWS hardware? I don't understand. Why would anybody pay EBS rates instead of S3 rates, to get data stored in the same place by the same people?

benlivengood · on Aug 10, 2021

Unless you're running a locally patched version AGPL is indistinguishable from GPL.

kkielhofner · on Aug 10, 2021

While this is true a word of warning: if you ever end up in a due diligence situation or a source code audit AGPL can really freak people out and hang up/derail the process until you get them to understand this point. If you can at all.

OJFord · on Aug 10, 2021

I'm pretty sure that - the prospect of AWS hosting and rebranding minio - was a joke.

gunapologist99 · on Aug 10, 2021

Pretty sure AWS would like to have something that at least looks and feels sorta like S3.

And, being S3-compatible at an API level would be a big bonus for a company the size of AWS, especially if it had nearly native compatibility with the aws-cli tool.

joeyrobert · on Aug 10, 2021

I know Minio can be used for production workloads, but Minio as a localhost substitute for an S3-like store is underrated. S3 dependent projects I work on spin up a Minio instance via docker-compose to get a fully local development experience.

mekster · on Aug 11, 2021

In other words, I'm not sure what other use case it has by using Minio on top of more expensive block storage than using any other native S3 storage services.

neilv · on Aug 10, 2021

MinIO is something I'll look into. And, as another example to the article's, it might also come in handy for some data needs for factories with imperfect Internet reliability (e.g., when the main submarine cable between the factory and AWS Singapore gets severed :).

This first example from the article sounds very valid, but is still personally funny to me, because it's related to the first use I made of S3, but in the opposite direction (due to different technical needs than the article's):

> If an Airline has a fleet of 100 aircraft that produce 200 TB of telemetry each week and has poor network connectivity at its hub.

Years ago, I helped move a tricky Linux-filesystem-based storage scheme for flight data recorder captures to S3. I ended up making a bespoke layer for local-caching, encryption (integrating a proven method and implementation, not rolling own, of course), compression, and legacy backward-compatibility.

That was a pretty interesting challenge of architecture, operations, and systems-y software development. And the occasional non-mainstream technical requirements we encounter are why projects like this MinIO are interesting.

wooptoo · on Aug 10, 2021

MinIO is great. I use MinIO together with FolderSync (Android only) to automatically backup the photos from my phone to my local NAS. It runs a scheduled job every night and they're saved in the original HEIC format.

I've also used MinIO to mock an S3 service for integration tests, complete with auth and whatnot.

jpeeler · on Aug 10, 2021

Were you already using MinIO? As somebody who wants to eventually backup photos on my phone, I'm curious why not just use Syncthing for that?

wooptoo · on Aug 10, 2021

Tbh I wasn't aware of Syncthing. In this use case it would work just as well I suppose.

One of the advantages of MinIO would be the wide compatibility with other S3 storage services. If my NAS had downtime while on holiday I could spin up a new bucket on S3/Backblaze/Wasabi and backup everything in a few minutes.

hjanssen · on Aug 11, 2021

At work, we use MinIO as a replacement of the S3 Api on our CI servers since we dont want to call production APIs for integration testing.

One of the challenges we had was "pre-filling" the MinIO server with testdata. Some tests require reading a testfile from our mocked S3 API. We wanted to have those testfiles instantly available at MinIO startup, but couldn not get that to work with docker volume mounts, MinIO simply would not recognize the files and serve a 404.

Has anyone got that working or a proposal for an alternative solution? We resorted to uploading the files via the application on startup (if it is in "testing mode"), but that does feel like a dirty hack.

bambam24 · on Aug 10, 2021

Guys, don’t store your data in minio. Its a sandbox, not an actual object storage. Companies uses minio to store their temporary data not the actual critical data.

For example if you have a project which you store objects to S33, At CI pipeline you don’t want to store temp files into S3 for cosr purposes. So instead you store at minio. A company must be crazy to use minio as their real data storage.

didip · on Aug 10, 2021

Anyone compared MinIO vs Ceph? I like MinIO because it seems exponentially simpler to setup but I don't know about its distributed and scalability stories.

opyh · on Aug 10, 2021

While I can't say much about its handling when using it distributed, I have had some negative experiences with MinIO/ceph when handling files > 10G.

One example: missing error handling for interrupted uploads leading to files that looked as if they had been uploaded, but had not.

Both ceph and MinIO's implementations differ from AWS original S3 server implementation, in subtle ways. ceph worked more reliably, but IIRC, both for MinIO and ceph, there is no guarantee that a file you upload is readable directly after upload. You have to poll if it is there, which might take a long time for bigger files (I guess because of the hash generation). AWS's original behavior is to keep the socket open until you can actually retrieve the file, which isn't necessary better, as it can lead to other errors like network timeouts.

I got it working halfway reliably by splitting uploads into multiple smaller files, and adding retry with exponential backoff. Then I figured out that using local node storage and handling distribution manually was much more efficient for my use case.

So for larger use cases, I'd take the 'drop in' claim with a grain of salt. YMMV :)

chrislusf · on Aug 10, 2021

Try SeaweedFS. It should be much more scalable than MinIO or CEPH. Large files are well supported.

didip · on Aug 12, 2021

Do you have a helm chart for SeaweedFS, I have been observing it for a long time too.

siscia · on Aug 10, 2021

I wonder if this kind of article push business into the consulting funnel.

Can the author let us know if he is happy with the business results of these articles?

WhyNotHugo · on Aug 11, 2021

I use minio a lot for development.

Some of my applications rely on S3 in production, but I don't want that dependency when running running the application on my machine - I use minio as a drop-in replacement for development.

Since I use docker compose to handle my app's services (postgres, rabbitmq, etc), adding minio into it is a perfect fix.

sam0x17 · on Aug 10, 2021

> Unfortunately, none of the above are available to the public, let alone something outsiders can run on their own hardware.

This is misleading. While there are no bare metal projects I'm aware of, there are 10+ S3-API compatible S3 alternatives, such as Wasabi, Digital Ocean Objects, etc., to name a few.

dzonga · on Aug 10, 2021

Something not really discussed a lot, FoundationDB works well as a blob / object store.

_hyn3 · on Aug 11, 2021

Is the project still alive? It sounds great if so, but if it's good at that, why doesn't Apple run it for iCloud instead of GCP? (Aside from the obvious massive scale issues, but it seems like Apple would be able to afford to engineer that.)

dzonga · on Aug 11, 2021

yeah, apple recently open sourced it. https://apple.github.io/foundationdb/index.html

tonymet · on Aug 11, 2021

Given all the config and services , why not just write to hdfs?

toolslive · on Aug 10, 2021

don't use minio for object storage. Use it because you need an s3 interface (and the object store you want to use doesn't provide it). It's actually pretty straight forward to build an integration if minio doesn't provide it. Implementation tip: make the minio side stateless. Have fun.

sebyx07 · on Aug 10, 2021

https://www.storj.io uses minio underneath, but they treat minio with respect, as a partner and pay their cut

jeffbarr · on Aug 10, 2021

Amazon S3 on Outposts (more info at https://aws.amazon.com/s3/outposts/ ) runs on-premises, offers durable storage, high throughput, the S3 API, currently scales to 380 TB, and doesn't require you to watch for and deal with failing disks.

I believe that it addresses many of the OP's reasons for deciding against S3.

Nullabillity · on Aug 10, 2021

At least disclose your conflicts of interest when writing spam like this.

GiorgioG · on Aug 10, 2021

And only costs $169,000 to start

gunapologist99 · on Aug 10, 2021

Hey, let's be fair. The storage-optimized instances start at only $425,931. ;)

oneplane · on Aug 10, 2021

It's not like buying hardware, support personnel hours and write-off administration is that much cheaper, unless you're willing to discard some features, but at that point you're no longer comparing things equally.

rodgerd · on Aug 10, 2021

Have Outposts fixed their "always-on, lose connectivity, lose your outpost" problem that they had when I first asked about them?

Can they scale down to "I need to spin up an S3 thing for local testing" for the cost of the storage and CPU?

Am I locked into a multi-year agreement, or can I just go and throw it away in a month and stop paying?

speedgoose · on Aug 10, 2021

I'm not going to contact AWS sales when I can easily use minio on Docker or Kubernetes.