It's still too confusing. Too much terminology, too many settings. 3 pages and over half a dozen screenshots to explain how to make a bucket private. Too complicated.
Google Drive folder permissions are easier. Phrases like "Anyone with the link can view" are understandable. Phrases like "Block public and cross-account access to buckets that have public policies" are not.
Hide the fine-grained control in an "Advanced" panel, for those who really need it.
All of AWS' access control is too confusing. Unless you spend a lot of time managing AWS it's hard to remember how to configure IAM and ACLs. I have to read the docs almost every time I change something just to be sure I don't screw it up. At my last job our team actively avoided touching IAM as much as possible because we all hated it.
You can tell how old a service is by its name. Most of the old ones have cutesy in-joke names. Most of the new ones are named exactly for what they do. They realized having 400 cutesy joke names was exhausting.
I have the opposite sentiment from a more infrastructure rather than application services perspective. Earlier names like S3, EC2, SQS are more descriptive. Aurora, Redshift, Pinpoint, Fargate, Macie, etc. give me no idea what they actually... do. Given that the earlier services are much more foundational technologies than value-adds and higher-level ones doesn’t that make sense? Somewhere around Lambda or a little prior is when I think the names started being more influenced by “product people” than traditional engineers. I think the names probably have more to do with the intended audience than anything else at this point.
While it might be an 'industry wide acronym' and it might make sense for people with experience [1] it is not immediately apparent to certain people who would like to try AWS or who aren't doing these things for a living.
Things should be labeled easily for the lowest common denominator. If you want to use an acronym it should appear after the full name of the item.
Not this: IAM
But this: Identity and Access Management (IAM). And the full description should appear everywhere. Not just in one place (like in a legal contract or the beginning of Jeff Barrs blog post) but everywhere.
Why not? Why not do everything to make it easier. It's not like we are talking a printed piece where space matters. People reading on mobile? So what still spell it out. There are always new users who don't know the jargon.
And if you want to have an express page then take the verbose page and make an express page from it for people who don't want it spelled out.
[1] Just like people with experience know what a MAC address is vs. a Mac Computer
This is great feedback, and I appreciate your taking the time to write it up.
I definitely try to do this in my blog posts, but I do assume that my readers have been following along for some time, and that they have internalized some of names.
As you noticed, I like to use the full ("formal") name at the start of each post, and then switch to the informal name after that. In fact, I have a shortcode system that makes this fairly easy. The first reference spells out the entire name and is linked; the others use a short name and are not linked.
Many posts, especially when posted to HN or linked from some other blog, are read by people who're not regulars. Besides, people scan, not always read end to end.
Should EBS no longer be called Elastic Block Store because people new to Linux don't know the difference b/w block and object storage? Where does it stop? Should programming languages no longer use terms like lambda functions and generics that take some technical reading to understand? New users can read one of the countless AWS books on Safari or one of the countless guides online if they need a technical brushup beforehand.
AWS is a technical product, it's not Wordpress or Mailchimp. Just like Cadence makes EDA products and MATLAB makes maths software, AWS makes datacenter computing available on the cloud. It's not aimed at users who have a few minutes of experience with some Lubuntu desktops at home.
Everything about AWS is too confusing. Exporting data from dynamodb, if it's more than 100 records, requires a bunch of setting up data pipeline, etc. An equivalent dump given something like mongodb would be one command.
Heck, even when I worked there and routinely had to modify IAM policies, it was a routine source of stress and concern. Especially the unexpected limits that would suddenly strike you from no-where. It was as if they really wanted you to be overly permissive instead of tightly restrictive.
This again is misleading, users blaming AWS for making bucket's public.
The bucket's start out __private__.
The UI when making a bucket public comes with a big warning.
Then a big warning label is attached next to the bucket itself.
It does show I think the scale of AWS that the userbase is so large that this isn't clear. This tool is nice because it means that a perhaps more experienced admin can lock down an entire account a bit more so newer devs looking to shortcut things with world public aren't as able to (which will slow them down for sure).
Except one of the biggest initial selling points is hosting static websites, which requires the bucket to be public. They even have a shortcut UI for setting this up, but then they highlight this "security issue" that you have a public bucket, which doesn't make sense.
If you've gone to the trouble to specify the bucket is a website with default documents, etc, surely they can filter these buckets out of the ALERT! PUBLIC BUCKET! list...
> Except one of the biggest initial selling points is hosting static websites, which requires the bucket to be public.
Maybe S3 should just be separated into buckets and websites. They can work the same under the hood, but call them something different to prevent this issue. Right now a lot of the new (and existing) security settings don't make any sense in the context of websites, and having the functionality to make websites within S3 creates security issues for everyone else.
While buckets are great for pushing your static content too, I would probably put a CDN in front of it, which is pretty easy to do with CloudFront.
Not only is it’s faster it’ll likely be cheaper, due to reads from S3 being more expensive than content read from a CDN. (AFAIK)
Edit: that said I use to use S3 for hosting a few toy sites, that were inside jokes between my friends. For sites like that maybe glitch.com or github pages might work better ¯\_(ツ)_/¯
My understanding was most of these issues came from setting permissions to "Everyone" and assuming an implied "with access to this aws account" that isn't there. I don't think it's people explicitly turning it to private. That said, it's a couple of years since I used AWS.
The tricky case is a bucket which is otherwise private but contains individual files that were uploaded with a public ACL (or had the ACL modified, of course). This is not made apparent anywhere in the interface (by which I mean there isn't some big "public" sticker), and is part of what this feature is trying to address.
They're still crap though e.g. you can have public resources (or broad access & findable within orgs) inside locked down "folders", there is no way to make a "folder"'s access recurse, if you find out that a "folder" had the wrong access rights it's absolute hell to fix it.
Or get a list of documents sorted by access level or the person you've shared it with. Like a CA you hired 3 years ago to file your taxes no longer needs access to your financial information.
This. I've been using AWS for my side projects, but strongly considered moving to GCP at some point. Sometimes I swear I feel like I need a phd in AWS literature to even set up a box with the "right" permissions for my friends. Sure, I'm no sysadmin, but it's mind blogging that setting up a cloud instance feels harder than doing it all on premise.
They are private by default, but it was far too easy for a to enable public access through a bucket policy that seemed sane to the untrained eye.
In the systems I've designed, I've built multiple layers of protection around sensitive buckets. I figure it's safest to assume that someone will eventually screw up the policy.
When possible, I deny access to the bucket except for explicit whitelisted approvals (alternatively, requests coming from IAM roles or users in the current account).
I also encrypt buckets by default and often deny unencrypted uploads. The concept of public access doesn't map in quite the same way to customer-managed KMS keys, so you have good odds that S3 requests would fail on decryption even when a bucket becomes public.
Next layer of protection is monitoring via Config rules and Trusted Advisor checks, combined with alarms to ensure that these alerts are actually brought to someone's attention.
Finally, except for development accounts, I insist on automating configuration of the environment (and aim to use IAM to prevent manual changes).
This is great if you can pull it off, but you don't get there overnight. And you certainly don't get here by relying only on the basic documentation. I would say you don't even necessarily get there by relying on AWS partners. I've seen more than a couple disappoint.
All of this to say that I'm happy to add one more guardrail (with much lower overhead).
This is exactly the right approach. AWS hammers people about the “shared security model”, being in the cloud doesn’t stop you from doing stupid stuff. You have all kinds of tools to help you do the right thing.
They are private by default. This is just making public/private settings very explicit, and also helping to ensure that anything that was public before the default setting was changed to private is known.
This is really just extra CYA because people who really don't know what they are doing are apparently still being put in charge of buckets by companies.
Some folks use S3 kind of a like a quick dropbox to share big files even internally in an org. Bob in accounting to Sue in finance.
One solution I've found if folks INSIST they do not want any login / security. Expose the bucket contents using time expiring shared links (presigned links). This has worked great in one use case I had where users INSISTED on no security with sensitive data. Users can then cut / paste / download / share links for a few hours (all they really want - send it to a few folks on a project). Even if their email is hacked you are OK. Not ideal, but better than nothing.
If you look inside orgs, the desire for no security is sometimes surprisingly high!
Those folks are wrong, and the correct answer here is for AWS to stop them. Google Drive and Box are appropriate solutions for that use case. Maybe AWS should launch one too.
You’ve never had to figure out how to make your buckets private. They are private by default. This new feature adds another level of protection against making a bucket public.
Making a bucket public is a very simple understandable option in the interface. The screenshots were just there to illustrate how even if the bucket is public the new feature will block it on the account level.
Comparing Google Drive to S3 is not really valid. No one is using Google Drive as an enterprise level access storage solution that can serve millions of requests.
The problem is really that sharing securely is way too complicated. So inexperienced or lazy people give up and make the bucket public so they can get on with whatever it was they were trying to accomplish.
I’m sure Linux administration will be just as overwhelming and confusing to me once I start getting into it as an experienced (20+ years) Microsoft developer who has cut my teeth on all things AWS except for the machine learning parts....
Great changes! But, there should also be an account-wide option just to ban public uploads to a bucket. For anyone running a website, checking any of these 'recommended' boxes is going to take down prod. (I just started checking boxes impulsively to see what would happen; fortunately any changes revert quickly!)
"Too confusing" can be feature though.
Older Microsoft strategy, now you have good ground for integrators and value added resellers.
Guys, who know specific platform,so they sell it and promote it for every chance.
Absolutely agree. Telling people of your bad design is still bad. Especially when "telling people" happens in a flood of other noise, like one line in documentation on a whole tool.
Default-insecure is bad design almost anywhere I can think of (and Django Storages is changing their default in the next version - they realised it's probably not best as it is).
This is a useful and much needed update. Know lots of stories where people got caught off guard by ACLs setting things public that they thought were private.
Usual scenario is that the bucket is set as “private” and you think it’s private but realize some app or script has set the ACL for objects to “public.”
"got caught off guard by ACLs setting things public that they thought were private."
A lot of these stories are written in this way where Amazon's ACL's "set thing public" when the user wanted things private.
To add a little reality here:
Amazon defaults are generally to private.
You get no network ingress to an instance by default (zero).
You have no username/password login by default (zero) - you need to use SSH with a public key.
S3 buckets are resource owner and aws account owner only to start.
My one issue - I wish they made buckets with no permissions by default, make the account owner add themselves or resource owner add themselves.
What is happening is despite LOTS of docs out there, users find it easier during development, quick integration to do things like WORLD readable. This is BY FAR the most common issue, USERS (not amazon) setting things to totally public by choice.
I've had folks tell say:
* It will only be public for a few days / short period of time while we work out X (once everything is used to public access it actually is HARDER to switch to secure access).
* I can't be bothered to learn IAM permissions (one of the most useful features AWS has for example relative to a place like google).
I like this new thing not because amazon is at fault, but because users are super lazy, and now some annoying "admin" can block them from making things totally public when they want to share with a partner, separate project etc.
So a good feature.
Now make S3 bucket's with NO permissions by default (not even creator / account owner) so folks can learn a bit of AWS right away to get going.
People blame the power tool manufacture after they get hurt because they wired open the guard, taped down the safety and then developed a habit of holding the saw in one hand and the work in the other.
Some people just can't accept that they dug their own hole even when it they clearly went out of their way to find the shovel.
This is where I have actually come to appreciate SELinux. A newer dev comes along and makes a folder in the document root to 777 thinking that will solve their problems. SELinux still needs to explicitly allow that folder to be writable. That slows them down long enough to come find me, and then we get to have the proper conversation about what needs to happen. A folder in the document root with 777 scares the crap out of me
Do you know of any good intro to SELinux guides? I'm hoping to use it to lock down webroots to prevent other users from modifying them even if the user messes up all the permissions via SFTP.
Sadly, no. I only play a sysadmin on TV ;-) I don't fully feel like I have groked SELinux, but when things behave unexpectedly, it takes me less and less time to remember to check if SELinux is involved. I have at least come to accept that it is there to help, and it has saved me from doing some really dumb things.
Things to know is that there is a specific setting to allow httpd to write to a folder. There is a way to list files `ls -Z` that shows you the SELinux Context for files/folders. httpd error_log entries will just give permission denied errors, but if you feel like the perms are correct, SELinux will probably be why you're getting denied. That's how I started learning. One error at a time.
ACL's can be set on a per-object basis. So you might have set up the bucket thinking you set it up as "non-public", but them some software sets objects to public when it stores them.
And we all know about misbehaving software and bugs. And how would you know that had happened? I guess you'd have to run something that monitored all the ACLs? (And you'd pay for those monitoring operations, although probably not a significant cost).
It makes a _lot_ of sense to be able to tell a bucket "No public access allowed in here". Which I understand is included in this new feature set? That alone is only sensible.
Nice, I didn't know about the free bucket permission check tools, that's good. Looks like here's the announcement: https://aws.amazon.com/about-aws/whats-new/2018/02/aws-trust... Sounds like you can still only automate it (using AWS dashboard anyway) as "Business and Enterprise support customers."
But yeah, simply being able to set a bucket as "never allow public access in here" makes a _lot_ of sense, an easy way to deal with "misbehaving" software components (whether the developers who wrote them believed it was misbehavior or not).
In most of the web software I write these days, I make _nothing_ public in S3, but provide an endpoint that will HTTP redirect to a signed URL to grant access. It's a bit of extra work (cpu cycles for the software that is, relatively trivial for the developer), but makes it easier to keep track of what I've granted access to how and when.
If people using AWS for your organization can't be bothered to take a few minutes to RTFM in its entirety (AWS has excellent documentation, compare to Azure's docs.microsoft.com behemoth of confusion), they shouldn't have access to reconfigure you're entire cloud infrastructure. Buckets are sensible private by default. If you don't know what you're doing, leave them that way.
This is sarcasm right? AWS doc's are okay, but not great. Honestly, their documents are pretty piss poor. They have contradicting statements depending on what tool, they have 100 exceptions for integrating their own toolchain, and the perspective and context is never clear. Sure...their basic stuff is well documented....but it's constantly changing and their system isn't versioned in a way that is easy to upgrade or transition legacy systems.
Related question, anyone know how to grant access to S3 objects via CloudFront only, since CloudFront bandwidth usage is cheaper than S3 objects accessed directly?
I've got it working for top level documents using Origin Access Identity, but subfolders (example.com/test/index.html) doesn't work. I'm surprised this use case isn't better documented because it saves you money and you don't need to make your bucket public.
Depending on where your original bucket is hosted, it is very possible that CloudFront ends up costing more in data transfer.
For instance, a us-east-1 bucket has a data transfer rate to the internet starting at $0.09/GB. That's flat, regardless of where the requester is located.
On the other hand, while US- and Europe-based transfer pricing is cheaper in CloudFront (starting at $0.085/GB), all the other regions are more expensive... for example, South America starts at $0.25/GB (!).
Of course the reverse is true: a bucket originally hosted in South America would probably do well to enable CloudFront, as requests from every other region become cheaper...
And yet another wrinkle, I think CloudFront always is cheaper than S3 eventually (with enough data usage). So for very large customers, maybe it is true that CloudFront will always be cheaper. Clear as mud.
Totally private (no permission) buckets by default.
Some additional support for common usecases that drive the shortcut to public readable (often sharing internally for a project). Not sure if this would be a good thing, but folks are used to defining a group, adding users to a group (project / departments etc) in enterprise context, AWS supports this with IAM, but I wonder if surfacing a UI / something here might simplify this in some way.
Bob says I'd like to play around with data x on S3, or grab report z on S3. They are used to the Dropbox / drive / box etc model. That's often invite by email or allow *@domain.com permission. But how does this work in S3, how do you invite someone, how do they self provision their credentials without admin involvement etc? For a non aws/s3 user? One thing you can do, make bucket world readable.
Big user base plus more stuff collecting in S3 from various systems seems to drive some risk, not sure how s3 solves for above or if it should (email or domain invite, self provisioning of credentials if not on platform etc). Tie into workdocs somehow? Push into competitor tools?
This issue also comes up in the cross account / org situation. Want to invite someone outside of org to read data etc. Tools to do this, but it takes more work than folks are used to with other approaches. Pretty soon something is world readable.
API side for developers works great.
Be curious if others had use cases that create gravity towards world readable - because s3 by default is not, but a relatively high number of people end up there for some reason. Be interesting to try to give them what they want in addition to lockdown tools.
Everything I found was either a PDF file (common use case when you'd want the bucket to be public) or an html file in a bucket-hosted website (another use case where public is the correct behavior)
Google Drive folder permissions are easier. Phrases like "Anyone with the link can view" are understandable. Phrases like "Block public and cross-account access to buckets that have public policies" are not.
Hide the fine-grained control in an "Advanced" panel, for those who really need it.