This is why you are strongly urged not to rely on one region or AZ.

pid-1 · on March 9, 2022

Given the total amount of money I've lost due a single AZ being down, it was totally worth it to NOT go multi az or multi region so far.

Multi AZ isn't that hard, but generally requires extra costs (one nat gw per az, etc...)

But multi region in AWS is a royal pain in the ass. Many services (like SSO) do not play well with multi region setups, making things really complicated even if you IaCed your whole stack.

evrydayhustling · on March 9, 2022

Those costs are the actual reason you are encouraged to go multi-AZ!

(I actually love that we have strategies and infrastructure for multi-region... it just tends to come up at scales and for applications where it is not justified.)

systemvoltage · on March 9, 2022

Seems like it would be conflict of interest to increase robustness of single AZ (so it never goes down or has its own redundancy) vs. increased revenues from multi AZ deployment.

What's the point of cloud if we have to manage robustness of their own infrastructure. I can understand if that's due to natural disasters and earthquakes, but the idea should be that a single AZ should never go down barring extraordinary circumstances. AWS should be auto-balancing, handling downtimes of a single AZ without the customer ever noticing it.

It might not be a good analogy, but if a single Cloudflare edge datacenter goes down, it will automatically route traffic through others. Transparent and painless to the customer. I understand AWS is huge, and different services have different redundancy mechanisms, but just conceptually it feels like they're in a conflict of interest to increase robustness of their data centers - "We told you to have multi-AZ deployment, not our fault".

Another way to put this is make sure as an AWS customer, to 3x multiply all costs + management of multi-AZ deployment into your total costs.

9wzYQbTYsAIc · on March 10, 2022

> What's the point of cloud if we have to manage robustness of their own infrastructure.

Worth deliberating on. I’m curious as to what the lifetime cost of ownership for an on-prem data center is relative to lifetime cost of operating in the cloud.

thedougd · on March 9, 2022

They would simply charge for the privilege. An EC2 'always on' or whatever option that enabled your instance to live migrate between availability zones would be a nice and expensive option.

systemvoltage · on March 9, 2022

Definitely. Then I wonder why we need the cloud :) if not for services (not EC2). Lot of mid-sized companies are re-evaluating: https://www.economist.com/business/2021/07/03/do-the-costs-o...

Johnny555 · on March 9, 2022

I would strongly urge not using us-east-1 -- of all the regions we're in, it's by far the most problematic. Use us-east-2 if you need good latency to the East Coast.

temp0826 · on March 9, 2022

Not sure if it's still the case, but when I was there us-east-1 was a SPOF for some services world wide. I think if dynamodb went down in the region it was a big, big issue.

Johnny555 · on March 10, 2022

The only SPOF of failure I know of for us-east-1 today is the control plane for Route53 - it's distributed and DNS queries will continue to work when us-east-1 is down (including health check based failover), but you can't make any DNS changes when us-east-1 is down.

m34 · on March 9, 2022

Might be true for running stuff in different regions/AZs but if the provisioning region is down (e.g. deploying lambda@edge) one does not really have an alternative

tyingq · on March 9, 2022

Good advice, though AWS still has some services that don't work completely independently. Cloudfront, because of certificates. Route53. The control API for IAM (adding/removing roles, etc). And I wish they didn't have global-looking endpoints (like https://sts.amazonaws.com) that aren't really global or resilient.

ranman · on March 9, 2022

STS will let you use regional endpoints now, right?

tyingq · on March 9, 2022

Yes. It's just that the "global endpoint" is misleading. They don't repoint it if it fails. It really shouldn't exist given that's how it functions.

didip · on March 9, 2022

Multi AZ is great and should be by default, but multi Region is expensive.

hughrr · on March 9, 2022

This. We have multi AZ in more than one region and I occasionally dream of Bezos wearing only a top hat and waistcoat laughing manically while diving into a large vat of gold coins.

jamesfinlayson · on March 10, 2022

Not always possible - Australia (currently) only has one availability zone and if you're in a regulated industry (banking or government stuff) they require data to be in Australia.