Building on my last blog, there are cases where transit gateways can allow you to significantly cut down on the number of resources that you deploy. In cases where those resources are lightly used, this can lead to a substantial reduction in costs. For example, if you deploy VPCs with NAT gateways so that instances can occasionally get out to the internet (for example to get OS patches or other software updates[1]) you’re paying almost $400/year for each one, even though they may be mostly idle. The cost problem is exacerbated if you maintain multiple NAT gateways per VPC to ensure high availability. A transit gateway allows you to share a single NAT gateway (or small set for HA) among several VPCs. A similar argument is readily made for VPN connections if you have a need to connect your corporate data center to more than one of your VPCs.
Here’s an example of three accounts each with one private and public network and a VPN connection back to a corporate network. In this example, the public networks only exist to permit instances to access the internet through a NAT gateway. The VPCs are interconnected to each other using VPC peering connections:
[1] If you’re doing OS updates on many instances, you should look into following a ‘Cattle not Pets’ methodology that allows you adopt or create a new AMI that can be used to when redeploying your servers and then doing updates are a matter of replacing old servers with new ones. With a good Continuous Deployment system in place, this can be done on a routine basis. (I wanted to say “on a whim” there, but that might be a tad too cavalier.)
Obviously, there are a lot of different ways this diagram could have been done. Many times, you’d want to have a least one public subnet where servers can be accessed from the internet, and most designs have multiple subnets in multiple availability zones to ensure High Availability. I’ve omitted those details here to unclutter the diagram and make the discussion easier to understand. For this conversation, the important aspect to the diagram above is that every VPC has its own NAT gateway (and a public subnet to host it), its own internet gateway and its own VPN gateway back to a corporate data center.
With a transit gateway, multiple accounts can share a NAT gateway and VPN Gateway to reduce the total resource count, and therefore the costs:
In this diagram, peering connections have been replaced with a TGW and attachments, three NAT gateways and three VPN Gateways have been consolidated into one of each. This diagram assumes that the TGW is hosted in its own account. Depending on the circumstances you might deploy the TGW into one of the existing accounts and continue to use a few of the existing resources while eliminating some from the other accounts. The specifics will vary, but there’s always going to be an opportunity to reduce the number of deployed resources by some significant factor.
Savings isn’t necessarily guaranteed, because there is a cost associated with transferring data over a transit gateway attachment, so a cost benefit analysis is needed to determine the optimal design. It’s unusual for the utilization of gateway resources to be so heavy that using an attachment wouldn’t be less expensive, but I won’t say it can’t happen. In most cases, the costs of the TGW attachments are far less than those of maintaining an often-idle NAT or VPN Gateway.
If it makes economic sense, then using shared NAT gateways for VPCs that may otherwise only need private subnets, your configuration can be further simplified because you get to eliminate the public subnets that are hosting the NAT gateway and the corresponding internet gateway. There’s also a potential security win here if you can cut down on the risk of somebody creating a public instance in a VPC where one might not be allowed.
The solution isn’t perfect, because transit gateways don’t support the sharing of resources that need to available for inbound access from the internet, because you can’t associate an Elastic IP in one VPC to an instance in another. (There are ways to solve that problem if you’re security model requires complete control of egress and ingress, but they depend on using third party services.)
This is just a taste of the kinds of things that can be shared. A few other possibilities that seem to be of common interest:
- Sharing a domain controller
- Sharing Service Endpoints
- Centralizing DNS
There are many examples of resource sharing that can benefit from use of the Transit Gateway service. If you’ve got an interesting one that you’d like to discuss, leave a comment below or contact us.
[1] If you’re doing OS updates on many instances, you should look into following a ‘Cattle not Pets’ methodology that allows you adopt or create a new AMI that can be used to when redeploying your servers and then doing updates are a matter of replacing old servers with new ones. With a good Continuous Deployment system in place, this can be done on a routine basis. (I wanted to say “on a whim” there, but that might be a tad too cavalier.)