Before You Start¶
AWS networking does not exist in isolation. Every VPC, route table, and Transit Gateway attachment lives inside an account, is governed by IAM policies, is subject to service quotas, and should be deployed through code. Getting these non-networking foundations right before you touch a single CIDR block prevents the class of problems that are hardest to fix later: permission errors that block cross-account sharing, quota exhaustion during production deployments, untagged resources that make cost attribution impossible, and manual configurations that drift silently.
This page covers the prerequisites that sit underneath the networking layer. None of them are networking-specific, but all of them shape how your network architecture operates in practice.

Non-networking foundations — Drawio Source
Identity and Access Management (IAM)¶
IAM is the control plane for everything in AWS. Every API call — creating a VPC, attaching a Transit Gateway, sharing a resource through AWS RAM — is an IAM-authorized action. If your IAM foundation is weak, your networking automation will fail in ways that are difficult to diagnose: cross-account resource sharing silently denied, service-linked roles missing permissions, or deployment pipelines blocked by overly restrictive SCPs.
What matters for networking specifically:
- Cross-account roles and trust policies — Multi-account networking is the norm, not the exception. Transit Gateway sharing, Route 53 Resolver rule associations, VPC Lattice service networks, and AWS RAM shares all require cross-account IAM trust. Design your role trust policies to support the resource-sharing patterns you'll use, not just the account you're currently working in.
- Service-linked roles — Services like AWS Transit Gateway, Amazon VPC Lattice, and AWS Network Firewall create service-linked roles automatically. These roles have specific permissions that the service needs to operate. Understand that they exist, that they're created on first use, and that SCPs or permission boundaries that block
iam:CreateServiceLinkedRolewill break service provisioning. - Service Control Policies (SCPs) — In an AWS Organizations environment, SCPs set the maximum permissions boundary for every account. A common mistake is deploying an SCP that restricts
ec2:*actions without realizing that VPC, subnet, route table, security group, and ENI operations all live under theec2:namespace. Test SCPs against networking actions explicitly. - Condition keys for network resources — IAM supports condition keys like
ec2:Vpc,ec2:Subnet, andaws:RequestedRegionthat let you scope permissions to specific network boundaries. Use them to prevent accidental resource creation in the wrong VPC or Region.
Key insight: The most common IAM failure in networking is not "permission denied on a single resource" — it's a cross-account trust relationship that works in development (same account) and breaks in production (different account, different Organization). Test cross-account IAM paths early.
Infrastructure as Code¶
Networking infrastructure is almost always managed as code, and for good reason: network resources are long-lived, interdependent, and shared across teams. A VPC created manually in the console becomes a liability the moment someone needs to replicate it, audit it, or understand why a route exists. The three primary tools for AWS infrastructure as code are AWS CloudFormation, Terraform, and AWS CDK, and each has a different operational model.
Choosing your tool:
- AWS CloudFormation — Native AWS service, tightly integrated with AWS Organizations (StackSets for multi-account deployment), drift detection, and change sets. Best when your environment is AWS-only and you want deployment orchestration managed by AWS. Networking resources are well-supported, including custom resources for operations CloudFormation doesn't cover natively.
- Terraform — Provider-based, state-file-driven, supports multi-cloud. The AWS provider covers networking resources comprehensively. Best when your team already uses Terraform, when you need to manage resources across multiple clouds, or when you prefer HCL's declarative style. Requires managing state files (use remote backends like S3 + DynamoDB locking).
- AWS CDK — Generates CloudFormation under the hood, but lets you define infrastructure in TypeScript, Python, Java, Go, or C#. Best when your team prefers general-purpose programming languages and wants to build reusable constructs. The
aws-ec2module provides L2 constructs for VPCs, subnets, and routing that handle common patterns with sensible defaults.
Patterns that matter for networking IaC:
- Separate network infrastructure from workload infrastructure — Network resources (VPCs, Transit Gateway attachments, route tables) change on a different cadence than application resources. Deploy them in separate stacks or state files so that a workload deployment cannot accidentally modify shared network infrastructure.
- Use outputs and cross-stack references — Networking stacks produce values (VPC IDs, subnet IDs, route table IDs) that workload stacks consume. Export these cleanly through CloudFormation exports, Terraform remote state data sources, or SSM Parameter Store.
- Parameterize CIDR blocks — Never hardcode CIDR ranges. Pass them as parameters or look them up from IPAM. This is what makes your templates reusable across environments.
- Test with ephemeral environments — Spin up a complete network stack in a test account, validate connectivity, and tear it down. This catches quota issues, IAM trust failures, and routing errors before they reach production.
Key insight: The tool matters less than the discipline. Pick one, use it consistently, and never allow manual console changes to network resources in production accounts. Drift between code and reality is the single largest source of networking incidents in mature AWS environments.
Service Quotas¶
AWS imposes default limits on networking resources, and many of them are lower than you'd expect. A default of 5 VPCs per Region sounds generous until you're running a multi-account landing zone where each account needs at least one VPC. A default of 50 routes per route table sounds fine until you peer with 60 VPCs. Quota exhaustion during a production deployment is an avoidable incident.
Quotas that catch networking teams most often:
| Resource | Default Limit | Why It Matters |
|---|---|---|
| VPCs per Region | 5 | Multi-account environments hit this quickly in shared-services accounts |
| Subnets per VPC | 200 | Rarely an issue, but large multi-AZ designs with many tiers can approach it |
| Routes per route table | 50 | Transit Gateway and VPC peering each consume a route entry per destination |
| Security groups per ENI | 5 | Microservice architectures with fine-grained SG rules hit this |
| VPC peering connections per VPC | 50 | Mesh topologies exhaust this; Transit Gateway is the answer |
| Transit Gateway attachments | 5,000 | Large organizations with hundreds of accounts approach this |
| Participants per RAM share | 5,000 | Relevant for Transit Gateway and VPC Lattice sharing at scale |
Best practices for quota management:
- Audit quotas before designing — Run
aws service-quotas list-service-quotas --service-code ec2and--service-code vpcin every Region you plan to use. Compare defaults against your architecture's resource count. - Request increases proactively — Quota increase requests can take hours to days. Submit them as part of your environment provisioning, not when a deployment fails.
- Monitor applied quotas in CloudWatch — Service Quotas publishes utilization metrics. Set alarms at 80% utilization so you have time to request increases before hitting the wall.
- Account for quotas in your IaC — If your Terraform or CloudFormation template creates resources that approach a quota, document it. Better yet, add a pre-deployment check.
Key insight: Quotas are not bugs — they're guardrails. But the default values assume a single-workload account, not a shared-services or networking account. Treat quota planning as part of architecture design, not as an afterthought.
IPv6 Readiness¶
IPv6 is not a future consideration — it's a current requirement for modern AWS networking. Every new VPC should be dual-stack from day one, and your non-networking foundations must be ready to support that. IPv6 readiness failures are subtle: a VPC with IPv6 enabled but security groups that only have IPv4 rules, or IaC templates that create route tables without ::/0 entries, or NACLs that block IPv6 traffic silently.
What IPv6 readiness means for your prerequisites:
- Security groups need explicit IPv6 rules — IPv4 rules do not automatically apply to IPv6 traffic. A security group that allows
0.0.0.0/0on port 443 does not allow[::]/0on port 443. Every security group in your IaC templates must include both protocol families. - NACLs must permit IPv6 — The default NACL allows all traffic (IPv4 and IPv6), but custom NACLs need explicit IPv6 entries. A custom NACL that allows
0.0.0.0/0inbound does not allow IPv6 inbound. - IaC templates must include dual-stack from the start — VPC templates should allocate both an IPv4 CIDR and an Amazon-provided IPv6
/56. Subnet templates should assign both an IPv4 CIDR and an IPv6/64. Route tables need both0.0.0.0/0and::/0entries. Retrofitting IPv6 into existing templates is significantly more work than including it from the beginning. - IAM policies referencing
ec2:actions — IPv6-specific actions (likeec2:AssignIpv6Addresses) must not be inadvertently blocked by SCPs or permission boundaries that restrictec2:*actions. - DNS resolution — Applications must handle dual-stack DNS responses (AAAA records alongside A records). Test that your application code, health checks, and monitoring tools work correctly when DNS returns IPv6 addresses.
Key insight: The most common IPv6 failure is not "the VPC doesn't support it" — it's "the security group only has IPv4 rules." Build IPv6 into your IaC templates, security group modules, and NACL baselines from the start, and this entire class of problems disappears.
Tagging Strategy¶
Tags are metadata, but in AWS networking they're also the mechanism for cost allocation, automation targeting, and policy enforcement. An untagged VPC is a VPC that no one can attribute cost to, no automation can reliably target, and no compliance rule can evaluate. In a multi-account networking environment with hundreds of VPCs and thousands of subnets, consistent tagging is the difference between operational visibility and chaos.
A minimum viable tagging standard for network resources:
Environment—production,staging,development. Gates automation behavior (for example, don't auto-delete production resources).Owner— The team responsible for the resource. Critical when a Transit Gateway attachment is consuming capacity and you need to know who to contact.CostCenter— For billing allocation. Network resources (NAT gateways, Transit Gateway attachments, VPC endpoints) generate significant cost that must be attributed.ManagedBy—cloudformation,terraform,cdk, ormanual. Identifies how the resource is managed and whether it's safe to modify.Name— Descriptive and consistent. Use a naming convention like{env}-{region}-{purpose}-{type}(for example,prod-use1-shared-tgw).
Enforcement mechanisms:
- AWS Organizations Tag Policies — Define required tags and allowed values at the Organization level. Non-compliant resources are flagged (or blocked, depending on enforcement mode).
- SCP-based enforcement — Deny
ec2:CreateVpc,ec2:CreateSubnet, and other networking actions unless required tags are present in the request. - AWS Config rules —
required-tagsrule evaluates existing resources and flags non-compliance for remediation.
Key insight: Enforce tagging at creation time, not after the fact. Retroactively tagging hundreds of network resources is painful and error-prone. An SCP or IAM policy condition that requires tags on resource creation is far cheaper than a quarterly tagging remediation effort.
AWS Well-Architected Framework¶
The Well-Architected Framework provides the architectural principles that networking decisions should align with. Two pillars are directly relevant to network architecture: Security and Reliability. A third — Cost Optimization — becomes critical once you're running Transit Gateway, NAT gateways, and VPC endpoints at scale.
Security Pillar — what applies to networking:
- Defense in depth — Layer security controls: security groups at the ENI, NACLs at the subnet, Network Firewall or GWLB at the VPC perimeter, and AWS WAF at the application edge. No single layer is sufficient alone.
- Least privilege for network access — Default-deny security groups, explicit route table entries, and PrivateLink instead of public endpoints wherever possible.
- Reduce attack surface — Private subnets by default, VPC endpoints for AWS service access, no public IPs unless explicitly required.
Reliability Pillar — what applies to networking:
- Multi-AZ by default — Every networking component (NAT gateway, Network Firewall endpoints, load balancers) should be deployed across multiple Availability Zones. Single-AZ networking is a single point of failure for everything above it.
- Limit blast radius — Use multiple VPCs and accounts to contain failures. A routing misconfiguration in one VPC should not propagate to others.
- Test failure modes — Simulate Availability Zone failure, Transit Gateway attachment loss, and DNS resolution failure. Know what breaks and how it recovers.
Cost Optimization Pillar — what applies to networking:
- Understand data transfer pricing — Cross-AZ, cross-Region, Transit Gateway processing, NAT gateway processing, and VPC endpoint hourly charges all add up. Architect to minimize unnecessary data movement.
- Use VPC endpoints to avoid NAT costs — Traffic to S3, DynamoDB, and other AWS services through VPC endpoints avoids NAT gateway processing charges entirely.
- Right-size your connectivity — Don't deploy Transit Gateway when VPC peering suffices for two VPCs. Don't deploy AWS Cloud WAN when a single-Region Transit Gateway covers your topology.
Key insight: Well-Architected is not a checklist to complete after building — it's a set of principles to apply during design. Review the Security and Reliability pillars before you draw your first network diagram, not after.
Documentation¶
-
IAM User Guide
Identity and access management fundamentals, policies, roles, and cross-account access patterns.
-
AWS CloudFormation
AWS-native infrastructure as code with StackSets, drift detection, and change sets.
-
Terraform AWS Provider
HashiCorp's AWS provider documentation covering all networking resources.
-
Service Quotas
View, manage, and request increases for AWS service limits.
-
Tagging Best Practices
AWS tagging strategies, tag policies, and enforcement mechanisms.
-
Well-Architected Framework
Architectural best practices across Security, Reliability, Performance, Cost, and Operational Excellence.
Next Steps¶
With these foundations in place, work through the Foundation topics in this order:
- AWS Organizations — Multi-account structure that governs how network resources are shared and isolated
- Amazon VPC — The isolated virtual network that everything else builds on
- Regions and Availability Zones — Geographic distribution and the multi-AZ patterns that drive reliability
- CIDR Planning — IP address allocation strategy that prevents conflicts at scale
- Subnets — Network segmentation within a VPC
- IPAM — Centralized IP address management for multi-account environments