Network and security
Atlas uses one shared VPC per root, split across two availability zones with explicit security group boundaries between the edge, workloads, database, and Kafka.
Network shape
| Concern | Current implementation |
|---|---|
| Availability zones | 2 dynamically selected AZs |
| Subnets | 2 public and 2 private subnets |
| Internet egress | 1 NAT gateway per public subnet |
| Private S3 path | S3 gateway VPC endpoint attached to private route tables |
| Flow logs | VPC Flow Logs to CloudWatch |
Security group model
| Security group | Allows |
|---|---|
alb-sg | inbound 80 and 443 from alb_ingress_cidrs, outbound unrestricted |
ecs-sg | inbound 8080 only from alb-sg, outbound unrestricted |
| dashboard backend SG | inbound target port from alb-sg, outbound unrestricted |
msk-connect-sg | no inbound, outbound unrestricted for connector workers |
msk-sg | inbound 9098 from ecs-sg and msk-connect-sg, inbound 9198 from msk_public_access_cidrs |
| RDS SG | inbound 5432 from allowed CIDRs plus the dashboard backend SG |
Ingress model
- The ALB is internet-facing and sits in public subnets.
- Port 80 redirects to 443.
- The default HTTPS listener action forwards to the events ingestion target group.
- Additional listener rules route the dashboard backend and Kafka UI by hostname.
Current exposure notes
ALB exposure
Both the staging example values and the committed production values currently allow alb_ingress_cidrs = ["0.0.0.0/0"]. The module supports a tighter allow-list, but the current committed state is wide open at the edge.
MSK public access
The VPC module currently exposes port 9198 from msk-sg to msk_public_access_cidrs. That enables public IAM + TLS access for external clients such as ClickHouse validation flows.
Dashboard database access
The staging example keeps RDS on a public subnet group with publicly_accessible = true and open CIDR defaults. Production committed values move the database to private subnets and disable public accessibility.
Operational implications
- ECS tasks stay in private subnets with
assign_public_ip = false. - Hostname routing is managed at the ALB listener level, not inside a separate ingress service.
- Security hardening happens primarily through input values, not by changing the root module graph.
Some older OpenSpec pages describe stricter or different networking assumptions. The current Terraform code is the source of truth for what Atlas actually provisions today.