
When you’re running automation at scale, whether it’s using Ansible contol node or Ansible Automation Platform (AAP) with its automation controller, understanding how to plan your environment capacity is super important. This ensures your automation doesn’t just work, but works reliably and efficiently, even under heavy loads.
Let’s break it down for both scenarios:
- Traditional Ansible control node (CLI Users)
- Ansible automation controller (AAP)
Introduction to Ansible Nodes and Their Roles
When planning capacity for your Ansible environment—whether you are using the open-source Ansible CLI or the enterprise-grade Ansible Automation Platform (AAP)—it’s important to understand the key components involved. Each node type plays a specific role in the automation ecosystem.
Ansible control node (Open Source Ansible CLI)
In the open-source Ansible (CLI-only) setup, the control node is the machine where you install Ansible. This is where you keep playbooks and inventories, execute automation jobs, and manage connections to the target (managed) nodes over SSH or other connection plugins. Everything runs locally on the Control Node, and scaling is done vertically by adding more CPU and memory or distributing workloads across multiple Control Nodes, which has to be managed manually.
Ansible automation controller
In Ansible Automation Platform (AAP), the automation controller provides the web UI, API, and management layer (previously known as Tower). It allows you to centrally manage playbooks, inventories, and credentials, as well as trigger and monitor jobs via UI, API, or schedules. It also supports scaling and clustering using different node types, while providing job logging, reporting, and RBAC capabilities.
Execution node
Execution nodes are the workhorses that run the actual Ansible playbooks. They connect to the managed hosts over SSH or other supported connection methods and handle all task execution. These nodes need to be scaled based on job concurrency and fork usage to ensure jobs do not queue unnecessarily.
Hop node
Hop nodes act as jump or bastion nodes when Execution Nodes cannot directly connect to control nodes to join the automation mesh. These nodes can have a very low CPU and memory footprint and are mainly used for special networking setups where direct communication is not possible.
Hybrid node
Hybrid nodes combine both control plane and execution roles in a single node. This setup is often used in small or demo environments where minimizing node count is important. However, in this case, both control and execution tasks compete for the same system resources, so careful sizing and monitoring are necessary.\
Also learn 8 ways to speed up your Ansible playbooks
Ansible control node Capacity Planning
If you are using the Ansible control node (The CLI way), your capacity planning is mostly about ensuring the machine (or jump server) where you run the Ansible commands has enough juice to handle the load.
Key considerations for CLI users:
- Number of Managed Hosts: How many hosts are you running the playbooks against?
- Concurrency (Forks): The
forks
parameter (default is 5) controls how many hosts Ansible will target simultaneously. More forks = more concurrent SSH sessions = higher CPU & memory usage. - Network Bandwidth: Since Ansible uses SSH, network bandwidth (and latency) between the control machine and managed nodes can become a bottleneck.
- Tasks Per Host: If your playbook is heavy (many tasks, lots of loops), it puts additional CPU/memory strain on the control machine.
Example Quick Math:
If you plan to run against 300 hosts, with forks=10, and average 20 tasks per host, your control machine needs to handle up to 3000 tasks per playbook run, maintaining up to 10 concurrent SSH connections at any given time.
Ansible control node Resource Calculator (CLI-only)
Estimate your Ansible Control Node sizing based on workload.
Tip for CLI Users:
Scale your control machine vertically (more CPU & RAM) if you’re seeing slowness. Also, always monitor CPU, RAM, disk IOPS, and network latency during playbook runs.
Ansible automation controller Capacity Planning
When you move into Ansible Automation Platform (AAP), capacity planning takes a different shape. It’s not just one control machine, but a clustered environment with specialized node roles— controller, execution, database, and optionally hop nodes.
What You Need to Plan For:
- Managed Hosts Count
- Tasks/hour per host
- Concurrency: Max concurrent jobs
- Forks per job
- Node Specs (CPU, RAM, Disk IOPS)
- Job Event Volume & Processing Needs
Example Planning Exercise:
Let’s take a real-world example:
Parameter | Value |
---|---|
Managed Hosts | 1000 |
Tasks per Hour per Host | 1000 (≈16 per min) |
Max Concurrent Jobs | 10 |
Forks per job | 5 |
Average Event Size | 1 MB |
Preferred Node Spec | 4 vCPU, 16 GB RAM, 3000 IOPS |
Calculations:
- Execution Capacity Needed:
(10 jobs * 5 forks) + (10 jobs * 1 base control task)
= 60 execution capacity - Control Capacity Needed: At least 10 control capacity to manage 10 jobs in parallel.
- Event Rate Calculation:
1000 tasks/hour * 1000 hosts
= 1,000,000 tasks/hour. Assuming 6 events per task →1,000,000 tasks * 6 events
= ~6 million events/hour. That’s ~1666 events/sec
Suggested Node Setup (For 1000 hosts workload):
- 2 Control Nodes (4 vCPU, 16 GB RAM, 3000 IOPS each)
- Over-provision control capacity to avoid UI/API delays.
- Adjust capacity settings if needed.
- 2 Execution Nodes (same spec)
- Can handle the 60 execution capacity demand comfortably.
- Add more nodes if you expect job queueing or job isolation needs.
- 1 Database Node (same spec)
- Monitor disk IOPS carefully—ensure enough performance for high event writes.
Key Insight:
With 1000 hosts and such a high task volume, make sure you:
- Regularly monitor event processing rates.
- Reserve enough control node resources purely for event handling.
- Consider horizontally scaling both Control and Execution planes if event processing delays or job queuing are observed.
Also learn 5 ways to make your Ansible modules work faster
Ansible automation controller Sizing Calculator
Estimate your AAP Control and Execution Node requirements.
Node Sizing and Scaling Best Practices
Automation controller nodes
- More CPU & RAM = handle more concurrent jobs + faster event processing.
- Scale vertically (CPU/RAM) to improve event processing.
- Scale horizontally (add nodes) to improve API availability.
Execution nodes
- More CPU & RAM = more forks → more concurrent job executions.
- Vertical scaling increases job density on one node.
- Horizontal scaling (adding nodes) allows job isolation and grouping (prod/dev separation).
Hop nodes:
- Generally low resource needs.
- Monitor network bandwidth carefully if used as a central hub.
Hybrid nodes
- Perform both Control + Execution tasks.
- Scaling vertically increases both control capacity and execution capacity.
- Recommended only if simplicity is your priority; otherwise, keep Control and Execution planes separate.
Key Takeaways
Aspect | Ansible CLI | Ansible Controller (AAP) |
---|---|---|
Scaling Approach | Vertical scaling of control machine | Mix of vertical & horizontal scaling |
Key Bottlenecks | CPU, RAM, SSH session limits | Control & Execution node separation, job events processing |
Concurrency Control | Forks parameter | Job templates, forks, capacity adjustments |
Event Processing | CLI doesn’t have event streaming overhead | Control nodes handle job events, can become bottleneck |
Final Thoughts
Whether you are using Ansible control node or the Ansible automation controller, capacity planning is not a set and forget activity. It’s a cycle of:
- Estimate
- Deploy
- Monitor
- Adjust
Always test your workload in a controlled lab or staging environment before going to production. And when in doubt, overestimate event processing needs—it’s one of the most common causes of Controller UI/API slowness when underestimated.