Ansible

Ansible Roles: Reusable Automation Packages

Playbooks get unwieldy fast. What starts as a clean 30-line YAML file turns into a 400-line monster with duplicated tasks, inconsistent variable names, and no hope of reuse across projects. Ansible roles fix this by giving your automation a standardized structure that you can share, version, and compose like building blocks.

Original content from computingforgeeks.com - post 165301

This guide walks through building two real roles from scratch: a base_system role that handles OS-level setup across Rocky Linux and Ubuntu, and an nginx_app role that depends on it. Along the way, we cover the directory structure, variable precedence, import_role vs include_role, dependency chains, and a real cross-platform gotcha that will bite you if you don’t plan for it. If you’ve been writing Ansible playbooks and want to level up, roles are the next step.

Current as of April 2026. Verified on Rocky Linux 10.1 and Ubuntu 24.04 with ansible-core 2.16.14, community.general 12.5.0

Prerequisites

Before starting, you need:

  • Ansible installed on a control node (tested with ansible-core 2.16.14 on Rocky Linux 10.1)
  • Two or more managed nodes with SSH access configured. This guide uses Rocky Linux 10.1 (10.0.1.11) and Ubuntu 24.04 (10.0.1.12)
  • Working knowledge of Ansible playbooks and ad-hoc commands
  • The community.general and ansible.posix collections installed (ansible-galaxy collection install community.general ansible.posix)

If you haven’t set up Ansible yet, follow Install and Configure Ansible on Linux first. Keep the Ansible cheat sheet handy as a quick reference while working through this guide.

What Is an Ansible Role?

A role is a standardized directory structure that packages related tasks, handlers, variables, templates, and files into a single reusable unit. Instead of cramming everything into one playbook, you break your automation into roles like base_system, nginx_app, postgresql, each handling one concern.

The practical benefits are significant. Roles enforce the DRY principle: write your base system setup once, use it in every project. Teams can own specific roles independently. You can version them with Git tags and publish them to Ansible Galaxy for the community. Most importantly, roles make your automation testable because each role has a clear boundary and a predictable interface through variables.

Role Directory Structure

The ansible-galaxy command scaffolds the standard role layout for you:

ansible-galaxy role init roles/base_system

The output confirms the role was created:

- Role roles/base_system was created successfully

List the generated files to see the full structure:

find roles/base_system -type f | sort

The layout looks like this:

roles/base_system/defaults/main.yml
roles/base_system/handlers/main.yml
roles/base_system/meta/main.yml
roles/base_system/README.md
roles/base_system/tasks/main.yml
roles/base_system/tests/inventory
roles/base_system/tests/test.yml
roles/base_system/vars/main.yml

Each directory serves a specific purpose. Here’s what goes where:

DirectoryPurposeWhen You Need It
defaults/Default variable values (lowest precedence)Always. This is the role’s public API
tasks/The main task list executed by the roleAlways. The core of the role
handlers/Handlers triggered by notify in tasksWhen tasks need service restarts or reloads
vars/High-precedence variables (overrides defaults)For values that should rarely be changed
templates/Jinja2 templates deployed with template moduleConfig files that need variable substitution
files/Static files deployed with copy moduleScripts, certificates, anything deployed as-is
meta/Role metadata and dependency declarationsWhen publishing to Galaxy or chaining roles
tests/Test playbook and inventory for CI/CDWhen testing the role in isolation

You won’t always need every directory. A simple role might only have tasks/ and defaults/. Delete what you don’t use.

Build Your First Role: base_system

This role handles the common ground that every server needs: base packages, an admin user, timezone, and firewall ports. The key challenge is making it work across both RHEL and Debian families without separate roles.

Define Defaults

Open the defaults file. These values serve as the role’s public interface, and consumers can override any of them:

vi roles/base_system/defaults/main.yml

Add the following variable definitions:

---
base_packages_rhel:
  - vim-enhanced
  - tmux
  - curl
  - wget

base_packages_debian:
  - vim
  - tmux
  - curl
  - wget

firewall_allowed_ports:
  - "22/tcp"

admin_user: deployer

Notice the separate package lists for each OS family. Package names differ between distributions (vim-enhanced on RHEL, vim on Debian), and trying to unify them into one list leads to failures.

Write the Tasks

The tasks file is where the actual work happens. Open it:

vi roles/base_system/tasks/main.yml

Add the OS-aware task definitions:

---
- name: Install base packages (RHEL)
  ansible.builtin.dnf:
    name: "{{ base_packages_rhel }}"
    state: present
  when: ansible_os_family == "RedHat"

- name: Install base packages (Debian)
  ansible.builtin.apt:
    name: "{{ base_packages_debian }}"
    state: present
    update_cache: true
  when: ansible_os_family == "Debian"

- name: Create admin user
  ansible.builtin.user:
    name: "{{ admin_user }}"
    groups: "{{ (ansible_os_family == 'RedHat') | ternary('wheel', 'sudo') }}"
    append: true
    shell: /bin/bash
    create_home: true

- name: Set timezone to UTC
  community.general.timezone:
    name: UTC
  notify: Restart cron

- name: Configure firewall ports (RHEL)
  ansible.posix.firewalld:
    port: "{{ item }}"
    permanent: true
    state: enabled
    immediate: true
  loop: "{{ firewall_allowed_ports }}"
  when: ansible_os_family == "RedHat"

Two patterns worth noting here. The when: ansible_os_family conditionals let the same role run on both RHEL and Debian without branching into separate roles. The ternary filter on the user task picks wheel on RHEL or sudo on Debian for the admin group. We’ll come back to why that ternary is critical in the troubleshooting section.

Add Handlers

Handlers run only when triggered by a notify directive, which keeps them from executing unnecessarily. Open the handlers file:

vi roles/base_system/handlers/main.yml

The cron service has different names across OS families, so the handler accounts for that:

---
- name: Restart cron
  ansible.builtin.service:
    name: "{{ (ansible_os_family == 'RedHat') | ternary('crond', 'cron') }}"
    state: restarted

When the timezone task reports a change, Ansible triggers this handler at the end of the play. If the timezone was already set to UTC, the handler never fires.

Build a Dependent Role: nginx_app

Real infrastructure involves layers. A web server role shouldn’t reinstall base packages or recreate the admin user. It should declare a dependency on base_system and focus on its own job. That’s what meta/main.yml dependencies do.

Scaffold the second role:

ansible-galaxy role init roles/nginx_app

Declare the Dependency

Edit the metadata file to declare base_system as a dependency and pass custom variables (opening HTTP/HTTPS ports in addition to SSH):

vi roles/nginx_app/meta/main.yml

Add the following:

---
galaxy_info:
  author: John Kibet
  description: Deploy and configure Nginx web server
  license: MIT
  min_ansible_version: "2.16"
  platforms:
    - name: EL
      versions: ["10"]
    - name: Ubuntu
      versions: [noble]

dependencies:
  - role: base_system
    vars:
      firewall_allowed_ports:
        - "22/tcp"
        - "80/tcp"
        - "443/tcp"

When Ansible applies nginx_app, it first resolves and executes base_system with the overridden firewall_allowed_ports list. The dependency runs before any tasks in nginx_app itself.

Role Defaults and Tasks

Set the Nginx defaults:

vi roles/nginx_app/defaults/main.yml

These are the variables consumers can override when including the role:

---
nginx_port: 80
nginx_server_name: _
nginx_root: /usr/share/nginx/html

Now write the tasks. Open the tasks file:

vi roles/nginx_app/tasks/main.yml

The tasks handle installation, configuration, and service management across both OS families:

---
- name: Install Nginx (RHEL)
  ansible.builtin.dnf:
    name: nginx
    state: present
  when: ansible_os_family == "RedHat"

- name: Install Nginx (Debian)
  ansible.builtin.apt:
    name: nginx
    state: present
  when: ansible_os_family == "Debian"

- name: Deploy Nginx configuration
  ansible.builtin.template:
    src: nginx.conf.j2
    dest: "{{ (ansible_os_family == 'RedHat') | ternary('/etc/nginx/conf.d/app.conf', '/etc/nginx/sites-enabled/app.conf') }}"
    mode: "0644"
  notify: Reload Nginx

- name: Remove default site (Debian)
  ansible.builtin.file:
    path: /etc/nginx/sites-enabled/default
    state: absent
  when: ansible_os_family == "Debian"
  notify: Reload Nginx

- name: Start and enable Nginx
  ansible.builtin.service:
    name: nginx
    state: started
    enabled: true

The config path differs between distributions: RHEL uses /etc/nginx/conf.d/ while Debian uses /etc/nginx/sites-enabled/. The ternary filter handles this cleanly in one task instead of duplicating it with when conditionals.

Create the Jinja2 Template

Create the templates directory and add the Nginx config template:

mkdir -p roles/nginx_app/templates

Edit the template file:

vi roles/nginx_app/templates/nginx.conf.j2

The template uses role variables for all configurable values:

server {
    listen {{ nginx_port }};
    server_name {{ nginx_server_name }};
    root {{ nginx_root }};
    index index.html;
    location / {
        try_files $uri $uri/ =404;
    }
}

Don’t forget the handler. Create roles/nginx_app/handlers/main.yml:

vi roles/nginx_app/handlers/main.yml

Add the reload handler:

---
- name: Reload Nginx
  ansible.builtin.service:
    name: nginx
    state: reloaded

Run the Roles

Create a playbook that applies the nginx_app role (which automatically pulls in base_system as a dependency):

vi site.yml

The playbook is minimal because all the logic lives in the roles:

---
- name: Configure servers with roles
  hosts: all
  become: true
  roles:
    - nginx_app

Execute it:

ansible-playbook -i inventory site.yml

The output shows the full dependency chain in action. Notice how base_system tasks run first on both hosts, then nginx_app tasks follow:

PLAY [Configure servers with roles] ********************************************

TASK [Gathering Facts] *********************************************************
ok: [managed-rocky]
ok: [managed-ubuntu]

TASK [base_system : Install base packages (RHEL)] ******************************
skipping: [managed-ubuntu]
ok: [managed-rocky]

TASK [base_system : Install base packages (Debian)] ****************************
skipping: [managed-rocky]
ok: [managed-ubuntu]

TASK [base_system : Create admin user] *****************************************
ok: [managed-rocky]
changed: [managed-ubuntu]

TASK [base_system : Set timezone to UTC] ***************************************
ok: [managed-rocky]
changed: [managed-ubuntu]

TASK [base_system : Configure firewall ports (RHEL)] ***************************
skipping: [managed-ubuntu]
ok: [managed-rocky] => (item=22/tcp)
ok: [managed-rocky] => (item=80/tcp)
ok: [managed-rocky] => (item=443/tcp)

TASK [nginx_app : Install Nginx (RHEL)] ****************************************
skipping: [managed-ubuntu]
ok: [managed-rocky]

TASK [nginx_app : Install Nginx (Debian)] **************************************
skipping: [managed-rocky]
changed: [managed-ubuntu]

TASK [nginx_app : Deploy Nginx configuration] **********************************
ok: [managed-rocky]
changed: [managed-ubuntu]

TASK [nginx_app : Remove default site (Debian)] ********************************
skipping: [managed-rocky]
changed: [managed-ubuntu]

TASK [nginx_app : Start and enable Nginx] **************************************
ok: [managed-rocky]
ok: [managed-ubuntu]

RUNNING HANDLER [base_system : Restart cron] ***********************************
changed: [managed-ubuntu]

RUNNING HANDLER [nginx_app : Reload Nginx] *************************************
changed: [managed-ubuntu]

PLAY RECAP *********************************************************************
managed-rocky              : ok=8    changed=0    unreachable=0    failed=0    skipped=3    rescued=0    ignored=0
managed-ubuntu             : ok=10   changed=7    unreachable=0    failed=0    skipped=3    rescued=0    ignored=0

The Rocky node shows changed=0 because the roles were already applied during testing. The Ubuntu node shows changed=7 for the first run. This is idempotency at work: run it again and both will show zero changes.

Variable Precedence in Roles

Understanding where to put variables is one of the trickiest parts of roles. Ansible has over 20 levels of variable precedence, but for roles, four levels matter most:

  1. Role defaults (defaults/main.yml) have the lowest precedence. They’re meant to be overridden
  2. Role vars (vars/main.yml) have higher precedence. Use these for values that shouldn’t change often
  3. Role parameters (passed when including the role in a playbook) override both defaults and vars
  4. Extra vars (--extra-vars on the command line) override everything, always

In production, you’ll want to keep most values in defaults/ so consumers can customize the role without forking it. Reserve vars/ for internal constants that callers should not change (like OS-specific paths).

Here’s a practical example. The base_system role defines admin_user: deployer in defaults. Running the playbook normally creates that user:

id deployer

The system confirms the user exists with the expected group membership:

uid=1001(deployer) gid=1001(deployer) groups=1001(deployer),10(wheel)

Override it at runtime with --extra-vars:

ansible-playbook site.yml --extra-vars "admin_user=superadmin"

Now a different user gets created instead:

uid=1002(superadmin) gid=1002(superadmin) groups=1002(superadmin),10(wheel)

Extra vars always win. This is useful for CI/CD pipelines where you want to inject environment-specific values without touching role files. For a complete variable reference, check the Ansible cheat sheet which covers all precedence levels.

import_role vs include_role

Ansible gives you two ways to use roles inside tasks: import_role (static) and include_role (dynamic). The difference matters more than you’d expect.

With import_role, Ansible parses the role at playbook load time. All tasks are visible upfront. Run --list-tasks on a playbook that uses import_role:

ansible-playbook test-import.yml --list-tasks

Every task from the role appears in the listing:

playbook: test-import.yml
  play #1 (all): Test import_role	TAGS: []
    tasks:
      base_system : Install base packages (RHEL)	TAGS: []
      base_system : Install base packages (Debian)	TAGS: []
      base_system : Create admin user	TAGS: []
      base_system : Set timezone to UTC	TAGS: []
      base_system : Configure firewall ports (RHEL)	TAGS: []

Now try the same with include_role:

ansible-playbook test-include.yml --list-tasks

The individual tasks are hidden because they’re resolved at runtime:

playbook: test-include.yml
  play #1 (all): Test include_role	TAGS: []
    tasks:
      Include base_system	TAGS: []

Here’s when to use each:

Featureimport_role (static)include_role (dynamic)
Parsing timePlaybook loadRuntime (when reached)
Tasks visible in --list-tasksYesNo
Works with when on each taskYes (applied to every task)Yes (applied only to the include)
Can loop overNoYes
Tags apply toAll tasks inside the roleOnly the include statement
Best forStandard role applicationConditional or looped roles

The rule of thumb: use import_role (or the roles: keyword in a play) by default. Switch to include_role only when you need to loop over a role or conditionally include it based on runtime facts. If you’re working with Ansible Vault for sensitive variables, both import and include methods handle encrypted variables the same way.

The Wheel Group Error: Writing OS-Aware Roles

This is the kind of gotcha that costs you an hour if you don’t know about it. When the base_system role was first tested with a hardcoded groups: wheel on the admin user task, it worked fine on Rocky Linux. Then it hit Ubuntu:

fatal: [managed-ubuntu]: FAILED! => {"changed": false, "msg": "Group wheel does not exist"}

Ubuntu (and all Debian-based systems) use the sudo group instead of wheel for administrative access. The fix uses Ansible’s ternary filter to pick the right group based on the OS family:

groups: "{{ (ansible_os_family == 'RedHat') | ternary('wheel', 'sudo') }}"

This pattern applies broadly. Any time you write a role that targets multiple OS families, watch for these differences:

ItemRHEL/RockyUbuntu/Debian
Admin groupwheelsudo
Cron servicecrondcron
Package managerdnfapt
Nginx config path/etc/nginx/conf.d//etc/nginx/sites-enabled/
Firewall toolfirewalldufw
SELinux/AppArmorSELinux enforcingAppArmor (usually permissive)

In production, you’ll encounter this with database roles, LEMP stack roles, and basically any role that touches system-level resources. Build the OS-awareness in from day one. Retrofitting it later means rewriting and retesting everything.

Organizing a Multi-Role Project

Once you have several roles, project layout matters. Here’s the structure that scales well:

project/
├── ansible.cfg
├── inventory/
│   ├── production
│   └── staging
├── group_vars/
│   ├── all.yml
│   └── webservers.yml
├── host_vars/
│   └── managed-rocky.yml
├── roles/
│   ├── base_system/
│   └── nginx_app/
├── site.yml
├── webservers.yml
└── dbservers.yml

Split your playbooks by function: site.yml applies everything, webservers.yml targets only web servers. Group vars let you set variables per inventory group without cluttering role defaults. This separation becomes essential when managing larger deployments like Kubernetes clusters where dozens of roles interact.

Sharing Roles with Ansible Galaxy

Ansible Galaxy is both a public registry and a CLI tool. You can pull community roles or publish your own. To install a role from Galaxy:

ansible-galaxy role install geerlingguy.docker

For production use, pin roles to specific versions in a requirements.yml file:

---
roles:
  - name: geerlingguy.docker
    version: "7.4.1"
  - name: geerlingguy.nginx
    version: "3.2.0"

Install all pinned roles at once:

ansible-galaxy install -r requirements.yml

Pinning versions prevents surprises. An unpinned role that auto-updates to a breaking version at 2 AM is not a fun way to start your morning. For managing Docker containers with Ansible, Galaxy roles can save significant setup time.

Troubleshooting

Error: “Group wheel does not exist”

This occurs on Debian/Ubuntu when a task hardcodes groups: wheel. The admin group on Debian systems is sudo, not wheel. Use the ternary filter as shown above to handle both families.

Role dependency runs twice

If two roles both depend on base_system, Ansible runs it only once by default (this is called deduplication). But if the two declarations pass different variables, Ansible runs it twice with the respective variable sets. This is usually what you want, but if it causes issues, set allow_duplicates: false in the dependency’s meta/main.yml.

Handler not firing after template change

Handlers only run when the notifying task reports changed. If the template content hasn’t actually changed (same variables, same template), the task reports ok and the handler won’t fire. This is correct behavior. If you need to force a handler, use ansible.builtin.meta: flush_handlers or run with --force-handlers.

Production Hardening Tips

Before using roles in production, consider these practices from real-world deployments:

  • Tag everything. Add tags to tasks so you can run subsets: ansible-playbook site.yml --tags "nginx" skips base_system entirely when you only need to update the web config
  • Use ansible-lint. It catches common mistakes like using deprecated modules, missing FQCNs, and incorrect mode formats before they reach production
  • Test roles in isolation. The tests/ directory exists for a reason. Create a test playbook that applies only one role against a throwaway VM
  • Version your roles with Git tags. When something breaks in production, you need to know which role version caused it and roll back to the previous tag
  • Keep defaults minimal. Every variable in defaults/main.yml is part of the role’s public API. Once published, changing a default name is a breaking change for every consumer

For more complex automation patterns, explore the official Ansible roles documentation which covers advanced features like role argument validation and conditional imports.

Related Articles

Automation How To Install ManageIQ or CloudForms on OpenStack/KVM Automation Install CatLight on Ubuntu / Debian / Linux Mint AWS How to Reset RDS Master User Password on AWS Automation Upgrading Kubespray Kubernetes Cluster to newer release

Leave a Comment

Press ESC to close