Backup and restore PostgreSQL databases with Ansible
There are multiple ways to backup and restore a database. With PostgreSQL we can either use SQL dumps or Continuous Archiving and Point-in-Time Recovery. This post is about the simple and straighforward SQL dumps, automated using Ansible.
This is a continuation of Install and configure PostgreSQL with Ansible since I am going to use the same structure and variables.
What we will achieve
In this example I will assume that we have an application app_name
with a database db_name
. We will:
- Create a SQL dump of the database on the server
- Download the dump to a local folder
- Copy the dump to the server
- And restore the dump
Variables
We will reference vars.yml
in our playbook to externalize some configuration and make our playbook more reusable. The variables in question are the name of the database we want to backup, application's name that will be used to create a backup folder of the same name, and a system user user_app
to assign ownership to the folder.
vars.yml
:
---
user_app: app1user
app_name: app1
db_name: app1
The inventory file
The playbook expects that we configure the target machines using an inventory file. Just create ansible.cfg
and hosts.ini
files with the proper connection information, or modify the playbook itself.
ansible.cfg
:
[defaults]
inventory = hosts.ini
hosts.ini
:
[app1]
192.168.122.89
Create a database SQL dump
Let's create the playbook!
---
- hosts: all
become: yes
vars_files:
- vars.yml
tasks:
- name: "Set timestamp of the backup"
set_fact:
now: "{{ lookup("pipe", "date +%F-%T") }}"
- name: "Create a backup directory"
file:
path: "/var/lib/{{ app_name }}/backups/{{ now }}/"
mode: 0777
owner: "{{ user_app }}"
state: directory
First, we need a directory to store the backups. The first task generates a timestamp that is later used to distribute dumps into directories based on the time of the backup. The main task creates the folder structure based on the user_app
and now
variables.
This is done with file
module that can create a directory tree that doesn't exist and set the ownership and permissions. user_app
is a system user name that has to exist on the target machine.
- name: "Back up the database"
postgresql_db:
state: dump
name: "{{ db_name }}"
target: "/var/lib/{{ app_name }}/backups/{{ now }}/{{ db_name }}.dump.gz"
become: yes
become_user: postgres
- name: "Fetch backup from the server"
fetch:
src: "/var/lib/{{ app_name }}/backups/{{ now }}/{{ db_name }}.dump.gz"
dest: "backups/{{ now }}/{{ db_name }}.dump.gz"
flat: yes
We use postgresql_db
Ansible module that can create the dump file automatically. The format of the file is determined by the target
file extension. In this case, the dump will be also compressed with Gzip. postgresql_db
module is usually invoked as the postgres
user, because no other user can administer the PostgreSQL cluster by default.
In the last task for backing up the database, we also download the file to a local folder using fetch
module. The usage is straightforward.
Restore a database from a SQL dump
The restore playbook will work with one more variable, backup_file
, that points to the dump file we want to restore. We can pass the variable to Ansible when invoking the playbook, e.g. --extra-vars "backup_file=2021-03-28-19:38:12/app1"
.
---
- hosts: all
become: yes
vars_files:
- vars.yml
tasks:
- name: "Create a backup directory"
file:
path: "/var/lib/{{ app_name }}/backups/uploads/{{ backup_file | dirname }}"
mode: 0777
owner: "{{ user_app }}"
state: directory
- name: "Copy backup to the server"
copy:
src: "backups/{{ backup_file }}.dump.gz"
dest: "/var/lib/{{ app_name }}/backups/uploads/{{ backup_file }}.dump.gz"
- name: "Restore the database"
postgresql_db:
state: restore
name: "{{db_name}}"
target: "/var/lib/{{ app_name }}/backups/uploads/{{ backup_file }}.dump.gz"
become: yes
become_user: postgres
At the beginning, we extract the time-based directory name and use it to create a folder under a new directory uploads
. That's where we copy our local dump before using it on the server. For restoring the database, we again use the postgresql_db
run as the postgres
user.
And that's it!
Last updated on 29.3.2021.