When you run TimescaleDB in a containerized environment, you can use
continuous archiving with a WAL-E container.
These containers are sometimes referred to as sidecars, because they run
alongside the main container. A WAL-E sidecar image
works with TimescaleDB as well as regular PostgreSQL. In this section, you
can set up archiving to your local filesystem with a main TimescaleDB
container called timescaledb
, and a WAL-E sidecar called wale
. When you are
ready to implement this in your production deployment, you can adapt the
instructions here to do archiving against cloud providers such as AWS S3, and
run it in an orchestration framework such as Kubernetes.
Try for free on Timescale
Timescale is a fully managed service with automatic backup and restore, high availability with replication, seamless scaling and resizing, and much more. You can try Timescale free for thirty days.
To make TimescaleDB use the WAL-E sidecar for archiving, the two containers need
to share a network. To do this, you need to create a Docker network and then
launch TimescaleDB with archiving turned on, using the newly created network.
When you launch TimescaleDB, you need to explicitly set the location of the
write-ahead log (POSTGRES_INITDB_WALDIR
) and data directory (PGDATA
) so that
you can share them with the WAL-E sidecar. Both must reside in a Docker volume,
by default a volume is created for /var/lib/postgresql/data
. When you have
started TimescaleDB, you can log in and create tables and data.
Deprecation
This section describes a feature that is deprecated on Timescale. We strongly recommend that you do not use this feature in a production environment. If you need more information, contact us.
Create the docker container:
docker network create timescaledb-netLaunch TimescaleDB, with archiving turned on:
docker run \--name timescaledb \--network timescaledb-net \-e POSTGRES_PASSWORD=insecure \-e POSTGRES_INITDB_WALDIR=/var/lib/postgresql/data/pg_wal \-e PGDATA=/var/lib/postgresql/data/pg_data \timescale/timescaledb:latest-pg10 postgres \-cwal_level=archive \-carchive_mode=on \-carchive_command="/usr/bin/wget wale/wal-push/%f -O -" \-carchive_timeout=600 \-ccheckpoint_timeout=700 \-cmax_wal_senders=1Run TimescaleDB within Docker:
docker exec -it timescaledb psql -U postgres
The WAL-E Docker image runs a web endpoint that accepts WAL-E commands across an HTTP API. This allows PostgreSQL to communicate with the WAL-E sidecar over the internal network to trigger archiving. You can also use the container to invoke WAL-E directly. The Docker image accepts standard WAL-E environment variables to configure the archiving backend, so you can issue commands from services such as AWS S3. For information about configuring, see the official WAL-E documentation.
To enable the WAL-E docker image to perform archiving, it needs to use the same
network and data volumes as the TimescaleDB container. It also needs to know the
location of the write-ahead log and data directories. You can pass all this
information to WAL-E when you start it. In this example, the WAL-E image listens
for commands on the timescaledb-net
internal network at port 80, and writes
backups to ~/backups
on the Docker host.
Start the WAL-E container with the required information about the container. In this example, the container is called
timescaledb-wale
:docker run \--name wale \--network timescaledb-net \--volumes-from timescaledb \-v ~/backups:/backups \-e WALE_LOG_DESTINATION=stderr \-e PGWAL=/var/lib/postgresql/data/pg_wal \-e PGDATA=/var/lib/postgresql/data/pg_data \-e PGHOST=timescaledb \-e PGPASSWORD=insecure \-e PGUSER=postgres \-e WALE_FILE_PREFIX=file://localhost/backups \timescale/timescaledb-wale:latestStart the backup:
docker exec wale wal-e backup-push /var/lib/postgresql/data/pg_dataAlternatively, you can start the backup using the sidecar's HTTP endpoint. This requires exposing the sidecar's port 80 on the Docker host by mapping it to an open port. In this example, it is mapped to port 8080:
curl http://localhost:8080/backup-push
You should do base backups at regular intervals daily, to minimize the amount of WAL-E replay, and to make recoveries faster. To make new base backups, re-trigger a base backup as shown here, either manually or on a schedule. If you run TimescaleDB on Kubernetes, there is built-in support for scheduling cron jobs that can invoke base backups using the WAL-E container's HTTP API.
To recover the database instance from the backup archive, create a new TimescaleDB container, and restore the database and configuration files from the base backup. Then you can relaunch the sidecar and the database.
Create the docker container:
docker create \--name timescaledb-recovered \--network timescaledb-net \-e POSTGRES_PASSWORD=insecure \-e POSTGRES_INITDB_WALDIR=/var/lib/postgresql/data/pg_wal \-e PGDATA=/var/lib/postgresql/data/pg_data \timescale/timescaledb:latest-pg10 postgresRestore the database files from the base backup:
docker run -it --rm \-v ~/backups:/backups \--volumes-from timescaledb-recovered \-e WALE_LOG_DESTINATION=stderr \-e WALE_FILE_PREFIX=file://localhost/backups \timescale/timescaledb-wale:latest \wal-e \backup-fetch /var/lib/postgresql/data/pg_data LATESTRecreate the configuration files. These are backed up from the original database instance:
docker run -it --rm \--volumes-from timescaledb-recovered \timescale/timescaledb:latest-pg10 \cp /usr/local/share/postgresql/pg_ident.conf.sample /var/lib/postgresql/data/pg_data/pg_ident.confdocker run -it --rm \--volumes-from timescaledb-recovered \timescale/timescaledb:latest-pg10 \cp /usr/local/share/postgresql/postgresql.conf.sample /var/lib/postgresql/data/pg_data/postgresql.confdocker run -it --rm \--volumes-from timescaledb-recovered \timescale/timescaledb:latest-pg10 \sh -c 'echo "local all postgres trust" > /var/lib/postgresql/data/pg_data/pg_hba.conf'Create a
recovery.conf
file that tells PostgreSQL how to recover:docker run -it --rm \--volumes-from timescaledb-recovered \timescale/timescaledb:latest-pg10 \sh -c 'echo "restore_command='\''/usr/bin/wget wale/wal-fetch/%f -O -'\''" > /var/lib/postgresql/data/pg_data/recovery.conf'
When you have recovered the data and the configuration files, and have created a recovery configuration file, you can relaunch the sidecar. You might need to remove the old one first. When you relaunch the sidecar, it replays the last WAL segments that might be missing from the base backup. The you can relaunch the database, and check that recovery was successful.
Relaunch the WAL-E sidecar:
docker run \--name wale \--network timescaledb-net \-v ~/backups:/backups \--volumes-from timescaledb-recovered \-e WALE_LOG_DESTINATION=stderr \-e PGWAL=/var/lib/postgresql/data/pg_wal \-e PGDATA=/var/lib/postgresql/data/pg_data \-e PGHOST=timescaledb \-e PGPASSWORD=insecure \-e PGUSER=postgres \-e WALE_FILE_PREFIX=file://localhost/backups \timescale/timescaledb-wale:latestRelaunch the TimescaleDB docker container:
docker start timescaledb-recoveredVerify that the database started up and recovered successfully:
docker logs timescaledb-recoveredDon't worry if you see some archive recovery errors in the log at this stage. This happens because the recovery is not completely finalized until no more files can be found in the archive. See the PostgreSQL documentation on continuous archiving for more information.
Keywords
Found an issue on this page?Report an issue or Edit this page in GitHub.