These are just defensive cleanup tasks that we run.
In the good case, there's nothing to kill or remove, so they trigger an
error like this:
> Error response from daemon: Cannot kill container: something: No such container: something
and:
> Error: No such container: something
People often ask us if this is a problem, so instead of always having to
answer with "no, this is to be expected", we'd rather eliminate it now
and make logs cleaner.
In the event that:
- a container is really stuck and needs cleanup using kill/rm
- and cleanup fails, and we fail to report it because of error
suppression (`2>/dev/null`)
.. we'd still get an error when launching ("container name already in use .."),
so it shouldn't be too hard to investigate.
In short, this makes Synapse a 2nd class citizen,
preparing for a future where it's just one-of-many homeserver software
options.
We also no longer have a default Postgres superuser password,
which improves security.
The changelog explains more as to why this was done
and how to proceed from here.
We need to suppress systemd service-stopping requests in certain rare
cases like https://github.com/spantaleev/matrix-docker-ansible-deploy/issues/771
That issue seems to describe a case, where a migration from mxisd to
ma1sd was happening (DB files had just been moved), and then we were
attemping to stop `matrix-ma1sd.service` so we could import that database into
Postgres. However, there's neither `matrix-mxisd.service`, nor
`matrix-ma1sd.service` after `migrate_mxisd.yml` had just run, so
stopping `matrix-ma1sd.service` was failing.
In cases where pgloader is not enough and we need to do some additional
migration work after it, we can now use
`additional_psql_statements_list` and
`additional_psql_statements_db_name`.
This is to be used when migrating `matrix-registration`'s data at the
very least.
We were running into conflicts, because having initialized
the roles (users) and databases, trying to import leads to
errors (role XXX already exists, etc.).
We were previously ignoring the Synapse database (`homeserver`)
when upgrading/importing, because that one gets created by default
whenever the container starts.
For our additional databases, it's a similar situation now.
It's not created by default as soon as Postgres starts with an empty
database, but rather we create it as part of running the playbook.
So we either need to skip those role/database creation statements
while upgrading/importing, or to avoid creating the additional database
and rely on the import for that. I've gone for the former, because
it's already similar to what we were doing and it's simpler
(it lets `setup_postgres.yml` be the same in all scenarios).
Instead of passing the connection string, we can now pass a name of a
variable, which contains a connection string.
Both are supported for having extra flexibility.
Since we'll likely have generic SQLite database importing
via [pgloader](https://pgloader.io/) for migrating bridge
databases from SQLite to Postgres, we'd rather avoid
calling the "import Synapse SQLite database" command
as just `--tags=import-sqlite-db`.
Similarly, for the media store, we'd like to mention that it's
related to Synapse as well.
We'd like to be more explicit, so as to be less confusing,
especially in light of other homeserver implementations
coming in the future.
While these modules are really nice and helpful, we can't use them
for at least 2 reasons:
- for us, Postgres runs in a container on a private Docker network
(`--network=matrix`) without usually being exposed to the host.
These modules execute on the host so they won't be able to reach it.
- these modules require `psycopg2`, so we need to install it before
using it. This might or might not be its own can of worms.
The tasks in `create_additional_databases.yml` will likely
ensure `matrix-postgres.service` is started, etc.
If no additional databases are defined, we'd rather not execute that
file and all these tasks that it may do in the future.
> Invalid data passed to 'loop', it requires a list, got this instead: matrix_postgres_additional_databases. Hint: If you passed a list/dict of just one element, try adding wantlist=True to your lookup invocation or use q/query instead of lookup.
Well, or working around it, as I've done in this commit (which seems
more sane than `wantlist=True` stuff).
To avoid needing to have `jq` installed on the machine, we could:
- try to run jq in a Docker container using some small image providing
that
- better yet, avoid `jq` altogether
Moving it above the "uninstalling" set of tasks is better.
Extracting it out to another file at the same time, for readability,
especially given that it will probably have to become more complex in
the future (potentially installing `jq`, etc.)
If a service is enabled, a database for it is created in postgres with a uniqque password. The service can then use this database for data storage instead of relying on sqlite.
The Docker 19.04 -> 20.10 upgrade contains the following change
in `/usr/lib/systemd/system/docker.service`:
```
-BindsTo=containerd.service
-After=network-online.target firewalld.service containerd.service
+After=network-online.target firewalld.service containerd.service multi-user.target
-Requires=docker.socket
+Requires=docker.socket containerd.service
Wants=network-online.target
```
The `multi-user.target` requirement in `After` seems to be in conflict
with our `WantedBy=multi-user.target` and `After=docker.service` /
`Requires=docker.service` definitions, causing the following error on
startup for all of our systemd services:
> Job matrix-synapse.service/start deleted to break ordering cycle starting with multi-user.target/start
A workaround which appears to work is to add `DefaultDependencies=no`
to all of our services.