16 KiB
Enabling metrics and graphs (Prometheus, Grafana) for your Matrix server (optional)
The playbook can install Prometheus with Grafana and configure performance metrics of your homeserver with graphs for you.
Warning
Metrics and resulting graphs can contain a lot of information. This includes system specs but also usage patterns. This applies especially to small personal/family scale homeservers. Someone might be able to figure out when you wake up and go to sleep by looking at the graphs over time. Think about this before enabling anonymous access. And you should really not forget to change your Grafana password.
Most of our docker containers run with limited system access, but the
prometheus-node-exporter
has access to the host network stack and (readonly) root filesystem. This is required to report on them. If you don't like that, you can setprometheus_node_exporter_enabled: false
(which is actually the default). You will still get Synapse metrics with this container disabled. Both of the dashboards will always be enabled, so you can still look at historical data after disabling either source.
Adjusting DNS records
By default, this playbook installs Grafana web user-interface on the stats.
subdomain (stats.example.com
) and requires you to create a CNAME record for stats
, which targets matrix.example.com
.
When setting, replace example.com
with your own.
Note: It is possible to install Prometheus without installing Grafana. In this case it is not required to create the CNAME record.
Adjusting the playbook configuration
Configure Prometheus
Prometheus is an open-source systems monitoring and alerting toolkit. To enable it, add the following configuration to your inventory/host_vars/matrix.example.com/vars.yml
file:
prometheus_enabled: true
# Uncomment to enable Node Exporter.
# prometheus_node_exporter_enabled: true
# Uncomment to enable nginx Log Exporter.
# matrix_prometheus_nginxlog_exporter_enabled: true
Name | Description |
---|---|
prometheus_enabled |
Prometheus is a time series database. It holds all the data we're going to talk about. |
prometheus_node_exporter_enabled |
Node Exporter is an addon of sorts to Prometheus that collects generic system information such as CPU, memory, filesystem, and even system temperatures. |
matrix_prometheus_nginxlog_exporter_enabled |
nginx Log Exporter is an addon of sorts to expose nginx logs to Prometheus. |
Note: the retention policy of Prometheus metrics is 15 days by default. Older data gets deleted automatically.
Enable metrics and graphs for Postgres (optional)
Expanding on the metrics exposed by the Synapse exporter and the Node exporter, the playbook can also install and configure the PostgreSQL Server Exporter that exposes more detailed information about what's happening on your Postgres database.
To enable it, add the following configuration to your vars.yml
file:
Note: prometheus_postgres_exporter_database_username
has nothing to do with your Matrix user ID. It can be any string you'd like.
prometheus_postgres_exporter_enabled: true
# The username for the user that the exporter uses to connect to the database.
# Uncomment and adjust this part if you'd like to use a username different than the default.
# prometheus_postgres_exporter_database_username: "matrix_prometheus_postgres_exporter"
# The password for the user that the exporter uses to connect to the database. By default, this is auto-generated by the playbook.
# Uncomment and adjust this part if you'd like to set the password by yourself.
# prometheus_postgres_exporter_database_password: "PASSWORD_HERE"
Name | Description |
---|---|
prometheus_postgres_exporter_enabled |
Enable the Postgres Prometheus exporter. This sets up the docker container, connects it to the database and adds a 'job' to the Prometheus config which tells Prometheus about this new exporter. The default is 'false' |
prometheus_postgres_exporter_container_labels_traefik_enabled |
If set to true , exposes the Postgres exporter metrics on https://matrix.example.com/metrics/postgres-exporter for usage with an external Prometheus server. To password-protect the metrics, see matrix_metrics_exposure_http_basic_auth_users below. |
Extending the configuration
There are some additional things you may wish to configure about Prometheus.
Take a look at:
- Prometheus role's
defaults/main.yml
for some variables that you can customize via yourvars.yml
file. You can override settings (even those that don't have dedicated playbook variables) using theprometheus_configuration_extension_yaml
variable
Configure Grafana
Grafana is an open source visualization and analytics software. To enable it, add the following configuration to your vars.yml
file. Make sure to replace USERNAME_HERE
and PASSWORD_HERE
.
Notes:
grafana_default_admin_user
has nothing to do with your Matrix user ID. It can be any string you'd like.- Changing the username/password subsequently won't work.
grafana_enabled: true
grafana_default_admin_user: "USERNAME_HERE"
grafana_default_admin_password: "PASSWORD_HERE"
# Uncomment to allow viewing Grafana without logging in.
# grafana_anonymous_access: true
Name | Description |
---|---|
grafana_enabled |
Grafana is the visual component. It shows (on the stats.example.com subdomain) the dashboards with the graphs that we're interested in. |
grafana_default_admin_user grafana_default_admin_password |
By default Grafana creates a user with admin as the username and password. You are asked to change the credentials on first login. If you feel this is insecure and you want to change them beforehand, you can do that here. |
grafana_anonymous_access |
By default you need to log in to see graphs. If you want to publicly share your graphs (e.g. when asking for help in #synapse:matrix.org ) you'll want to enable this option. |
Adjusting the Grafana URL (optional)
By tweaking the grafana_hostname
variable, you can easily make the service available at a different hostname than the default one.
Example additional configuration for your vars.yml
file:
# Change the default hostname
grafana_hostname: grafana.example.com
After changing the domain, you may need to adjust your DNS records to point the Grafana domain to the Matrix server.
Installing
After configuring the playbook and potentially adjusting your DNS records, run the playbook with playbook tags as below:
ansible-playbook -i inventory/hosts setup.yml --tags=setup-all,start
The shortcut commands with the just
program are also available: just install-all
or just setup-all
just install-all
is useful for maintaining your setup quickly (2x-5x faster than just setup-all
) when its components remain unchanged. If you adjust your vars.yml
to remove other components, you'd need to run just setup-all
, or these components will still remain installed. Note these shortcuts run the ensure-matrix-users-created
tag too.
Collecting metrics to an external Prometheus server
If the integrated Prometheus server is enabled (prometheus_enabled: true
), metrics are collected by it from each service via communication that happens over the container network. Each service does not need to expose its metrics "publicly".
When you'd like to collect metrics from an external Prometheus server, you need to expose service metrics outside of the container network.
The playbook provides a single endpoint (https://matrix.example.com/metrics/*
), under which various services may expose their metrics (e.g. /metrics/node-exporter
, /metrics/postgres-exporter
, /metrics/hookshot
, etc). To expose all services on this /metrics/*
feature, use matrix_metrics_exposure_enabled
. To protect access using Basic Authentication, see matrix_metrics_exposure_http_basic_auth_enabled
and matrix_metrics_exposure_http_basic_auth_users
below.
When using matrix_metrics_exposure_enabled
, you don't need to expose metrics for individual services one by one.
The following variables may be of interest:
Name | Description |
---|---|
matrix_metrics_exposure_enabled |
Set this to true to enable metrics exposure for all services on https://matrix.example.com/metrics/* . If you think this is too much, refer to the helpful (but nonexhaustive) list of individual matrix_SERVICE_metrics_proxying_enabled (or similar) variables below for exposing metrics on a per-service basis. |
matrix_metrics_exposure_http_basic_auth_enabled |
Set this to true to protect all https://matrix.example.com/metrics/* endpoints with Basic Authentication (see the other variables below for supplying the actual credentials). When enabled, all endpoints beneath /metrics will be protected with the same credentials. |
matrix_metrics_exposure_http_basic_auth_users |
Set this to the Basic Authentication credentials (raw htpasswd file content) used to protect /metrics/* . This htpasswd-file needs to be generated with the htpasswd tool and can include multiple username/password pairs. |
matrix_synapse_metrics_enabled |
Set this to true to make Synapse expose metrics (locally, on the container network). |
matrix_synapse_metrics_proxying_enabled |
Set this to true to expose Synapse's metrics on https://matrix.example.com/metrics/synapse/main-process and https://matrix.example.com/metrics/synapse/worker/TYPE-ID . Read below if you're running a Synapse worker setup (matrix_synapse_workers_enabled: true ). To password-protect the metrics, see matrix_metrics_exposure_http_basic_auth_users above. |
prometheus_node_exporter_enabled |
Set this to true to enable the node (general system stats) exporter (locally, on the container network). |
prometheus_node_exporter_container_labels_traefik_enabled |
Set this to true to expose the node (general system stats) metrics on https://matrix.example.com/metrics/node-exporter . To password-protect the metrics, see matrix_metrics_exposure_http_basic_auth_users above. |
prometheus_postgres_exporter_enabled |
Set this to true to enable the Postgres exporter (locally, on the container network). |
prometheus_postgres_exporter_container_labels_traefik_enabled |
Set this to true to expose the Postgres exporter metrics on https://matrix.example.com/metrics/postgres-exporter . To password-protect the metrics, see matrix_metrics_exposure_http_basic_auth_users above. |
matrix_prometheus_nginxlog_exporter_enabled |
Set this to true to enable the nginx Log exporter (locally, on the container network). |
matrix_sliding_sync_metrics_enabled |
Set this to true to make Sliding Sync expose metrics (locally, on the container network). |
matrix_sliding_sync_metrics_proxying_enabled |
Set this to true to expose the Sliding Sync metrics on https://matrix.example.com/metrics/sliding-sync . To password-protect the metrics, see matrix_metrics_exposure_http_basic_auth_users above. |
matrix_bridge_hookshot_metrics_enabled |
Set this to true to make Hookshot expose metrics (locally, on the container network). |
matrix_bridge_hookshot_metrics_proxying_enabled |
Set this to true to expose the Hookshot metrics on https://matrix.example.com/metrics/hookshot . To password-protect the metrics, see matrix_metrics_exposure_http_basic_auth_users above. |
matrix_SERVICE_metrics_proxying_enabled |
Various other services/roles may provide similar _metrics_enabled and _metrics_proxying_enabled variables for exposing their metrics. Refer to each role for details. To password-protect the metrics, see matrix_metrics_exposure_http_basic_auth_users above or matrix_SERVICE_container_labels_metrics_middleware_basic_auth_enabled /matrix_SERVICE_container_labels_metrics_middleware_basic_auth_users variables provided by each role. |
matrix_media_repo_metrics_enabled |
Set this to true to make media-repo expose metrics (locally, on the container network). |
Collecting Synapse worker metrics to an external Prometheus server
If you are using workers (matrix_synapse_workers_enabled: true
) and have enabled matrix_synapse_metrics_proxying_enabled
as described above, the playbook will also automatically expose all Synapse worker threads' metrics to https://matrix.example.com/metrics/synapse/worker/ID
, where ID
corresponds to the worker id
as exemplified in matrix_synapse_workers_enabled_list
.
The playbook also generates an exemplary config file (/matrix/synapse/external_prometheus.yml.template
) with all the correct paths which you can copy to your Prometheus server and adapt to your needs. Make sure to edit the specified password_file
path and contents and path to your synapse-v2.rules
. It will look a bit like this:
scrape_configs:
- job_name: 'synapse'
metrics_path: /metrics/synapse/main-process
scheme: https
basic_auth:
username: prometheus
password_file: /etc/prometheus/password.pwd
static_configs:
- targets: ['matrix.example.com:443']
labels:
job: "master"
index: 1
- job_name: 'matrix-synapse-synapse-worker-generic-worker-0'
metrics_path: /metrics/synapse/worker/generic-worker-0
scheme: https
basic_auth:
username: prometheus
password_file: /etc/prometheus/password.pwd
static_configs:
- targets: ['matrix.example.com:443']
labels:
job: "generic_worker"
index: 18111
Troubleshooting
As with all other services, you can find the logs in systemd-journald by logging in to the server with SSH and running the commands below:
journalctl -fu matrix-prometheus
for Prometheusjournalctl -fu matrix-grafana
for Grafana
More information
- Enabling synapse-usage-exporter for Synapse usage statistics
- Understanding Synapse Performance Issues Through Grafana Graphs at the Synapse Github Wiki
- The Prometheus scraping rules (we use v2)
- The Synapse Grafana dashboard
- The Node Exporter dashboard (for generic non-synapse performance graphs)
- The PostgresSQL dashboard (generic Postgres dashboard)