diff --git a/CHANGELOG.md b/CHANGELOG.md index d8ebdd0ed..153be11fa 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -1,3 +1,23 @@ +# 2024-02-14 + +## Much larger Synapse caches and cache auto-tuning enabled by default + +Thanks to [FSG-Cat](https://github.com/FSG-Cat), the playbook now uses much larger caches and enables Synapse's [cache auto-tuning functionality](https://matrix-org.github.io/synapse/latest/usage/configuration/config_documentation.html#caches-and-associated-values). +This work and the default values used by the playbook are inspired by [Tom Foster](https://github.com/tcpipuk)'s [Synapse homeserver guide](https://tcpipuk.github.io/synapse/deployment/synapse.html). + +The playbook has always used a very conservative cache factor (`matrix_synapse_caches_global_factor`) value of `0.5`, which may be OK for small and underactive deployments, but is not ideal for larger servers. Paradoxically, a small global cache factor value [does not necessarily decrease RAM usage as a whole](https://github.com/matrix-org/synapse/issues/3939). + +The playbook now uses **a 20x larger cache factor** (currently `10`), adjusts a few other cache-related variables, and **enables cache auto-tuning** via the following variables: + +- `matrix_synapse_cache_autotuning_max_cache_memory_usage` - defaults to 1/8 of total RAM with a cap of 2GB; values are specified in bytes +- `matrix_synapse_cache_autotuning_target_cache_memory_usage` - defaults to 1/16 of total RAM with a cap of 1GB; values are specified in bytes +- `matrix_synapse_cache_autotuning_min_cache_ttl` - defaults to `30s` + +These values should be good defaults for most servers, but may change over time as we experiment further. + +Refer to our new [Tuning caches and cache autotuning](docs/maintenance-synapse.md#tuning-caches-and-cache-autotuning) documentation section for more details. + + # 2024-01-31 ## (Backward-compatibility break) Minor changes necessary for some people serving a static website at the base domain diff --git a/docs/maintenance-synapse.md b/docs/maintenance-synapse.md index ed1dee7c2..93c150022 100644 --- a/docs/maintenance-synapse.md +++ b/docs/maintenance-synapse.md @@ -74,8 +74,32 @@ Synapse's presence feature which tracks which users are online and which are off If you have enough compute resources (CPU & RAM), you can make Synapse better use of them by [enabling load-balancing with workers](configuring-playbook-synapse.md#load-balancing-with-workers). -Tuning Synapse's cache factor can help reduce RAM usage. [See the upstream documentation](https://github.com/element-hq/synapse#help-synapse-is-slow-and-eats-all-my-ram-cpu) for more information on what value to set the cache factor to. Use the variable `matrix_synapse_caches_global_factor` to set the cache factor. - [Tuning your PostgreSQL database](maintenance-postgres.md#tuning-postgresql) could also improve Synapse performance. The playbook tunes the integrated Postgres database automatically, but based on your needs you may wish to adjust tuning variables manually. If you're using an [external Postgres database](configuring-playbook-external-postgres.md), you will aslo need to tune Postgres manually. +### Tuning caches and cache autotuning + +Tuning Synapse's cache factor is useful for performance increases but also as part of controlling Synapse's memory use. Use the variable `matrix_synapse_caches_global_factor` to set the cache factor as part of this process. + +**The playbook defaults the global cache factor to a large value** (e.g. `10`). A smaller value (e.g. `0.5`) will decrease the amount used for caches, but will [not necessarily decrease RAM usage as a whole](https://github.com/matrix-org/synapse/issues/3939). + +Tuning the cache factor is useful only to a limited degree (as its crude to do in isolation) and therefore users who are tuning their cache factor should likely look into tuning autotune variables as well (see below). + +Cache autotuning is **enabled by default** and controlled via the following variables: + +- `matrix_synapse_cache_autotuning_max_cache_memory_usage` - defaults to 1/8 of total RAM with a cap of 2GB; values are specified in bytes +- `matrix_synapse_cache_autotuning_target_cache_memory_usage` - defaults to 1/16 of total RAM with a cap of 1GB; values are specified in bytes +- `matrix_synapse_cache_autotuning_min_cache_ttl` - defaults to `30s` + +You can **learn more about cache-autotuning and the global cache factor settings** in the [Synapse's documentation on caches and associated values](https://matrix-org.github.io/synapse/latest/usage/configuration/config_documentation.html#caches-and-associated-values). + +To **disable cache auto-tuning**, unset all values: + +```yml +matrix_synapse_cache_autotuning_max_cache_memory_usage: '' +matrix_synapse_cache_autotuning_target_cache_memory_usage: '' +matrix_synapse_cache_autotuning_min_cache_ttl: '' +``` + +Users who wish to lower Synapse's RAM footprint should look into lowering the global cache factor and tweaking the autotune variables (or disabling auto-tuning). If your cache factor is too low for a given auto tune setting your caches will not reach autotune thresholds and autotune won't be able to do its job. Therefore, when auto-tuning is enabled (which it is by default), it's recommended to have your cache factor be large. + See also [How do I optimize this setup for a low-power server?](faq.md#how-do-i-optimize-this-setup-for-a-low-power-server). diff --git a/roles/custom/matrix-synapse/defaults/main.yml b/roles/custom/matrix-synapse/defaults/main.yml index 13e16b360..9b9f068fd 100644 --- a/roles/custom/matrix-synapse/defaults/main.yml +++ b/roles/custom/matrix-synapse/defaults/main.yml @@ -547,8 +547,23 @@ matrix_synapse_event_cache_size: "100K" # Controls cache sizes for Synapse. # Raise this to increase cache sizes or lower it to potentially lower memory use. -# To learn more, see: https://github.com/matrix-org/synapse/issues/3939 -matrix_synapse_caches_global_factor: 0.5 +# To learn more, see: +# - https://matrix-org.github.io/synapse/latest/usage/configuration/config_documentation.html#caching +# - https://github.com/matrix-org/synapse/issues/3939 +# Defaults for timings of caches is from https://tcpipuk.github.io/synapse/deployment/synapse.html +# The idea with the timings used is that you get to evict soon but also you keep stuff around for a long time when its not forced out. +# Long cache lifetimes together with the low minimum TTL allows autotune to be the primary eviction method assuming size of cache is hit before we hit other caps. +matrix_synapse_caches_global_factor: 10 +matrix_synapse_caches_expire_caches: true +matrix_synapse_caches_cache_entry_ttl: "1080m" +matrix_synapse_caches_sync_response_cache_duration: "2m" +matrix_synapse_cache_autotuning_min_cache_ttl: "30s" +# The Cache tune math used here is a derivative of the same math used to autotune sizes for postgres. +# The memtotal variable can in theory be overiden to make Synapse think it has less ram to work with. +# But if your at the point of considering that just override the math or put static values in. +matrix_synapse_memtotal_kb: "{{ ansible_memtotal_mb*1024|int }}" +matrix_synapse_cache_autotuning_max_cache_memory_usage: "{{ 2097152 if (matrix_synapse_memtotal_kb|int/8)/1024 >= 2048 else matrix_synapse_memtotal_kb|int/8 }}" +matrix_synapse_cache_autotuning_target_cache_memory_usage: "{{ 1048576 if (matrix_synapse_memtotal_kb|int/16)/1024 >= 1024 else matrix_synapse_memtotal_kb|int/16 }}" # Controls whether Synapse will federate at all. # Disable this to completely isolate your server from the rest of the Matrix network. diff --git a/roles/custom/matrix-synapse/tasks/validate_config.yml b/roles/custom/matrix-synapse/tasks/validate_config.yml index 2f1a5e1c0..530ce4be6 100644 --- a/roles/custom/matrix-synapse/tasks/validate_config.yml +++ b/roles/custom/matrix-synapse/tasks/validate_config.yml @@ -89,6 +89,9 @@ - {'old': 'matrix_synapse_send_federation', 'new': ''} - {'old': 'matrix_synapse_start_pushers', 'new': ''} - {'old': 'matrix_synapse_spam_checker', 'new': ''} + - {'old': 'matrix_synapse_caches_autotuning_max_cache_memory_usage', 'new': 'matrix_synapse_cache_autotuning_max_cache_memory_usage'} + - {'old': 'matrix_synapse_caches_autotuning_target_cache_memory_usage', 'new': 'matrix_synapse_cache_autotuning_target_cache_memory_usage'} + - {'old': 'matrix_synapse_caches_autotuning_min_cache_ttl', 'new': 'matrix_synapse_cache_autotuning_min_cache_ttl'} - name: (Deprecation) Catch and report renamed settings in matrix_synapse_configuration_extension_yaml ansible.builtin.fail: diff --git a/roles/custom/matrix-synapse/templates/synapse/homeserver.yaml.j2 b/roles/custom/matrix-synapse/templates/synapse/homeserver.yaml.j2 index b6cc6428b..5206dce3f 100644 --- a/roles/custom/matrix-synapse/templates/synapse/homeserver.yaml.j2 +++ b/roles/custom/matrix-synapse/templates/synapse/homeserver.yaml.j2 @@ -760,49 +760,48 @@ federation_domain_whitelist: {{ matrix_synapse_federation_domain_whitelist|to_js # The number of events to cache in memory. Not affected by # caches.global_factor. # -event_cache_size: "{{ matrix_synapse_event_cache_size }}" +event_cache_size: {{ matrix_synapse_event_cache_size | to_json }} caches: - # Controls the global cache factor, which is the default cache factor - # for all caches if a specific factor for that cache is not otherwise - # set. - # - # This can also be set by the "SYNAPSE_CACHE_FACTOR" environment - # variable. Setting by environment variable takes priority over - # setting through the config file. - # - # Defaults to 0.5, which will half the size of all caches. - # - global_factor: {{ matrix_synapse_caches_global_factor }} - - # A dictionary of cache name to cache factor for that individual - # cache. Overrides the global cache factor for a given cache. - # - # These can also be set through environment variables comprised - # of "SYNAPSE_CACHE_FACTOR_" + the name of the cache in capital - # letters and underscores. Setting by environment variable - # takes priority over setting through the config file. - # Ex. SYNAPSE_CACHE_FACTOR_GET_USERS_WHO_SHARE_ROOM_WITH_USER=2.0 - # - # Some caches have '*' and other characters that are not - # alphanumeric or underscores. These caches can be named with or - # without the special characters stripped. For example, to specify - # the cache factor for `*stateGroupCache*` via an environment - # variable would be `SYNAPSE_CACHE_FACTOR_STATEGROUPCACHE=2.0`. - # - per_cache_factors: - #get_users_who_share_room_with_user: 2.0 + # Controls the global cache factor, which is the default cache factor + # for all caches if a specific factor for that cache is not otherwise + # set. + # + # This can also be set by the "SYNAPSE_CACHE_FACTOR" environment + # variable. Setting by environment variable takes priority over + # setting through the config file. + # + # Defaults to 0.5, which will half the size of all caches. + # + global_factor: {{ matrix_synapse_caches_global_factor | to_json }} + # A dictionary of cache name to cache factor for that individual + # cache. Overrides the global cache factor for a given cache. + # + # These can also be set through environment variables comprised + # of "SYNAPSE_CACHE_FACTOR_" + the name of the cache in capital + # letters and underscores. Setting by environment variable + # takes priority over setting through the config file. + # Ex. SYNAPSE_CACHE_FACTOR_GET_USERS_WHO_SHARE_ROOM_WITH_USER=2.0 + # + # Some caches have '*' and other characters that are not + # alphanumeric or underscores. These caches can be named with or + # without the special characters stripped. For example, to specify + # the cache factor for `*stateGroupCache*` via an environment + # variable would be `SYNAPSE_CACHE_FACTOR_STATEGROUPCACHE=2.0`. + # + per_cache_factors: + #get_users_who_share_room_with_user: 2.0 # Controls whether cache entries are evicted after a specified time # period. Defaults to true. Uncomment to disable this feature. # - #expire_caches: false + expire_caches: {{ matrix_synapse_caches_expire_caches | to_json }} # If expire_caches is enabled, this flag controls how long an entry can # be in a cache without having been accessed before being evicted. # Defaults to 30m. Uncomment to set a different time to live for cache entries. # - #cache_entry_ttl: 30m + cache_entry_ttl: {{ matrix_synapse_caches_cache_entry_ttl | to_json }} # Controls how long the results of a /sync request are cached for after # a successful response is returned. A higher duration can help clients with @@ -811,7 +810,12 @@ caches: # By default, this is zero, which means that sync responses are not cached # at all. # - #sync_response_cache_duration: 2m + sync_response_cache_duration: {{ matrix_synapse_caches_sync_response_cache_duration | to_json }} + + cache_autotuning: + max_cache_memory_usage: {{ matrix_synapse_cache_autotuning_max_cache_memory_usage | to_json }} + target_cache_memory_usage: {{ matrix_synapse_cache_autotuning_target_cache_memory_usage | to_json }} + min_cache_ttl: {{ matrix_synapse_cache_autotuning_min_cache_ttl | to_json }} ## Database ##