fix(server): Fix delay with multiple ml servers (#16284)

* Prospective fix for ensuring that known active ML servers are used to reduce search delay. * Added some logging and renamed backoff const. * Fix lint issues. * Update to use env vars for timeouts and updated documentation and strings. * Fix docs. * Make counter logic clearer. * Minor readability improvements. * Extract skipUrl logic per feedback, and change log to verbose. * Make code harder to read.
2025-11-07 17:27:20 +00:00 · 2025-02-28 03:14:09 +11:00 · 2025-02-28 03:14:09 +11:00 · a808b8610e
commit a808b8610e
parent c70c9067b0
5 changed files with 82 additions and 1 deletions
--- a/docs/docs/administration/system-settings.md
+++ b/docs/docs/administration/system-settings.md
@ -98,6 +98,14 @@ The default Immich log level is `Log` (commonly known as `Info`). The Immich adm
 Through this setting, you can manage all the settings related to machine learning in Immich, from the setting of remote machine learning to the model and its parameters
 You can choose to disable a certain type of machine learning, for example smart search or facial recognition.

+### URL
+
+The built in (`http://immich-machine-learning:3003`) machine learning server will be configured by default, but you can change this or add additional servers.
+
+Hosting the `immich-machine-learning` container on a machine with a more powerful GPU can be helpful to for processing a large number of photos (such as during batch import) or for faster search.
+
+If more than one URL is provided, each server will be attempted one-at-a-time until one responds successfully, in order from first to last. Servers that don't respond will be temporarily ignored until they come back online.
+
 ### Smart Search

 The [smart search](/docs/features/searching) settings allow you to change the [CLIP model](https://openai.com/research/clip). Larger models will typically provide [more accurate search results](https://github.com/immich-app/immich/discussions/11862) but consume more processing power and RAM. When [changing the CLIP model](/docs/FAQ#can-i-use-a-custom-clip-model) it is mandatory to re-run the Smart Search job on all images to fully apply the change.
--- a/docs/docs/install/environment-variables.md
+++ b/docs/docs/install/environment-variables.md
@ -168,6 +168,8 @@ Redis (Sentinel) URL example JSON before encoding:
 | `MACHINE_LEARNING_ANN_TUNING_LEVEL`                         | ARM-NN GPU tuning level (1: rapid, 2: normal, 3: exhaustive)                                        |               `2`               | machine learning |
 | `MACHINE_LEARNING_DEVICE_IDS`<sup>\*4</sup>                 | Device IDs to use in multi-GPU environments                                                         |               `0`               | machine learning |
 | `MACHINE_LEARNING_MAX_BATCH_SIZE__FACIAL_RECOGNITION`       | Set the maximum number of faces that will be processed at once by the facial recognition model      |  None (`1` if using OpenVINO)   | machine learning |
+| `MACHINE_LEARNING_PING_TIMEOUT`                             | How long (ms) to wait for a PING response when checking if an ML server is available                |             `2000`              | server           |
+| `MACHINE_LEARNING_AVAILABILITY_BACKOFF_TIME`                | How long to ignore ML servers that are offline before trying again                                  |             `30000`             | server           |

 \*1: It is recommended to begin with this parameter when changing the concurrency levels of the machine learning service and then tune the other ones.