Resource requirements
Usual Packit Service deployment consists of the following services with these resource requirements.
CPU requirements
| Deployment | Requested (always assigned) | Limit |
|---|---|---|
| postgres | 30m | 1 |
| redict | 10m | 10m |
| flower | 5m | 50m |
| nginx | 5m | 10m |
| pushgateway | 5m | 10m |
| tokman | 20m (prod, 5m otherwise) | 50m |
| dashboard | 5m | 50m |
| fedmsg | 5m | 50m |
| beat | 5m | 50m |
| service | 10m | 200m |
| worker (generic) | 100m | 400m |
| worker (short) | 80m | 400m |
| worker (long) | 100m | 600m |
Memory requirements
| Deployment | Requested (always assigned) | Limit |
|---|---|---|
| postgres | 1Gi (prod, 256Mi otherwise) | 1536Mi (prod, 512Mi otherwise) |
| redict | 128Mi | 256Mi |
| flower | 80Mi | 128Mi |
| nginx | 8Mi | 32Mi |
| pushgateway | 16Mi | 32Mi |
| tokman | 100Mi (prod, 88Mi otherwise) | 160Mi (prod, 128Mi otherwise) |
| dashboard | 128Mi | 256Mi |
| fedmsg | 88Mi | 128Mi |
| beat | 160Mi | 256Mi |
| service | 320Mi | 1Gi (prod, 512Mi otherwise) |
| worker (generic) | 384Mi | 1024Mi |
| worker (short) | 320Mi | 640Mi |
| worker (long) | 384Mi | 1024Mi |
Currently allowed requirements / limits
| Resource | Allowed to request | Limit |
|---|---|---|
| CPU | 3 | 12 |
| Memory | 6Gi | 8Gi |
Total for production
| Deployment | Memory request | Memory limit | CPU request | CPU limit |
|---|---|---|---|---|
| non-scalable1 | 2052Mi | 3808Mi | 100m | 1480m |
| 2× short worker | 640Mi | 1280Mi | 160m | 800m |
| 2× long worker | 768Mi | 2048Mi | 200m | 1200m |
| Σ | 3460Mi | 7136Mi | 460m | 3480m |
Proposed changes
Revert to the pre-MP+ resources (they were higher for service, workers and postgres; lower values were used due to a hardcoded check in the templates);
Pre-MP+ memory requirements/limits for production deployment:
Deployment Requested Limit postgres 2Gi4Giservice 320m4GiWith the current setup (2× short and long-running workers), we would need
Resource Request Limit CPU 460m3480mMemory 4484Mi12768MiRequesting the memory quotas to be multiplied by 3 results in having ~
11Gimemory left which should be enough to scale up for few more workers if needed. This setup would also allow scaling up to 8 workers per each queue.Request adjustments of the quotas such that we can have some buffer (database migrations, higher load on service, etc.), but also could permanently scale up the workers if we find service to be more reliable that way
- Based on the calculations above, 2× the current quotas on memory would be sufficient, but if we were to scale the workers up too (and account for possible adjustments, e.g., Redict) we should probably go for 3×
Migrate tokman to different toolchain, it's a small self-contained app, so it is easy to migrate to either Rust or Go that should leave smaller footprint.
Opened an issue for testing out running without Tokman deployment https://github.com/packit/tokman/issues/72
Opened an issue for migrating in case we need the tokman To be opened, if the previous issue “fails” (i.e. tokman is still needed, or dropping affects the amount of requests to GitHub in a negative way)
- includes non-scalable deployments, i.e., each runs just one pod, e.g., dashboard, redict, postgres, etc.↩