[rocm-libraries] ROCm/rocm-libraries#6563 (commit 6559ac9)

[CK] Add render group to AITER and FA dockers
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

## Motivation

The AITER and FA test dockers (`Dockerfile.aiter`, `Dockerfile.fa`)
inherit from the `rocm/pytorch` base image. Recent updates to that base
image dropped the `render` group from `/etc/group`, so every parallel
test stage now fails on the test agents with:

```
docker: Error response from daemon: Unable to find group render:
no matching entries in group file.
```

Jenkins resolves `--group-add render` against the **container's**
`/etc/group`, not the host's, so even though the test agents have render
in their `/etc/group` (GID 109), the container lookup fails.

This pattern affects every recent develop build
([#673](http://micimaster.amd.com/blue/organizations/jenkins/rocm-libraries-folder%2FComposable%20Kernel/detail/develop/673),
[#674](http://micimaster.amd.com/blue/organizations/jenkins/rocm-libraries-folder%2FComposable%20Kernel/detail/develop/674),
[#686](http://micimaster.amd.com/blue/organizations/jenkins/rocm-libraries-folder%2FComposable%20Kernel/detail/develop/686),
[#688](http://micimaster.amd.com/blue/organizations/jenkins/rocm-libraries-folder%2FComposable%20Kernel/detail/develop/688),
[#699](http://micimaster.amd.com/blue/organizations/jenkins/rocm-libraries-folder%2FComposable%20Kernel/detail/develop/699),
[#708](http://micimaster.amd.com/blue/organizations/jenkins/rocm-libraries-folder%2FComposable%20Kernel/detail/develop/708)
— 6 days in a row), where AITER tests fail in seconds and the cascading
failure aborts all downstream Build/FMHA/TILE_ENGINE stages.

## Technical Details

Add `groupadd -f render` to both `Dockerfile.aiter` and `Dockerfile.fa`,
mirroring what the main `Dockerfile` already does (`Dockerfile:96`) and
what `Dockerfile.pytorch` does (`Dockerfile.pytorch:4`). The `-f` flag
makes it idempotent — silently succeeds if the group already exists.

This guarantees the `render` group is always present in the container,
regardless of whether the base image happens to ship it.

## Test Plan
Triggering AITER CI job:

## Test Result

## Submission Checklist

- [x] Look over the contributing guidelines at

https://github.com/ROCm/ROCm/blob/develop/CONTRIBUTING.md#pull-requests.
This commit is contained in:
Yi DING
2026-04-21 05:36:37 +00:00
committed by assistant-librarian[bot]
parent 60ff5693c4
commit eaaed3e35e
2 changed files with 4 additions and 0 deletions

View File

@@ -34,6 +34,8 @@ RUN pip install pandas zmq einops ninja tabulate vcs_versioning && \
python3 setup.py develop && \
groupadd -g 1001 jenkins && \
useradd -u 1001 -g 1001 -m -s /bin/bash jenkins && \
groupadd -f video && \
groupadd -f render && \
chown -R jenkins:jenkins /home/jenkins && \
chmod -R a+rwx /home/jenkins && \
chown -R jenkins:jenkins /tmp && \

View File

@@ -36,6 +36,8 @@ RUN set -x ; \
MAX_JOBS=$(nproc) GPU_ARCHS="$GPU_ARCHS" /opt/venv/bin/python3 -u -m pip install --no-build-isolation -v . && \
groupadd -g 1001 jenkins && \
useradd -u 1001 -g 1001 -m -s /bin/bash jenkins && \
groupadd -f video && \
groupadd -f render && \
chown -R jenkins:jenkins /home/jenkins && \
chmod -R a+rwx /home/jenkins && \
chown -R jenkins:jenkins /tmp && \