mirror of
https://github.com/ROCm/composable_kernel.git
synced 2026-04-20 06:49:15 +00:00
[rocm-libraries] ROCm/rocm-libraries#5713 (commit e179279)
Adding New Notification Detection ## Motivation Restricting one of the notification failure patterns to match a specific missing drivers log pattern. This will help reduce the noise of erroneous logs. Also adding a new failure pattern to notify us of Github access issues. ## Technical Details - Set the failure pattern to match the exact failure observed in the logs. - Switching to a plain substring search so special characters are handled literally. - Added a new failure pattern for Github access errors. ## Test Plan - Force a failure using the known failure patterns. ## Test Result The forced failures were triggered and caught by the notification system. ## Submission Checklist - [ ] Look over the contributing guidelines at https://github.com/ROCm/ROCm/blob/develop/CONTRIBUTING.md#pull-requests.
This commit is contained in:
committed by
assistant-librarian[bot]
parent
ba2fb0224f
commit
5a4243096b
@@ -22,12 +22,13 @@ PATTERNS=(
|
||||
'login attempt to .* failed with status: 401 Unauthorized'
|
||||
'docker login failed'
|
||||
'HTTP request sent .* 404 Not Found'
|
||||
'cat: .* No such file or directory'
|
||||
'/sys/module/amdgpu/version: No such file or directory'
|
||||
'GPU not found'
|
||||
'Could not connect to Redis at .* Connection timed out'
|
||||
'unauthorized: your account must log in with a Personal Access Token'
|
||||
'sccache: error: Server startup failed: Address in use'
|
||||
'No space left on device'
|
||||
'Could not resolve host: github.com'
|
||||
)
|
||||
|
||||
DESCRIPTIONS=(
|
||||
@@ -40,10 +41,11 @@ DESCRIPTIONS=(
|
||||
"Docker login failed"
|
||||
"Sccache Error"
|
||||
"Device space error"
|
||||
"Unable to access Github"
|
||||
)
|
||||
|
||||
# Indices into PATTERNS/DESCRIPTIONS for which a node name lookup is performed.
|
||||
NODE_PATTERN_INDICES=(3 4 8) # cat: No such file, GPU not found, No space left on device
|
||||
NODE_PATTERN_INDICES=(3 4 8 9)
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Fetch and scan the log.
|
||||
@@ -92,7 +94,7 @@ process_block() {
|
||||
if [[ "$node_idx" == "$i" ]]; then
|
||||
node_name=$(wget -q --no-check-certificate -O - "${BUILD_URL}consoleText" | awk '
|
||||
/NODE_NAME[[:space:]]*=/ { node = $NF }
|
||||
/'"$pattern"'/ { print node; exit }
|
||||
index($0, "'"$pattern"'") { print node; exit }
|
||||
')
|
||||
break
|
||||
fi
|
||||
|
||||
Reference in New Issue
Block a user