Test comprehensive dataset (#2685)

* Add CSV-driven convolution test pipeline

- Add test_grouped_convnd_fwd_dataset_xdl.cpp with CSV reader functionality
- Add complete dataset generation toolchain in test_data/
- Add Jenkins integration with RUN_CONV_COMPREHENSIVE_DATASET parameter
- Ready for comprehensive convolution testing with scalable datasets

* Update convolution test dataset generation pipeline

* add 2d, 3d dataset csv files

* Remove CSV test dataset files from repository

* Update generate_test_dataset.sh

* Fix channel division for MIOpen to CK conversion

* Remove unnecessary test files

* Fix clang-format-18 formatting issues

* TEST: Enable comprehensive dataset tests by default

* Fix test_data path in Jenkins - build runs from build directory

* Add Python dependencies and debug output for CSV generation

* Remove Python package installation - not needed

* Add better debugging for generate_test_dataset.sh execution

* Fix Jenkinsfile syntax error - escape dollar signs

* Add PyTorch to Docker image for convolution test dataset generation

- Install PyTorch CPU version for lightweight model execution
- Fixes Jenkins CI failures where CSV files were empty due to missing PyTorch
- Model generation scripts require PyTorch to extract convolution parameters

* Add debugging to understand Jenkins directory structure and CSV file status

- Print current working directory
- List CSV files in test_data directory
- Show line counts of CSV files
- Will help diagnose why tests fail in Jenkins

* Fix clang-format-18 formatting issues

- Applied clang-format-18 to test file
- Fixed brace placement and whitespace issues

* Add detailed debugging for CSV dataset investigation

- Check generated_datasets directory contents
- List all CSV files with line counts
- Show first 5 lines of main CSV file
- Applied clang-format-18 formatting
- This will help identify why CSV files are empty in Jenkins

* keep testing add pytorch installation in shell script

* Use virtual environment for PyTorch installation

- Jenkins user doesn't have permission to write to /.local
- Create virtual environment in current directory (./pytorch_venv)
- Install PyTorch in virtual environment to avoid permission issues
- Use PYTHON_CMD variable to run all Python scripts with correct interpreter
- Virtual environment will be reused if it already exists

* Remove debug code and reduce verbose logging in Jenkins

- Remove bash -x and debug commands from Jenkinsfile execute_args
- Remove all debug system() calls and getcwd from C++ test file
- Remove unistd.h include that was only needed for getcwd
- Remove debug print in CSV parser
- Add set +x to generate_test_dataset.sh to disable command echo
- Redirect Python script stdout to /dev/null for cleaner output

This makes Jenkins logs much cleaner while still showing progress messages.

* install gpu torch

* Clean up and optimize comprehensive dataset test pipeline

- Reorder Jenkinsfile execution: build -> generate data -> run test
- Remove commented-out debug code from generate_test_dataset.sh
- Ensure all files end with proper newline character (POSIX compliance)
- Keep useful status messages while removing development debug prints
- Set MAX_ITERATIONS=0 for unlimited test generation in production

* Add configuration modes to reduce test execution time

- Add --mode option (half/full) to generate_model_configs.py
  - half mode (default): ~278 configs (224 2D + 54 3D) -> ~1,058 total tests
  - full mode: ~807 configs (672 2D + 135 3D) -> ~3,093 total tests
- Update generate_test_dataset.sh to use CONFIG_MODE environment variable
- Keeps all model types but reduces parameter combinations intelligently
- Fixes Jenkins timeout issue (was running 3,669 tests taking 17+ hours)
- Default half mode should complete in ~4-5 hours instead of 17+ hours

* Add small mode for quick testing of comprehensive dataset

* jenkins pipeline test done

* jenkins test done

* Trigger CI build

* remove test comment and update data generation option as half

---------

Co-authored-by: Bartłomiej Kocot <barkocot@amd.com>
This commit is contained in:
JH-Leon-KIM-AMD
2025-08-26 23:18:05 +03:00
committed by GitHub
parent 508e7912f9
commit 19d5327c45
5 changed files with 186 additions and 65 deletions

View File

@@ -10,8 +10,12 @@ import csv
import itertools
import argparse
def generate_2d_configs():
"""Generate all 2D model configuration combinations"""
def generate_2d_configs(mode='full'):
"""Generate all 2D model configuration combinations
Args:
mode: 'small' for minimal set (~50 configs), 'half' for reduced set (~250 configs), 'full' for comprehensive set (~500 configs)
"""
# Define parameter ranges
models_2d = [
@@ -24,15 +28,37 @@ def generate_2d_configs():
'shufflenet_v2_x1_0'
]
batch_sizes = [1, 4, 8, 16, 32]
# Input dimensions: (height, width)
input_dims = [
(64, 64), (128, 128), (224, 224), (256, 256), (512, 512), # Square
(224, 320), (224, 448), (320, 224), (448, 224), # Rectangular
(227, 227), # AlexNet preferred
(299, 299) # Inception preferred
]
if mode == 'small':
# Minimal set for quick testing
batch_sizes = [1, 8] # Just two batch sizes
# Very limited input dimensions - only 2 key sizes
input_dims = [
(224, 224), # Standard (most common)
(256, 256), # Medium
]
# Use only first 3 models for minimal testing
models_2d = models_2d[:3] # Only resnet18, resnet34, resnet50
elif mode == 'half':
# Reduced set for faster testing
batch_sizes = [1, 8, 32] # Small, medium, large
# Reduced input dimensions - 5 key sizes
input_dims = [
(64, 64), # Small
(224, 224), # Standard (most common)
(512, 512), # Large
(224, 320), # Rectangular
(227, 227), # AlexNet preferred
]
else: # full mode
# More comprehensive but still limited
batch_sizes = [1, 4, 8, 16, 32]
# More dimensions but skip some redundant ones
input_dims = [
(64, 64), (128, 128), (224, 224), (256, 256), (512, 512), # Square
(224, 320), (320, 224), # Rectangular (reduced from 4)
(227, 227), # AlexNet preferred
(299, 299) # Inception preferred
]
precisions = ['fp32'] #, 'fp16', 'bf16']
channels = [3] # Most models expect RGB
@@ -68,19 +94,44 @@ def generate_2d_configs():
return configs
def generate_3d_configs():
"""Generate all 3D model configuration combinations"""
def generate_3d_configs(mode='full'):
"""Generate all 3D model configuration combinations
Args:
mode: 'small' for minimal set (~10 configs), 'half' for reduced set (~50 configs), 'full' for comprehensive set (~100 configs)
"""
models_3d = ['r3d_18', 'mc3_18', 'r2plus1d_18']
batch_sizes = [1, 2, 4, 8] # 3D models are more memory intensive
temporal_sizes = [8, 16, 32]
# 3D input dimensions: (height, width)
input_dims = [
(112, 112), (224, 224), (256, 256), # Standard sizes
(224, 320), (320, 224) # Rectangular
]
if mode == 'small':
# Minimal set for quick testing
batch_sizes = [1, 4] # Just two batch sizes
temporal_sizes = [8] # Only smallest temporal size
# Very limited spatial dimensions
input_dims = [
(112, 112), # Standard for 3D
]
# Use only first model for minimal testing
models_3d = models_3d[:1] # Only r3d_18
elif mode == 'half':
# Reduced set for faster testing
batch_sizes = [1, 4, 8] # Skip batch_size=2
temporal_sizes = [8, 16] # Skip 32 (most expensive)
# Reduced spatial dimensions
input_dims = [
(112, 112), # Small (common for video)
(224, 224), # Standard
(224, 320) # Rectangular
]
else: # full mode
# More comprehensive but still reasonable
batch_sizes = [1, 2, 4, 8] # 3D models are more memory intensive
temporal_sizes = [8, 16, 32]
# More dimensions
input_dims = [
(112, 112), (224, 224), (256, 256), # Standard sizes
(224, 320), (320, 224) # Rectangular
]
precisions = ['fp32'] #, 'fp16'] # Skip bf16 for 3D to reduce combinations
channels = [3]
@@ -142,19 +193,23 @@ def main():
help='Output file for 2D configurations')
parser.add_argument('--output-3d', type=str, default='model_configs_3d.csv',
help='Output file for 3D configurations')
parser.add_argument('--mode', choices=['small', 'half', 'full'], default='full',
help='Configuration mode: small (~60 total), half (~300 total) or full (~600 total) (default: half)')
parser.add_argument('--limit', type=int,
help='Limit number of configurations per type (for testing)')
args = parser.parse_args()
print(f"Generating {args.mode} model configurations...")
print("Generating 2D model configurations...")
configs_2d = generate_2d_configs()
configs_2d = generate_2d_configs(mode=args.mode)
if args.limit:
configs_2d = configs_2d[:args.limit]
save_configs_to_csv(configs_2d, args.output_2d, "2D")
print("Generating 3D model configurations...")
configs_3d = generate_3d_configs()
configs_3d = generate_3d_configs(mode=args.mode)
if args.limit:
configs_3d = configs_3d[:args.limit]
save_configs_to_csv(configs_3d, args.output_3d, "3D")
@@ -164,4 +219,4 @@ def main():
print(" Update generate_test_dataset.sh to read from these CSV files")
if __name__ == "__main__":
main()
main()