* [feat]: Enhance CPU feature detection and support for AVX512 extensions
- Added cmake/DetectCPU.cmake for automatic CPU feature detection.
- Updated CMakeLists.txt to include auto-detection logic for AVX512 features.
- Modified install.sh to include new AVX512_VBMI option for FP8 MoE.
- Enhanced _cpu_detect.py to support progressive matching of CPU variants.
- Created scripts/check_cpu_features.py for manual CPU feature checks.
- Updated setup.py to reflect changes in CPU variant building and environment variables.
* [fix](kt-kernel): Add conditional inclusion of FP8 MoE for AVX512 BF16 support
* [chore](kt-kernel): update project version to 0.5.0 in CMakeLists.txt and version.py
* [fix](kt-kernel): fix AVX512 cpu instruction set detection
* [feat](kt-kernel): AVX512 fallback kernel for RAW-INT4
* [fix](kt-kernel): fix setup version issue
* [fix](kt-kernel): update install for custom build
* [docs](kt-kernel): new installation guide for various cpu instruction set
* [fix](kt-kernel): fix _mm512_dpbusd_epi32_compat fallback implmentation
* [style](kt-kernel): clang format