560.28.03

555.58.02
(cherry picked from commit 1795a8bb20)
2026-01-27 11:39:46 +00:00 · 2024-07-19 15:45:15 -07:00 · 2024-07-19 15:38:11 -07:00 · 2024-07-19 15:38:08 -07:00 · 2024-07-19 15:38:04 -07:00 · 2024-07-19 15:38:00 -07:00
1874 changed files with 447254 additions and 216510 deletions
--- a/.github/ISSUE_TEMPLATE/20_build_bug.yml
+++ b/.github/ISSUE_TEMPLATE/20_build_bug.yml
@@ -32,6 +32,14 @@ body:
    description: "Which kernel are you running? (output of `uname -a`, say if you built it yourself)."
  validations:
    required: true
+- type: checkboxes
+  id: sw_host_kernel_stable
+  attributes:
+    label: "Please confirm you are running a stable release kernel (e.g. not a -rc). We do not accept bug reports for unreleased kernels."
+    options:
+    - label: "I am running on a stable kernel release."
+  validations:
+    required: true
 - type: textarea
  id: bug_description
  attributes:
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -1,178 +0,0 @@
-# Changelog
-
-## Release 545 Entries
-
-### [545.23.06] 2023-10-17
-
-#### Fixed
-
- Fix always-false conditional, [#493](https://github.com/NVIDIA/open-gpu-kernel-modules/pull/493) by @meme8383
-
-#### Added
-
- Added beta-quality support for GeForce and Workstation GPUs. Please see the "Open Linux Kernel Modules" chapter in the NVIDIA GPU driver end user README for details.
-
-## Release 535 Entries
-
-### [535.113.01] 2023-09-21
-
-#### Fixed
-
- Fixed building main against current centos stream 8 fails, [#550](https://github.com/NVIDIA/open-gpu-kernel-modules/issues/550) by @airlied
-
-### [535.104.05] 2023-08-22
-
-### [535.98] 2023-08-08
-
-### [535.86.10] 2023-07-31
-
-### [535.86.05] 2023-07-18
-
-### [535.54.03] 2023-06-14
-
-### [535.43.02] 2023-05-30
-
-#### Fixed
-
- Fixed console restore with traditional VGA consoles.
-
-#### Added
-
- Added support for Run Time D3 (RTD3) on Ampere and later GPUs.
- Added support for G-Sync on desktop GPUs.
-
-## Release 530 Entries
-
-### [530.41.03] 2023-03-23
-
-### [530.30.02] 2023-02-28
-
-#### Changed
-
- GSP firmware is now distributed as `gsp_tu10x.bin` and `gsp_ga10x.bin` to better reflect the GPU architectures supported by each firmware file in this release.
-    - The .run installer will continue to install firmware to /lib/firmware/nvidia/<version> and the nvidia.ko kernel module will load the appropriate firmware for each GPU at runtime.
-  
-#### Fixed
-
- Add support for resizable BAR on Linux when NVreg_EnableResizableBar=1 module param is set. [#3](https://github.com/NVIDIA/open-gpu-kernel-modules/pull/3) by @sjkelly
-
-#### Added
-
- Support for power management features like Suspend, Hibernate and Resume.
-
-## Release 525 Entries
-
-### [525.116.04] 2023-05-09
-
-### [525.116.03] 2023-04-25
-
-### [525.105.17] 2023-03-30
-
-### [525.89.02] 2023-02-08
-
-### [525.85.12] 2023-01-30
-
-### [525.85.05] 2023-01-19
-
-#### Fixed
-
- Fix build problems with Clang 15.0, [#377](https://github.com/NVIDIA/open-gpu-kernel-modules/issues/377) by @ptr1337
-
-### [525.78.01] 2023-01-05
-
-### [525.60.13] 2022-12-05
-
-### [525.60.11] 2022-11-28
-
-#### Fixed
-
- Fixed nvenc compatibility with usermode clients [#104](https://github.com/NVIDIA/open-gpu-kernel-modules/issues/104)
-
-### [525.53] 2022-11-10
-
-#### Changed
-
- GSP firmware is now distributed as multiple firmware files: this release has `gsp_tu10x.bin` and `gsp_ad10x.bin` replacing `gsp.bin` from previous releases.
-    - Each file is named after a GPU architecture and supports GPUs from one or more architectures. This allows GSP firmware to better leverage each architecture's capabilities.
-    - The .run installer will continue to install firmware to `/lib/firmware/nvidia/<version>` and the `nvidia.ko` kernel module will load the appropriate firmware for each GPU at runtime.
-
-#### Fixed
-
- Add support for IBT (indirect branch tracking) on supported platforms, [#256](https://github.com/NVIDIA/open-gpu-kernel-modules/issues/256) by @rnd-ash
- Return EINVAL when [failing to] allocating memory, [#280](https://github.com/NVIDIA/open-gpu-kernel-modules/pull/280) by @YusufKhan-gamedev
- Fix various typos in nvidia/src/kernel, [#16](https://github.com/NVIDIA/open-gpu-kernel-modules/pull/16) by @alexisgeoffrey
- Added support for rotation in X11, Quadro Sync, Stereo, and YUV 4:2:0 on Turing.
-
-## Release 520 Entries
-
-### [520.61.07] 2022-10-20
-
-### [520.56.06] 2022-10-12
-
-#### Added
-
- Introduce support for GeForce RTX 4090 GPUs.
-
-### [520.61.05] 2022-10-10
-
-#### Added
-
- Introduce support for NVIDIA H100 GPUs.
-
-#### Fixed
-
- Fix/Improve Makefile, [#308](https://github.com/NVIDIA/open-gpu-kernel-modules/pull/308/) by @izenynn
- Make nvLogBase2 more efficient, [#177](https://github.com/NVIDIA/open-gpu-kernel-modules/pull/177/) by @DMaroo
- nv-pci: fixed always true expression, [#195](https://github.com/NVIDIA/open-gpu-kernel-modules/pull/195/) by @ValZapod
-
-## Release 515 Entries
-
-### [515.76] 2022-09-20
-
-#### Fixed
-
- Improved compatibility with new Linux kernel releases
- Fixed possible excessive GPU power draw on an idle X11 or Wayland desktop when driving high resolutions or refresh rates
-
-### [515.65.07] 2022-10-19
-
-### [515.65.01] 2022-08-02
-
-#### Fixed
-
- Collection of minor fixes to issues, [#6](https://github.com/NVIDIA/open-gpu-kernel-modules/pull/61) by @Joshua-Ashton
- Remove unnecessary use of acpi_bus_get_device().
-
-### [515.57] 2022-06-28
-
-#### Fixed
-
- Backtick is deprecated, [#273](https://github.com/NVIDIA/open-gpu-kernel-modules/pull/273) by @arch-user-france1
-
-### [515.48.07] 2022-05-31
-
-#### Added
-
- List of compatible GPUs in README.md.
-
-#### Fixed
-
- Fix various README capitalizations, [#8](https://github.com/NVIDIA/open-gpu-kernel-modules/pull/8) by @27lx 
- Automatically tag bug report issues, [#15](https://github.com/NVIDIA/open-gpu-kernel-modules/pull/15) by @thebeanogamer
- Improve conftest.sh Script, [#37](https://github.com/NVIDIA/open-gpu-kernel-modules/pull/37) by @Nitepone
- Update HTTP link to HTTPS, [#101](https://github.com/NVIDIA/open-gpu-kernel-modules/pull/101) by @alcaparra
- moved array sanity check to before the array access, [#117](https://github.com/NVIDIA/open-gpu-kernel-modules/pull/117) by @RealAstolfo
- Fixed some typos, [#122](https://github.com/NVIDIA/open-gpu-kernel-modules/pull/122) by @FEDOyt
- Fixed capitalization, [#123](https://github.com/NVIDIA/open-gpu-kernel-modules/pull/123) by @keroeslux
- Fix typos in NVDEC Engine Descriptor, [#126](https://github.com/NVIDIA/open-gpu-kernel-modules/pull/126) from @TrickyDmitriy
- Extranous apostrohpes in a makefile script [sic], [#14](https://github.com/NVIDIA/open-gpu-kernel-modules/issues/14) by @kiroma
- HDMI no audio @ 4K above 60Hz, [#75](https://github.com/NVIDIA/open-gpu-kernel-modules/issues/75) by @adolfotregosa
- dp_configcaps.cpp:405: array index sanity check in wrong place?, [#110](https://github.com/NVIDIA/open-gpu-kernel-modules/issues/110) by @dcb314
- NVRM kgspInitRm_IMPL: missing NVDEC0 engine, cannot initialize GSP-RM, [#116](https://github.com/NVIDIA/open-gpu-kernel-modules/issues/116) by @kfazz
- ERROR: modpost: "backlight_device_register" [...nvidia-modeset.ko] undefined, [#135](https://github.com/NVIDIA/open-gpu-kernel-modules/issues/135) by @sndirsch
- aarch64 build fails, [#151](https://github.com/NVIDIA/open-gpu-kernel-modules/issues/151) by @frezbo
-
-### [515.43.04] 2022-05-11
-
- Initial release.
-
--- a/README.md
+++ b/README.md
@@ -1,7 +1,7 @@
 # NVIDIA Linux Open GPU Kernel Module Source

 This is the source release of the NVIDIA Linux open GPU kernel modules,
-version 545.23.06.
+version 560.28.03.


 ## How to Build
@@ -17,7 +17,7 @@ as root:

 Note that the kernel modules built here must be used with GSP
 firmware and user-space NVIDIA GPU driver components from a corresponding
-545.23.06 driver release.  This can be achieved by installing
+560.28.03 driver release.  This can be achieved by installing
 the NVIDIA GPU driver from the .run file using the `--no-kernel-modules`
 option.  E.g.,

@@ -74,7 +74,7 @@ kernel.

 The NVIDIA open kernel modules support the same range of Linux kernel
 versions that are supported with the proprietary NVIDIA kernel modules.
-This is currently Linux kernel 3.10 or newer.
+This is currently Linux kernel 4.15 or newer.


 ## How to Contribute
@@ -179,16 +179,16 @@ software applications.

 ## Compatible GPUs

-The NVIDIA open kernel modules can be used on any Turing or later GPU
-(see the table below). However, in the __DRIVER_VERION__ release, GeForce and
-Workstation support is considered to be Beta quality. The open kernel modules
-are suitable for broad usage, and NVIDIA requests feedback on any issues
-encountered specific to them.
+The NVIDIA open kernel modules can be used on any Turing or later GPU (see the
+table below).

 For details on feature support and limitations, see the NVIDIA GPU driver
 end user README here:

-https://us.download.nvidia.com/XFree86/Linux-x86_64/545.23.06/README/kernel_open.html
+https://us.download.nvidia.com/XFree86/Linux-x86_64/560.28.03/README/kernel_open.html
+
+For vGPU support, please refer to the README.vgpu packaged in the vGPU Host
+Package for more details.

 In the below table, if three IDs are listed, the first is the PCI Device 
 ID, the second is the PCI Subsystem Vendor ID, and the third is the PCI
@@ -648,9 +648,12 @@ Subsystem Device ID.
 | NVIDIA T1000 8GB                                | 1FF0 17AA 1612 |
 | NVIDIA T400 4GB                                 | 1FF2 1028 1613 |
 | NVIDIA T400 4GB                                 | 1FF2 103C 1613 |
+| NVIDIA T400E                                    | 1FF2 103C 18FF |
 | NVIDIA T400 4GB                                 | 1FF2 103C 8A80 |
 | NVIDIA T400 4GB                                 | 1FF2 10DE 1613 |
+| NVIDIA T400E                                    | 1FF2 10DE 18FF |
 | NVIDIA T400 4GB                                 | 1FF2 17AA 1613 |
+| NVIDIA T400E                                    | 1FF2 17AA 18FF |
 | Quadro T1000                                    | 1FF9           |
 | NVIDIA A100-SXM4-40GB                           | 20B0           |
 | NVIDIA A100-PG509-200                           | 20B0 10DE 1450 |
@@ -658,6 +661,7 @@ Subsystem Device ID.
 | NVIDIA A100-SXM4-80GB                           | 20B2 10DE 147F |
 | NVIDIA A100-SXM4-80GB                           | 20B2 10DE 1622 |
 | NVIDIA A100-SXM4-80GB                           | 20B2 10DE 1623 |
+| NVIDIA PG509-210                                | 20B2 10DE 1625 |
 | NVIDIA A100-SXM-64GB                            | 20B3 10DE 14A7 |
 | NVIDIA A100-SXM-64GB                            | 20B3 10DE 14A8 |
 | NVIDIA A100 80GB PCIe                           | 20B5 10DE 1533 |
@@ -665,6 +669,7 @@ Subsystem Device ID.
 | NVIDIA PG506-232                                | 20B6 10DE 1492 |
 | NVIDIA A30                                      | 20B7 10DE 1532 |
 | NVIDIA A30                                      | 20B7 10DE 1804 |
+| NVIDIA A30                                      | 20B7 10DE 1852 |
 | NVIDIA A800-SXM4-40GB                           | 20BD 10DE 17F4 |
 | NVIDIA A100-PCIE-40GB                           | 20F1 10DE 145F |
 | NVIDIA A800-SXM4-80GB                           | 20F3 10DE 179B |
@@ -681,6 +686,7 @@ Subsystem Device ID.
 | NVIDIA A800 40GB Active                         | 20F6 103C 180A |
 | NVIDIA A800 40GB Active                         | 20F6 10DE 180A |
 | NVIDIA A800 40GB Active                         | 20F6 17AA 180A |
+| NVIDIA AX800                                    | 20FD 10DE 17F8 |
 | NVIDIA GeForce GTX 1660 Ti                      | 2182           |
 | NVIDIA GeForce GTX 1660                         | 2184           |
 | NVIDIA GeForce GTX 1650 SUPER                   | 2187           |
@@ -743,11 +749,18 @@ Subsystem Device ID.
 | NVIDIA H800 PCIe                                | 2322 10DE 17A4 |
 | NVIDIA H800                                     | 2324 10DE 17A6 |
 | NVIDIA H800                                     | 2324 10DE 17A8 |
+| NVIDIA H20                                      | 2329 10DE 198B |
+| NVIDIA H20                                      | 2329 10DE 198C |
 | NVIDIA H100 80GB HBM3                           | 2330 10DE 16C0 |
 | NVIDIA H100 80GB HBM3                           | 2330 10DE 16C1 |
 | NVIDIA H100 PCIe                                | 2331 10DE 1626 |
+| NVIDIA H200                                     | 2335 10DE 18BE |
+| NVIDIA H200                                     | 2335 10DE 18BF |
 | NVIDIA H100                                     | 2339 10DE 17FC |
 | NVIDIA H800 NVL                                 | 233A 10DE 183A |
+| NVIDIA GH200 120GB                              | 2342 10DE 16EB |
+| NVIDIA GH200 120GB                              | 2342 10DE 1805 |
+| NVIDIA GH200 480GB                              | 2342 10DE 1809 |
 | NVIDIA GeForce RTX 3060 Ti                      | 2414           |
 | NVIDIA GeForce RTX 3080 Ti Laptop GPU           | 2420           |
 | NVIDIA RTX A5500 Laptop GPU                     | 2438           |
@@ -800,6 +813,7 @@ Subsystem Device ID.
 | NVIDIA RTX A2000 12GB                           | 2571 10DE 1611 |
 | NVIDIA RTX A2000 12GB                           | 2571 17AA 1611 |
 | NVIDIA GeForce RTX 3050                         | 2582           |
+| NVIDIA GeForce RTX 3050                         | 2584           |
 | NVIDIA GeForce RTX 3050 Ti Laptop GPU           | 25A0           |
 | NVIDIA GeForce RTX 3050Ti Laptop GPU            | 25A0 103C 8928 |
 | NVIDIA GeForce RTX 3050Ti Laptop GPU            | 25A0 103C 89F9 |
@@ -815,6 +829,14 @@ Subsystem Device ID.
 | NVIDIA GeForce RTX 3050 4GB Laptop GPU          | 25AB           |
 | NVIDIA GeForce RTX 3050 6GB Laptop GPU          | 25AC           |
 | NVIDIA GeForce RTX 2050                         | 25AD           |
+| NVIDIA RTX A1000                                | 25B0 1028 1878 |
+| NVIDIA RTX A1000                                | 25B0 103C 1878 |
+| NVIDIA RTX A1000                                | 25B0 10DE 1878 |
+| NVIDIA RTX A1000                                | 25B0 17AA 1878 |
+| NVIDIA RTX A400                                 | 25B2 1028 1879 |
+| NVIDIA RTX A400                                 | 25B2 103C 1879 |
+| NVIDIA RTX A400                                 | 25B2 10DE 1879 |
+| NVIDIA RTX A400                                 | 25B2 17AA 1879 |
 | NVIDIA A16                                      | 25B6 10DE 14A9 |
 | NVIDIA A2                                       | 25B6 10DE 157E |
 | NVIDIA RTX A2000 Laptop GPU                     | 25B8           |
@@ -832,6 +854,8 @@ Subsystem Device ID.
 | NVIDIA RTX A2000 Embedded GPU                   | 25FA           |
 | NVIDIA RTX A500 Embedded GPU                    | 25FB           |
 | NVIDIA GeForce RTX 4090                         | 2684           |
+| NVIDIA GeForce RTX 4090 D                       | 2685           |
+| NVIDIA GeForce RTX 4070 Ti SUPER                | 2689           |
 | NVIDIA RTX 6000 Ada Generation                  | 26B1 1028 16A1 |
 | NVIDIA RTX 6000 Ada Generation                  | 26B1 103C 16A1 |
 | NVIDIA RTX 6000 Ada Generation                  | 26B1 10DE 16A1 |
@@ -840,17 +864,28 @@ Subsystem Device ID.
 | NVIDIA RTX 5000 Ada Generation                  | 26B2 103C 17FA |
 | NVIDIA RTX 5000 Ada Generation                  | 26B2 10DE 17FA |
 | NVIDIA RTX 5000 Ada Generation                  | 26B2 17AA 17FA |
+| NVIDIA RTX 5880 Ada Generation                  | 26B3 1028 1934 |
+| NVIDIA RTX 5880 Ada Generation                  | 26B3 103C 1934 |
+| NVIDIA RTX 5880 Ada Generation                  | 26B3 10DE 1934 |
+| NVIDIA RTX 5880 Ada Generation                  | 26B3 17AA 1934 |
 | NVIDIA L40                                      | 26B5 10DE 169D |
 | NVIDIA L40                                      | 26B5 10DE 17DA |
 | NVIDIA L40S                                     | 26B9 10DE 1851 |
 | NVIDIA L40S                                     | 26B9 10DE 18CF |
+| NVIDIA L20                                      | 26BA 10DE 1957 |
+| NVIDIA L20                                      | 26BA 10DE 1990 |
+| NVIDIA GeForce RTX 4080 SUPER                   | 2702           |
 | NVIDIA GeForce RTX 4080                         | 2704           |
+| NVIDIA GeForce RTX 4070 Ti SUPER                | 2705           |
+| NVIDIA GeForce RTX 4070                         | 2709           |
 | NVIDIA GeForce RTX 4090 Laptop GPU              | 2717           |
 | NVIDIA RTX 5000 Ada Generation Laptop GPU       | 2730           |
 | NVIDIA GeForce RTX 4090 Laptop GPU              | 2757           |
 | NVIDIA RTX 5000 Ada Generation Embedded GPU     | 2770           |
 | NVIDIA GeForce RTX 4070 Ti                      | 2782           |
+| NVIDIA GeForce RTX 4070 SUPER                   | 2783           |
 | NVIDIA GeForce RTX 4070                         | 2786           |
+| NVIDIA GeForce RTX 4060 Ti                      | 2788           |
 | NVIDIA GeForce RTX 4080 Laptop GPU              | 27A0           |
 | NVIDIA RTX 4000 SFF Ada Generation              | 27B0 1028 16FA |
 | NVIDIA RTX 4000 SFF Ada Generation              | 27B0 103C 16FA |
@@ -864,6 +899,7 @@ Subsystem Device ID.
 | NVIDIA RTX 4000 Ada Generation                  | 27B2 103C 181B |
 | NVIDIA RTX 4000 Ada Generation                  | 27B2 10DE 181B |
 | NVIDIA RTX 4000 Ada Generation                  | 27B2 17AA 181B |
+| NVIDIA L2                                       | 27B6 10DE 1933 |
 | NVIDIA L4                                       | 27B8 10DE 16CA |
 | NVIDIA L4                                       | 27B8 10DE 16EE |
 | NVIDIA RTX 4000 Ada Generation Laptop GPU       | 27BA           |
@@ -872,13 +908,25 @@ Subsystem Device ID.
 | NVIDIA RTX 3500 Ada Generation Embedded GPU     | 27FB           |
 | NVIDIA GeForce RTX 4060 Ti                      | 2803           |
 | NVIDIA GeForce RTX 4060 Ti                      | 2805           |
+| NVIDIA GeForce RTX 4060                         | 2808           |
 | NVIDIA GeForce RTX 4070 Laptop GPU              | 2820           |
+| NVIDIA GeForce RTX 3050 A Laptop GPU            | 2822           |
 | NVIDIA RTX 3000 Ada Generation Laptop GPU       | 2838           |
 | NVIDIA GeForce RTX 4070 Laptop GPU              | 2860           |
 | NVIDIA GeForce RTX 4060                         | 2882           |
 | NVIDIA GeForce RTX 4060 Laptop GPU              | 28A0           |
 | NVIDIA GeForce RTX 4050 Laptop GPU              | 28A1           |
+| NVIDIA RTX 2000 Ada Generation                  | 28B0 1028 1870 |
+| NVIDIA RTX 2000 Ada Generation                  | 28B0 103C 1870 |
+| NVIDIA RTX 2000E Ada Generation                 | 28B0 103C 1871 |
+| NVIDIA RTX 2000 Ada Generation                  | 28B0 10DE 1870 |
+| NVIDIA RTX 2000E Ada Generation                 | 28B0 10DE 1871 |
+| NVIDIA RTX 2000 Ada Generation                  | 28B0 17AA 1870 |
+| NVIDIA RTX 2000E Ada Generation                 | 28B0 17AA 1871 |
 | NVIDIA RTX 2000 Ada Generation Laptop GPU       | 28B8           |
+| NVIDIA RTX 1000 Ada Generation Laptop GPU       | 28B9           |
+| NVIDIA RTX 500 Ada Generation Laptop GPU        | 28BA           |
+| NVIDIA RTX 500 Ada Generation Laptop GPU        | 28BB           |
 | NVIDIA GeForce RTX 4060 Laptop GPU              | 28E0           |
 | NVIDIA GeForce RTX 4050 Laptop GPU              | 28E1           |
 | NVIDIA RTX 2000 Ada Generation Embedded GPU     | 28F8           |
--- a/kernel-open/Kbuild
+++ b/kernel-open/Kbuild
@@ -70,9 +70,9 @@ $(foreach _module, $(NV_KERNEL_MODULES), \

 EXTRA_CFLAGS += -I$(src)/common/inc
 EXTRA_CFLAGS += -I$(src)
-EXTRA_CFLAGS += -Wall $(DEFINES) $(INCLUDES) -Wno-cast-qual -Wno-error -Wno-format-extra-args
+EXTRA_CFLAGS += -Wall $(DEFINES) $(INCLUDES) -Wno-cast-qual -Wno-format-extra-args
 EXTRA_CFLAGS += -D__KERNEL__ -DMODULE -DNVRM
-EXTRA_CFLAGS += -DNV_VERSION_STRING=\"545.23.06\"
+EXTRA_CFLAGS += -DNV_VERSION_STRING=\"560.28.03\"

 ifneq ($(SYSSRCHOST1X),)
 EXTRA_CFLAGS += -I$(SYSSRCHOST1X)
@@ -118,7 +118,7 @@ ifeq ($(ARCH),x86_64)
 endif

 ifeq ($(ARCH),powerpc)
- EXTRA_CFLAGS += -mlittle-endian -mno-strict-align -mno-altivec
+ EXTRA_CFLAGS += -mlittle-endian -mno-strict-align
 endif

 EXTRA_CFLAGS += -DNV_UVM_ENABLE
@@ -134,6 +134,16 @@ ifneq ($(wildcard /proc/sgi_uv),)
 EXTRA_CFLAGS += -DNV_CONFIG_X86_UV
 endif

+ifdef VGX_FORCE_VFIO_PCI_CORE
+ EXTRA_CFLAGS += -DNV_VGPU_FORCE_VFIO_PCI_CORE
+endif
+
+WARNINGS_AS_ERRORS ?=
+ifeq ($(WARNINGS_AS_ERRORS),1)
+ ccflags-y += -Werror
+else
+ ccflags-y += -Wno-error
+endif

 #
 # The conftest.sh script tests various aspects of the target kernel.
@@ -160,6 +170,10 @@ NV_CONFTEST_CMD := /bin/sh $(NV_CONFTEST_SCRIPT) \
 NV_CFLAGS_FROM_CONFTEST := $(shell $(NV_CONFTEST_CMD) build_cflags)

 NV_CONFTEST_CFLAGS = $(NV_CFLAGS_FROM_CONFTEST) $(EXTRA_CFLAGS) -fno-pie
+NV_CONFTEST_CFLAGS += $(call cc-disable-warning,pointer-sign)
+NV_CONFTEST_CFLAGS += $(call cc-option,-fshort-wchar,)
+NV_CONFTEST_CFLAGS += $(call cc-option,-Werror=incompatible-pointer-types,)
+NV_CONFTEST_CFLAGS += -Wno-error

 NV_CONFTEST_COMPILE_TEST_HEADERS := $(obj)/conftest/macros.h
 NV_CONFTEST_COMPILE_TEST_HEADERS += $(obj)/conftest/functions.h
@@ -219,106 +233,7 @@ $(obj)/conftest/patches.h: $(NV_CONFTEST_SCRIPT)
 	@mkdir -p $(obj)/conftest
 	@$(NV_CONFTEST_CMD) patch_check > $@

-
-# Each of these headers is checked for presence with a test #include; a
-# corresponding #define will be generated in conftest/headers.h.
-NV_HEADER_PRESENCE_TESTS = \
- asm/system.h \
- drm/drmP.h \
- drm/drm_aperture.h \
- drm/drm_auth.h \
- drm/drm_gem.h \
- drm/drm_crtc.h \
- drm/drm_color_mgmt.h \
- drm/drm_atomic.h \
- drm/drm_atomic_helper.h \
- drm/drm_atomic_state_helper.h \
- drm/drm_encoder.h \
- drm/drm_atomic_uapi.h \
- drm/drm_drv.h \
- drm/drm_fbdev_generic.h \
- drm/drm_framebuffer.h \
- drm/drm_connector.h \
- drm/drm_probe_helper.h \
- drm/drm_blend.h \
- drm/drm_fourcc.h \
- drm/drm_prime.h \
- drm/drm_plane.h \
- drm/drm_vblank.h \
- drm/drm_file.h \
- drm/drm_ioctl.h \
- drm/drm_device.h \
- drm/drm_mode_config.h \
- drm/drm_modeset_lock.h \
- dt-bindings/interconnect/tegra_icc_id.h \
- generated/autoconf.h \
- generated/compile.h \
- generated/utsrelease.h \
- linux/efi.h \
- linux/kconfig.h \
- linux/platform/tegra/mc_utils.h \
- linux/printk.h \
- linux/ratelimit.h \
- linux/prio_tree.h \
- linux/log2.h \
- linux/of.h \
- linux/bug.h \
- linux/sched.h \
- linux/sched/mm.h \
- linux/sched/signal.h \
- linux/sched/task.h \
- linux/sched/task_stack.h \
- xen/ioemu.h \
- linux/fence.h \
- linux/dma-fence.h \
- linux/dma-resv.h \
- soc/tegra/chip-id.h \
- soc/tegra/fuse.h \
- soc/tegra/tegra_bpmp.h \
- video/nv_internal.h \
- linux/platform/tegra/dce/dce-client-ipc.h \
- linux/nvhost.h \
- linux/nvhost_t194.h \
- linux/host1x-next.h \
- asm/book3s/64/hash-64k.h \
- asm/set_memory.h \
- asm/prom.h \
- asm/powernv.h \
- linux/atomic.h \
- asm/barrier.h \
- asm/opal-api.h \
- sound/hdaudio.h \
- asm/pgtable_types.h \
- asm/page.h \
- linux/stringhash.h \
- linux/dma-map-ops.h \
- rdma/peer_mem.h \
- sound/hda_codec.h \
- linux/dma-buf.h \
- linux/time.h \
- linux/platform_device.h \
- linux/mutex.h \
- linux/reset.h \
- linux/of_platform.h \
- linux/of_device.h \
- linux/of_gpio.h \
- linux/gpio.h \
- linux/gpio/consumer.h \
- linux/interconnect.h \
- linux/pm_runtime.h \
- linux/clk.h \
- linux/clk-provider.h \
- linux/ioasid.h \
- linux/stdarg.h \
- linux/iosys-map.h \
- asm/coco.h \
- linux/vfio_pci_core.h \
- linux/mdev.h \
- soc/tegra/bpmp-abi.h \
- soc/tegra/bpmp.h \
- linux/sync_file.h \
- linux/cc_platform.h \
- asm/cpufeature.h
+include $(src)/header-presence-tests.mk

 # Filename to store the define for the header in $(1); this is only consumed by
 # the rule below that concatenates all of these together.
--- a/kernel-open/Makefile
+++ b/kernel-open/Makefile
@@ -28,7 +28,7 @@ else
  else
    KERNEL_UNAME ?= $(shell uname -r)
    KERNEL_MODLIB := /lib/modules/$(KERNEL_UNAME)
-    KERNEL_SOURCES := $(shell test -d $(KERNEL_MODLIB)/source && echo $(KERNEL_MODLIB)/source || echo $(KERNEL_MODLIB)/build)
+    KERNEL_SOURCES := $(shell ((test -d $(KERNEL_MODLIB)/source && echo $(KERNEL_MODLIB)/source) || (test -d $(KERNEL_MODLIB)/build/source && echo $(KERNEL_MODLIB)/build/source)) || echo $(KERNEL_MODLIB)/build)
  endif

  KERNEL_OUTPUT := $(KERNEL_SOURCES)
@@ -42,7 +42,11 @@ else
  else
    KERNEL_UNAME ?= $(shell uname -r)
    KERNEL_MODLIB := /lib/modules/$(KERNEL_UNAME)
-    ifeq ($(KERNEL_SOURCES), $(KERNEL_MODLIB)/source)
+    # $(filter patter...,text) - Returns all whitespace-separated words in text that
+    # do match any of the pattern words, removing any words that do not match.
+    # Set the KERNEL_OUTPUT only if either $(KERNEL_MODLIB)/source or
+    # $(KERNEL_MODLIB)/build/source path matches the KERNEL_SOURCES.
+    ifneq ($(filter $(KERNEL_SOURCES),$(KERNEL_MODLIB)/source $(KERNEL_MODLIB)/build/source),)
      KERNEL_OUTPUT := $(KERNEL_MODLIB)/build
      KBUILD_PARAMS := KBUILD_OUTPUT=$(KERNEL_OUTPUT)
    endif
@@ -57,12 +61,15 @@ else
      -e 's/armv[0-7]\w\+/arm/' \
      -e 's/aarch64/arm64/' \
      -e 's/ppc64le/powerpc/' \
+      -e 's/riscv64/riscv/' \
    )
  endif

  NV_KERNEL_MODULES ?= $(wildcard nvidia nvidia-uvm nvidia-vgpu-vfio nvidia-modeset nvidia-drm nvidia-peermem)
  NV_KERNEL_MODULES := $(filter-out $(NV_EXCLUDE_KERNEL_MODULES), \
                                    $(NV_KERNEL_MODULES))
+  INSTALL_MOD_DIR ?= kernel/drivers/video
+
  NV_VERBOSE ?=
  SPECTRE_V2_RETPOLINE ?= 0

@@ -74,7 +81,7 @@ else
  KBUILD_PARAMS += NV_KERNEL_SOURCES=$(KERNEL_SOURCES)
  KBUILD_PARAMS += NV_KERNEL_OUTPUT=$(KERNEL_OUTPUT)
  KBUILD_PARAMS += NV_KERNEL_MODULES="$(NV_KERNEL_MODULES)"
-  KBUILD_PARAMS += INSTALL_MOD_DIR=kernel/drivers/video
+  KBUILD_PARAMS += INSTALL_MOD_DIR="$(INSTALL_MOD_DIR)"
  KBUILD_PARAMS += NV_SPECTRE_V2=$(SPECTRE_V2_RETPOLINE)

  .PHONY: modules module clean clean_conftest modules_install
--- a/kernel-open/common/inc/nv-firmware.h
+++ b/kernel-open/common/inc/nv-firmware.h
@@ -1,5 +1,5 @@
 /*
- * SPDX-FileCopyrightText: Copyright (c) 2022-2023 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
+ * SPDX-FileCopyrightText: Copyright (c) 2022-2024 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
 * SPDX-License-Identifier: MIT
 *
 * Permission is hereby granted, free of charge, to any person obtaining a
@@ -44,6 +44,7 @@ typedef enum
    NV_FIRMWARE_CHIP_FAMILY_GA10X = 4,
    NV_FIRMWARE_CHIP_FAMILY_AD10X = 5,
    NV_FIRMWARE_CHIP_FAMILY_GH100 = 6,
+    NV_FIRMWARE_CHIP_FAMILY_GB10X = 8,
    NV_FIRMWARE_CHIP_FAMILY_END,
 } nv_firmware_chip_family_t;

@@ -52,6 +53,7 @@ static inline const char *nv_firmware_chip_family_to_string(
 )
 {
    switch (fw_chip_family) {
+        case NV_FIRMWARE_CHIP_FAMILY_GB10X: return "gb10x";
        case NV_FIRMWARE_CHIP_FAMILY_GH100: return "gh100";
        case NV_FIRMWARE_CHIP_FAMILY_AD10X: return "ad10x";
        case NV_FIRMWARE_CHIP_FAMILY_GA10X: return "ga10x";
@@ -66,13 +68,13 @@ static inline const char *nv_firmware_chip_family_to_string(
    return NULL;
 }

-// The includer (presumably nv.c) may optionally define
-// NV_FIRMWARE_PATH_FOR_FILENAME(filename)
-// to return a string "path" given a gsp_*.bin or gsp_log_*.bin filename.
+// The includer may optionally define
+// NV_FIRMWARE_FOR_NAME(name)
+// to return a platform-defined string for a given a gsp_* or gsp_log_* name.
 //
-// The function nv_firmware_path will then be available.
-#if defined(NV_FIRMWARE_PATH_FOR_FILENAME)
-static inline const char *nv_firmware_path(
+// The function nv_firmware_for_chip_family will then be available.
+#if defined(NV_FIRMWARE_FOR_NAME)
+static inline const char *nv_firmware_for_chip_family(
    nv_firmware_type_t fw_type,
    nv_firmware_chip_family_t fw_chip_family
 )
@@ -81,15 +83,16 @@ static inline const char *nv_firmware_path(
    {
        switch (fw_chip_family)
        {
+            case NV_FIRMWARE_CHIP_FAMILY_GB10X:  // fall through
            case NV_FIRMWARE_CHIP_FAMILY_GH100:  // fall through
            case NV_FIRMWARE_CHIP_FAMILY_AD10X:  // fall through
            case NV_FIRMWARE_CHIP_FAMILY_GA10X:
-                return NV_FIRMWARE_PATH_FOR_FILENAME("gsp_ga10x.bin");
+                return NV_FIRMWARE_FOR_NAME("gsp_ga10x");

            case NV_FIRMWARE_CHIP_FAMILY_GA100:  // fall through
            case NV_FIRMWARE_CHIP_FAMILY_TU11X:  // fall through
            case NV_FIRMWARE_CHIP_FAMILY_TU10X:
-                return NV_FIRMWARE_PATH_FOR_FILENAME("gsp_tu10x.bin");
+                return NV_FIRMWARE_FOR_NAME("gsp_tu10x");

            case NV_FIRMWARE_CHIP_FAMILY_END:  // fall through
            case NV_FIRMWARE_CHIP_FAMILY_NULL:
@@ -100,15 +103,16 @@ static inline const char *nv_firmware_path(
    {
        switch (fw_chip_family)
        {
+            case NV_FIRMWARE_CHIP_FAMILY_GB10X:  // fall through
            case NV_FIRMWARE_CHIP_FAMILY_GH100:  // fall through
            case NV_FIRMWARE_CHIP_FAMILY_AD10X:  // fall through
            case NV_FIRMWARE_CHIP_FAMILY_GA10X:
-                return NV_FIRMWARE_PATH_FOR_FILENAME("gsp_log_ga10x.bin");
+                return NV_FIRMWARE_FOR_NAME("gsp_log_ga10x");

            case NV_FIRMWARE_CHIP_FAMILY_GA100:  // fall through
            case NV_FIRMWARE_CHIP_FAMILY_TU11X:  // fall through
            case NV_FIRMWARE_CHIP_FAMILY_TU10X:
-                return NV_FIRMWARE_PATH_FOR_FILENAME("gsp_log_tu10x.bin");
+                return NV_FIRMWARE_FOR_NAME("gsp_log_tu10x");

            case NV_FIRMWARE_CHIP_FAMILY_END:  // fall through
            case NV_FIRMWARE_CHIP_FAMILY_NULL:
@@ -118,15 +122,15 @@ static inline const char *nv_firmware_path(

    return "";
 }
-#endif  // defined(NV_FIRMWARE_PATH_FOR_FILENAME)
+#endif  // defined(NV_FIRMWARE_FOR_NAME)

-// The includer (presumably nv.c) may optionally define
-// NV_FIRMWARE_DECLARE_GSP_FILENAME(filename)
+// The includer may optionally define
+// NV_FIRMWARE_DECLARE_GSP(name)
 // which will then be invoked (at the top-level) for each
-// gsp_*.bin (but not gsp_log_*.bin)
-#if defined(NV_FIRMWARE_DECLARE_GSP_FILENAME)
-NV_FIRMWARE_DECLARE_GSP_FILENAME("gsp_ga10x.bin")
-NV_FIRMWARE_DECLARE_GSP_FILENAME("gsp_tu10x.bin")
-#endif  // defined(NV_FIRMWARE_DECLARE_GSP_FILENAME)
+// gsp_* (but not gsp_log_*)
+#if defined(NV_FIRMWARE_DECLARE_GSP)
+NV_FIRMWARE_DECLARE_GSP("gsp_ga10x")
+NV_FIRMWARE_DECLARE_GSP("gsp_tu10x")
+#endif  // defined(NV_FIRMWARE_DECLARE_GSP)

-#endif  // NV_FIRMWARE_DECLARE_GSP_FILENAME
+#endif  // NV_FIRMWARE_DECLARE_GSP
--- a/kernel-open/common/inc/nv-hypervisor.h
+++ b/kernel-open/common/inc/nv-hypervisor.h
@@ -37,13 +37,11 @@ typedef enum _HYPERVISOR_TYPE
    OS_HYPERVISOR_UNKNOWN
 } HYPERVISOR_TYPE;

-#define CMD_VGPU_VFIO_WAKE_WAIT_QUEUE         0
-#define CMD_VGPU_VFIO_INJECT_INTERRUPT        1
-#define CMD_VGPU_VFIO_REGISTER_MDEV           2
-#define CMD_VGPU_VFIO_PRESENT                 3
-#define CMD_VFIO_PCI_CORE_PRESENT             4
+#define CMD_VFIO_WAKE_REMOVE_GPU              1
+#define CMD_VGPU_VFIO_PRESENT                 2
+#define CMD_VFIO_PCI_CORE_PRESENT             3

-#define MAX_VF_COUNT_PER_GPU 64
+#define MAX_VF_COUNT_PER_GPU                  64

 typedef enum _VGPU_TYPE_INFO
 {
@@ -54,17 +52,11 @@ typedef enum _VGPU_TYPE_INFO

 typedef struct
 {
-    void  *vgpuVfioRef;
-    void  *waitQueue;
    void  *nv;
-    NvU32 *vgpuTypeIds;
-    NvU8 **vgpuNames;
-    NvU32  numVgpuTypes;
-    NvU32  domain;
-    NvU8   bus;
-    NvU8   slot;
-    NvU8   function;
-    NvBool is_virtfn;
+    NvU32 domain;
+    NvU32 bus;
+    NvU32 device;
+    NvU32 return_status;
 } vgpu_vfio_info;

 typedef struct
--- a/kernel-open/common/inc/nv-ioctl-numbers.h
+++ b/kernel-open/common/inc/nv-ioctl-numbers.h
@@ -1,5 +1,5 @@
 /*
- * SPDX-FileCopyrightText: Copyright (c) 2020-2022 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
+ * SPDX-FileCopyrightText: Copyright (c) 2020-2023 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
 * SPDX-License-Identifier: MIT
 *
 * Permission is hereby granted, free of charge, to any person obtaining a
@@ -39,5 +39,6 @@
 #define NV_ESC_QUERY_DEVICE_INTR     (NV_IOCTL_BASE + 13)
 #define NV_ESC_SYS_PARAMS            (NV_IOCTL_BASE + 14)
 #define NV_ESC_EXPORT_TO_DMABUF_FD   (NV_IOCTL_BASE + 17)
+#define NV_ESC_WAIT_OPEN_COMPLETE    (NV_IOCTL_BASE + 18)

 #endif
--- a/kernel-open/common/inc/nv-ioctl.h
+++ b/kernel-open/common/inc/nv-ioctl.h
@@ -1,5 +1,5 @@
 /*
- * SPDX-FileCopyrightText: Copyright (c) 2020-2022 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
+ * SPDX-FileCopyrightText: Copyright (c) 2020-2023 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
 * SPDX-License-Identifier: MIT
 *
 * Permission is hereby granted, free of charge, to any person obtaining a
@@ -142,4 +142,10 @@ typedef struct nv_ioctl_export_to_dma_buf_fd
    NvU32       status;
 } nv_ioctl_export_to_dma_buf_fd_t;

+typedef struct nv_ioctl_wait_open_complete
+{
+    int rc;
+    NvU32 adapterStatus;
+} nv_ioctl_wait_open_complete_t;
+
 #endif
--- a/kernel-open/common/inc/nv-linux.h
+++ b/kernel-open/common/inc/nv-linux.h
@@ -1,5 +1,5 @@
 /*
- * SPDX-FileCopyrightText: Copyright (c) 2001-2023 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
+ * SPDX-FileCopyrightText: Copyright (c) 2001-2024 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
 * SPDX-License-Identifier: MIT
 *
 * Permission is hereby granted, free of charge, to any person obtaining a
@@ -35,6 +35,7 @@
 #include "os-interface.h"
 #include "nv-timer.h"
 #include "nv-time.h"
+#include "nv-chardev-numbers.h"

 #define NV_KERNEL_NAME "Linux"

@@ -57,14 +58,10 @@
 #include <linux/version.h>
 #include <linux/utsname.h>

-#if LINUX_VERSION_CODE < KERNEL_VERSION(2, 6, 32)
-#error "This driver does not support kernels older than 2.6.32!"
-#elif LINUX_VERSION_CODE < KERNEL_VERSION(2, 7, 0)
-#  define KERNEL_2_6
-#elif LINUX_VERSION_CODE >= KERNEL_VERSION(3, 0, 0)
-#  define KERNEL_3
-#else
-#error "This driver does not support development kernels!"
+#if LINUX_VERSION_CODE == KERNEL_VERSION(4, 4, 0)
+// Version 4.4 is allowed, temporarily, although not officially supported.
+#elif LINUX_VERSION_CODE < KERNEL_VERSION(4, 15, 0)
+#error "This driver does not support kernels older than Linux 4.15!"
 #endif

 #if defined (CONFIG_SMP) && !defined (__SMP__)
@@ -406,37 +403,6 @@ extern int nv_pat_mode;
 #define NV_GFP_DMA32 (NV_GFP_KERNEL)
 #endif

-extern NvBool nvos_is_chipset_io_coherent(void);
-
-#if defined(NVCPU_X86_64)
-#define CACHE_FLUSH()  asm volatile("wbinvd":::"memory")
-#define WRITE_COMBINE_FLUSH() asm volatile("sfence":::"memory")
-#elif defined(NVCPU_AARCH64)
-    static inline void nv_flush_cache_cpu(void *info)
-    {
-        if (!nvos_is_chipset_io_coherent())
-        {
-#if defined(NV_FLUSH_CACHE_ALL_PRESENT)
-            flush_cache_all();
-#else
-            WARN_ONCE(0, "NVRM: kernel does not support flush_cache_all()\n");
-#endif
-        }
-    }
-#define CACHE_FLUSH()            nv_flush_cache_cpu(NULL)
-#define CACHE_FLUSH_ALL()        on_each_cpu(nv_flush_cache_cpu, NULL, 1)
-#define WRITE_COMBINE_FLUSH()    mb()
-#elif defined(NVCPU_PPC64LE)
-#define CACHE_FLUSH()            asm volatile("sync;  \n" \
-                                              "isync; \n" ::: "memory")
-#define WRITE_COMBINE_FLUSH()    CACHE_FLUSH()
-#elif defined(NVCPU_RISCV64)
-#define CACHE_FLUSH()            mb()
-#define WRITE_COMBINE_FLUSH()    CACHE_FLUSH()
-#else
-#error "CACHE_FLUSH() and WRITE_COMBINE_FLUSH() need to be defined for this architecture."
-#endif
-
 typedef enum
 {
    NV_MEMORY_TYPE_SYSTEM,      /* Memory mapped for ROM, SBIOS and physical RAM. */
@@ -504,7 +470,9 @@ static inline void *nv_vmalloc(unsigned long size)
    void *ptr = __vmalloc(size, GFP_KERNEL);
 #endif
    if (ptr)
+    {
        NV_MEMDBG_ADD(ptr, size);
+    }
    return ptr;
 }

@@ -522,7 +490,9 @@ static inline void *nv_ioremap(NvU64 phys, NvU64 size)
    void *ptr = ioremap(phys, size);
 #endif
    if (ptr)
+    {
        NV_MEMDBG_ADD(ptr, size);
+    }
    return ptr;
 }

@@ -558,8 +528,9 @@ static inline void *nv_ioremap_cache(NvU64 phys, NvU64 size)
 #endif

    if (ptr)
+    {
        NV_MEMDBG_ADD(ptr, size);
-
+    }
    return ptr;
 }

@@ -575,8 +546,9 @@ static inline void *nv_ioremap_wc(NvU64 phys, NvU64 size)
 #endif

    if (ptr)
+    {
        NV_MEMDBG_ADD(ptr, size);
-
+    }
    return ptr;
 }

@@ -705,7 +677,9 @@ static inline NvUPtr nv_vmap(struct page **pages, NvU32 page_count,
    /* All memory cached in PPC64LE; can't honor 'cached' input. */
    ptr = vmap(pages, page_count, VM_MAP, prot);
    if (ptr)
+    {
        NV_MEMDBG_ADD(ptr, page_count * PAGE_SIZE);
+    }
    return (NvUPtr)ptr;
 }

@@ -866,16 +840,16 @@ static inline dma_addr_t nv_phys_to_dma(struct device *dev, NvU64 pa)
 #define NV_PRINT_AT(nv_debug_level,at)                                           \
    {                                                                            \
        nv_printf(nv_debug_level,                                                \
-            "NVRM: VM: %s:%d: 0x%p, %d page(s), count = %d, flags = 0x%08x, "    \
+            "NVRM: VM: %s:%d: 0x%p, %d page(s), count = %d, "                    \
            "page_table = 0x%p\n",  __FUNCTION__, __LINE__, at,                  \
            at->num_pages, NV_ATOMIC_READ(at->usage_count),                      \
-            at->flags, at->page_table);                                          \
+            at->page_table);                                                     \
    }

 #define NV_PRINT_VMA(nv_debug_level,vma)                                                 \
    {                                                                                    \
        nv_printf(nv_debug_level,                                                        \
-            "NVRM: VM: %s:%d: 0x%lx - 0x%lx, 0x%08x bytes @ 0x%016llx, 0x%p, 0x%p\n",    \
+            "NVRM: VM: %s:%d: 0x%lx - 0x%lx, 0x%08lx bytes @ 0x%016llx, 0x%p, 0x%p\n",    \
            __FUNCTION__, __LINE__, vma->vm_start, vma->vm_end, NV_VMA_SIZE(vma),        \
            NV_VMA_OFFSET(vma), NV_VMA_PRIVATE(vma), NV_VMA_FILE(vma));                  \
    }
@@ -1108,6 +1082,8 @@ static inline void nv_kmem_ctor_dummy(void *arg)
        kmem_cache_destroy(kmem_cache);     \
    }

+#define NV_KMEM_CACHE_ALLOC_ATOMIC(kmem_cache)     \
+    kmem_cache_alloc(kmem_cache, GFP_ATOMIC)
 #define NV_KMEM_CACHE_ALLOC(kmem_cache)     \
    kmem_cache_alloc(kmem_cache, GFP_KERNEL)
 #define NV_KMEM_CACHE_FREE(ptr, kmem_cache) \
@@ -1134,6 +1110,23 @@ static inline void *nv_kmem_cache_zalloc(struct kmem_cache *k, gfp_t flags)
 #endif
 }

+static inline int nv_kmem_cache_alloc_stack_atomic(nvidia_stack_t **stack)
+{
+    nvidia_stack_t *sp = NULL;
+#if defined(NVCPU_X86_64)
+    if (rm_is_altstack_in_use())
+    {
+        sp = NV_KMEM_CACHE_ALLOC_ATOMIC(nvidia_stack_t_cache);
+        if (sp == NULL)
+            return -ENOMEM;
+        sp->size = sizeof(sp->stack);
+        sp->top = sp->stack + sp->size;
+    }
+#endif
+    *stack = sp;
+    return 0;
+}
+
 static inline int nv_kmem_cache_alloc_stack(nvidia_stack_t **stack)
 {
    nvidia_stack_t *sp = NULL;
@@ -1189,6 +1182,16 @@ typedef struct nvidia_pte_s {
    unsigned int    page_count;
 } nvidia_pte_t;

+#if defined(CONFIG_DMA_SHARED_BUFFER)
+/* Standard dma_buf-related information. */
+struct nv_dma_buf
+{
+    struct dma_buf *dma_buf;
+    struct dma_buf_attachment *dma_attach;
+    struct sg_table *sgt;
+};
+#endif // CONFIG_DMA_SHARED_BUFFER
+
 typedef struct nv_alloc_s {
    struct nv_alloc_s *next;
    struct device     *dev;
@@ -1380,7 +1383,19 @@ typedef struct nv_dma_map_s {
         i < dm->mapping.discontig.submap_count;                              \
         i++, sm = &dm->mapping.discontig.submaps[i])

+/*
+ * On 4K ARM kernels, use max submap size a multiple of 64K to keep nv-p2p happy.
+ * Despite 4K OS pages, we still use 64K P2P pages due to dependent modules still using 64K.
+ * Instead of using (4G-4K), use max submap size as (4G-64K) since the mapped IOVA range
+ * must be aligned at 64K boundary.
+ */
+#if defined(CONFIG_ARM64_4K_PAGES)
+#define NV_DMA_U32_MAX_4K_PAGES           ((NvU32)((NV_U32_MAX >> PAGE_SHIFT) + 1))
+#define NV_DMA_SUBMAP_MAX_PAGES           ((NvU32)(NV_DMA_U32_MAX_4K_PAGES - 16))
+#else
 #define NV_DMA_SUBMAP_MAX_PAGES           ((NvU32)(NV_U32_MAX >> PAGE_SHIFT))
+#endif
+
 #define NV_DMA_SUBMAP_IDX_TO_PAGE_IDX(s)  (s * NV_DMA_SUBMAP_MAX_PAGES)

 /*
@@ -1460,6 +1475,11 @@ typedef struct coherent_link_info_s {
     * baremetal OS environment it is System Physical Address(SPA) and in the case
     * of virutalized OS environment it is Intermediate Physical Address(IPA) */
    NvU64 gpu_mem_pa;
+
+    /* Physical address of the reserved portion of the GPU memory, applicable
+     * only in Grace Hopper self hosted passthrough virtualizatioan platform. */
+    NvU64 rsvd_mem_pa;
+
    /* Bitmap of NUMA node ids, corresponding to the reserved PXMs,
     * available for adding GPU memory to the kernel as system RAM */
    DECLARE_BITMAP(free_node_bitmap, MAX_NUMNODES);
@@ -1607,6 +1627,30 @@ typedef struct nv_linux_state_s {

    struct nv_dma_device dma_dev;
    struct nv_dma_device niso_dma_dev;
+
+    /*
+     * Background kthread for handling deferred open operations
+     * (e.g. from O_NONBLOCK).
+     *
+     * Adding to open_q and reading/writing is_accepting_opens
+     * are protected by nvl->open_q_lock (not nvl->ldata_lock).
+     * This allows new deferred open operations to be enqueued without
+     * blocking behind previous ones (which hold nvl->ldata_lock).
+     *
+     * Adding to open_q is only safe if is_accepting_opens is true.
+     * This prevents open operations from racing with device removal.
+     *
+     * Stopping open_q is only safe after setting is_accepting_opens to false.
+     * This ensures that the open_q (and the larger nvl structure) will
+     * outlive any of the open operations enqueued.
+     */
+    nv_kthread_q_t open_q;
+    NvBool is_accepting_opens;
+    struct semaphore open_q_lock;
+#if defined(NV_VGPU_KVM_BUILD)
+    wait_queue_head_t wait;
+    NvS32 return_status;
+#endif
 } nv_linux_state_t;

 extern nv_linux_state_t *nv_linux_devices;
@@ -1656,7 +1700,7 @@ typedef struct

    nvidia_stack_t *sp;
    nv_alloc_t *free_list;
-    void *nvptr;
+    nv_linux_state_t *nvptr;
    nvidia_event_t *event_data_head, *event_data_tail;
    NvBool dataless_event_pending;
    nv_spinlock_t fp_lock;
@@ -1667,6 +1711,12 @@ typedef struct
    nv_alloc_mapping_context_t mmap_context;
    struct address_space mapping;

+    nv_kthread_q_item_t open_q_item;
+    struct completion open_complete;
+    nv_linux_state_t *deferred_open_nvl;
+    int open_rc;
+    NV_STATUS adapter_status;
+
    struct list_head entry;
 } nv_linux_file_private_t;

@@ -1675,6 +1725,21 @@ static inline nv_linux_file_private_t *nv_get_nvlfp_from_nvfp(nv_file_private_t
    return container_of(nvfp, nv_linux_file_private_t, nvfp);
 }

+static inline int nv_wait_open_complete_interruptible(nv_linux_file_private_t *nvlfp)
+{
+    return wait_for_completion_interruptible(&nvlfp->open_complete);
+}
+
+static inline void nv_wait_open_complete(nv_linux_file_private_t *nvlfp)
+{
+    wait_for_completion(&nvlfp->open_complete);
+}
+
+static inline NvBool nv_is_open_complete(nv_linux_file_private_t *nvlfp)
+{
+    return completion_done(&nvlfp->open_complete);
+}
+
 #define NV_SET_FILE_PRIVATE(filep,data) ((filep)->private_data = (data))
 #define NV_GET_LINUX_FILE_PRIVATE(filep) ((nv_linux_file_private_t *)(filep)->private_data)

@@ -1756,12 +1821,18 @@ static inline NV_STATUS nv_check_gpu_state(nv_state_t *nv)
 extern NvU32 NVreg_EnableUserNUMAManagement;
 extern NvU32 NVreg_RegisterPCIDriver;
 extern NvU32 NVreg_EnableResizableBar;
+extern NvU32 NVreg_EnableNonblockingOpen;

 extern NvU32 num_probed_nv_devices;
 extern NvU32 num_nv_devices;

 #define NV_FILE_INODE(file) (file)->f_inode

+static inline int nv_is_control_device(struct inode *inode)
+{
+    return (minor((inode)->i_rdev) == NV_MINOR_DEVICE_NUMBER_CONTROL_DEVICE);
+}
+
 #if defined(NV_DOM0_KERNEL_PRESENT) || defined(NV_VGPU_KVM_BUILD)
 #define NV_VGX_HYPER
 #if defined(NV_XEN_IOEMU_INJECT_MSI)
@@ -1955,31 +2026,6 @@ static inline NvBool nv_platform_use_auto_online(nv_linux_state_t *nvl)
    return nvl->numa_info.use_auto_online;
 }

-typedef struct {
-    NvU64 base;
-    NvU64 size;
-    NvU32 nodeId;
-    int ret;
-} remove_numa_memory_info_t;
-
-static void offline_numa_memory_callback
-(
-    void *args
-)
-{
-#ifdef NV_OFFLINE_AND_REMOVE_MEMORY_PRESENT
-    remove_numa_memory_info_t *pNumaInfo = (remove_numa_memory_info_t *)args;
-#ifdef NV_REMOVE_MEMORY_HAS_NID_ARG
-    pNumaInfo->ret = offline_and_remove_memory(pNumaInfo->nodeId,
-                                               pNumaInfo->base,
-                                               pNumaInfo->size);
-#else
-    pNumaInfo->ret = offline_and_remove_memory(pNumaInfo->base,
-                                               pNumaInfo->size);
-#endif
-#endif
-}
-
 typedef enum
 {
    NV_NUMA_STATUS_DISABLED             = 0,
@@ -2040,4 +2086,7 @@ typedef enum
 #include <linux/clk-provider.h>
 #endif

+#define NV_EXPORT_SYMBOL(symbol)        EXPORT_SYMBOL_GPL(symbol)
+#define NV_CHECK_EXPORT_SYMBOL(symbol)  NV_IS_EXPORT_SYMBOL_PRESENT_##symbol
+
 #endif  /* _NV_LINUX_H_ */
--- a/kernel-open/common/inc/nv-lock.h
+++ b/kernel-open/common/inc/nv-lock.h
@@ -1,5 +1,5 @@
 /*
- * SPDX-FileCopyrightText: Copyright (c) 2017 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
+ * SPDX-FileCopyrightText: Copyright (c) 2017-2023 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
 * SPDX-License-Identifier: MIT
 *
 * Permission is hereby granted, free of charge, to any person obtaining a
@@ -37,6 +37,7 @@

 #if defined(CONFIG_PREEMPT_RT) || defined(CONFIG_PREEMPT_RT_FULL)
 typedef raw_spinlock_t            nv_spinlock_t;
+#define NV_DEFINE_SPINLOCK(lock)  DEFINE_RAW_SPINLOCK(lock)
 #define NV_SPIN_LOCK_INIT(lock)   raw_spin_lock_init(lock)
 #define NV_SPIN_LOCK_IRQ(lock)    raw_spin_lock_irq(lock)
 #define NV_SPIN_UNLOCK_IRQ(lock)  raw_spin_unlock_irq(lock)
@@ -47,6 +48,7 @@ typedef raw_spinlock_t            nv_spinlock_t;
 #define NV_SPIN_UNLOCK_WAIT(lock) raw_spin_unlock_wait(lock)
 #else
 typedef spinlock_t                nv_spinlock_t;
+#define NV_DEFINE_SPINLOCK(lock)  DEFINE_SPINLOCK(lock)
 #define NV_SPIN_LOCK_INIT(lock)   spin_lock_init(lock)
 #define NV_SPIN_LOCK_IRQ(lock)    spin_lock_irq(lock)
 #define NV_SPIN_UNLOCK_IRQ(lock)  spin_unlock_irq(lock)
--- a/kernel-open/common/inc/nv-mm.h
+++ b/kernel-open/common/inc/nv-mm.h
@@ -29,27 +29,33 @@
 typedef int vm_fault_t;
 #endif

-/* pin_user_pages
+/*
+ * pin_user_pages()
+ *
 * Presence of pin_user_pages() also implies the presence of unpin-user_page().
- * Both were added in the v5.6-rc1
+ * Both were added in the v5.6.
 *
- * pin_user_pages() was added by commit eddb1c228f7951d399240
- * ("mm/gup: introduce pin_user_pages*() and FOLL_PIN") in v5.6-rc1 (2020-01-30)
- *
- * Removed vmas parameter from pin_user_pages() by commit 40896a02751
- * ("mm/gup: remove vmas parameter from pin_user_pages()")
- * in linux-next, expected in v6.5-rc1 (2023-05-17)
+ * pin_user_pages() was added by commit eddb1c228f79
+ * ("mm/gup: introduce pin_user_pages*() and FOLL_PIN") in v5.6.
 *
+ * Removed vmas parameter from pin_user_pages() by commit 4c630f307455
+ * ("mm/gup: remove vmas parameter from pin_user_pages()") in v6.5.
 */

 #include <linux/mm.h>
 #include <linux/sched.h>
-#if defined(NV_PIN_USER_PAGES_PRESENT)
+
+/*
+ * FreeBSD's pin_user_pages's conftest breaks since pin_user_pages is an inline
+ * function. Because it simply maps to get_user_pages, we can just replace
+ * NV_PIN_USER_PAGES with NV_GET_USER_PAGES on FreeBSD
+ */
+#if defined(NV_PIN_USER_PAGES_PRESENT) && !defined(NV_BSD)
    #if defined(NV_PIN_USER_PAGES_HAS_ARGS_VMAS)
-        #define NV_PIN_USER_PAGES pin_user_pages
+        #define NV_PIN_USER_PAGES(start, nr_pages, gup_flags, pages) \
+            pin_user_pages(start, nr_pages, gup_flags, pages, NULL)
    #else
-        #define NV_PIN_USER_PAGES(start, nr_pages, gup_flags, pages, vmas) \
-            pin_user_pages(start, nr_pages, gup_flags, pages)
+        #define NV_PIN_USER_PAGES pin_user_pages
    #endif // NV_PIN_USER_PAGES_HAS_ARGS_VMAS
    #define NV_UNPIN_USER_PAGE unpin_user_page
 #else
@@ -57,80 +63,83 @@ typedef int vm_fault_t;
    #define NV_UNPIN_USER_PAGE put_page
 #endif // NV_PIN_USER_PAGES_PRESENT

-/* get_user_pages
+/*
+ * get_user_pages()
 *
- * The 8-argument version of get_user_pages was deprecated by commit
- * (2016 Feb 12: cde70140fed8429acf7a14e2e2cbd3e329036653)for the non-remote case
+ * The 8-argument version of get_user_pages() was deprecated by commit
+ * cde70140fed8 ("mm/gup: Overload get_user_pages() functions") in v4.6-rc1.
 * (calling get_user_pages with current and current->mm).
 *
- * Completely moved to the 6 argument version of get_user_pages -
- * 2016 Apr 4: c12d2da56d0e07d230968ee2305aaa86b93a6832
+ * Completely moved to the 6 argument version of get_user_pages() by
+ * commit c12d2da56d0e ("mm/gup: Remove the macro overload API migration
+ * helpers from the get_user*() APIs") in v4.6-rc4.
 *
- * write and force parameters were replaced with gup_flags by -
- * 2016 Oct 12: 768ae309a96103ed02eb1e111e838c87854d8b51
+ * write and force parameters were replaced with gup_flags by
+ * commit 768ae309a961 ("mm: replace get_user_pages() write/force parameters
+ * with gup_flags") in v4.9.
 *
 * A 7-argument version of get_user_pages was introduced into linux-4.4.y by
- * commit 8e50b8b07f462ab4b91bc1491b1c91bd75e4ad40 which cherry-picked the
- * replacement of the write and force parameters with gup_flags
+ * commit 8e50b8b07f462 ("mm: replace get_user_pages() write/force parameters
+ * with gup_flags") which cherry-picked the replacement of the write and
+ * force parameters with gup_flags.
 *
- * Removed vmas parameter from get_user_pages() by commit 7bbf9c8c99
- * ("mm/gup: remove unused vmas parameter from get_user_pages()")
- * in linux-next, expected in v6.5-rc1 (2023-05-17)
+ * Removed vmas parameter from get_user_pages() by commit 54d020692b34
+ * ("mm/gup: remove unused vmas parameter from get_user_pages()") in v6.5.
 *
 */

 #if defined(NV_GET_USER_PAGES_HAS_ARGS_FLAGS)
-    #define NV_GET_USER_PAGES(start, nr_pages, flags, pages, vmas) \
-        get_user_pages(start, nr_pages, flags, pages)
-#elif defined(NV_GET_USER_PAGES_HAS_ARGS_FLAGS_VMAS)
    #define NV_GET_USER_PAGES get_user_pages
+#elif defined(NV_GET_USER_PAGES_HAS_ARGS_FLAGS_VMAS)
+    #define NV_GET_USER_PAGES(start, nr_pages, flags, pages) \
+        get_user_pages(start, nr_pages, flags, pages, NULL)
 #elif defined(NV_GET_USER_PAGES_HAS_ARGS_TSK_FLAGS_VMAS)
-    #define NV_GET_USER_PAGES(start, nr_pages, flags, pages, vmas) \
-        get_user_pages(current, current->mm, start, nr_pages, flags, pages, vmas)
+    #define NV_GET_USER_PAGES(start, nr_pages, flags, pages) \
+        get_user_pages(current, current->mm, start, nr_pages, flags, pages, NULL)
 #else
    static inline long NV_GET_USER_PAGES(unsigned long start,
                                         unsigned long nr_pages,
                                         unsigned int flags,
-                                         struct page **pages,
-                                         struct vm_area_struct **vmas)
+                                         struct page **pages)
    {
        int write = flags & FOLL_WRITE;
        int force = flags & FOLL_FORCE;

    #if defined(NV_GET_USER_PAGES_HAS_ARGS_WRITE_FORCE_VMAS)
-        return get_user_pages(start, nr_pages, write, force, pages, vmas);
+        return get_user_pages(start, nr_pages, write, force, pages, NULL);
    #else
        // NV_GET_USER_PAGES_HAS_ARGS_TSK_WRITE_FORCE_VMAS
        return get_user_pages(current, current->mm, start, nr_pages, write,
-                              force, pages, vmas);
+                              force, pages, NULL);
    #endif // NV_GET_USER_PAGES_HAS_ARGS_WRITE_FORCE_VMAS
    }
 #endif // NV_GET_USER_PAGES_HAS_ARGS_FLAGS

-/* pin_user_pages_remote
+/*
+ * pin_user_pages_remote()
 *
- * pin_user_pages_remote() was added by commit eddb1c228f7951d399240
- * ("mm/gup: introduce pin_user_pages*() and FOLL_PIN") in v5.6 (2020-01-30)
+ * pin_user_pages_remote() was added by commit eddb1c228f79
+ * ("mm/gup: introduce pin_user_pages*() and FOLL_PIN") in v5.6.
 *
 * pin_user_pages_remote() removed 'tsk' parameter by commit
- * 64019a2e467a ("mm/gup: remove task_struct pointer for  all gup code")
- * in v5.9-rc1 (2020-08-11). *
+ * 64019a2e467a ("mm/gup: remove task_struct pointer for all gup code")
+ * in v5.9.
 *
 * Removed unused vmas parameter from pin_user_pages_remote() by commit
- * 83bcc2e132("mm/gup: remove unused vmas parameter from pin_user_pages_remote()")
- * in linux-next, expected in v6.5-rc1 (2023-05-14)
+ * 0b295316b3a9 ("mm/gup: remove unused vmas parameter from
+ * pin_user_pages_remote()") in v6.5.
 *
 */

 #if defined(NV_PIN_USER_PAGES_REMOTE_PRESENT)
    #if defined(NV_PIN_USER_PAGES_REMOTE_HAS_ARGS_TSK_VMAS)
-        #define NV_PIN_USER_PAGES_REMOTE(mm, start, nr_pages, flags, pages, vmas, locked) \
-            pin_user_pages_remote(NULL, mm, start, nr_pages, flags, pages, vmas, locked)
+        #define NV_PIN_USER_PAGES_REMOTE(mm, start, nr_pages, flags, pages, locked) \
+            pin_user_pages_remote(NULL, mm, start, nr_pages, flags, pages, NULL, locked)
    #elif defined(NV_PIN_USER_PAGES_REMOTE_HAS_ARGS_VMAS)
-        #define NV_PIN_USER_PAGES_REMOTE pin_user_pages_remote
+        #define NV_PIN_USER_PAGES_REMOTE(mm, start, nr_pages, flags, pages, locked) \
+            pin_user_pages_remote(mm, start, nr_pages, flags, pages, NULL, locked)
    #else
-        #define NV_PIN_USER_PAGES_REMOTE(mm, start, nr_pages, flags, pages, vmas, locked) \
-            pin_user_pages_remote(mm, start, nr_pages, flags, pages, locked)
+        #define NV_PIN_USER_PAGES_REMOTE pin_user_pages_remote
    #endif // NV_PIN_USER_PAGES_REMOTE_HAS_ARGS_TSK_VMAS
 #else
    #define NV_PIN_USER_PAGES_REMOTE NV_GET_USER_PAGES_REMOTE
@@ -138,7 +147,7 @@ typedef int vm_fault_t;

 /*
 * get_user_pages_remote() was added by commit 1e9877902dc7
- * ("mm/gup: Introduce get_user_pages_remote()") in v4.6 (2016-02-12).
+ * ("mm/gup: Introduce get_user_pages_remote()") in v4.6.
 *
 * Note that get_user_pages_remote() requires the caller to hold a reference on
 * the task_struct (if non-NULL and if this API has tsk argument) and the mm_struct.
@@ -148,37 +157,35 @@ typedef int vm_fault_t;
 *
 * get_user_pages_remote() write/force parameters were replaced
 * with gup_flags by commit 9beae1ea8930 ("mm: replace get_user_pages_remote()
- * write/force parameters with gup_flags") in v4.9 (2016-10-13).
+ * write/force parameters with gup_flags") in v4.9.
 *
 * get_user_pages_remote() added 'locked' parameter by commit 5b56d49fc31d
- * ("mm: add locked parameter to get_user_pages_remote()") in
- * v4.10 (2016-12-14).
+ * ("mm: add locked parameter to get_user_pages_remote()") in v4.10.
 *
 * get_user_pages_remote() removed 'tsk' parameter by
 * commit 64019a2e467a ("mm/gup: remove task_struct pointer for
- * all gup code") in v5.9-rc1 (2020-08-11).
+ * all gup code") in v5.9.
 *
- * Removed vmas parameter from get_user_pages_remote() by commit a4bde14d549 
- * ("mm/gup: remove vmas parameter from get_user_pages_remote()")
- * in linux-next, expected in v6.5-rc1 (2023-05-14)
+ * Removed vmas parameter from get_user_pages_remote() by commit ca5e863233e8
+ * ("mm/gup: remove vmas parameter from get_user_pages_remote()") in v6.5.
 *
 */

 #if defined(NV_GET_USER_PAGES_REMOTE_PRESENT)
    #if defined(NV_GET_USER_PAGES_REMOTE_HAS_ARGS_FLAGS_LOCKED)
-        #define NV_GET_USER_PAGES_REMOTE(mm, start, nr_pages, flags, pages, vmas, locked) \
-            get_user_pages_remote(mm, start, nr_pages, flags, pages, locked)
-
-    #elif defined(NV_GET_USER_PAGES_REMOTE_HAS_ARGS_FLAGS_LOCKED_VMAS)
        #define NV_GET_USER_PAGES_REMOTE get_user_pages_remote

+    #elif defined(NV_GET_USER_PAGES_REMOTE_HAS_ARGS_FLAGS_LOCKED_VMAS)
+        #define NV_GET_USER_PAGES_REMOTE(mm, start, nr_pages, flags, pages, locked) \
+            get_user_pages_remote(mm, start, nr_pages, flags, pages, NULL, locked)
+
    #elif defined(NV_GET_USER_PAGES_REMOTE_HAS_ARGS_TSK_FLAGS_LOCKED_VMAS)
-        #define NV_GET_USER_PAGES_REMOTE(mm, start, nr_pages, flags, pages, vmas, locked) \
-            get_user_pages_remote(NULL, mm, start, nr_pages, flags, pages, vmas, locked)
+        #define NV_GET_USER_PAGES_REMOTE(mm, start, nr_pages, flags, pages, locked) \
+            get_user_pages_remote(NULL, mm, start, nr_pages, flags, pages, NULL, locked)

    #elif defined(NV_GET_USER_PAGES_REMOTE_HAS_ARGS_TSK_FLAGS_VMAS)
-        #define NV_GET_USER_PAGES_REMOTE(mm, start, nr_pages, flags, pages, vmas, locked) \
-            get_user_pages_remote(NULL, mm, start, nr_pages, flags, pages, vmas)
+        #define NV_GET_USER_PAGES_REMOTE(mm, start, nr_pages, flags, pages, locked) \
+            get_user_pages_remote(NULL, mm, start, nr_pages, flags, pages, NULL)

    #else
        // NV_GET_USER_PAGES_REMOTE_HAS_ARGS_TSK_WRITE_FORCE_VMAS
@@ -187,14 +194,13 @@ typedef int vm_fault_t;
                                                    unsigned long nr_pages,
                                                    unsigned int flags,
                                                    struct page **pages,
-                                                    struct vm_area_struct **vmas,
                                                    int *locked)
        {
            int write = flags & FOLL_WRITE;
            int force = flags & FOLL_FORCE;

            return get_user_pages_remote(NULL, mm, start, nr_pages, write, force,
-                                         pages, vmas);
+                                         pages, NULL);
        }
    #endif // NV_GET_USER_PAGES_REMOTE_HAS_ARGS_FLAGS_LOCKED
 #else
@@ -204,18 +210,17 @@ typedef int vm_fault_t;
                                                    unsigned long nr_pages,
                                                    unsigned int flags,
                                                    struct page **pages,
-                                                    struct vm_area_struct **vmas,
                                                    int *locked)
        {
            int write = flags & FOLL_WRITE;
            int force = flags & FOLL_FORCE;

-            return get_user_pages(NULL, mm, start, nr_pages, write, force, pages, vmas);
+            return get_user_pages(NULL, mm, start, nr_pages, write, force, pages, NULL);
        }

    #else
-        #define NV_GET_USER_PAGES_REMOTE(mm, start, nr_pages, flags, pages, vmas, locked) \
-            get_user_pages(NULL, mm, start, nr_pages, flags, pages, vmas)
+        #define NV_GET_USER_PAGES_REMOTE(mm, start, nr_pages, flags, pages, locked) \
+            get_user_pages(NULL, mm, start, nr_pages, flags, pages, NULL)
    #endif // NV_GET_USER_PAGES_HAS_ARGS_TSK_WRITE_FORCE_VMAS
 #endif // NV_GET_USER_PAGES_REMOTE_PRESENT

--- a/kernel-open/common/inc/nv-pgprot.h
+++ b/kernel-open/common/inc/nv-pgprot.h
@@ -1,5 +1,5 @@
 /*
- * SPDX-FileCopyrightText: Copyright (c) 2015 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
+ * SPDX-FileCopyrightText: Copyright (c) 2015-2023 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
 * SPDX-License-Identifier: MIT
 *
 * Permission is hereby granted, free of charge, to any person obtaining a
@@ -60,6 +60,7 @@ static inline pgprot_t pgprot_modify_writecombine(pgprot_t old_prot)
 #endif /* !defined(NV_VMWARE) */

 #if defined(NVCPU_AARCH64)
+extern NvBool nvos_is_chipset_io_coherent(void);
 /*
 * Don't rely on the kernel's definition of pgprot_noncached(), as on 64-bit
 * ARM that's not for system memory, but device memory instead. For I/O cache
--- a/kernel-open/common/inc/nv-procfs-utils.h
+++ b/kernel-open/common/inc/nv-procfs-utils.h
@@ -92,6 +92,24 @@ typedef struct file_operations nv_proc_ops_t;
 #endif

 #define NV_DEFINE_SINGLE_PROCFS_FILE_HELPER(name, lock)                     \
+    static ssize_t nv_procfs_read_lock_##name(                              \
+        struct file *file,                                                  \
+        char __user *buf,                                                   \
+        size_t size,                                                        \
+        loff_t *ppos                                                        \
+    )                                                                       \
+    {                                                                       \
+        int ret;                                                            \
+        ret = nv_down_read_interruptible(&lock);                            \
+        if (ret < 0)                                                        \
+        {                                                                   \
+            return ret;                                                     \
+        }                                                                   \
+        size = seq_read(file, buf, size, ppos);                             \
+        up_read(&lock);                                                     \
+        return size;                                                        \
+    }                                                                       \
+                                                                            \
    static int nv_procfs_open_##name(                                       \
        struct inode *inode,                                                \
        struct file *filep                                                  \
@@ -104,11 +122,6 @@ typedef struct file_operations nv_proc_ops_t;
        {                                                                   \
            return ret;                                                     \
        }                                                                   \
-        ret = nv_down_read_interruptible(&lock);                            \
-        if (ret < 0)                                                        \
-        {                                                                   \
-            single_release(inode, filep);                                   \
-        }                                                                   \
        return ret;                                                         \
    }                                                                       \
                                                                            \
@@ -117,7 +130,6 @@ typedef struct file_operations nv_proc_ops_t;
        struct file *filep                                                  \
    )                                                                       \
    {                                                                       \
-        up_read(&lock);                                                     \
        return single_release(inode, filep);                                \
    }

@@ -127,46 +139,7 @@ typedef struct file_operations nv_proc_ops_t;
    static const nv_proc_ops_t nv_procfs_##name##_fops = {                  \
        NV_PROC_OPS_SET_OWNER()                                             \
        .NV_PROC_OPS_OPEN    = nv_procfs_open_##name,                       \
-        .NV_PROC_OPS_READ    = seq_read,                                    \
-        .NV_PROC_OPS_LSEEK   = seq_lseek,                                   \
-        .NV_PROC_OPS_RELEASE = nv_procfs_release_##name,                    \
-    };
-
-
-#define NV_DEFINE_SINGLE_PROCFS_FILE_READ_WRITE(name, lock,                 \
-write_callback)                                                             \
-    NV_DEFINE_SINGLE_PROCFS_FILE_HELPER(name, lock)                         \
-                                                                            \
-    static ssize_t nv_procfs_write_##name(                                  \
-        struct file *file,                                                  \
-        const char __user *buf,                                             \
-        size_t size,                                                        \
-        loff_t *ppos                                                        \
-    )                                                                       \
-    {                                                                       \
-        ssize_t ret;                                                        \
-        struct seq_file *s;                                                 \
-                                                                            \
-        s = file->private_data;                                             \
-        if (s == NULL)                                                      \
-        {                                                                   \
-            return -EIO;                                                    \
-        }                                                                   \
-                                                                            \
-        ret = write_callback(s, buf + *ppos, size - *ppos);                 \
-        if (ret == 0)                                                       \
-        {                                                                   \
-            /* avoid infinite loop */                                       \
-            ret = -EIO;                                                     \
-        }                                                                   \
-        return ret;                                                         \
-    }                                                                       \
-                                                                            \
-    static const nv_proc_ops_t nv_procfs_##name##_fops = {                  \
-        NV_PROC_OPS_SET_OWNER()                                             \
-        .NV_PROC_OPS_OPEN    = nv_procfs_open_##name,                       \
-        .NV_PROC_OPS_READ    = seq_read,                                    \
-        .NV_PROC_OPS_WRITE   = nv_procfs_write_##name,                      \
+        .NV_PROC_OPS_READ    = nv_procfs_read_lock_##name,                  \
        .NV_PROC_OPS_LSEEK   = seq_lseek,                                   \
        .NV_PROC_OPS_RELEASE = nv_procfs_release_##name,                    \
    };
--- a/kernel-open/common/inc/nv-proto.h
+++ b/kernel-open/common/inc/nv-proto.h
@@ -88,4 +88,7 @@ int           nv_linux_add_device_locked(nv_linux_state_t *);
 void          nv_linux_remove_device_locked(nv_linux_state_t *);
 NvBool        nv_acpi_power_resource_method_present(struct pci_dev *);

+int           nv_linux_init_open_q(nv_linux_state_t *);
+void          nv_linux_stop_open_q(nv_linux_state_t *);
+
 #endif /* _NV_PROTO_H_ */
--- a/kernel-open/common/inc/nv.h
+++ b/kernel-open/common/inc/nv.h
@@ -1,5 +1,5 @@
 /*
- * SPDX-FileCopyrightText: Copyright (c) 1999-2023 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
+ * SPDX-FileCopyrightText: Copyright (c) 1999-2024 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
 * SPDX-License-Identifier: MIT
 *
 * Permission is hereby granted, free of charge, to any person obtaining a
@@ -110,15 +110,15 @@ typedef enum _TEGRASOC_WHICH_CLK
    TEGRASOC_WHICH_CLK_DSIPLL_CLKOUTPN,
    TEGRASOC_WHICH_CLK_DSIPLL_CLKOUTA,
    TEGRASOC_WHICH_CLK_SPPLL0_VCO,
-    TEGRASOC_WHICH_CLK_SPPLL0_CLKOUTPN,
    TEGRASOC_WHICH_CLK_SPPLL0_CLKOUTA,
    TEGRASOC_WHICH_CLK_SPPLL0_CLKOUTB,
+    TEGRASOC_WHICH_CLK_SPPLL0_CLKOUTPN,
+    TEGRASOC_WHICH_CLK_SPPLL1_CLKOUTPN,
+    TEGRASOC_WHICH_CLK_SPPLL0_DIV27,
+    TEGRASOC_WHICH_CLK_SPPLL1_DIV27,
    TEGRASOC_WHICH_CLK_SPPLL0_DIV10,
    TEGRASOC_WHICH_CLK_SPPLL0_DIV25,
-    TEGRASOC_WHICH_CLK_SPPLL0_DIV27,
    TEGRASOC_WHICH_CLK_SPPLL1_VCO,
-    TEGRASOC_WHICH_CLK_SPPLL1_CLKOUTPN,
-    TEGRASOC_WHICH_CLK_SPPLL1_DIV27,
    TEGRASOC_WHICH_CLK_VPLL0_REF,
    TEGRASOC_WHICH_CLK_VPLL0,
    TEGRASOC_WHICH_CLK_VPLL1,
@@ -132,7 +132,7 @@ typedef enum _TEGRASOC_WHICH_CLK
    TEGRASOC_WHICH_CLK_DSI_PIXEL,
    TEGRASOC_WHICH_CLK_PRE_SOR0,
    TEGRASOC_WHICH_CLK_PRE_SOR1,
-    TEGRASOC_WHICH_CLK_DP_LINK_REF,
+    TEGRASOC_WHICH_CLK_DP_LINKA_REF,
    TEGRASOC_WHICH_CLK_SOR_LINKA_INPUT,
    TEGRASOC_WHICH_CLK_SOR_LINKA_AFIFO,
    TEGRASOC_WHICH_CLK_SOR_LINKA_AFIFO_M,
@@ -143,7 +143,7 @@ typedef enum _TEGRASOC_WHICH_CLK
    TEGRASOC_WHICH_CLK_PLLHUB,
    TEGRASOC_WHICH_CLK_SOR0,
    TEGRASOC_WHICH_CLK_SOR1,
-    TEGRASOC_WHICH_CLK_SOR_PAD_INPUT,
+    TEGRASOC_WHICH_CLK_SOR_PADA_INPUT,
    TEGRASOC_WHICH_CLK_PRE_SF0,
    TEGRASOC_WHICH_CLK_SF0,
    TEGRASOC_WHICH_CLK_SF1,
@@ -221,7 +221,6 @@ typedef struct
 #define NV_RM_PAGE_MASK     (NV_RM_PAGE_SIZE - 1)

 #define NV_RM_TO_OS_PAGE_SHIFT      (os_page_shift - NV_RM_PAGE_SHIFT)
-#define NV_RM_PAGES_PER_OS_PAGE     (1U << NV_RM_TO_OS_PAGE_SHIFT)
 #define NV_RM_PAGES_TO_OS_PAGES(count) \
    ((((NvUPtr)(count)) >> NV_RM_TO_OS_PAGE_SHIFT) + \
     ((((count) & ((1 << NV_RM_TO_OS_PAGE_SHIFT) - 1)) != 0) ? 1 : 0))
@@ -333,7 +332,9 @@ typedef struct nv_soc_irq_info_s {

 #define NV_MAX_SOC_IRQS              6
 #define NV_MAX_DPAUX_NUM_DEVICES     4
-#define NV_MAX_SOC_DPAUX_NUM_DEVICES 2 // From SOC_DEV_MAPPING
+
+#define NV_MAX_SOC_DPAUX_NUM_DEVICES 2
+

 #define NV_IGPU_LEGACY_STALL_IRQ     70
 #define NV_IGPU_MAX_STALL_IRQS       3
@@ -467,12 +468,6 @@ typedef struct nv_state_t
        NvHandle hDisp;
    } rmapi;

-    /* Bool to check if ISO iommu enabled */
-    NvBool iso_iommu_present;
-
-    /* Bool to check if NISO iommu enabled */
-    NvBool niso_iommu_present;
-
    /* Bool to check if dma-buf is supported */
    NvBool dma_buf_supported;

@@ -484,13 +479,23 @@ typedef struct nv_state_t

    /* Bool to check if the GPU has a coherent sysmem link */
    NvBool coherent;
-} nv_state_t;

-// These define need to be in sync with defines in system.h
-#define OS_TYPE_LINUX   0x1
-#define OS_TYPE_FREEBSD 0x2
-#define OS_TYPE_SUNOS   0x3
-#define OS_TYPE_VMWARE  0x4
+    /*
+     * NUMA node ID of the CPU to which the GPU is attached.
+     * Holds NUMA_NO_NODE on platforms that don't support NUMA configuration.
+     */
+    NvS32 cpu_numa_node_id;
+
+    struct {
+        /* Bool to check if ISO iommu enabled */
+        NvBool iso_iommu_present;
+        /* Bool to check if NISO iommu enabled */
+        NvBool niso_iommu_present;
+        /* Display SMMU Stream IDs */
+        NvU32 dispIsoStreamId;
+        NvU32 dispNisoStreamId;
+    } iommus;
+} nv_state_t;

 #define NVFP_TYPE_NONE       0x0
 #define NVFP_TYPE_REFCOUNTED 0x1
@@ -600,6 +605,15 @@ typedef enum
    NV_POWER_STATE_RUNNING
 } nv_power_state_t;

+typedef struct
+{
+    const char *vidmem_power_status;
+    const char *dynamic_power_status;
+    const char *gc6_support;
+    const char *gcoff_support;
+    const char *s0ix_status;
+} nv_power_info_t;
+
 #define NV_PRIMARY_VGA(nv)      ((nv)->primary_vga)

 #define NV_IS_CTL_DEVICE(nv)    ((nv)->flags & NV_FLAG_CONTROL)
@@ -612,11 +626,19 @@ typedef enum
 #define NV_IS_DEVICE_IN_SURPRISE_REMOVAL(nv)    \
        (((nv)->flags & NV_FLAG_IN_SURPRISE_REMOVAL) != 0)

+/*
+ * For console setup by EFI GOP, the base address is BAR1.
+ * For console setup by VBIOS, the base address is BAR2 + 16MB.
+ */
+#define NV_IS_CONSOLE_MAPPED(nv, addr)  \
+        (((addr) == (nv)->bars[NV_GPU_BAR_INDEX_FB].cpu_address) || \
+         ((addr) == ((nv)->bars[NV_GPU_BAR_INDEX_IMEM].cpu_address + 0x1000000)))
+
 #define NV_SOC_IS_ISO_IOMMU_PRESENT(nv)     \
-        ((nv)->iso_iommu_present)
+        ((nv)->iommus.iso_iommu_present)

 #define NV_SOC_IS_NISO_IOMMU_PRESENT(nv)     \
-        ((nv)->niso_iommu_present)
+        ((nv)->iommus.niso_iommu_present)
 /*
 * GPU add/remove events
 */
@@ -761,7 +783,7 @@ nv_state_t*  NV_API_CALL  nv_get_ctl_state       (void);

 void   NV_API_CALL  nv_set_dma_address_size      (nv_state_t *, NvU32 );

-NV_STATUS  NV_API_CALL  nv_alias_pages           (nv_state_t *, NvU32, NvU32, NvU32, NvU64, NvU64 *, void **);
+NV_STATUS  NV_API_CALL  nv_alias_pages           (nv_state_t *, NvU32, NvU64, NvU32, NvU32, NvU64, NvU64 *, void **);
 NV_STATUS  NV_API_CALL  nv_alloc_pages           (nv_state_t *, NvU32, NvU64, NvBool, NvU32, NvBool, NvBool, NvS32, NvU64 *, void **);
 NV_STATUS  NV_API_CALL  nv_free_pages            (nv_state_t *, NvU32, NvBool, NvU32, void *);

@@ -779,8 +801,6 @@ NV_STATUS NV_API_CALL   nv_register_phys_pages   (nv_state_t *, NvU64 *, NvU64,
 void      NV_API_CALL   nv_unregister_phys_pages (nv_state_t *, void *);

 NV_STATUS  NV_API_CALL  nv_dma_map_sgt           (nv_dma_device_t *, NvU64, NvU64 *, NvU32, void **);
-NV_STATUS  NV_API_CALL  nv_dma_map_pages         (nv_dma_device_t *, NvU64, NvU64 *, NvBool, NvU32, void **);
-NV_STATUS  NV_API_CALL  nv_dma_unmap_pages       (nv_dma_device_t *, NvU64, NvU64 *, void **);

 NV_STATUS  NV_API_CALL  nv_dma_map_alloc         (nv_dma_device_t *, NvU64, NvU64 *, NvBool, void **);
 NV_STATUS  NV_API_CALL  nv_dma_unmap_alloc       (nv_dma_device_t *, NvU64, NvU64 *, void **);
@@ -807,6 +827,7 @@ void   NV_API_CALL  nv_acpi_methods_init         (NvU32 *);
 void   NV_API_CALL  nv_acpi_methods_uninit       (void);

 NV_STATUS  NV_API_CALL  nv_acpi_method           (NvU32, NvU32, NvU32, void *, NvU16, NvU32 *, void *, NvU16 *);
+NV_STATUS  NV_API_CALL  nv_acpi_d3cold_dsm_for_upstream_port (nv_state_t *, NvU8 *, NvU32, NvU32, NvU32 *);
 NV_STATUS  NV_API_CALL  nv_acpi_dsm_method       (nv_state_t *, NvU8 *, NvU32, NvBool, NvU32, void *, NvU16, NvU32 *, void *, NvU16 *);
 NV_STATUS  NV_API_CALL  nv_acpi_ddc_method       (nv_state_t *, void *, NvU32 *, NvBool);
 NV_STATUS  NV_API_CALL  nv_acpi_dod_method       (nv_state_t *, NvU32 *, NvU32 *);
@@ -830,7 +851,7 @@ void       NV_API_CALL  nv_put_firmware(const void *);
 nv_file_private_t* NV_API_CALL nv_get_file_private(NvS32, NvBool, void **);
 void               NV_API_CALL nv_put_file_private(void *);

-NV_STATUS NV_API_CALL nv_get_device_memory_config(nv_state_t *, NvU64 *, NvU64 *, NvU32 *, NvS32 *);
+NV_STATUS NV_API_CALL nv_get_device_memory_config(nv_state_t *, NvU64 *, NvU64 *, NvU64 *, NvU32 *, NvS32 *);
 NV_STATUS NV_API_CALL nv_get_egm_info(nv_state_t *, NvU64 *, NvU64 *, NvS32 *);

 NV_STATUS NV_API_CALL nv_get_ibmnpu_genreg_info(nv_state_t *, NvU64 *, NvU64 *, void**);
@@ -868,18 +889,18 @@ void      NV_API_CALL nv_cap_drv_exit(void);
 NvBool    NV_API_CALL nv_is_gpu_accessible(nv_state_t *);
 NvBool    NV_API_CALL nv_match_gpu_os_info(nv_state_t *, void *);

-NvU32     NV_API_CALL nv_get_os_type(void);
-
 void      NV_API_CALL nv_get_updated_emu_seg(NvU32 *start, NvU32 *end);
+void      NV_API_CALL nv_get_screen_info(nv_state_t *, NvU64 *, NvU32 *, NvU32 *, NvU32 *, NvU32 *, NvU64 *);
+
 struct dma_buf;
 typedef struct nv_dma_buf nv_dma_buf_t;
 struct drm_gem_object;

 NV_STATUS NV_API_CALL nv_dma_import_sgt  (nv_dma_device_t *, struct sg_table *, struct drm_gem_object *);
 void NV_API_CALL nv_dma_release_sgt(struct sg_table *, struct drm_gem_object *);
-NV_STATUS NV_API_CALL nv_dma_import_dma_buf      (nv_dma_device_t *, struct dma_buf *, NvU32 *, void **, struct sg_table **, nv_dma_buf_t **);
-NV_STATUS NV_API_CALL nv_dma_import_from_fd      (nv_dma_device_t *, NvS32, NvU32 *, void **, struct sg_table **, nv_dma_buf_t **);
-void      NV_API_CALL nv_dma_release_dma_buf     (void *, nv_dma_buf_t *);
+NV_STATUS NV_API_CALL nv_dma_import_dma_buf      (nv_dma_device_t *, struct dma_buf *, NvU32 *, struct sg_table **, nv_dma_buf_t **);
+NV_STATUS NV_API_CALL nv_dma_import_from_fd      (nv_dma_device_t *, NvS32, NvU32 *, struct sg_table **, nv_dma_buf_t **);
+void      NV_API_CALL nv_dma_release_dma_buf     (nv_dma_buf_t *);

 void      NV_API_CALL nv_schedule_uvm_isr        (nv_state_t *);

@@ -895,6 +916,8 @@ typedef void (*nvTegraDceClientIpcCallback)(NvU32, NvU32, NvU32, void *, void *)
 NV_STATUS NV_API_CALL nv_get_num_phys_pages      (void *, NvU32 *);
 NV_STATUS NV_API_CALL nv_get_phys_pages          (void *, void *, NvU32 *);

+void      NV_API_CALL nv_get_disp_smmu_stream_ids (nv_state_t *, NvU32 *, NvU32 *);
+
 /*
 * ---------------------------------------------------------------------------
 *
@@ -921,6 +944,7 @@ NV_STATUS  NV_API_CALL  rm_ioctl                 (nvidia_stack_t *, nv_state_t *
 NvBool     NV_API_CALL  rm_isr                   (nvidia_stack_t *, nv_state_t *, NvU32 *);
 void       NV_API_CALL  rm_isr_bh                (nvidia_stack_t *, nv_state_t *);
 void       NV_API_CALL  rm_isr_bh_unlocked       (nvidia_stack_t *, nv_state_t *);
+NvBool     NV_API_CALL  rm_is_msix_allowed       (nvidia_stack_t *, nv_state_t *);
 NV_STATUS  NV_API_CALL  rm_power_management      (nvidia_stack_t *, nv_state_t *, nv_pm_action_t);
 NV_STATUS  NV_API_CALL  rm_stop_user_channels    (nvidia_stack_t *, nv_state_t *);
 NV_STATUS  NV_API_CALL  rm_restart_user_channels (nvidia_stack_t *, nv_state_t *);
@@ -940,6 +964,7 @@ void       NV_API_CALL  rm_parse_option_string   (nvidia_stack_t *, const char *
 char*      NV_API_CALL  rm_remove_spaces         (const char *);
 char*      NV_API_CALL  rm_string_token          (char **, const char);
 void       NV_API_CALL  rm_vgpu_vfio_set_driver_vm(nvidia_stack_t *, NvBool);
+NV_STATUS  NV_API_CALL  rm_get_adapter_status_external(nvidia_stack_t *, nv_state_t *);

 NV_STATUS  NV_API_CALL  rm_run_rc_callback       (nvidia_stack_t *, nv_state_t *);
 void       NV_API_CALL  rm_execute_work_item     (nvidia_stack_t *, void *);
@@ -969,10 +994,10 @@ NV_STATUS  NV_API_CALL  rm_p2p_init_mapping       (nvidia_stack_t *, NvU64, NvU6
 NV_STATUS  NV_API_CALL  rm_p2p_destroy_mapping    (nvidia_stack_t *, NvU64);
 NV_STATUS  NV_API_CALL  rm_p2p_get_pages          (nvidia_stack_t *, NvU64, NvU32, NvU64, NvU64, NvU64 *, NvU32 *, NvU32 *, NvU32 *, NvU8 **, void *);
 NV_STATUS  NV_API_CALL  rm_p2p_get_gpu_info       (nvidia_stack_t *, NvU64, NvU64, NvU8 **, void **);
-NV_STATUS  NV_API_CALL  rm_p2p_get_pages_persistent (nvidia_stack_t *,  NvU64, NvU64, void **, NvU64 *, NvU32 *, void *, void *);
+NV_STATUS  NV_API_CALL  rm_p2p_get_pages_persistent (nvidia_stack_t *,  NvU64, NvU64, void **, NvU64 *, NvU32 *, void *, void *, void **);
 NV_STATUS  NV_API_CALL  rm_p2p_register_callback  (nvidia_stack_t *, NvU64, NvU64, NvU64, void *, void (*)(void *), void *);
 NV_STATUS  NV_API_CALL  rm_p2p_put_pages          (nvidia_stack_t *, NvU64, NvU32, NvU64, void *);
-NV_STATUS  NV_API_CALL  rm_p2p_put_pages_persistent(nvidia_stack_t *, void *, void *);
+NV_STATUS  NV_API_CALL  rm_p2p_put_pages_persistent(nvidia_stack_t *, void *, void *, void *);
 NV_STATUS  NV_API_CALL  rm_p2p_dma_map_pages      (nvidia_stack_t *, nv_dma_device_t *, NvU8 *, NvU64, NvU32, NvU64 *, void **);
 NV_STATUS  NV_API_CALL  rm_dma_buf_dup_mem_handle (nvidia_stack_t *, nv_state_t *, NvHandle, NvHandle, NvHandle, NvHandle, void *, NvHandle, NvU64, NvU64, NvHandle *, void **);
 void       NV_API_CALL  rm_dma_buf_undup_mem_handle(nvidia_stack_t *, nv_state_t *, NvHandle, NvHandle);
@@ -1006,9 +1031,7 @@ void       NV_API_CALL rm_enable_dynamic_power_management(nvidia_stack_t *, nv_s
 NV_STATUS  NV_API_CALL rm_ref_dynamic_power(nvidia_stack_t *, nv_state_t *, nv_dynamic_power_mode_t);
 void       NV_API_CALL rm_unref_dynamic_power(nvidia_stack_t *, nv_state_t *, nv_dynamic_power_mode_t);
 NV_STATUS  NV_API_CALL rm_transition_dynamic_power(nvidia_stack_t *, nv_state_t *, NvBool, NvBool *);
-const char* NV_API_CALL rm_get_vidmem_power_status(nvidia_stack_t *, nv_state_t *);
-const char* NV_API_CALL rm_get_dynamic_power_management_status(nvidia_stack_t *, nv_state_t *);
-const char* NV_API_CALL rm_get_gpu_gcx_support(nvidia_stack_t *, nv_state_t *, NvBool);
+void       NV_API_CALL rm_get_power_info(nvidia_stack_t *, nv_state_t *, nv_power_info_t *);

 void       NV_API_CALL rm_acpi_notify(nvidia_stack_t *, nv_state_t *, NvU32);
 void       NV_API_CALL rm_acpi_nvpcf_notify(nvidia_stack_t *);
@@ -1020,13 +1043,12 @@ NV_STATUS  NV_API_CALL  nv_vgpu_create_request(nvidia_stack_t *, nv_state_t *, c
 NV_STATUS  NV_API_CALL  nv_vgpu_delete(nvidia_stack_t *, const NvU8 *, NvU16);
 NV_STATUS  NV_API_CALL  nv_vgpu_get_type_ids(nvidia_stack_t *, nv_state_t *, NvU32 *, NvU32 *, NvBool, NvU8, NvBool);
 NV_STATUS  NV_API_CALL  nv_vgpu_get_type_info(nvidia_stack_t *, nv_state_t *, NvU32, char *, int, NvU8);
-NV_STATUS  NV_API_CALL  nv_vgpu_get_bar_info(nvidia_stack_t *, nv_state_t *, const NvU8 *, NvU64 *, NvU32, void *, NvBool *);
+NV_STATUS  NV_API_CALL  nv_vgpu_get_bar_info(nvidia_stack_t *, nv_state_t *, const NvU8 *, NvU64 *,
+                                             NvU64 *, NvU64 *, NvU32 *, NvBool *, NvU8 *);
 NV_STATUS  NV_API_CALL  nv_vgpu_get_hbm_info(nvidia_stack_t *, nv_state_t *, const NvU8 *, NvU64 *, NvU64 *);
-NV_STATUS  NV_API_CALL  nv_vgpu_start(nvidia_stack_t *, const NvU8 *, void *, NvS32 *, NvU8 *, NvU32);
-NV_STATUS  NV_API_CALL  nv_vgpu_get_sparse_mmap(nvidia_stack_t *, nv_state_t *, const NvU8 *, NvU64 **, NvU64 **, NvU32 *);
 NV_STATUS  NV_API_CALL  nv_vgpu_process_vf_info(nvidia_stack_t *, nv_state_t *, NvU8, NvU32, NvU8, NvU8, NvU8, NvBool, void *);
-NV_STATUS  NV_API_CALL  nv_vgpu_update_request(nvidia_stack_t *, const NvU8 *, NvU32, NvU64 *, NvU64 *, const char *);
 NV_STATUS  NV_API_CALL  nv_gpu_bind_event(nvidia_stack_t *);
+NV_STATUS  NV_API_CALL  nv_gpu_unbind_event(nvidia_stack_t *, NvU32, NvBool *);

 NV_STATUS NV_API_CALL nv_get_usermap_access_params(nv_state_t*, nv_usermap_access_params_t*);
 nv_soc_irq_type_t NV_API_CALL nv_get_current_irq_type(nv_state_t*);
@@ -1057,6 +1079,9 @@ NV_STATUS   NV_API_CALL rm_run_nano_timer_callback(nvidia_stack_t *, nv_state_t
 void        NV_API_CALL nv_cancel_nano_timer(nv_state_t *, nv_nano_timer_t *);
 void        NV_API_CALL nv_destroy_nano_timer(nv_state_t *nv, nv_nano_timer_t *);

+// Host1x specific functions.
+NV_STATUS NV_API_CALL nv_get_syncpoint_aperture(NvU32, NvU64 *, NvU64 *, NvU32 *);
+
 #if defined(NVCPU_X86_64)

 static inline NvU64 nv_rdtsc(void)
--- a/kernel-open/common/inc/nv_uvm_interface.h
+++ b/kernel-open/common/inc/nv_uvm_interface.h
@@ -1,5 +1,5 @@
 /*
- * SPDX-FileCopyrightText: Copyright (c) 2013-2023 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
+ * SPDX-FileCopyrightText: Copyright (c) 2013-2024 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
 * SPDX-License-Identifier: MIT
 *
 * Permission is hereby granted, free of charge, to any person obtaining a
@@ -62,10 +62,10 @@ typedef struct
 /*******************************************************************************
    nvUvmInterfaceRegisterGpu

-    Registers the GPU with the provided UUID for use. A GPU must be registered
-    before its UUID can be used with any other API. This call is ref-counted so
-    every nvUvmInterfaceRegisterGpu must be paired with a corresponding
-    nvUvmInterfaceUnregisterGpu.
+    Registers the GPU with the provided physical UUID for use. A GPU must be
+    registered before its UUID can be used with any other API. This call is
+    ref-counted so every nvUvmInterfaceRegisterGpu must be paired with a
+    corresponding nvUvmInterfaceUnregisterGpu.

    You don't need to call nvUvmInterfaceSessionCreate before calling this.

@@ -79,12 +79,13 @@ NV_STATUS nvUvmInterfaceRegisterGpu(const NvProcessorUuid *gpuUuid, UvmGpuPlatfo
 /*******************************************************************************
    nvUvmInterfaceUnregisterGpu

-    Unregisters the GPU with the provided UUID. This drops the ref count from
-    nvUvmInterfaceRegisterGpu. Once the reference count goes to 0 the device may
-    no longer be accessible until the next nvUvmInterfaceRegisterGpu call. No
-    automatic resource freeing is performed, so only make the last unregister
-    call after destroying all your allocations associated with that UUID (such
-    as those from nvUvmInterfaceAddressSpaceCreate).
+    Unregisters the GPU with the provided physical UUID. This drops the ref
+    count from nvUvmInterfaceRegisterGpu. Once the reference count goes to 0
+    the device may no longer be accessible until the next
+    nvUvmInterfaceRegisterGpu call. No automatic resource freeing is performed,
+    so only make the last unregister call after destroying all your allocations
+    associated with that UUID (such as those from
+    nvUvmInterfaceAddressSpaceCreate).

    If the UUID is not found, no operation is performed.
 */
@@ -121,10 +122,10 @@ NV_STATUS nvUvmInterfaceSessionDestroy(uvmGpuSessionHandle session);
    nvUvmInterfaceDeviceCreate

    Creates a device object under the given session for the GPU with the given
-    UUID. Also creates a partition object for the device iff bCreateSmcPartition
-    is true and pGpuInfo->smcEnabled is true. pGpuInfo->smcUserClientInfo will
-    be used to determine the SMC partition in this case. A device handle is
-    returned in the device output parameter.
+    physical UUID. Also creates a partition object for the device iff
+    bCreateSmcPartition is true and pGpuInfo->smcEnabled is true.
+    pGpuInfo->smcUserClientInfo will be used to determine the SMC partition in
+    this case. A device handle is returned in the device output parameter.

    Error codes:
      NV_ERR_GENERIC
@@ -161,6 +162,7 @@ void nvUvmInterfaceDeviceDestroy(uvmGpuDeviceHandle device);
 NV_STATUS nvUvmInterfaceAddressSpaceCreate(uvmGpuDeviceHandle device,
                                           unsigned long long vaBase,
                                           unsigned long long vaSize,
+                                           NvBool enableAts,
                                           uvmGpuAddressSpaceHandle *vaSpace,
                                           UvmGpuAddressSpaceInfo *vaSpaceInfo);

@@ -422,33 +424,6 @@ NV_STATUS nvUvmInterfacePmaPinPages(void *pPma,
                                    NvU64 pageSize,
                                    NvU32 flags);

-/*******************************************************************************
-    nvUvmInterfacePmaUnpinPages
-
-    This function will unpin the physical memory allocated using PMA. The pages
-    passed as input must be already pinned, else this function will return an
-    error and rollback any change if any page is not previously marked "pinned".
-    Behaviour is undefined if any blacklisted pages are unpinned.
-
-    Arguments:
-        pPma[IN]             - Pointer to PMA object.
-        pPages[IN]           - Array of pointers, containing the PA base
-                               address of each page to be unpinned.
-        pageCount [IN]       - Number of pages required to be unpinned.
-        pageSize [IN]        - Page size of each page to be unpinned.
-
-    Error codes:
-        NV_ERR_INVALID_ARGUMENT       - Invalid input arguments.
-        NV_ERR_GENERIC                - Unexpected error. We try hard to avoid
-                                        returning this error code as is not very
-                                        informative.
-        NV_ERR_NOT_SUPPORTED          - Operation not supported on broken FB
-*/
-NV_STATUS nvUvmInterfacePmaUnpinPages(void *pPma,
-                                      NvU64 *pPages,
-                                      NvLength pageCount,
-                                      NvU64 pageSize);
-
 /*******************************************************************************
    nvUvmInterfaceMemoryFree

@@ -638,6 +613,8 @@ NV_STATUS nvUvmInterfaceQueryCopyEnginesCaps(uvmGpuDeviceHandle device,
    nvUvmInterfaceGetGpuInfo

    Return various gpu info, refer to the UvmGpuInfo struct for details.
+    The input UUID is for the physical GPU and the pGpuClientInfo identifies
+    the SMC partition if SMC is enabled and the partition exists.
    If no gpu matching the uuid is found, an error will be returned.

    On Ampere+ GPUs, pGpuClientInfo contains SMC information provided by the
@@ -645,6 +622,9 @@ NV_STATUS nvUvmInterfaceQueryCopyEnginesCaps(uvmGpuDeviceHandle device,

    Error codes:
      NV_ERR_GENERIC
+      NV_ERR_NO_MEMORY
+      NV_ERR_GPU_UUID_NOT_FOUND
+      NV_ERR_INSUFFICIENT_PERMISSIONS
      NV_ERR_INSUFFICIENT_RESOURCES
 */
 NV_STATUS nvUvmInterfaceGetGpuInfo(const NvProcessorUuid *gpuUuid,
@@ -857,7 +837,7 @@ NV_STATUS nvUvmInterfaceGetEccInfo(uvmGpuDeviceHandle device,
        UVM GPU UNLOCK

    Arguments:
-        gpuUuid[IN]          - UUID of the GPU to operate on
+        device[IN]           - Device handle associated with the gpu
        bOwnInterrupts       - Set to NV_TRUE for UVM to take ownership of the
                               replayable page fault interrupts. Set to NV_FALSE
                               to return ownership of the page fault interrupts
@@ -973,14 +953,45 @@ NV_STATUS nvUvmInterfaceGetNonReplayableFaults(UvmGpuFaultInfo *pFaultInfo,
    NOTES:
    - This function DOES NOT acquire the RM API or GPU locks. That is because
    it is called during fault servicing, which could produce deadlocks.
+    - This function should not be called when interrupts are disabled.

    Arguments:
-        device[IN]        - Device handle associated with the gpu
+        pFaultInfo[IN]        - information provided by RM for fault handling.
+                                used for obtaining the device handle without locks.
+        bCopyAndFlush[IN]     - Instructs RM to perform the flush in the Copy+Flush mode.
+                                In this mode, RM will perform a copy of the packets from
+                                the HW buffer to UVM's SW buffer as part of performing
+                                the flush. This mode gives UVM the opportunity to observe
+                                the packets contained within the HW buffer at the time
+                                of issuing the call.

    Error codes:
      NV_ERR_INVALID_ARGUMENT
 */
-NV_STATUS nvUvmInterfaceFlushReplayableFaultBuffer(uvmGpuDeviceHandle device);
+NV_STATUS nvUvmInterfaceFlushReplayableFaultBuffer(UvmGpuFaultInfo *pFaultInfo,
+                                                   NvBool bCopyAndFlush);
+
+/*******************************************************************************
+    nvUvmInterfaceTogglePrefetchFaults
+
+    This function sends an RPC to GSP in order to toggle the prefetch fault PRI.
+
+    NOTES:
+    - This function DOES NOT acquire the RM API or GPU locks. That is because
+    it is called during fault servicing, which could produce deadlocks.
+    - This function should not be called when interrupts are disabled.
+
+    Arguments:
+        pFaultInfo[IN]        - Information provided by RM for fault handling.
+                                Used for obtaining the device handle without locks.
+        bEnable[IN]           - Instructs RM whether to toggle generating faults on
+                                prefetch on/off.
+
+    Error codes:
+      NV_ERR_INVALID_ARGUMENT
+*/
+NV_STATUS nvUvmInterfaceTogglePrefetchFaults(UvmGpuFaultInfo *pFaultInfo,
+                                             NvBool bEnable);

 /*******************************************************************************
    nvUvmInterfaceInitAccessCntrInfo
@@ -1087,7 +1098,8 @@ void nvUvmInterfaceDeRegisterUvmOps(void);

    Error codes:
      NV_ERR_INVALID_ARGUMENT
-      NV_ERR_OBJECT_NOT_FOUND : If device object associated with the uuids aren't found.
+      NV_ERR_OBJECT_NOT_FOUND : If device object associated with the device
+                                handles isn't found.
 */
 NV_STATUS nvUvmInterfaceP2pObjectCreate(uvmGpuDeviceHandle device1,
                                        uvmGpuDeviceHandle device2,
@@ -1140,6 +1152,8 @@ void nvUvmInterfaceP2pObjectDestroy(uvmGpuSessionHandle session,
        NV_ERR_NOT_READY                - Returned when querying the PTEs requires a deferred setup
                                          which has not yet completed. It is expected that the caller
                                          will reattempt the call until a different code is returned.
+                                          As an example, multi-node systems which require querying
+                                          PTEs from the Fabric Manager may return this code.
 */
 NV_STATUS nvUvmInterfaceGetExternalAllocPtes(uvmGpuAddressSpaceHandle vaSpace,
                                             NvHandle hMemory,
@@ -1449,18 +1463,7 @@ NV_STATUS nvUvmInterfacePagingChannelPushStream(UvmGpuPagingChannelHandle channe
                                                NvU32 methodStreamSize);

 /*******************************************************************************
-    CSL Interface and Locking
-
-    The following functions do not acquire the RM API or GPU locks and must not be called
-    concurrently with the same UvmCslContext parameter in different threads. The caller must
-    guarantee this exclusion.
-
-    * nvUvmInterfaceCslRotateIv
-    * nvUvmInterfaceCslEncrypt
-    * nvUvmInterfaceCslDecrypt
-    * nvUvmInterfaceCslSign
-    * nvUvmInterfaceCslQueryMessagePool
-    * nvUvmInterfaceCslIncrementIv
+    Cryptography Services Library (CSL) Interface
 */

 /*******************************************************************************
@@ -1471,8 +1474,11 @@ NV_STATUS nvUvmInterfacePagingChannelPushStream(UvmGpuPagingChannelHandle channe
    The lifetime of the context is the same as the lifetime of the secure channel
    it is paired with.

+    Locking: This function acquires an API lock.
+    Memory : This function dynamically allocates memory.
+
    Arguments:
-        uvmCslContext[IN/OUT] - The CSL context.
+        uvmCslContext[IN/OUT] - The CSL context associated with a channel.
        channel[IN]           - Handle to a secure channel.

    Error codes:
@@ -1490,30 +1496,62 @@ NV_STATUS nvUvmInterfaceCslInitContext(UvmCslContext *uvmCslContext,

    If context is already deinitialized then function returns immediately.

+    Locking: This function does not acquire an API or GPU lock.
+    Memory : This function may free memory.
+
    Arguments:
-        uvmCslContext[IN] - The CSL context.
+        uvmCslContext[IN] - The CSL context associated with a channel.
 */
 void nvUvmInterfaceDeinitCslContext(UvmCslContext *uvmCslContext);

+/*******************************************************************************
+    nvUvmInterfaceCslRotateKey
+
+    Disables channels and rotates keys.
+
+    This function disables channels and rotates associated keys. The channels
+    associated with the given CSL contexts must be idled before this function is
+    called. To trigger key rotation all allocated channels for a given key must
+    be present in the list. If the function returns successfully then the CSL
+    contexts have been updated with the new key.
+
+    Locking: This function attempts to acquire the GPU lock. In case of failure
+             to acquire the return code is NV_ERR_STATE_IN_USE. The caller must
+             guarantee that no CSL function, including this one, is invoked
+             concurrently with the CSL contexts in contextList.
+    Memory : This function dynamically allocates memory.
+
+    Arguments:
+        contextList[IN/OUT]  - An array of pointers to CSL contexts.
+        contextListCount[IN] - Number of CSL contexts in contextList. Its value
+                               must be greater than 0.
+    Error codes:
+        NV_ERR_INVALID_ARGUMENT - contextList is NULL or contextListCount is 0.
+        NV_ERR_STATE_IN_USE     - Unable to acquire lock / resource. Caller
+                                  can retry at a later time.
+        NV_ERR_GENERIC          - A failure other than _STATE_IN_USE occurred
+                                  when attempting to acquire a lock.
+*/
+NV_STATUS nvUvmInterfaceCslRotateKey(UvmCslContext *contextList[],
+                                     NvU32 contextListCount);
+
 /*******************************************************************************
    nvUvmInterfaceCslRotateIv

    Rotates the IV for a given channel and operation.

    This function will rotate the IV on both the CPU and the GPU.
-    Outstanding messages that have been encrypted by the GPU should first be
-    decrypted before calling this function with operation equal to
-    UVM_CSL_OPERATION_DECRYPT. Similarly, outstanding messages that have been
-    encrypted by the CPU should first be decrypted before calling this function
-    with operation equal to UVM_CSL_OPERATION_ENCRYPT. For a given operation
-    the channel must be idle before calling this function. This function can be
-    called regardless of the value of the IV's message counter.
+    For a given operation the channel must be idle before calling this function.
+    This function can be called regardless of the value of the IV's message counter.

-    See "CSL Interface and Locking" for locking requirements.
-    This function does not perform dynamic memory allocation.
+    Locking: This function attempts to acquire the GPU lock. In case of failure to
+             acquire the return code is NV_ERR_STATE_IN_USE. The caller must guarantee
+             that no CSL function, including this one, is invoked concurrently with
+             the same CSL context.
+    Memory : This function does not dynamically allocate memory.

 Arguments:
-        uvmCslContext[IN/OUT] - The CSL context.
+        uvmCslContext[IN/OUT] - The CSL context associated with a channel.
        operation[IN]         - Either
                                - UVM_CSL_OPERATION_ENCRYPT
                                - UVM_CSL_OPERATION_DECRYPT
@@ -1521,7 +1559,11 @@ Arguments:
    Error codes:
      NV_ERR_INSUFFICIENT_RESOURCES - The rotate operation would cause a counter
                                      to overflow.
+      NV_ERR_STATE_IN_USE           - Unable to acquire lock / resource. Caller
+                                      can retry at a later time.
      NV_ERR_INVALID_ARGUMENT       - Invalid value for operation.
+      NV_ERR_GENERIC                - A failure other than _STATE_IN_USE occurred
+                                      when attempting to acquire a lock.
 */
 NV_STATUS nvUvmInterfaceCslRotateIv(UvmCslContext *uvmCslContext,
                                    UvmCslOperation operation);
@@ -1538,11 +1580,13 @@ NV_STATUS nvUvmInterfaceCslRotateIv(UvmCslContext *uvmCslContext,
    The encryptIV can be obtained from nvUvmInterfaceCslIncrementIv.
    However, it is optional. If it is NULL, the next IV in line will be used.

-    See "CSL Interface and Locking" for locking requirements.
-    This function does not perform dynamic memory allocation.
+    Locking: This function does not acquire an API or GPU lock.
+             The caller must guarantee that no CSL function, including this one,
+             is invoked concurrently with the same CSL context.
+    Memory : This function does not dynamically allocate memory.

 Arguments:
-        uvmCslContext[IN/OUT] - The CSL context.
+        uvmCslContext[IN/OUT] - The CSL context associated with a channel.
        bufferSize[IN]        - Size of the input and output buffers in
                                units of bytes. Value can range from 1 byte
                                to (2^32) - 1 bytes.
@@ -1553,8 +1597,9 @@ Arguments:
                                Its size is UVM_CSL_CRYPT_AUTH_TAG_SIZE_BYTES.

    Error codes:
-      NV_ERR_INVALID_ARGUMENT       - The size of the data is 0 bytes.
-                                    - The encryptIv has already been used.
+      NV_ERR_INVALID_ARGUMENT - The CSL context is not associated with a channel.
+                              - The size of the data is 0 bytes.
+                              - The encryptIv has already been used.
 */
 NV_STATUS nvUvmInterfaceCslEncrypt(UvmCslContext *uvmCslContext,
                                   NvU32 bufferSize,
@@ -1573,8 +1618,15 @@ NV_STATUS nvUvmInterfaceCslEncrypt(UvmCslContext *uvmCslContext,
    maximized when the input and output buffers are 16-byte aligned. This is
    natural alignment for AES block.

-    See "CSL Interface and Locking" for locking requirements.
-    This function does not perform dynamic memory allocation.
+    During a key rotation event the previous key is stored in the CSL context.
+    This allows data encrypted by the GPU to be decrypted with the previous key.
+    The keyRotationId parameter identifies which key is used. The first key rotation
+    ID has a value of 0 that increments by one for each key rotation event.
+
+    Locking: This function does not acquire an API or GPU lock.
+             The caller must guarantee that no CSL function, including this one,
+             is invoked concurrently with the same CSL context.
+    Memory : This function does not dynamically allocate memory.

    Arguments:
        uvmCslContext[IN/OUT] - The CSL context.
@@ -1583,6 +1635,8 @@ NV_STATUS nvUvmInterfaceCslEncrypt(UvmCslContext *uvmCslContext,
        decryptIv[IN]         - IV used to decrypt the ciphertext. Its value can either be given by
                                nvUvmInterfaceCslIncrementIv, or, if NULL, the CSL context's
                                internal counter is used.
+        keyRotationId[IN]     - Specifies the key that is used for decryption.
+                                A value of NV_U32_MAX specifies the current key.
        inputBuffer[IN]       - Address of ciphertext input buffer.
        outputBuffer[OUT]     - Address of plaintext output buffer.
        addAuthData[IN]       - Address of the plaintext additional authenticated data used to
@@ -1603,6 +1657,7 @@ NV_STATUS nvUvmInterfaceCslDecrypt(UvmCslContext *uvmCslContext,
                                   NvU32 bufferSize,
                                   NvU8 const *inputBuffer,
                                   UvmCslIv const *decryptIv,
+                                   NvU32 keyRotationId,
                                   NvU8 *outputBuffer,
                                   NvU8 const *addAuthData,
                                   NvU32 addAuthDataSize,
@@ -1616,11 +1671,13 @@ NV_STATUS nvUvmInterfaceCslDecrypt(UvmCslContext *uvmCslContext,
    Auth and input buffers must not overlap. If they do then calling this function produces
    undefined behavior.

-    See "CSL Interface and Locking" for locking requirements.
-    This function does not perform dynamic memory allocation.
+    Locking: This function does not acquire an API or GPU lock.
+             The caller must guarantee that no CSL function, including this one,
+             is invoked concurrently with the same CSL context.
+    Memory : This function does not dynamically allocate memory.

    Arguments:
-        uvmCslContext[IN/OUT] - The CSL context.
+        uvmCslContext[IN/OUT] - The CSL context associated with a channel.
        bufferSize[IN]        - Size of the input buffer in units of bytes.
                                Value can range from 1 byte to (2^32) - 1 bytes.
        inputBuffer[IN]       - Address of plaintext input buffer.
@@ -1629,7 +1686,8 @@ NV_STATUS nvUvmInterfaceCslDecrypt(UvmCslContext *uvmCslContext,

    Error codes:
      NV_ERR_INSUFFICIENT_RESOURCES - The signing operation would cause a counter overflow to occur.
-      NV_ERR_INVALID_ARGUMENT       - The size of the data is 0 bytes.
+      NV_ERR_INVALID_ARGUMENT       - The CSL context is not associated with a channel.
+                                    - The size of the data is 0 bytes.
 */
 NV_STATUS nvUvmInterfaceCslSign(UvmCslContext *uvmCslContext,
                                NvU32 bufferSize,
@@ -1641,8 +1699,10 @@ NV_STATUS nvUvmInterfaceCslSign(UvmCslContext *uvmCslContext,

    Returns the number of messages that can be encrypted before the message counter will overflow.

-    See "CSL Interface and Locking" for locking requirements.
-    This function does not perform dynamic memory allocation.
+    Locking: This function does not acquire an API or GPU lock.
+    Memory : This function does not dynamically allocate memory.
+             The caller must guarantee that no CSL function, including this one,
+             is invoked concurrently with the same CSL context.

    Arguments:
        uvmCslContext[IN/OUT] - The CSL context.
@@ -1666,8 +1726,10 @@ NV_STATUS nvUvmInterfaceCslQueryMessagePool(UvmCslContext *uvmCslContext,
    can be used in nvUvmInterfaceCslEncrypt. If operation is UVM_CSL_OPERATION_DECRYPT then
    the returned IV can be used in nvUvmInterfaceCslDecrypt.

-    See "CSL Interface and Locking" for locking requirements.
-    This function does not perform dynamic memory allocation.
+    Locking: This function does not acquire an API or GPU lock.
+             The caller must guarantee that no CSL function, including this one,
+             is invoked concurrently with the same CSL context.
+    Memory : This function does not dynamically allocate memory.

 Arguments:
        uvmCslContext[IN/OUT] - The CSL context.
@@ -1675,7 +1737,7 @@ Arguments:
                                - UVM_CSL_OPERATION_ENCRYPT
                                - UVM_CSL_OPERATION_DECRYPT
        increment[IN]         - The amount by which the IV is incremented. Can be 0.
-        iv[out]               - If non-NULL, a buffer to store the incremented IV.
+        iv[OUT]               - If non-NULL, a buffer to store the incremented IV.

    Error codes:
      NV_ERR_INVALID_ARGUMENT       - The value of the operation parameter is illegal.
@@ -1687,4 +1749,42 @@ NV_STATUS nvUvmInterfaceCslIncrementIv(UvmCslContext *uvmCslContext,
                                       NvU64 increment,
                                       UvmCslIv *iv);

+/*******************************************************************************
+    nvUvmInterfaceCslLogEncryption
+
+    Checks and logs information about encryptions associated with the given
+    CSL context.
+
+    For contexts associated with channels, this function does not modify elements of
+    the UvmCslContext, and must be called for every CPU/GPU encryption.
+
+    For the context associated with fault buffers, bufferSize can encompass multiple
+    encryption invocations, and the UvmCslContext will be updated following a key
+    rotation event.
+
+    In either case the IV remains unmodified after this function is called.
+
+    Locking: This function does not acquire an API or GPU lock.
+    Memory : This function does not dynamically allocate memory.
+             The caller must guarantee that no CSL function, including this one,
+             is invoked concurrently with the same CSL context.
+
+    Arguments:
+        uvmCslContext[IN/OUT] - The CSL context.
+        operation[IN]         - If the CSL context is associated with a fault
+                                buffer, this argument is ignored. If it is
+                                associated with a channel, it must be either
+                                - UVM_CSL_OPERATION_ENCRYPT
+                                - UVM_CSL_OPERATION_DECRYPT
+        bufferSize[IN]        - The size of the buffer(s) encrypted by the
+                                external entity in units of bytes.
+
+    Error codes:
+      NV_ERR_INSUFFICIENT_RESOURCES - The encryption would cause a counter
+                                      to overflow.
+*/
+NV_STATUS nvUvmInterfaceCslLogEncryption(UvmCslContext *uvmCslContext,
+                                         UvmCslOperation operation,
+                                         NvU32 bufferSize);
+
 #endif // _NV_UVM_INTERFACE_H_
--- a/kernel-open/common/inc/nv_uvm_types.h
+++ b/kernel-open/common/inc/nv_uvm_types.h
@@ -1,5 +1,5 @@
 /*
- * SPDX-FileCopyrightText: Copyright (c) 2014-2023 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
+ * SPDX-FileCopyrightText: Copyright (c) 2014-2024 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
 * SPDX-License-Identifier: MIT
 *
 * Permission is hereby granted, free of charge, to any person obtaining a
@@ -39,12 +39,13 @@
 // are multiple BIG page sizes in RM. These defines are used as flags to "0"
 // should be OK when user is not sure which pagesize allocation it wants
 //
-#define UVM_PAGE_SIZE_DEFAULT    0x0
-#define UVM_PAGE_SIZE_4K         0x1000
-#define UVM_PAGE_SIZE_64K        0x10000
-#define UVM_PAGE_SIZE_128K       0x20000
-#define UVM_PAGE_SIZE_2M         0x200000
-#define UVM_PAGE_SIZE_512M       0x20000000
+#define UVM_PAGE_SIZE_DEFAULT    0x0ULL
+#define UVM_PAGE_SIZE_4K         0x1000ULL
+#define UVM_PAGE_SIZE_64K        0x10000ULL
+#define UVM_PAGE_SIZE_128K       0x20000ULL
+#define UVM_PAGE_SIZE_2M         0x200000ULL
+#define UVM_PAGE_SIZE_512M       0x20000000ULL
+#define UVM_PAGE_SIZE_256G       0x4000000000ULL

 //
 // When modifying flags, make sure they are compatible with the mirrored
@@ -131,6 +132,8 @@ typedef struct UvmGpuMemoryInfo_tag
    //      This is only valid if deviceDescendant is NV_TRUE.
    //      When egm is NV_TRUE, this is also the UUID of the GPU
    //      for which EGM is local.
+    //      If the GPU has SMC enabled, the UUID is the GI UUID.
+    //      Otherwise, it is the UUID for the physical GPU.
    //      Note: If the allocation is owned by a device in
    //      an SLI group and the allocation is broadcast
    //      across the SLI group, this UUID will be any one
@@ -265,6 +268,7 @@ typedef struct UvmGpuChannelInfo_tag

    // The errorNotifier is filled out when the channel hits an RC error.
    NvNotification    *errorNotifier;
+    NvNotification    *keyRotationNotifier;

    NvU32              hwRunlistId;
    NvU32              hwChannelId;
@@ -290,13 +294,13 @@ typedef struct UvmGpuChannelInfo_tag

    // GPU VAs of both GPFIFO and GPPUT are needed in Confidential Computing
    // so a channel can be controlled via another channel (SEC2 or WLC/LCIC)
-    NvU64             gpFifoGpuVa;
-    NvU64             gpPutGpuVa;
-    NvU64             gpGetGpuVa;
+    NvU64              gpFifoGpuVa;
+    NvU64              gpPutGpuVa;
+    NvU64              gpGetGpuVa;
    // GPU VA of work submission offset is needed in Confidential Computing
    // so CE channels can ring doorbell of other channels as required for
    // WLC/LCIC work submission
-    NvU64             workSubmissionOffsetGpuVa;
+    NvU64              workSubmissionOffsetGpuVa;
 } UvmGpuChannelInfo;

 typedef enum
@@ -392,6 +396,7 @@ typedef enum
    UVM_LINK_TYPE_NVLINK_2,
    UVM_LINK_TYPE_NVLINK_3,
    UVM_LINK_TYPE_NVLINK_4,
+    UVM_LINK_TYPE_NVLINK_5,
    UVM_LINK_TYPE_C2C,
 } UVM_LINK_TYPE;

@@ -544,6 +549,10 @@ typedef struct UvmGpuP2PCapsParams_tag
    // the GPUs are direct peers.
    NvU32 peerIds[2];

+    // Out: peerId[i] contains gpu[i]'s EGM peer id of gpu[1 - i]. Only defined
+    // if the GPUs are direct peers and EGM enabled in the system.
+    NvU32 egmPeerIds[2];
+
    // Out: UVM_LINK_TYPE
    NvU32 p2pLink;

@@ -559,11 +568,6 @@ typedef struct UvmGpuP2PCapsParams_tag
    // second, not taking into account the protocols overhead. The reported
    // bandwidth for indirect peers is zero.
    NvU32 totalLinkLineRateMBps;
-
-    // Out: True if the peers have a indirect link to communicate. On P9
-    // systems, this is true if peers are connected to different NPUs that
-    // forward the requests between them.
-    NvU32 indirectAccess      : 1;
 } UvmGpuP2PCapsParams;

 // Platform-wide information
@@ -572,8 +576,11 @@ typedef struct UvmPlatformInfo_tag
    // Out: ATS (Address Translation Services) is supported
    NvBool atsSupported;

-    // Out: AMD SEV (Secure Encrypted Virtualization) is enabled
-    NvBool sevEnabled;
+    // Out: True if HW trusted execution, such as AMD's SEV-SNP or Intel's TDX,
+    // is enabled in the VM, indicating that Confidential Computing must be
+    // also enabled in the GPU(s); these two security features are either both
+    // enabled, or both disabled.
+    NvBool confComputingEnabled;
 } UvmPlatformInfo;

 typedef struct UvmGpuClientInfo_tag
@@ -595,6 +602,8 @@ typedef struct UvmGpuConfComputeCaps_tag
 {
    // Out: GPU's confidential compute mode
    UvmGpuConfComputeMode mode;
+    // Is key rotation enabled for UVM keys
+    NvBool bKeyRotationEnabled;
 } UvmGpuConfComputeCaps;

 #define UVM_GPU_NAME_LENGTH 0x40
@@ -604,7 +613,8 @@ typedef struct UvmGpuInfo_tag
    // Printable gpu name
    char name[UVM_GPU_NAME_LENGTH];

-    // Uuid of this gpu
+    // Uuid of the physical GPU or GI UUID if nvUvmInterfaceGetGpuInfo()
+    // requested information for a valid SMC partition.
    NvProcessorUuid uuid;

    // Gpu architecture; NV2080_CTRL_MC_ARCH_INFO_ARCHITECTURE_*
@@ -688,8 +698,21 @@ typedef struct UvmGpuInfo_tag
    NvU64 nvswitchMemoryWindowStart;

    // local EGM properties
+    // NV_TRUE if EGM is enabled
    NvBool   egmEnabled;
+
+    // Peer ID to reach local EGM when EGM is enabled
    NvU8     egmPeerId;
+
+    // EGM base address to offset in the GMMU PTE entry for EGM mappings
+    NvU64    egmBaseAddr;
+
+    // If connectedToSwitch is NV_TRUE,
+    // nvswitchEgmMemoryWindowStart tells the base address for the GPU's EGM memory in the
+    // NVSwitch address space. It is used when creating PTEs of GPU memory mappings
+    // to NVSwitch peers.
+    NvU64 nvswitchEgmMemoryWindowStart;
+
 } UvmGpuInfo;

 typedef struct UvmGpuFbInfo_tag
@@ -698,9 +721,10 @@ typedef struct UvmGpuFbInfo_tag
    // RM regions that are not registered with PMA either.
    NvU64 maxAllocatableAddress;

-    NvU32 heapSize;         // RAM in KB available for user allocations
-    NvU32 reservedHeapSize; // RAM in KB reserved for internal RM allocation
-    NvBool bZeroFb;         // Zero FB mode enabled.
+    NvU32 heapSize;          // RAM in KB available for user allocations
+    NvU32 reservedHeapSize;  // RAM in KB reserved for internal RM allocation
+    NvBool bZeroFb;          // Zero FB mode enabled.
+    NvU64 maxVidmemPageSize; // Largest GPU page size to access vidmem.
 } UvmGpuFbInfo;

 typedef struct UvmGpuEccInfo_tag
@@ -778,14 +802,14 @@ typedef NV_STATUS (*uvmEventResume_t) (void);
 /*******************************************************************************
    uvmEventStartDevice
    This function will be called by the GPU driver once it has finished its
-    initialization to tell the UVM driver that this GPU has come up.
+    initialization to tell the UVM driver that this physical GPU has come up.
 */
 typedef NV_STATUS (*uvmEventStartDevice_t) (const NvProcessorUuid *pGpuUuidStruct);

 /*******************************************************************************
    uvmEventStopDevice
-    This function will be called by the GPU driver to let UVM know that a GPU
-    is going down.
+    This function will be called by the GPU driver to let UVM know that a
+    physical GPU is going down.
 */
 typedef NV_STATUS (*uvmEventStopDevice_t) (const NvProcessorUuid *pGpuUuidStruct);

@@ -816,7 +840,7 @@ typedef NV_STATUS (*uvmEventServiceInterrupt_t) (void *pDeviceObject,
 /*******************************************************************************
    uvmEventIsrTopHalf_t
    This function will be called by the GPU driver to let UVM know
-    that an interrupt has occurred.
+    that an interrupt has occurred on the given physical GPU.

    Returns:
        NV_OK if the UVM driver handled the interrupt
@@ -923,11 +947,6 @@ typedef struct UvmGpuFaultInfo_tag
        // CSL context used for performing decryption of replayable faults when
        // Confidential Computing is enabled.
        UvmCslContext cslCtx;
-
-        // Indicates whether UVM owns the replayable fault buffer.
-        // The value of this field is always NV_TRUE When Confidential Computing
-        // is disabled.
-        NvBool bUvmOwnsHwFaultBuffer;
    } replayable;
    struct
    {
@@ -1074,4 +1093,21 @@ typedef enum UvmCslOperation
    UVM_CSL_OPERATION_DECRYPT
 } UvmCslOperation;

+typedef enum UVM_KEY_ROTATION_STATUS {
+    // Key rotation complete/not in progress
+    UVM_KEY_ROTATION_STATUS_IDLE = 0,
+    // RM is waiting for clients to report their channels are idle for key rotation
+    UVM_KEY_ROTATION_STATUS_PENDING = 1,
+    // Key rotation is in progress
+    UVM_KEY_ROTATION_STATUS_IN_PROGRESS = 2,
+    // Key rotation timeout failure, RM will RC non-idle channels.
+    // UVM should never see this status value.
+    UVM_KEY_ROTATION_STATUS_FAILED_TIMEOUT = 3,
+    // Key rotation failed because upper threshold was crossed, RM will RC non-idle channels
+    UVM_KEY_ROTATION_STATUS_FAILED_THRESHOLD = 4,
+    // Internal RM failure while rotating keys for a certain channel, RM will RC the channel.
+    UVM_KEY_ROTATION_STATUS_FAILED_ROTATION = 5,
+    UVM_KEY_ROTATION_STATUS_MAX_COUNT = 6,
+} UVM_KEY_ROTATION_STATUS;
+
 #endif // _NV_UVM_TYPES_H_
--- a/kernel-open/common/inc/nvkms-api-types.h
+++ b/kernel-open/common/inc/nvkms-api-types.h
@@ -58,6 +58,7 @@ typedef NvU32 NvKmsFrameLockHandle;
 typedef NvU32 NvKmsDeferredRequestFifoHandle;
 typedef NvU32 NvKmsSwapGroupHandle;
 typedef NvU32 NvKmsVblankSyncObjectHandle;
+typedef NvU32 NvKmsVblankSemControlHandle;

 struct NvKmsSize {
    NvU16 width;
@@ -439,9 +440,9 @@ struct NvKmsLayerCapabilities {
    NvBool supportsWindowMode              :1;

    /*!
-     * Whether layer supports HDR pipe.
+     * Whether layer supports ICtCp pipe.
     */
-    NvBool supportsHDR                     :1;
+    NvBool supportsICtCp                   :1;


    /*!
--- a/kernel-open/common/inc/nvkms-kapi.h
+++ b/kernel-open/common/inc/nvkms-kapi.h
@@ -158,13 +158,17 @@ struct NvKmsKapiDeviceResourcesInfo {

        NvU32 hasVideoMemory;

+        NvU32 numDisplaySemaphores;
+
        NvU8  genericPageKind;

        NvBool  supportsSyncpts;
+
+        NvBool requiresVrrSemaphores;
    } caps;

    NvU64 supportedSurfaceMemoryFormats[NVKMS_KAPI_LAYER_MAX];
-    NvBool supportsHDR[NVKMS_KAPI_LAYER_MAX];
+    NvBool supportsICtCp[NVKMS_KAPI_LAYER_MAX];
 };

 #define NVKMS_KAPI_LAYER_MASK(layerType) (1 << (layerType))
@@ -210,18 +214,26 @@ struct NvKmsKapiStaticDisplayInfo {
    NvU32 headMask;
 };

-struct NvKmsKapiSyncpt {
+struct NvKmsKapiSyncParams {
+    union {
+        struct {
+            /*!
+             * Possible syncpt use case in kapi.
+             * For pre-syncpt, use only id and value
+             * and for post-syncpt, use only fd.
+             */
+            NvU32   preSyncptId;
+            NvU32   preSyncptValue;
+        } syncpt;

-    /*!
-     * Possible syncpt use case in kapi.
-     * For pre-syncpt, use only id and value
-     * and for post-syncpt, use only fd.
-     */
-    NvBool  preSyncptSpecified;
-    NvU32   preSyncptId;
-    NvU32   preSyncptValue;
+        struct {
+            NvU32 index;
+        } semaphore;
+    } u;

-    NvBool  postSyncptRequested;
+    NvBool preSyncptSpecified;
+    NvBool postSyncptRequested;
+    NvBool semaphoreSpecified;
 };

 struct NvKmsKapiLayerConfig {
@@ -231,7 +243,7 @@ struct NvKmsKapiLayerConfig {
        NvU8 surfaceAlpha;
    } compParams;
    struct NvKmsRRParams rrParams;
-    struct NvKmsKapiSyncpt syncptParams;
+    struct NvKmsKapiSyncParams syncParams;

    struct {
        struct NvKmsHDRStaticMetadata val;
@@ -319,7 +331,6 @@ struct NvKmsKapiHeadModeSetConfig {

    struct {
        struct {
-            NvBool specified;
            NvU32 depth;
            NvU32 start;
            NvU32 end;
@@ -327,7 +338,6 @@ struct NvKmsKapiHeadModeSetConfig {
        } input;

        struct {
-            NvBool specified;
            NvBool enabled;
            struct NvKmsLutRamps *pRamps;
        } output;
@@ -342,7 +352,8 @@ struct NvKmsKapiHeadRequestedConfig {
        NvBool modeChanged         : 1;
        NvBool hdrInfoFrameChanged : 1;
        NvBool colorimetryChanged  : 1;
-        NvBool lutChanged      : 1;
+        NvBool ilutChanged         : 1;
+        NvBool olutChanged         : 1;
    } flags;

    struct NvKmsKapiCursorRequestedConfig cursorRequestedConfig;
@@ -368,6 +379,8 @@ struct NvKmsKapiHeadReplyConfig {

 struct NvKmsKapiModeSetReplyConfig {
    enum NvKmsFlipResult flipResult;
+    NvBool vrrFlip;
+    NvS32 vrrSemaphoreIndex;
    struct NvKmsKapiHeadReplyConfig
        headReplyConfig[NVKMS_KAPI_MAX_HEADS];
 };
@@ -490,6 +503,8 @@ typedef enum NvKmsKapiRegisterWaiterResultRec {
    NVKMS_KAPI_REG_WAITER_ALREADY_SIGNALLED,
 } NvKmsKapiRegisterWaiterResult;

+typedef void NvKmsKapiSuspendResumeCallbackFunc(NvBool suspend);
+
 struct NvKmsKapiFunctionsTable {

    /*!
@@ -1399,6 +1414,96 @@ struct NvKmsKapiFunctionsTable {
        NvU64 index,
        NvU64 new_value
    );
+
+    /*!
+     * Set the callback function for suspending and resuming the display system.
+     */
+    void
+    (*setSuspendResumeCallback)
+    (
+        NvKmsKapiSuspendResumeCallbackFunc *function
+    );
+
+    /*!
+     * Immediately reset the specified display semaphore to the pending state.
+     *
+     * Must be called prior to applying a mode set that utilizes the specified
+     * display semaphore for synchronization.
+     *
+     * \param [in] device         The device which will utilize the semaphore.
+     *
+     * \param [in] semaphoreIndex Index of the desired semaphore within the
+     *                            NVKMS semaphore pool. Must be less than
+     *                            NvKmsKapiDeviceResourcesInfo::caps::numDisplaySemaphores
+     *                            for the specified device.
+     */
+    NvBool
+    (*resetDisplaySemaphore)
+    (
+        struct NvKmsKapiDevice *device,
+        NvU32 semaphoreIndex
+    );
+
+    /*!
+     * Immediately set the specified display semaphore to the displayable state.
+     *
+     * Must be called after \ref resetDisplaySemaphore to indicate a mode
+     * configuration change that utilizes the specified display semaphore for
+     * synchronization may proceed.
+     *
+     * \param [in] device         The device which will utilize the semaphore.
+     *
+     * \param [in] semaphoreIndex Index of the desired semaphore within the
+     *                            NVKMS semaphore pool. Must be less than
+     *                            NvKmsKapiDeviceResourcesInfo::caps::numDisplaySemaphores
+     *                            for the specified device.
+     */
+    void
+    (*signalDisplaySemaphore)
+    (
+        struct NvKmsKapiDevice *device,
+        NvU32 semaphoreIndex
+    );
+
+    /*!
+     * Immediately cancel use of a display semaphore by resetting its value to
+     * its initial state.
+     *
+     * This can be used by clients to restore a semaphore to a consistent state
+     * when they have prepared it for use by previously calling
+     * \ref resetDisplaySemaphore() on it, but are then prevented from
+     * submitting the associated hardware operations to consume it due to the
+     * subsequent failure of some software or hardware operation.
+     *
+     * \param [in] device         The device which will utilize the semaphore.
+     *
+     * \param [in] semaphoreIndex Index of the desired semaphore within the
+     *                            NVKMS semaphore pool. Must be less than
+     *                            NvKmsKapiDeviceResourcesInfo::caps::numDisplaySemaphores
+     *                            for the specified device.
+     */
+    void
+    (*cancelDisplaySemaphore)
+    (
+        struct NvKmsKapiDevice *device,
+        NvU32 semaphoreIndex
+    );
+
+    /*!
+     * Signal the VRR semaphore at the specified index from the CPU.
+     * If device does not support VRR semaphores, this is a no-op.
+     * Returns true if signal is success or no-op, otherwise returns false.
+     *
+     * \param [in]  device  A device allocated using allocateDevice().
+     *
+     * \param [in]  index   The VRR semaphore index to be signalled.
+     */
+    NvBool
+    (*signalVrrSemaphore)
+    (
+        struct NvKmsKapiDevice *device,
+        NvS32 index
+    );
 };

 /** @} */
--- a/kernel-open/common/inc/nvmisc.h
+++ b/kernel-open/common/inc/nvmisc.h
@@ -1,5 +1,5 @@
 /*
- * SPDX-FileCopyrightText: Copyright (c) 1993-2020 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
+ * SPDX-FileCopyrightText: Copyright (c) 1993-2024 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
 * SPDX-License-Identifier: MIT
 *
 * Permission is hereby granted, free of charge, to any person obtaining a
@@ -67,6 +67,9 @@ extern "C" {
 #define NVBIT64(b)                NVBIT_TYPE(b, NvU64)
 #endif

+//Concatenate 2 32bit values to a 64bit value
+#define NV_CONCAT_32_TO_64(hi, lo) ((((NvU64)hi) << 32) | ((NvU64)lo))
+
 // Helper macro's for 32 bit bitmasks
 #define NV_BITMASK32_ELEMENT_SIZE            (sizeof(NvU32) << 3)
 #define NV_BITMASK32_IDX(chId)               (((chId) & ~(0x1F)) >> 5)  
@@ -494,6 +497,23 @@ do                                                      \
 //
 #define NV_TWO_N_MINUS_ONE(n) (((1ULL<<(n/2))<<((n+1)/2))-1)

+//
+// Create a 64b bitmask with n bits set
+// This is the same as ((1ULL<<n) - 1), but it doesn't overflow for n=64
+//
+// ...
+// n=-1, 0x0000000000000000
+// n=0,  0x0000000000000000
+// n=1,  0x0000000000000001
+// ...
+// n=63, 0x7FFFFFFFFFFFFFFF
+// n=64, 0xFFFFFFFFFFFFFFFF
+// n=65, 0xFFFFFFFFFFFFFFFF
+// n=66, 0xFFFFFFFFFFFFFFFF
+// ...
+//
+#define NV_BITMASK64(n) ((n<1) ? 0ULL : (NV_U64_MAX>>((n>64) ? 0 : (64-n))))
+
 #define DRF_READ_1WORD_BS(d,r,f,v) \
    ((DRF_EXTENT_MW(NV##d##r##f)<8)?DRF_READ_1BYTE_BS(NV##d##r##f,(v)): \
    ((DRF_EXTENT_MW(NV##d##r##f)<16)?DRF_READ_2BYTE_BS(NV##d##r##f,(v)): \
@@ -574,6 +594,13 @@ nvMaskPos32(const NvU32 mask, const NvU32 bitIdx)
    n32 = BIT_IDX_32(LOWESTBIT(n32));\
 }

+// Destructive operation on n64
+#define LOWESTBITIDX_64(n64)         \
+{                                    \
+    n64 = BIT_IDX_64(LOWESTBIT(n64));\
+}
+
+
 // Destructive operation on n32
 #define HIGHESTBITIDX_32(n32)   \
 {                               \
@@ -918,6 +945,14 @@ static NV_FORCEINLINE void *NV_NVUPTR_TO_PTR(NvUPtr address)
 // Use (lo) if (b) is less than 64, and (hi) if >= 64.
 //
 #define NV_BIT_SET_128(b, lo, hi)              { nvAssert( (b) < 128 ); if ( (b) < 64 ) (lo) |= NVBIT64(b); else (hi) |= NVBIT64( b & 0x3F ); }
+//
+// Clear the bit at pos (b) for U64 which is < 128.
+// Use (lo) if (b) is less than 64, and (hi) if >= 64.
+//
+#define NV_BIT_CLEAR_128(b, lo, hi)            { nvAssert( (b) < 128 ); if ( (b) < 64 ) (lo) &= ~NVBIT64(b); else (hi) &= ~NVBIT64( b & 0x3F ); }
+
+// Get the number of elements the specified fixed-size array
+#define NV_ARRAY_ELEMENTS(x)                   ((sizeof(x)/sizeof((x)[0])))

 #ifdef __cplusplus
 }
--- a/kernel-open/common/inc/nvstatuscodes.h
+++ b/kernel-open/common/inc/nvstatuscodes.h
@@ -1,5 +1,5 @@
 /*
- * SPDX-FileCopyrightText: Copyright (c) 2014-2020 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
+ * SPDX-FileCopyrightText: Copyright (c) 2014-2024 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
 * SPDX-License-Identifier: MIT
 *
 * Permission is hereby granted, free of charge, to any person obtaining a
@@ -150,6 +150,11 @@ NV_STATUS_CODE(NV_ERR_NVLINK_CONFIGURATION_ERROR,      0x00000078, "Nvlink Confi
 NV_STATUS_CODE(NV_ERR_RISCV_ERROR,                     0x00000079, "Generic RISC-V assert or halt")
 NV_STATUS_CODE(NV_ERR_FABRIC_MANAGER_NOT_PRESENT,      0x0000007A, "Fabric Manager is not loaded")
 NV_STATUS_CODE(NV_ERR_ALREADY_SIGNALLED,               0x0000007B, "Semaphore Surface value already >= requested wait value")
+NV_STATUS_CODE(NV_ERR_QUEUE_TASK_SLOT_NOT_AVAILABLE,   0x0000007C, "PMU RPC error due to no queue slot available for this event")
+NV_STATUS_CODE(NV_ERR_KEY_ROTATION_IN_PROGRESS,        0x0000007D, "Operation not allowed as key rotation is in progress")
+NV_STATUS_CODE(NV_ERR_TEST_ONLY_CODE_NOT_ENABLED,      0x0000007E, "Test-only code path not enabled")
+NV_STATUS_CODE(NV_ERR_SECURE_BOOT_FAILED,              0x0000007F, "GFW secure boot failed")
+NV_STATUS_CODE(NV_ERR_INSUFFICIENT_ZBC_ENTRY,          0x00000080, "No more ZBC entry for the client")

 // Warnings:
 NV_STATUS_CODE(NV_WARN_HOT_SWITCH,                     0x00010001, "WARNING Hot switch")
--- a/kernel-open/common/inc/nvtypes.h
+++ b/kernel-open/common/inc/nvtypes.h
@@ -145,7 +145,18 @@ typedef   signed short     NvS16; /* -32768 to 32767                         */
 #endif

 // Macro to build an NvU32 from four bytes, listed from msb to lsb
-#define NvU32_BUILD(a, b, c, d) (((a) << 24) | ((b) << 16) | ((c) << 8) | (d))
+#define NvU32_BUILD(a, b, c, d) \
+    ((NvU32)( \
+     (((NvU32)(a) & 0xff) << 24) | \
+     (((NvU32)(b) & 0xff) << 16) | \
+     (((NvU32)(c) & 0xff) << 8)  | \
+     (((NvU32)(d) & 0xff))))
+
+// Macro to build an NvU64 from two DWORDS, listed from msb to lsb
+#define NvU64_BUILD(a, b) \
+    ((NvU64)( \
+     (((NvU64)(a) & ~0U) << 32) | \
+     (((NvU64)(b) & ~0U))))

 #if NVTYPES_USE_STDINT
 typedef uint32_t           NvV32; /* "void": enumerated or multiple fields   */
--- a/kernel-open/common/inc/os-interface.h
+++ b/kernel-open/common/inc/os-interface.h
@@ -1,5 +1,5 @@
 /*
- * SPDX-FileCopyrightText: Copyright (c) 1999-2023 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
+ * SPDX-FileCopyrightText: Copyright (c) 1999-2024 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
 * SPDX-License-Identifier: MIT
 *
 * Permission is hereby granted, free of charge, to any person obtaining a
@@ -67,7 +67,6 @@ typedef struct os_wait_queue os_wait_queue;
 * ---------------------------------------------------------------------------
 */

-NvU64       NV_API_CALL  os_get_num_phys_pages       (void);
 NV_STATUS   NV_API_CALL  os_alloc_mem                (void **, NvU64);
 void        NV_API_CALL  os_free_mem                 (void *);
 NV_STATUS   NV_API_CALL  os_get_current_time         (NvU32 *, NvU32 *);
@@ -105,7 +104,6 @@ void*       NV_API_CALL  os_map_kernel_space         (NvU64, NvU64, NvU32);
 void        NV_API_CALL  os_unmap_kernel_space       (void *, NvU64);
 void*       NV_API_CALL  os_map_user_space           (NvU64, NvU64, NvU32, NvU32, void **);
 void        NV_API_CALL  os_unmap_user_space         (void *, NvU64, void *);
-NV_STATUS   NV_API_CALL  os_flush_cpu_cache          (void);
 NV_STATUS   NV_API_CALL  os_flush_cpu_cache_all      (void);
 NV_STATUS   NV_API_CALL  os_flush_user_cache         (void);
 void        NV_API_CALL  os_flush_cpu_write_combine_buffer(void);
@@ -153,6 +151,7 @@ void        NV_API_CALL  os_release_rwlock_read      (void *);
 void        NV_API_CALL  os_release_rwlock_write     (void *);
 NvBool      NV_API_CALL  os_semaphore_may_sleep      (void);
 NV_STATUS   NV_API_CALL  os_get_version_info         (os_version_info*);
+NV_STATUS   NV_API_CALL  os_get_is_openrm            (NvBool *);
 NvBool      NV_API_CALL  os_is_isr                   (void);
 NvBool      NV_API_CALL  os_pat_supported            (void);
 void        NV_API_CALL  os_dump_stack               (void);
@@ -162,10 +161,9 @@ NvBool      NV_API_CALL  os_is_vgx_hyper             (void);
 NV_STATUS   NV_API_CALL  os_inject_vgx_msi           (NvU16, NvU64, NvU32);
 NvBool      NV_API_CALL  os_is_grid_supported        (void);
 NvU32       NV_API_CALL  os_get_grid_csp_support     (void);
-void        NV_API_CALL  os_get_screen_info          (NvU64 *, NvU32 *, NvU32 *, NvU32 *, NvU32 *, NvU64, NvU64);
 void        NV_API_CALL  os_bug_check                (NvU32, const char *);
 NV_STATUS   NV_API_CALL  os_lock_user_pages          (void *, NvU64, void **, NvU32);
-NV_STATUS   NV_API_CALL  os_lookup_user_io_memory    (void *, NvU64, NvU64 **, void**);
+NV_STATUS   NV_API_CALL  os_lookup_user_io_memory    (void *, NvU64, NvU64 **);
 NV_STATUS   NV_API_CALL  os_unlock_user_pages        (NvU64, void *);
 NV_STATUS   NV_API_CALL  os_match_mmap_offset        (void *, NvU64, NvU64 *);
 NV_STATUS   NV_API_CALL  os_get_euid                 (NvU32 *);
@@ -200,6 +198,8 @@ nv_cap_t*   NV_API_CALL  os_nv_cap_create_file_entry  (nv_cap_t *, const char *,
 void        NV_API_CALL  os_nv_cap_destroy_entry      (nv_cap_t *);
 int         NV_API_CALL  os_nv_cap_validate_and_dup_fd(const nv_cap_t *, int);
 void        NV_API_CALL  os_nv_cap_close_fd           (int);
+NvS32       NV_API_CALL  os_imex_channel_get          (NvU64);
+NvS32       NV_API_CALL  os_imex_channel_count        (void);

 enum os_pci_req_atomics_type {
    OS_INTF_PCIE_REQ_ATOMICS_32BIT,
@@ -221,6 +221,7 @@ extern NvU8  os_page_shift;
 extern NvBool os_cc_enabled;
 extern NvBool os_cc_tdx_enabled;
 extern NvBool os_dma_buf_enabled;
+extern NvBool os_imex_channel_is_supported;

 /*
 * ---------------------------------------------------------------------------
@@ -230,14 +231,12 @@ extern NvBool os_dma_buf_enabled;
 * ---------------------------------------------------------------------------
 */

-#define NV_DBG_INFO       0x1
-#define NV_DBG_SETUP      0x2
+#define NV_DBG_INFO       0x0
+#define NV_DBG_SETUP      0x1
+#define NV_DBG_USERERRORS 0x2
 #define NV_DBG_WARNINGS   0x3
 #define NV_DBG_ERRORS     0x4
-#define NV_DBG_HW_ERRORS  0x5
-#define NV_DBG_FATAL      0x6

-#define NV_DBG_FORCE_LEVEL(level) ((level) | (1 << 8))

 void NV_API_CALL  out_string(const char *str);
 int  NV_API_CALL  nv_printf(NvU32 debuglevel, const char *printf_format, ...);
--- a/src/nvidia/arch/nvalloc/common/inc/gsp/gsp_error.h
+++ b/src/nvidia/arch/nvalloc/common/inc/gsp/gsp_error.h
@@ -1,5 +1,5 @@
 /*
- * SPDX-FileCopyrightText: Copyright (c) 2021 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
+ * SPDX-FileCopyrightText: Copyright (c) 2024 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
 * SPDX-License-Identifier: MIT
 *
 * Permission is hereby granted, free of charge, to any person obtaining a
@@ -21,15 +21,19 @@
 * DEALINGS IN THE SOFTWARE.
 */

-#ifndef GSP_ERROR_H
-#define GSP_ERROR_H
+#ifndef NV_MEMORY_AREA_H
+#define NV_MEMORY_AREA_H

-// Definitions for GSP-RM to report errors to CPU-RM via mailbox
-#define NV_GSP_ERROR_CODE     7:0
-#define NV_GSP_ERROR_REASON  15:8
-#define NV_GSP_ERROR_TASK    23:16
-#define NV_GSP_ERROR_SKIPPED 27:24
-#define NV_GSP_ERROR_TAG     31:28
-#define NV_GSP_ERROR_TAG_VAL  0xE
+typedef struct MemoryRange
+{
+    NvU64 start;
+    NvU64 size;
+} MemoryRange;

-#endif // GSP_ERROR_H
+typedef struct MemoryArea
+{
+    MemoryRange *pRanges;
+    NvU64 numRanges;
+} MemoryArea;
+
+#endif /* NV_MEMORY_AREA_H */
--- a/kernel-open/common/inc/rm-gpu-ops.h
+++ b/kernel-open/common/inc/rm-gpu-ops.h
@@ -1,5 +1,5 @@
 /*
- * SPDX-FileCopyrightText: Copyright (c) 1999-2023 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
+ * SPDX-FileCopyrightText: Copyright (c) 1999-2024 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
 * SPDX-License-Identifier: MIT
 *
 * Permission is hereby granted, free of charge, to any person obtaining a
@@ -37,7 +37,7 @@ NV_STATUS  NV_API_CALL  rm_gpu_ops_create_session (nvidia_stack_t *, nvgpuSessio
 NV_STATUS  NV_API_CALL  rm_gpu_ops_destroy_session (nvidia_stack_t *, nvgpuSessionHandle_t);
 NV_STATUS  NV_API_CALL  rm_gpu_ops_device_create (nvidia_stack_t *, nvgpuSessionHandle_t, const nvgpuInfo_t *, const NvProcessorUuid *, nvgpuDeviceHandle_t *, NvBool);
 NV_STATUS  NV_API_CALL  rm_gpu_ops_device_destroy (nvidia_stack_t *, nvgpuDeviceHandle_t);
-NV_STATUS  NV_API_CALL  rm_gpu_ops_address_space_create(nvidia_stack_t *, nvgpuDeviceHandle_t, unsigned long long, unsigned long long, nvgpuAddressSpaceHandle_t *, nvgpuAddressSpaceInfo_t);
+NV_STATUS  NV_API_CALL  rm_gpu_ops_address_space_create(nvidia_stack_t *, nvgpuDeviceHandle_t, unsigned long long, unsigned long long, NvBool, nvgpuAddressSpaceHandle_t *, nvgpuAddressSpaceInfo_t);
 NV_STATUS  NV_API_CALL  rm_gpu_ops_dup_address_space(nvidia_stack_t *, nvgpuDeviceHandle_t, NvHandle, NvHandle, nvgpuAddressSpaceHandle_t *, nvgpuAddressSpaceInfo_t);
 NV_STATUS  NV_API_CALL  rm_gpu_ops_address_space_destroy(nvidia_stack_t *, nvgpuAddressSpaceHandle_t);
 NV_STATUS  NV_API_CALL  rm_gpu_ops_memory_alloc_fb(nvidia_stack_t *, nvgpuAddressSpaceHandle_t, NvLength, NvU64 *, nvgpuAllocInfo_t);
@@ -45,7 +45,6 @@ NV_STATUS  NV_API_CALL  rm_gpu_ops_memory_alloc_fb(nvidia_stack_t *, nvgpuAddres
 NV_STATUS  NV_API_CALL  rm_gpu_ops_pma_alloc_pages(nvidia_stack_t *, void *, NvLength, NvU32 , nvgpuPmaAllocationOptions_t, NvU64 *);
 NV_STATUS  NV_API_CALL  rm_gpu_ops_pma_free_pages(nvidia_stack_t *, void *, NvU64 *, NvLength , NvU32, NvU32);
 NV_STATUS  NV_API_CALL  rm_gpu_ops_pma_pin_pages(nvidia_stack_t *, void *, NvU64 *, NvLength , NvU32, NvU32);
-NV_STATUS  NV_API_CALL  rm_gpu_ops_pma_unpin_pages(nvidia_stack_t *, void *, NvU64 *, NvLength , NvU32);
 NV_STATUS  NV_API_CALL  rm_gpu_ops_get_pma_object(nvidia_stack_t *, nvgpuDeviceHandle_t, void **, const nvgpuPmaStatistics_t *);
 NV_STATUS  NV_API_CALL  rm_gpu_ops_pma_register_callbacks(nvidia_stack_t *sp, void *, nvPmaEvictPagesCallback, nvPmaEvictRangeCallback, void *);
 void       NV_API_CALL  rm_gpu_ops_pma_unregister_callbacks(nvidia_stack_t *sp, void *);
@@ -76,7 +75,8 @@ NV_STATUS NV_API_CALL rm_gpu_ops_own_page_fault_intr(nvidia_stack_t *, nvgpuDevi
 NV_STATUS  NV_API_CALL rm_gpu_ops_init_fault_info(nvidia_stack_t *, nvgpuDeviceHandle_t, nvgpuFaultInfo_t);
 NV_STATUS  NV_API_CALL rm_gpu_ops_destroy_fault_info(nvidia_stack_t *, nvgpuDeviceHandle_t, nvgpuFaultInfo_t);
 NV_STATUS  NV_API_CALL rm_gpu_ops_get_non_replayable_faults(nvidia_stack_t *, nvgpuFaultInfo_t, void *, NvU32 *);
-NV_STATUS  NV_API_CALL rm_gpu_ops_flush_replayable_fault_buffer(nvidia_stack_t *, nvgpuDeviceHandle_t);
+NV_STATUS  NV_API_CALL rm_gpu_ops_flush_replayable_fault_buffer(nvidia_stack_t *, nvgpuFaultInfo_t, NvBool);
+NV_STATUS  NV_API_CALL rm_gpu_ops_toggle_prefetch_faults(nvidia_stack_t *, nvgpuFaultInfo_t, NvBool);
 NV_STATUS  NV_API_CALL rm_gpu_ops_has_pending_non_replayable_faults(nvidia_stack_t *, nvgpuFaultInfo_t, NvBool *);
 NV_STATUS  NV_API_CALL rm_gpu_ops_init_access_cntr_info(nvidia_stack_t *, nvgpuDeviceHandle_t, nvgpuAccessCntrInfo_t, NvU32);
 NV_STATUS  NV_API_CALL rm_gpu_ops_destroy_access_cntr_info(nvidia_stack_t *, nvgpuDeviceHandle_t, nvgpuAccessCntrInfo_t);
@@ -103,12 +103,14 @@ NV_STATUS  NV_API_CALL rm_gpu_ops_paging_channel_push_stream(nvidia_stack_t *, n

 NV_STATUS  NV_API_CALL rm_gpu_ops_ccsl_context_init(nvidia_stack_t *, struct ccslContext_t **, nvgpuChannelHandle_t);
 NV_STATUS  NV_API_CALL rm_gpu_ops_ccsl_context_clear(nvidia_stack_t *, struct ccslContext_t *);
+NV_STATUS  NV_API_CALL rm_gpu_ops_ccsl_rotate_key(nvidia_stack_t *, UvmCslContext *[], NvU32);
 NV_STATUS  NV_API_CALL rm_gpu_ops_ccsl_rotate_iv(nvidia_stack_t *, struct ccslContext_t *, NvU8);
 NV_STATUS  NV_API_CALL rm_gpu_ops_ccsl_encrypt(nvidia_stack_t *, struct ccslContext_t *, NvU32, NvU8 const *, NvU8 *, NvU8 *);
 NV_STATUS  NV_API_CALL rm_gpu_ops_ccsl_encrypt_with_iv(nvidia_stack_t *, struct ccslContext_t *, NvU32, NvU8 const *, NvU8*, NvU8 *, NvU8 *);
-NV_STATUS  NV_API_CALL rm_gpu_ops_ccsl_decrypt(nvidia_stack_t *, struct ccslContext_t *, NvU32, NvU8 const *, NvU8 const *, NvU8 *, NvU8 const *, NvU32, NvU8 const *);
+NV_STATUS  NV_API_CALL rm_gpu_ops_ccsl_decrypt(nvidia_stack_t *, struct ccslContext_t *, NvU32, NvU8 const *, NvU8 const *, NvU32, NvU8 *, NvU8 const *, NvU32, NvU8 const *);
 NV_STATUS  NV_API_CALL rm_gpu_ops_ccsl_sign(nvidia_stack_t *, struct ccslContext_t *, NvU32, NvU8 const *, NvU8 *);
 NV_STATUS  NV_API_CALL rm_gpu_ops_ccsl_query_message_pool(nvidia_stack_t *, struct ccslContext_t *, NvU8, NvU64 *);
 NV_STATUS  NV_API_CALL rm_gpu_ops_ccsl_increment_iv(nvidia_stack_t *, struct ccslContext_t *, NvU8, NvU64, NvU8 *);
+NV_STATUS  NV_API_CALL rm_gpu_ops_ccsl_log_encryption(nvidia_stack_t *, struct ccslContext_t *, NvU8, NvU32);

 #endif
--- a/kernel-open/conftest.sh
+++ b/kernel-open/conftest.sh
@@ -14,6 +14,13 @@ OUTPUT=$4
 XEN_PRESENT=1
 PREEMPT_RT_PRESENT=0

+# We also use conftest.sh on FreeBSD to check for which symbols are provided
+# by the linux kernel programming interface (linuxkpi) when compiling nvidia-drm.ko
+OS_FREEBSD=0
+if [ "$OS" = "FreeBSD" ] ; then
+    OS_FREEBSD=1
+fi
+
 # VGX_BUILD parameter defined only for VGX builds (vGPU Host driver)
 # VGX_KVM_BUILD parameter defined only vGPU builds on KVM hypervisor
 # GRID_BUILD parameter defined only for GRID builds (GRID Guest driver)
@@ -64,7 +71,7 @@ test_header_presence() {
    TEST_CFLAGS="-E -M $CFLAGS"

    file="$1"
-    file_define=NV_`echo $file | tr '/.\-a-z' '___A-Z'`_PRESENT
+    file_define=NV_`echo $file | tr '/.-' '___' | tr 'a-z' 'A-Z'`_PRESENT

    CODE="#include <$file>"

@@ -205,11 +212,6 @@ CONFTEST_PREAMBLE="#include \"conftest/headers.h\"
    #if defined(NV_LINUX_KCONFIG_H_PRESENT)
    #include <linux/kconfig.h>
    #endif
-    #if defined(NV_GENERATED_AUTOCONF_H_PRESENT)
-    #include <generated/autoconf.h>
-    #else
-    #include <linux/autoconf.h>
-    #endif
    #if defined(CONFIG_XEN) && \
        defined(CONFIG_XEN_INTERFACE_VERSION) &&  !defined(__XEN_INTERFACE_VERSION__)
    #define __XEN_INTERFACE_VERSION__ CONFIG_XEN_INTERFACE_VERSION
@@ -222,6 +224,17 @@ CONFTEST_PREAMBLE="#include \"conftest/headers.h\"
    #endif
    #endif"

+# FreeBSD's Linux compatibility does not have autoconf.h defined
+# anywhere yet, only add this part on Linux
+if [ ${OS_FREEBSD} -ne 1 ] ; then
+    CONFTEST_PREAMBLE="${CONFTEST_PREAMBLE}
+        #if defined(NV_GENERATED_AUTOCONF_H_PRESENT)
+        #include <generated/autoconf.h>
+        #else
+        #include <linux/autoconf.h>
+        #endif"
+fi
+
 test_configuration_option() {
    #
    # Check to see if the given configuration option is defined
@@ -308,16 +321,57 @@ compile_check_conftest() {
    fi
 }

-export_symbol_present_conftest() {
-    #
-    # Check Module.symvers to see whether the given symbol is present.
-    #
+check_symbol_exists() {
+    # Check that the given symbol is available

    SYMBOL="$1"
    TAB='	'

-    if grep -e "${TAB}${SYMBOL}${TAB}.*${TAB}EXPORT_SYMBOL\(_GPL\)\?\s*\$" \
-               "$OUTPUT/Module.symvers" >/dev/null 2>&1; then
+    if [ ${OS_FREEBSD} -ne 1 ] ; then
+        # Linux:
+        # ------
+        #
+        # Check Module.symvers to see whether the given symbol is present.
+        #
+        if grep -e "${TAB}${SYMBOL}${TAB}.*${TAB}EXPORT_SYMBOL.*\$" \
+                   "$OUTPUT/Module.symvers" >/dev/null 2>&1; then
+            return 0
+        fi
+    else
+        # FreeBSD:
+        # ------
+        #
+        # Check if any of the linuxkpi or drm kernel module files contain
+        # references to this symbol.
+
+        # Get the /boot/kernel/ and /boot/modules paths, convert the list to a
+        # space separated list instead of semicolon separated so we can iterate
+        # over it.
+        if [ -z "${CONFTEST_BSD_KMODPATHS}" ] ; then
+            KMODPATHS=`sysctl -n kern.module_path | sed -e "s/;/ /g"`
+        else
+            KMODPATHS="${CONFTEST_BSD_KMODPATHS}"
+        fi
+
+        for KMOD in linuxkpi.ko linuxkpi_gplv2.ko drm.ko dmabuf.ko ; do
+            for KMODPATH in $KMODPATHS; do
+                if [ -e "$KMODPATH/$KMOD" ] ; then
+                    if nm "$KMODPATH/$KMOD" | grep "$SYMBOL" >/dev/null 2>&1 ; then
+                        return 0
+                    fi
+                fi
+            done
+        done
+    fi
+
+    return 1
+}
+
+export_symbol_present_conftest() {
+
+    SYMBOL="$1"
+
+    if check_symbol_exists $SYMBOL; then
        echo "#define NV_IS_EXPORT_SYMBOL_PRESENT_$SYMBOL 1" |
            append_conftest "symbols"
    else
@@ -1206,6 +1260,36 @@ compile_test() {
            compile_check_conftest "$CODE" "NV_VFIO_DEVICE_OPS_HAS_BIND_IOMMUFD" "" "types"
        ;;

+        vfio_device_ops_has_detach_ioas)
+            #
+            # Determine if 'vfio_device_ops' struct has 'detach_ioas' field.
+            #
+            # Added by commit 9048c7341c4df9cae04c154a8b0f556dbe913358 ("vfio-iommufd: Add detach_ioas
+            # support for physical VFIO devices
+            #
+            CODE="
+            #include <linux/pci.h>
+            #include <linux/vfio.h>
+            int conftest_vfio_device_ops_has_detach_ioas(void) {
+                return offsetof(struct vfio_device_ops, detach_ioas);
+            }"
+
+            compile_check_conftest "$CODE" "NV_VFIO_DEVICE_OPS_HAS_DETACH_IOAS" "" "types"
+        ;;
+
+        pfn_address_space)
+            #
+            # Determine if 'struct pfn_address_space' structure is present or not.
+            #
+            CODE="
+            #include <linux/memory-failure.h>
+            void conftest_pfn_address_space() {
+                struct pfn_address_space pfn_address_space;
+            }"
+
+            compile_check_conftest "$CODE" "NV_PFN_ADDRESS_SPACE_STRUCT_PRESENT" "" "types"
+        ;;
+
        pci_irq_vector_helpers)
            #
            # Determine if pci_alloc_irq_vectors(), pci_free_irq_vectors()
@@ -1332,6 +1416,42 @@ compile_test() {
            compile_check_conftest "$CODE" "NV_VFIO_REGISTER_EMULATED_IOMMU_DEV_PRESENT" "" "functions"
        ;;

+        bus_type_has_iommu_ops)
+            #
+            # Determine if 'bus_type' structure has a 'iommu_ops' field.
+            #
+            # This field was removed by commit 17de3f5fdd35 (iommu: Retire bus ops)
+            # in v6.8
+            #
+            CODE="
+            #include <linux/device.h>
+
+            int conftest_bus_type_has_iommu_ops(void) {
+                return offsetof(struct bus_type, iommu_ops);
+            }"
+
+            compile_check_conftest "$CODE" "NV_BUS_TYPE_HAS_IOMMU_OPS" "" "types"
+        ;;
+
+        eventfd_signal_has_counter_arg)
+            #
+            # Determine if eventfd_signal() function has an additional 'counter' argument.
+            #
+            # This argument was removed by commit 3652117f8548 (eventfd: simplify
+            # eventfd_signal()) in v6.8
+            #
+            CODE="
+            #include <linux/eventfd.h>
+
+            void conftest_eventfd_signal_has_counter_arg(void) {
+                struct eventfd_ctx *ctx;
+
+                eventfd_signal(ctx, 1);
+            }"
+
+            compile_check_conftest "$CODE" "NV_EVENTFD_SIGNAL_HAS_COUNTER_ARG" "" "types"
+        ;;
+
        drm_available)
            # Determine if the DRM subsystem is usable
            CODE="
@@ -1343,7 +1463,7 @@ compile_test() {
            #include <drm/drm_drv.h>
            #endif

-            #if !defined(CONFIG_DRM) && !defined(CONFIG_DRM_MODULE)
+            #if !defined(CONFIG_DRM) && !defined(CONFIG_DRM_MODULE) && !defined(__FreeBSD__)
            #error DRM not enabled
            #endif

@@ -1807,7 +1927,7 @@ compile_test() {
            #include <drm/drmP.h>
            #endif
            #include <drm/drm_atomic.h>
-            #if !defined(CONFIG_DRM) && !defined(CONFIG_DRM_MODULE)
+            #if !defined(CONFIG_DRM) && !defined(CONFIG_DRM_MODULE) && !defined(__FreeBSD__)
            #error DRM not enabled
            #endif
            void conftest_drm_atomic_modeset_available(void) {
@@ -3012,6 +3132,22 @@ compile_test() {

        ;;

+        foll_longterm_present)
+            #
+            # Determine if FOLL_LONGTERM enum is present or not
+            #
+            # Added by commit 932f4a630a69 ("mm/gup: replace
+            # get_user_pages_longterm() with FOLL_LONGTERM") in
+            # v5.2
+            #
+            CODE="
+            #include <linux/mm.h>
+            int foll_longterm = FOLL_LONGTERM;
+            "
+
+            compile_check_conftest "$CODE" "NV_FOLL_LONGTERM_PRESENT" "" "types"
+        ;;
+
        vfio_pin_pages_has_vfio_device_arg)
            #
            # Determine if vfio_pin_pages() kABI accepts "struct vfio_device *"
@@ -5068,11 +5204,15 @@ compile_test() {
            # commit 49a3f51dfeee ("drm/gem: Use struct dma_buf_map in GEM
            # vmap ops and convert GEM backends") in v5.11.
            #
+            # Note that the 'map' argument type is changed from 'struct dma_buf_map'
+            # to 'struct iosys_map' by commit 7938f4218168 ("dma-buf-map: Rename
+            # to iosys-map) in v5.18.
+            #
            CODE="
            #include <drm/drm_gem.h>
            int conftest_drm_gem_object_vmap_has_map_arg(
-                    struct drm_gem_object *obj, struct dma_buf_map *map) {
-                return obj->funcs->vmap(obj, map);
+                    struct drm_gem_object *obj) {
+                return obj->funcs->vmap(obj, NULL);
            }"

            compile_check_conftest "$CODE" "NV_DRM_GEM_OBJECT_VMAP_HAS_MAP_ARG" "" "types"
@@ -5112,25 +5252,23 @@ compile_test() {
            compile_check_conftest "$CODE" "NV_PCI_CLASS_MULTIMEDIA_HD_AUDIO_PRESENT" "" "generic"
        ;;

-        unsafe_follow_pfn)
+        follow_pfn)
            #
-            # Determine if unsafe_follow_pfn() is present.
+            # Determine if follow_pfn() is present.
            #
-            # unsafe_follow_pfn() was added by commit 69bacee7f9ad
-            # ("mm: Add unsafe_follow_pfn") in v5.13-rc1.
-            #
-            # Note: this commit never made it to the linux kernel, so
-            # unsafe_follow_pfn() never existed.
+            # follow_pfn() was added by commit 3b6748e2dd69
+            # ("mm: introduce follow_pfn()") in v2.6.31-rc1, and removed
+            # by commit 233eb0bf3b94 ("mm: remove follow_pfn")
+            # from linux-next 233eb0bf3b94.
            #
            CODE="
            #include <linux/mm.h>
-            void conftest_unsafe_follow_pfn(void) {
-                unsafe_follow_pfn();
+            void conftest_follow_pfn(void) {
+                follow_pfn();
            }"

-            compile_check_conftest "$CODE" "NV_UNSAFE_FOLLOW_PFN_PRESENT" "" "functions"
+            compile_check_conftest "$CODE" "NV_FOLLOW_PFN_PRESENT" "" "functions"
        ;;
-
        drm_plane_atomic_check_has_atomic_state_arg)
            #
            # Determine if drm_plane_helper_funcs::atomic_check takes 'state'
@@ -5203,10 +5341,16 @@ compile_test() {
            # Added by commit 7b7b27214bba ("mm/memory_hotplug: introduce
            # add_memory_driver_managed()") in v5.8.
            #
+            # Before commit 3a0aaefe4134 ("mm/memory_hotplug: guard more
+            # declarations by CONFIG_MEMORY_HOTPLUG") in v5.10, the
+            # add_memory_driver_managed() was not guarded.
+            #
            CODE="
            #include <linux/memory_hotplug.h>
            void conftest_add_memory_driver_managed() {
+            #if defined(CONFIG_MEMORY_HOTPLUG)
                add_memory_driver_managed();
+            #endif
            }"

            compile_check_conftest "$CODE" "NV_ADD_MEMORY_DRIVER_MANAGED_PRESENT" "" "functions"
@@ -5410,7 +5554,8 @@ compile_test() {

        of_dma_configure)
            #
-            # Determine if of_dma_configure() function is present
+            # Determine if of_dma_configure() function is present, and how
+            # many arguments it takes.
            #
            # Added by commit 591c1ee465ce ("of: configure the platform
            # device dma parameters") in v3.16.  However, it was a static,
@@ -5420,17 +5565,69 @@ compile_test() {
            # commit 1f5c69aa51f9 ("of: Move of_dma_configure() to device.c
            # to help re-use") in v4.1.
            #
-            CODE="
+            # It subsequently began taking a third parameter with commit
+            # 3d6ce86ee794 ("drivers: remove force dma flag from buses")
+            # in v4.18.
+            #
+
+            echo "$CONFTEST_PREAMBLE
            #if defined(NV_LINUX_OF_DEVICE_H_PRESENT)
            #include <linux/of_device.h>
            #endif
+
            void conftest_of_dma_configure(void)
            {
                of_dma_configure();
            }
-            "
+            " > conftest$$.c

-            compile_check_conftest "$CODE" "NV_OF_DMA_CONFIGURE_PRESENT" "" "functions"
+            $CC $CFLAGS -c conftest$$.c > /dev/null 2>&1
+            rm -f conftest$$.c
+
+            if [ -f conftest$$.o ]; then
+                rm -f conftest$$.o
+
+                echo "#undef NV_OF_DMA_CONFIGURE_PRESENT" | append_conftest "functions"
+                echo "#undef NV_OF_DMA_CONFIGURE_ARGUMENT_COUNT" | append_conftest "functions"
+            else
+                echo "#define NV_OF_DMA_CONFIGURE_PRESENT" | append_conftest "functions"
+
+                echo "$CONFTEST_PREAMBLE
+                #if defined(NV_LINUX_OF_DEVICE_H_PRESENT)
+                #include <linux/of_device.h>
+                #endif
+
+                void conftest_of_dma_configure(void) {
+                    of_dma_configure(NULL, NULL, false);
+                }" > conftest$$.c
+
+                $CC $CFLAGS -c conftest$$.c > /dev/null 2>&1
+                rm -f conftest$$.c
+
+                if [ -f conftest$$.o ]; then
+                    rm -f conftest$$.o
+                    echo "#define NV_OF_DMA_CONFIGURE_ARGUMENT_COUNT 3" | append_conftest "functions"
+                    return
+                fi
+
+                echo "$CONFTEST_PREAMBLE
+                #if defined(NV_LINUX_OF_DEVICE_H_PRESENT)
+                #include <linux/of_device.h>
+                #endif
+
+                void conftest_of_dma_configure(void) {
+                    of_dma_configure(NULL, NULL);
+                }" > conftest$$.c
+
+                $CC $CFLAGS -c conftest$$.c > /dev/null 2>&1
+                rm -f conftest$$.c
+
+                if [ -f conftest$$.o ]; then
+                    rm -f conftest$$.o
+                    echo "#define NV_OF_DMA_CONFIGURE_ARGUMENT_COUNT 2" | append_conftest "functions"
+                    return
+                fi
+            fi
        ;;

        icc_get)
@@ -5669,22 +5866,6 @@ compile_test() {
            compile_check_conftest "$CODE" "NV_GPIO_TO_IRQ_PRESENT" "" "functions"
        ;;

-        migrate_vma_setup)
-            #
-            # Determine if migrate_vma_setup() function is present
-            #
-            # Added by commit a7d1f22bb74f ("mm: turn migrate_vma upside
-            # down") in v5.4.
-            #
-            CODE="
-            #include <linux/migrate.h>
-            int conftest_migrate_vma_setup(void) {
-                migrate_vma_setup();
-            }"
-
-            compile_check_conftest "$CODE" "NV_MIGRATE_VMA_SETUP_PRESENT" "" "functions"
-        ;;
-
        migrate_vma_added_flags)
            #
            # Determine if migrate_vma structure has flags
@@ -5795,6 +5976,24 @@ compile_test() {
            compile_check_conftest "$CODE" "NV_MM_PASID_DROP_PRESENT" "" "functions"
        ;;

+        iommu_is_dma_domain)
+            #
+            # Determine if iommu_is_dma_domain() function is present
+            # this also assumes that iommu_get_domain_for_dev() function is
+            # present.
+            #
+            # Added by commit bf3aed4660c6 ("iommu: Introduce explicit type
+            # for non-strict DMA domains") in v5.15
+            #
+            CODE="
+            #include <linux/iommu.h>
+            void conftest_iommu_is_dma_domain(void) {
+                iommu_is_dma_domain();
+            }"
+
+            compile_check_conftest "$CODE" "NV_IOMMU_IS_DMA_DOMAIN_PRESENT" "" "functions"
+        ;;
+
        drm_crtc_state_has_no_vblank)
            #
            # Determine if the 'drm_crtc_state' structure has 'no_vblank'.
@@ -6483,6 +6682,21 @@ compile_test() {
            compile_check_conftest "$CODE" "NV_FIND_NEXT_BIT_WRAP_PRESENT" "" "functions"
        ;;

+        crypto_tfm_ctx_aligned)
+            # Determine if 'crypto_tfm_ctx_aligned' is defined.
+            #
+            # Removed by commit 25c74a39e0f6 ("crypto: hmac - remove unnecessary
+            # alignment logic") in v6.7.
+            #
+            CODE="
+            #include <crypto/algapi.h>
+            void conftest_crypto_tfm_ctx_aligned(void) {
+                  (void)crypto_tfm_ctx_aligned();
+            }"
+
+            compile_check_conftest "$CODE" "NV_CRYPTO_TFM_CTX_ALIGNED_PRESENT" "" "functions"
+        ;;
+
        crypto)
            #
            # Determine if we support various crypto functions.
@@ -6604,9 +6818,9 @@ compile_test() {
            # 'supported_colorspaces' argument.
            #
            # The 'u32 supported_colorspaces' argument was added to
-            # drm_mode_create_dp_colorspace_property() by linux-next commit
+            # drm_mode_create_dp_colorspace_property() by commit
            # c265f340eaa8 ("drm/connector: Allow drivers to pass list of
-            # supported colorspaces").
+            # supported colorspaces") in v6.5.
            #
            # To test if drm_mode_create_dp_colorspace_property() has the
            # 'supported_colorspaces' argument, declare a function prototype
@@ -6634,6 +6848,148 @@ compile_test() {
            compile_check_conftest "$CODE" "NV_DRM_MODE_CREATE_DP_COLORSPACE_PROPERTY_HAS_SUPPORTED_COLORSPACES_ARG" "" "types"
        ;;

+        drm_syncobj_features_present)
+            # Determine if DRIVER_SYNCOBJ and DRIVER_SYNCOBJ_TIMELINE DRM
+            # driver features are present. Timeline DRM synchronization objects
+            # may only be used if both of these are supported by the driver.
+            #
+            # DRIVER_SYNCOBJ_TIMELINE Added by commit 060cebb20cdb ("drm:
+            # introduce a capability flag for syncobj timeline support") in
+            # v5.2
+            #
+            # DRIVER_SYNCOBJ Added by commit e9083420bbac ("drm: introduce
+            # sync objects (v4)") in v4.12
+            CODE="
+            #if defined(NV_DRM_DRM_DRV_H_PRESENT)
+            #include <drm/drm_drv.h>
+            #endif
+            int features = DRIVER_SYNCOBJ | DRIVER_SYNCOBJ_TIMELINE;"
+
+            compile_check_conftest "$CODE" "NV_DRM_SYNCOBJ_FEATURES_PRESENT" "" "types"
+        ;;
+
+        stack_trace)
+            # Determine if functions stack_trace_{save,print} are present.
+            # Added by commit e9b98e162 ("stacktrace: Provide helpers for
+            # common stack trace operations") in v5.2.
+            CODE="
+            #include <linux/stacktrace.h>
+            void conftest_stack_trace(void) {
+                stack_trace_save();
+                stack_trace_print();
+            }"
+
+            compile_check_conftest "$CODE" "NV_STACK_TRACE_PRESENT" "" "functions"
+        ;;
+
+        drm_unlocked_ioctl_flag_present)
+            # Determine if DRM_UNLOCKED IOCTL flag is present.
+            #
+            # DRM_UNLOCKED was removed by commit 2798ffcc1d6a ("drm: Remove
+            # locking for legacy ioctls and DRM_UNLOCKED") in v6.8.
+            #
+            # DRM_UNLOCKED definition was moved from drmP.h to drm_ioctl.h by
+            # commit 2640981f3600 ("drm: document drm_ioctl.[hc]") in v4.12.
+            CODE="
+            #if defined(NV_DRM_DRM_IOCTL_H_PRESENT)
+            #include <drm/drm_ioctl.h>
+            #endif
+            #if defined(NV_DRM_DRMP_H_PRESENT)
+            #include <drm/drmP.h>
+            #endif
+            int flags = DRM_UNLOCKED;"
+
+            compile_check_conftest "$CODE" "NV_DRM_UNLOCKED_IOCTL_FLAG_PRESENT" "" "types"
+        ;;
+
+        fault_flag_remote_present)
+            # Determine if FAULT_FLAG_REMOTE is present in the kernel, either
+            # as a define or an enum
+            #
+            # FAULT_FLAG_REMOTE define added by Kernel commit 1b2ee1266ea6
+            # ("mm/core: Do not enforce PKEY permissions on remote mm access")
+            # in v4.6
+            # FAULT_FLAG_REMOTE changed from define to enum by Kernel commit
+            # da2f5eb3d344 ("mm/doc: turn fault flags into an enum") in v5.13
+            # FAULT_FLAG_REMOTE moved from `mm.h` to `mm_types.h` by Kernel
+            # commit 36090def7bad ("mm: move tlb_flush_pending inline helpers
+            # to mm_inline.h") in v5.17
+            #
+            CODE="
+            #include <linux/mm.h>
+            int fault_flag_remote = FAULT_FLAG_REMOTE;
+            "
+
+            compile_check_conftest "$CODE" "NV_MM_HAS_FAULT_FLAG_REMOTE" "" "types"
+        ;;
+
+        drm_framebuffer_obj_present)
+            #
+            # Determine if the drm_framebuffer struct has an obj member.
+            #
+            # Added by commit 4c3dbb2c312c ("drm: Add GEM backed framebuffer
+            # library") in v4.14.
+            #
+            CODE="
+            #if defined(NV_DRM_DRMP_H_PRESENT)
+            #include <drm/drmP.h>
+            #endif
+
+            #if defined(NV_DRM_DRM_FRAMEBUFFER_H_PRESENT)
+            #include <drm/drm_framebuffer.h>
+            #endif
+
+            int conftest_drm_framebuffer_obj_present(void) {
+                return offsetof(struct drm_framebuffer, obj);
+            }"
+
+            compile_check_conftest "$CODE" "NV_DRM_FRAMEBUFFER_OBJ_PRESENT" "" "types"
+        ;;
+
+        drm_color_ctm_3x4_present)
+            # Determine if struct drm_color_ctm_3x4 is present.
+            #
+            # struct drm_color_ctm_3x4 was added by commit 6872a189be50
+            # ("drm/amd/display: Add 3x4 CTM support for plane CTM") in v6.8.
+            CODE="
+            #include <uapi/drm/drm_mode.h>
+            struct drm_color_ctm_3x4 ctm;"
+
+            compile_check_conftest "$CODE" "NV_DRM_COLOR_CTM_3X4_PRESENT" "" "types"
+        ;;
+
+        drm_color_lut)
+            # Determine if struct drm_color_lut is present.
+            #
+            # struct drm_color_lut was added by commit 5488dc16fde7
+            # ("drm: introduce pipe color correction properties") in v4.6.
+            CODE="
+            #include <uapi/drm/drm_mode.h>
+            struct drm_color_lut lut;"
+
+            compile_check_conftest "$CODE" "NV_DRM_COLOR_LUT_PRESENT" "" "types"
+        ;;
+
+        drm_property_blob_put)
+            #
+            # Determine if function drm_property_blob_put() is present.
+            #
+            # Added by commit 6472e5090be7 ("drm: Introduce
+            # drm_property_blob_{get,put}()") v4.12, when it replaced
+            # drm_property_unreference_blob().
+            #
+
+            CODE="
+            #if defined(NV_DRM_DRM_PROPERTY_H_PRESENT)
+            #include <drm/drm_property.h>
+            #endif
+            void conftest_drm_property_blob_put(void) {
+                drm_property_blob_put();
+            }"
+
+            compile_check_conftest "$CODE" "NV_DRM_PROPERTY_BLOB_PUT_PRESENT" "" "functions"
+        ;;
+
        # When adding a new conftest entry, please use the correct format for
        # specifying the relevant upstream Linux kernel commit.  Please
        # avoid specifying -rc kernels, and only use SHAs that actually exist
@@ -6935,10 +7291,12 @@ case "$5" in
        #
        VERBOSE=$6
        iommu=CONFIG_VFIO_IOMMU_TYPE1
+        iommufd_vfio_container=CONFIG_IOMMUFD_VFIO_CONTAINER
        mdev=CONFIG_VFIO_MDEV
        kvm=CONFIG_KVM_VFIO
        vfio_pci_core=CONFIG_VFIO_PCI_CORE
        VFIO_IOMMU_PRESENT=0
+        VFIO_IOMMUFD_VFIO_CONTAINER_PRESENT=0
        VFIO_MDEV_PRESENT=0
        KVM_PRESENT=0
        VFIO_PCI_CORE_PRESENT=0
@@ -6948,6 +7306,10 @@ case "$5" in
                VFIO_IOMMU_PRESENT=1
            fi

+            if (test_configuration_option ${iommufd_vfio_container} || test_configuration_option ${iommufd_vfio_container}_MODULE); then
+                VFIO_IOMMUFD_VFIO_CONTAINER_PRESENT=1
+            fi
+
            if (test_configuration_option ${mdev} || test_configuration_option ${mdev}_MODULE); then
                VFIO_MDEV_PRESENT=1
            fi
@@ -6960,36 +7322,23 @@ case "$5" in
                VFIO_PCI_CORE_PRESENT=1
            fi

-            # When this sanity check is run via nvidia-installer, it sets ARCH as aarch64.
-            # But, when it is run via Kbuild, ARCH is set as arm64
-            if [ "$ARCH" = "aarch64" ]; then
-                ARCH="arm64"
-            fi
-
-            if [ "$VFIO_IOMMU_PRESENT" != "0" ] && [ "$KVM_PRESENT" != "0" ] ; then
-
-                # On x86_64, vGPU requires MDEV framework to be present.
-                # On aarch64, vGPU requires MDEV or vfio-pci-core framework to be present.
-                if ([ "$ARCH" = "arm64" ] && ([ "$VFIO_MDEV_PRESENT" != "0" ] || [ "$VFIO_PCI_CORE_PRESENT" != "0" ])) ||
-                   ([ "$ARCH" = "x86_64" ] && [ "$VFIO_MDEV_PRESENT" != "0" ];) then
+            if ([ "$VFIO_IOMMU_PRESENT" != "0" ] || [ "$VFIO_IOMMUFD_VFIO_CONTAINER_PRESENT" != "0" ])&& [ "$KVM_PRESENT" != "0" ] ; then
+                # vGPU requires either MDEV or vfio-pci-core framework to be present.
+                if [ "$VFIO_MDEV_PRESENT" != "0" ] || [ "$VFIO_PCI_CORE_PRESENT" != "0" ]; then
                    exit 0
                fi
            fi

            echo "Below CONFIG options are missing on the kernel for installing";
            echo "NVIDIA vGPU driver on KVM host";
-            if [ "$VFIO_IOMMU_PRESENT" = "0" ]; then
-                echo "CONFIG_VFIO_IOMMU_TYPE1";
+            if [ "$VFIO_IOMMU_PRESENT" = "0" ] && [ "$VFIO_IOMMUFD_VFIO_CONTAINER_PRESENT" = "0" ]; then
+                echo "either CONFIG_VFIO_IOMMU_TYPE1 or CONFIG_IOMMUFD_VFIO_CONTAINER";
            fi

-            if [ "$ARCH" = "arm64" ] && [ "$VFIO_MDEV_PRESENT" = "0" ] && [ "$VFIO_PCI_CORE_PRESENT" = "0" ]; then
+            if [ "$VFIO_MDEV_PRESENT" = "0" ] && [ "$VFIO_PCI_CORE_PRESENT" = "0" ]; then
                echo "either CONFIG_VFIO_MDEV or CONFIG_VFIO_PCI_CORE";
            fi

-            if [ "$ARCH" = "x86_64" ] && [ "$VFIO_MDEV_PRESENT" = "0" ]; then
-                echo "CONFIG_VFIO_MDEV";
-            fi
-
            if [ "$KVM_PRESENT" = "0" ]; then
                echo "CONFIG_KVM";
            fi
--- a/kernel-open/header-presence-tests.mk
+++ b/kernel-open/header-presence-tests.mk
@@ -0,0 +1,103 @@
+# Each of these headers is checked for presence with a test #include; a
+# corresponding #define will be generated in conftest/headers.h.
+NV_HEADER_PRESENCE_TESTS = \
+  asm/system.h \
+  drm/drmP.h \
+  drm/drm_aperture.h \
+  drm/drm_auth.h \
+  drm/drm_gem.h \
+  drm/drm_crtc.h \
+  drm/drm_color_mgmt.h \
+  drm/drm_atomic.h \
+  drm/drm_atomic_helper.h \
+  drm/drm_atomic_state_helper.h \
+  drm/drm_encoder.h \
+  drm/drm_atomic_uapi.h \
+  drm/drm_drv.h \
+  drm/drm_fbdev_generic.h \
+  drm/drm_framebuffer.h \
+  drm/drm_connector.h \
+  drm/drm_probe_helper.h \
+  drm/drm_blend.h \
+  drm/drm_fourcc.h \
+  drm/drm_prime.h \
+  drm/drm_plane.h \
+  drm/drm_vblank.h \
+  drm/drm_file.h \
+  drm/drm_ioctl.h \
+  drm/drm_device.h \
+  drm/drm_mode_config.h \
+  drm/drm_modeset_lock.h \
+  drm/drm_property.h \
+  dt-bindings/interconnect/tegra_icc_id.h \
+  generated/autoconf.h \
+  generated/compile.h \
+  generated/utsrelease.h \
+  linux/efi.h \
+  linux/kconfig.h \
+  linux/platform/tegra/mc_utils.h \
+  linux/printk.h \
+  linux/ratelimit.h \
+  linux/prio_tree.h \
+  linux/log2.h \
+  linux/of.h \
+  linux/bug.h \
+  linux/sched.h \
+  linux/sched/mm.h \
+  linux/sched/signal.h \
+  linux/sched/task.h \
+  linux/sched/task_stack.h \
+  xen/ioemu.h \
+  linux/fence.h \
+  linux/dma-fence.h \
+  linux/dma-resv.h \
+  soc/tegra/chip-id.h \
+  soc/tegra/fuse.h \
+  soc/tegra/fuse-helper.h \
+  soc/tegra/tegra_bpmp.h \
+  video/nv_internal.h \
+  linux/platform/tegra/dce/dce-client-ipc.h \
+  linux/nvhost.h \
+  linux/nvhost_t194.h \
+  linux/host1x-next.h \
+  asm/book3s/64/hash-64k.h \
+  asm/set_memory.h \
+  asm/prom.h \
+  asm/powernv.h \
+  linux/atomic.h \
+  asm/barrier.h \
+  asm/opal-api.h \
+  sound/hdaudio.h \
+  asm/pgtable_types.h \
+  asm/page.h \
+  linux/stringhash.h \
+  linux/dma-map-ops.h \
+  rdma/peer_mem.h \
+  sound/hda_codec.h \
+  linux/dma-buf.h \
+  linux/time.h \
+  linux/platform_device.h \
+  linux/mutex.h \
+  linux/reset.h \
+  linux/of_platform.h \
+  linux/of_device.h \
+  linux/of_gpio.h \
+  linux/gpio.h \
+  linux/gpio/consumer.h \
+  linux/interconnect.h \
+  linux/pm_runtime.h \
+  linux/clk.h \
+  linux/clk-provider.h \
+  linux/ioasid.h \
+  linux/stdarg.h \
+  linux/iosys-map.h \
+  asm/coco.h \
+  linux/vfio_pci_core.h \
+  linux/mdev.h \
+  soc/tegra/bpmp-abi.h \
+  soc/tegra/bpmp.h \
+  linux/sync_file.h \
+  linux/cc_platform.h \
+  asm/cpufeature.h \
+  linux/mpi.h
+
--- a/kernel-open/nvidia-drm/nv-kthread-q.c
+++ b/kernel-open/nvidia-drm/nv-kthread-q.c
@@ -1,5 +1,5 @@
 /*
- * SPDX-FileCopyrightText: Copyright (c) 2016 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
+ * SPDX-FileCopyrightText: Copyright (c) 2016-2024 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
 * SPDX-License-Identifier: MIT
 *
 * Permission is hereby granted, free of charge, to any person obtaining a
@@ -176,7 +176,7 @@ static struct task_struct *thread_create_on_node(int (*threadfn)(void *data),
 {

    unsigned i, j;
-    const static unsigned attempts = 3;
+    static const unsigned attempts = 3;
    struct task_struct *thread[3];

    for (i = 0;; i++) {
--- a/kernel-open/nvidia-drm/nv-pci-table.c
+++ b/kernel-open/nvidia-drm/nv-pci-table.c
@@ -25,6 +25,15 @@
 #include <linux/module.h>

 #include "nv-pci-table.h"
+#include "cpuopsys.h"
+
+#if defined(NV_BSD)
+/* Define PCI classes that FreeBSD's linuxkpi is missing */
+#define PCI_VENDOR_ID_NVIDIA 0x10de
+#define PCI_CLASS_DISPLAY_VGA 0x0300
+#define PCI_CLASS_DISPLAY_3D 0x0302
+#define PCI_CLASS_BRIDGE_OTHER 0x0680
+#endif

 /* Devices supported by RM */
 struct pci_device_id nv_pci_table[] = {
@@ -48,7 +57,7 @@ struct pci_device_id nv_pci_table[] = {
 };

 /* Devices supported by all drivers in nvidia.ko */
-struct pci_device_id nv_module_device_table[] = {
+struct pci_device_id nv_module_device_table[4] = {
    {
        .vendor      = PCI_VENDOR_ID_NVIDIA,
        .device      = PCI_ANY_ID,
@@ -76,4 +85,6 @@ struct pci_device_id nv_module_device_table[] = {
    { }
 };

+#if defined(NV_LINUX)
 MODULE_DEVICE_TABLE(pci, nv_module_device_table);
+#endif
--- a/kernel-open/nvidia-drm/nv-pci-table.h
+++ b/kernel-open/nvidia-drm/nv-pci-table.h
@@ -27,5 +27,6 @@
 #include <linux/pci.h>

 extern struct pci_device_id nv_pci_table[];
+extern struct pci_device_id nv_module_device_table[4];

 #endif /* _NV_PCI_TABLE_H_ */
--- a/kernel-open/nvidia-drm/nvidia-drm-conftest.h
+++ b/kernel-open/nvidia-drm/nvidia-drm-conftest.h
@@ -24,6 +24,7 @@
 #define __NVIDIA_DRM_CONFTEST_H__

 #include "conftest.h"
+#include "nvtypes.h"

 /*
 * NOTE: This file is expected to get included at the top before including any
@@ -72,4 +73,121 @@
 #undef NV_DRM_COLOR_MGMT_AVAILABLE
 #endif

+/*
+ * Adapt to quirks in FreeBSD's Linux kernel compatibility layer.
+ */
+#if defined(NV_BSD)
+
+#include <linux/rwsem.h>
+#include <sys/param.h>
+#include <sys/lock.h>
+#include <sys/sx.h>
+
+/* For nv_drm_gem_prime_force_fence_signal */
+#ifndef spin_is_locked
+#define spin_is_locked(lock) mtx_owned(lock.m)
+#endif
+
+#ifndef rwsem_is_locked
+#define rwsem_is_locked(sem) (((sem)->sx.sx_lock & (SX_LOCK_SHARED)) \
+                              || ((sem)->sx.sx_lock & ~(SX_LOCK_FLAGMASK & ~SX_LOCK_SHARED)))
+#endif
+
+/*
+ * FreeBSD does not define vm_flags_t in its linuxkpi, since there is already
+ * a FreeBSD vm_flags_t (of a different size) and they don't want the names to
+ * collide. Temporarily redefine it when including nv-mm.h
+ */
+#define vm_flags_t unsigned long
+#include "nv-mm.h"
+#undef vm_flags_t
+
+/*
+ * sys/nv.h and nvidia/nv.h have the same header guard
+ * we need to clear it for nvlist_t to get loaded
+ */
+#undef _NV_H_
+#include <sys/nv.h>
+
+/*
+ * For now just use set_page_dirty as the lock variant
+ * is not ported for FreeBSD. (in progress). This calls
+ * vm_page_dirty. Used in nv-mm.h
+ */
+#define set_page_dirty_lock set_page_dirty
+
+/*
+ * FreeBSD does not implement drm_atomic_state_free, simply
+ * default to drm_atomic_state_put
+ */
+#define drm_atomic_state_free drm_atomic_state_put
+
+#if __FreeBSD_version < 1300000
+/* redefine LIST_HEAD_INIT to the linux version */
+#include <linux/list.h>
+#define LIST_HEAD_INIT(name) LINUX_LIST_HEAD_INIT(name)
+#endif
+
+/*
+ * FreeBSD currently has only vmf_insert_pfn_prot defined, and it has a
+ * static assert warning not to use it since all of DRM's usages are in
+ * loops with the vm obj lock(s) held. Instead we should use the lkpi
+ * function itself directly. For us none of this applies so we can just
+ * wrap it in our own definition of vmf_insert_pfn
+ */
+#ifndef NV_VMF_INSERT_PFN_PRESENT
+#define NV_VMF_INSERT_PFN_PRESENT 1
+
+#if __FreeBSD_version < 1300000
+#define VM_SHARED       (1 << 17)
+
+/* Not present in 12.2 */
+static inline vm_fault_t
+lkpi_vmf_insert_pfn_prot_locked(struct vm_area_struct *vma, unsigned long addr,
+    unsigned long pfn, pgprot_t prot)
+{
+       vm_object_t vm_obj = vma->vm_obj;
+       vm_page_t page;
+       vm_pindex_t pindex;
+
+       VM_OBJECT_ASSERT_WLOCKED(vm_obj);
+       pindex = OFF_TO_IDX(addr - vma->vm_start);
+       if (vma->vm_pfn_count == 0)
+               vma->vm_pfn_first = pindex;
+       MPASS(pindex <= OFF_TO_IDX(vma->vm_end));
+
+       page = vm_page_grab(vm_obj, pindex, VM_ALLOC_NORMAL);
+       if (page == NULL) {
+               page = PHYS_TO_VM_PAGE(IDX_TO_OFF(pfn));
+               vm_page_xbusy(page);
+               if (vm_page_insert(page, vm_obj, pindex)) {
+                       vm_page_xunbusy(page);
+                       return (VM_FAULT_OOM);
+               }
+               page->valid = VM_PAGE_BITS_ALL;
+       }
+       pmap_page_set_memattr(page, pgprot2cachemode(prot));
+       vma->vm_pfn_count++;
+
+       return (VM_FAULT_NOPAGE);
+}
+#endif
+
+static inline vm_fault_t
+vmf_insert_pfn(struct vm_area_struct *vma, unsigned long addr,
+    unsigned long pfn)
+{
+       vm_fault_t ret;
+
+       VM_OBJECT_WLOCK(vma->vm_obj);
+       ret = lkpi_vmf_insert_pfn_prot_locked(vma, addr, pfn, vma->vm_page_prot);
+       VM_OBJECT_WUNLOCK(vma->vm_obj);
+
+       return (ret);
+}
+
+#endif
+
+#endif /* defined(NV_BSD) */
+
 #endif /* defined(__NVIDIA_DRM_CONFTEST_H__) */
--- a/kernel-open/nvidia-drm/nvidia-drm-crtc.c
+++ b/kernel-open/nvidia-drm/nvidia-drm-crtc.c
@@ -42,12 +42,6 @@
 #include <drm/drm_atomic.h>
 #include <drm/drm_atomic_helper.h>

-#if defined(NV_LINUX_NVHOST_H_PRESENT) && defined(CONFIG_TEGRA_GRHOST)
-#include <linux/nvhost.h>
-#elif defined(NV_LINUX_HOST1X_NEXT_H_PRESENT)            
-#include <linux/host1x-next.h>
-#endif
-
 #if defined(NV_DRM_DRM_COLOR_MGMT_H_PRESENT)
 #include <drm/drm_color_mgmt.h>
 #endif
@@ -92,11 +86,22 @@ static void nv_drm_plane_destroy(struct drm_plane *plane)
    nv_drm_free(nv_plane);
 }

+static inline void
+plane_config_clear(struct NvKmsKapiLayerConfig *layerConfig)
+{
+    if (layerConfig == NULL) {
+        return;
+    }
+
+    memset(layerConfig, 0, sizeof(*layerConfig));
+    layerConfig->csc = NVKMS_IDENTITY_CSC_MATRIX;
+}
+
 static inline void
 plane_req_config_disable(struct NvKmsKapiLayerRequestedConfig *req_config)
 {
    /* Clear layer config */
-    memset(&req_config->config, 0, sizeof(req_config->config));
+    plane_config_clear(&req_config->config);

    /* Set flags to get cleared layer config applied */
    req_config->flags.surfaceChanged = NV_TRUE;
@@ -113,6 +118,45 @@ cursor_req_config_disable(struct NvKmsKapiCursorRequestedConfig *req_config)
    req_config->flags.surfaceChanged = NV_TRUE;
 }

+#if defined(NV_DRM_COLOR_MGMT_AVAILABLE)
+static void color_mgmt_config_ctm_to_csc(struct NvKmsCscMatrix *nvkms_csc,
+                                         struct drm_color_ctm  *drm_ctm)
+{
+    int y;
+
+    /* CTM is a 3x3 matrix while ours is 3x4. Zero out the last column. */
+    nvkms_csc->m[0][3] = nvkms_csc->m[1][3] = nvkms_csc->m[2][3] = 0;
+
+    for (y = 0; y < 3; y++) {
+        int x;
+
+        for (x = 0; x < 3; x++) {
+            /*
+             * Values in the CTM are encoded in S31.32 sign-magnitude fixed-
+             * point format, while NvKms CSC values are signed 2's-complement
+             * S15.16 (Ssign-extend12-3.16?) fixed-point format.
+             */
+            NvU64 ctmVal = drm_ctm->matrix[y*3 + x];
+            NvU64 signBit = ctmVal & (1ULL << 63);
+            NvU64 magnitude = ctmVal & ~signBit;
+
+            /*
+             * Drop the low 16 bits of the fractional part and the high 17 bits
+             * of the integral part. Drop 17 bits to avoid corner cases where
+             * the highest resulting bit is a 1, causing the `cscVal = -cscVal`
+             * line to result in a positive number.
+             */
+            NvS32 cscVal = (magnitude >> 16) & ((1ULL << 31) - 1);
+            if (signBit) {
+                cscVal = -cscVal;
+            }
+
+            nvkms_csc->m[y][x] = cscVal;
+        }
+    }
+}
+#endif /* NV_DRM_COLOR_MGMT_AVAILABLE */
+
 static void
 cursor_plane_req_config_update(struct drm_plane *plane,
                               struct drm_plane_state *plane_state,
@@ -126,12 +170,10 @@ cursor_plane_req_config_update(struct drm_plane *plane,
        return;
    }

-    *req_config = (struct NvKmsKapiCursorRequestedConfig) {
-        .surface = to_nv_framebuffer(plane_state->fb)->pSurface,
-
-        .dstX = plane_state->crtc_x,
-        .dstY = plane_state->crtc_y,
-    };
+    memset(req_config, 0, sizeof(*req_config));
+    req_config->surface = to_nv_framebuffer(plane_state->fb)->pSurface;
+    req_config->dstX = plane_state->crtc_x;
+    req_config->dstY = plane_state->crtc_y;

 #if defined(NV_DRM_ALPHA_BLENDING_AVAILABLE)
    if (plane->blend_mode_property != NULL && plane->alpha_property != NULL) {
@@ -216,7 +258,6 @@ plane_req_config_update(struct drm_plane *plane,
 {
    struct nv_drm_plane *nv_plane = to_nv_plane(plane);
    struct NvKmsKapiLayerConfig old_config = req_config->config;
-    struct nv_drm_device *nv_dev = to_nv_device(plane->dev);
    struct nv_drm_plane_state *nv_drm_plane_state =
        to_nv_drm_plane_state(plane_state);

@@ -225,22 +266,22 @@ plane_req_config_update(struct drm_plane *plane,
        return 0;
    }

-    *req_config = (struct NvKmsKapiLayerRequestedConfig) {
-        .config = {
-            .surface = to_nv_framebuffer(plane_state->fb)->pSurface,
+    memset(req_config, 0, sizeof(*req_config));

-            /* Source values are 16.16 fixed point */
-            .srcX = plane_state->src_x >> 16,
-            .srcY = plane_state->src_y >> 16,
-            .srcWidth  = plane_state->src_w >> 16,
-            .srcHeight = plane_state->src_h >> 16,
+    req_config->config.surface = to_nv_framebuffer(plane_state->fb)->pSurface;

-            .dstX = plane_state->crtc_x,
-            .dstY = plane_state->crtc_y,
-            .dstWidth  = plane_state->crtc_w,
-            .dstHeight = plane_state->crtc_h,
-        },
-    };
+    /* Source values are 16.16 fixed point */
+    req_config->config.srcX = plane_state->src_x >> 16;
+    req_config->config.srcY = plane_state->src_y >> 16;
+    req_config->config.srcWidth  = plane_state->src_w >> 16;
+    req_config->config.srcHeight = plane_state->src_h >> 16;
+
+    req_config->config.dstX = plane_state->crtc_x;
+    req_config->config.dstY = plane_state->crtc_y;
+    req_config->config.dstWidth  = plane_state->crtc_w;
+    req_config->config.dstHeight = plane_state->crtc_h;
+
+    req_config->config.csc = old_config.csc;

 #if defined(NV_DRM_ROTATION_AVAILABLE)
    /*
@@ -344,49 +385,16 @@ plane_req_config_update(struct drm_plane *plane,
    req_config->config.inputColorSpace =
        nv_drm_plane_state->input_colorspace;

-    req_config->config.syncptParams.preSyncptSpecified = false;
-    req_config->config.syncptParams.postSyncptRequested = false;
+    req_config->config.syncParams.preSyncptSpecified = false;
+    req_config->config.syncParams.postSyncptRequested = false;
+    req_config->config.syncParams.semaphoreSpecified = false;

-    if (plane_state->fence != NULL || nv_drm_plane_state->fd_user_ptr) {
-        if (!nv_dev->supportsSyncpts) {
+    if (nv_drm_plane_state->fd_user_ptr) {
+        if (to_nv_device(plane->dev)->supportsSyncpts) {
+            req_config->config.syncParams.postSyncptRequested = true;
+        } else {
            return -1;
        }
-
-#if defined(NV_LINUX_NVHOST_H_PRESENT) && defined(CONFIG_TEGRA_GRHOST)
-#if defined(NV_NVHOST_DMA_FENCE_UNPACK_PRESENT)
-        if (plane_state->fence != NULL) {
-            int ret = nvhost_dma_fence_unpack(
-                          plane_state->fence,
-                          &req_config->config.syncptParams.preSyncptId,
-                          &req_config->config.syncptParams.preSyncptValue);
-            if (ret != 0) {
-                return ret;
-            }
-            req_config->config.syncptParams.preSyncptSpecified = true;
-        }
-#endif
-
-        if (nv_drm_plane_state->fd_user_ptr) {
-            req_config->config.syncptParams.postSyncptRequested = true;
-        }           
-#elif defined(NV_LINUX_HOST1X_NEXT_H_PRESENT)            
-        if (plane_state->fence != NULL) {            
-            int ret = host1x_fence_extract(            
-                      plane_state->fence,            
-                      &req_config->config.syncptParams.preSyncptId,            
-                      &req_config->config.syncptParams.preSyncptValue);            
-            if (ret != 0) {            
-                return ret;            
-            }            
-            req_config->config.syncptParams.preSyncptSpecified = true;            
-        }            
-
-        if (nv_drm_plane_state->fd_user_ptr) {            
-            req_config->config.syncptParams.postSyncptRequested = true;            
-        }
-#else
-        return -1;
-#endif
    }

 #if defined(NV_DRM_HAS_HDR_OUTPUT_METADATA)
@@ -578,6 +586,24 @@ static int nv_drm_plane_atomic_check(struct drm_plane *plane,
                return ret;
            }

+#if defined(NV_DRM_COLOR_MGMT_AVAILABLE)
+            if (crtc_state->color_mgmt_changed) {
+                /*
+                 * According to the comment in the Linux kernel's
+                 * drivers/gpu/drm/drm_color_mgmt.c, if this property is NULL,
+                 * the CTM needs to be changed to the identity matrix
+                 */
+                if (crtc_state->ctm) {
+                    color_mgmt_config_ctm_to_csc(&plane_requested_config->config.csc,
+                                                 (struct drm_color_ctm *)crtc_state->ctm->data);
+                } else {
+                    plane_requested_config->config.csc = NVKMS_IDENTITY_CSC_MATRIX;
+                }
+                plane_requested_config->config.cscUseMain = NV_FALSE;
+                plane_requested_config->flags.cscChanged = NV_TRUE;
+            }
+#endif /* NV_DRM_COLOR_MGMT_AVAILABLE */
+
            if (__is_async_flip_requested(plane, crtc_state)) {
                /*
                 * Async flip requests that the flip happen 'as soon as
@@ -618,9 +644,7 @@ static int nv_drm_plane_atomic_set_property(
        to_nv_drm_plane_state(state);

    if (property == nv_dev->nv_out_fence_property) {
-#if defined(NV_LINUX_NVHOST_H_PRESENT) && defined(CONFIG_TEGRA_GRHOST)
-        nv_drm_plane_state->fd_user_ptr = u64_to_user_ptr(val);
-#endif
+        nv_drm_plane_state->fd_user_ptr = (void __user *)(uintptr_t)(val);
        return 0;
    } else if (property == nv_dev->nv_input_colorspace_property) {
        nv_drm_plane_state->input_colorspace = val;
@@ -668,6 +692,38 @@ static int nv_drm_plane_atomic_get_property(
    return -EINVAL;
 }

+/**
+ * nv_drm_plane_atomic_reset - plane state reset hook
+ * @plane: DRM plane
+ *
+ * Allocate an empty DRM plane state.
+ */
+static void nv_drm_plane_atomic_reset(struct drm_plane *plane)
+{
+    struct nv_drm_plane_state *nv_plane_state =
+        nv_drm_calloc(1, sizeof(*nv_plane_state));
+
+    if (!nv_plane_state) {
+        return;
+    }
+
+    drm_atomic_helper_plane_reset(plane);
+
+    /*
+     * The drm atomic helper function allocates a state object that is the wrong
+     * size. Copy its contents into the one we allocated above and replace the
+     * pointer.
+     */
+    if (plane->state) {
+        nv_plane_state->base = *plane->state;
+        kfree(plane->state);
+        plane->state = &nv_plane_state->base;
+    } else {
+        kfree(nv_plane_state);
+    }
+}
+
+
 static struct drm_plane_state *
 nv_drm_plane_atomic_duplicate_state(struct drm_plane *plane)
 {
@@ -727,7 +783,7 @@ static const struct drm_plane_funcs nv_plane_funcs = {
    .update_plane           = drm_atomic_helper_update_plane,
    .disable_plane          = drm_atomic_helper_disable_plane,
    .destroy                = nv_drm_plane_destroy,
-    .reset                  = drm_atomic_helper_plane_reset,
+    .reset                  = nv_drm_plane_atomic_reset,
    .atomic_get_property    = nv_drm_plane_atomic_get_property,
    .atomic_set_property    = nv_drm_plane_atomic_set_property,
    .atomic_duplicate_state = nv_drm_plane_atomic_duplicate_state,
@@ -761,7 +817,7 @@ __nv_drm_atomic_helper_crtc_destroy_state(struct drm_crtc *crtc,
 #endif
 }

-static inline void nv_drm_crtc_duplicate_req_head_modeset_config(
+static inline bool nv_drm_crtc_duplicate_req_head_modeset_config(
    const struct NvKmsKapiHeadRequestedConfig *old,
    struct NvKmsKapiHeadRequestedConfig *new)
 {
@@ -773,14 +829,86 @@ static inline void nv_drm_crtc_duplicate_req_head_modeset_config(
     * there is no change in new configuration yet with respect
     * to older one!
     */
-    *new = (struct NvKmsKapiHeadRequestedConfig) {
-        .modeSetConfig = old->modeSetConfig,
-    };
+    memset(new, 0, sizeof(*new));
+    new->modeSetConfig = old->modeSetConfig;

    for (i = 0; i < ARRAY_SIZE(old->layerRequestedConfig); i++) {
-        new->layerRequestedConfig[i] = (struct NvKmsKapiLayerRequestedConfig) {
-            .config = old->layerRequestedConfig[i].config,
-        };
+        new->layerRequestedConfig[i].config =
+            old->layerRequestedConfig[i].config;
+    }
+
+    if (old->modeSetConfig.lut.input.pRamps) {
+        new->modeSetConfig.lut.input.pRamps =
+            nv_drm_calloc(1, sizeof(*new->modeSetConfig.lut.input.pRamps));
+
+        if (!new->modeSetConfig.lut.input.pRamps) {
+            return false;
+        }
+        *new->modeSetConfig.lut.input.pRamps =
+            *old->modeSetConfig.lut.input.pRamps;
+    }
+    if (old->modeSetConfig.lut.output.pRamps) {
+        new->modeSetConfig.lut.output.pRamps =
+            nv_drm_calloc(1, sizeof(*new->modeSetConfig.lut.output.pRamps));
+
+        if (!new->modeSetConfig.lut.output.pRamps) {
+            /*
+             * new->modeSetConfig.lut.input.pRamps is either NULL or it was
+             * just allocated
+             */
+            nv_drm_free(new->modeSetConfig.lut.input.pRamps);
+            new->modeSetConfig.lut.input.pRamps = NULL;
+            return false;
+        }
+        *new->modeSetConfig.lut.output.pRamps =
+            *old->modeSetConfig.lut.output.pRamps;
+    }
+    return true;
+}
+
+static inline struct nv_drm_crtc_state *nv_drm_crtc_state_alloc(void)
+{
+    struct nv_drm_crtc_state *nv_state = nv_drm_calloc(1, sizeof(*nv_state));
+    int i;
+
+    if (nv_state == NULL) {
+        return NULL;
+    }
+
+    for (i = 0; i < ARRAY_SIZE(nv_state->req_config.layerRequestedConfig); i++) {
+        plane_config_clear(&nv_state->req_config.layerRequestedConfig[i].config);
+    }
+    return nv_state;
+}
+
+
+/**
+ * nv_drm_atomic_crtc_reset - crtc state reset hook
+ * @crtc: DRM crtc
+ *
+ * Allocate an empty DRM crtc state.
+ */
+static void nv_drm_atomic_crtc_reset(struct drm_crtc *crtc)
+{
+    struct nv_drm_crtc_state *nv_state = nv_drm_crtc_state_alloc();
+
+    if (!nv_state) {
+        return;
+    }
+
+    drm_atomic_helper_crtc_reset(crtc);
+
+    /*
+     * The drm atomic helper function allocates a state object that is the wrong
+     * size. Copy its contents into the one we allocated above and replace the
+     * pointer.
+     */
+    if (crtc->state) {
+        nv_state->base = *crtc->state;
+        kfree(crtc->state);
+        crtc->state = &nv_state->base;
+    } else {
+        kfree(nv_state);
    }
 }

@@ -795,7 +923,7 @@ static inline void nv_drm_crtc_duplicate_req_head_modeset_config(
 static struct drm_crtc_state*
 nv_drm_atomic_crtc_duplicate_state(struct drm_crtc *crtc)
 {
-    struct nv_drm_crtc_state *nv_state = nv_drm_calloc(1, sizeof(*nv_state));
+    struct nv_drm_crtc_state *nv_state = nv_drm_crtc_state_alloc();

    if (nv_state == NULL) {
        return NULL;
@@ -807,17 +935,24 @@ nv_drm_atomic_crtc_duplicate_state(struct drm_crtc *crtc)
        return NULL;
    }

-    __drm_atomic_helper_crtc_duplicate_state(crtc, &nv_state->base);
-
    INIT_LIST_HEAD(&nv_state->nv_flip->list_entry);
    INIT_LIST_HEAD(&nv_state->nv_flip->deferred_flip_list);

-    nv_drm_crtc_duplicate_req_head_modeset_config(
-        &(to_nv_crtc_state(crtc->state)->req_config),
-        &nv_state->req_config);
+    /*
+     * nv_drm_crtc_duplicate_req_head_modeset_config potentially allocates
+     * nv_state->req_config.modeSetConfig.lut.{in,out}put.pRamps, so they should
+     * be freed in any following failure paths.
+     */
+    if (!nv_drm_crtc_duplicate_req_head_modeset_config(
+             &(to_nv_crtc_state(crtc->state)->req_config),
+             &nv_state->req_config)) {

-    nv_state->ilut_ramps = NULL;
-    nv_state->olut_ramps = NULL;
+        nv_drm_free(nv_state->nv_flip);
+        nv_drm_free(nv_state);
+        return NULL;
+    }
+
+    __drm_atomic_helper_crtc_duplicate_state(crtc, &nv_state->base);

    return &nv_state->base;
 }
@@ -842,8 +977,8 @@ static void nv_drm_atomic_crtc_destroy_state(struct drm_crtc *crtc,

    __nv_drm_atomic_helper_crtc_destroy_state(crtc, &nv_state->base);

-    nv_drm_free(nv_state->ilut_ramps);
-    nv_drm_free(nv_state->olut_ramps);
+    nv_drm_free(nv_state->req_config.modeSetConfig.lut.input.pRamps);
+    nv_drm_free(nv_state->req_config.modeSetConfig.lut.output.pRamps);

    nv_drm_free(nv_state);
 }
@@ -851,7 +986,7 @@ static void nv_drm_atomic_crtc_destroy_state(struct drm_crtc *crtc,
 static struct drm_crtc_funcs nv_crtc_funcs = {
    .set_config             = drm_atomic_helper_set_config,
    .page_flip              = drm_atomic_helper_page_flip,
-    .reset                  = drm_atomic_helper_crtc_reset,
+    .reset                  = nv_drm_atomic_crtc_reset,
    .destroy                = nv_drm_crtc_destroy,
    .atomic_duplicate_state = nv_drm_atomic_crtc_duplicate_state,
    .atomic_destroy_state   = nv_drm_atomic_crtc_destroy_state,
@@ -914,172 +1049,94 @@ static int color_mgmt_config_copy_lut(struct NvKmsLutRamps *nvkms_lut,
    return 0;
 }

-static void color_mgmt_config_ctm_to_csc(struct NvKmsCscMatrix *nvkms_csc,
-                                         struct drm_color_ctm  *drm_ctm)
-{
-    int y;
-
-    /* CTM is a 3x3 matrix while ours is 3x4. Zero out the last column. */
-    nvkms_csc->m[0][3] = nvkms_csc->m[1][3] = nvkms_csc->m[2][3] = 0;
-
-    for (y = 0; y < 3; y++) {
-        int x;
-
-        for (x = 0; x < 3; x++) {
-            /*
-             * Values in the CTM are encoded in S31.32 sign-magnitude fixed-
-             * point format, while NvKms CSC values are signed 2's-complement
-             * S15.16 (Ssign-extend12-3.16?) fixed-point format.
-             */
-            NvU64 ctmVal = drm_ctm->matrix[y*3 + x];
-            NvU64 signBit = ctmVal & (1ULL << 63);
-            NvU64 magnitude = ctmVal & ~signBit;
-
-            /*
-             * Drop the low 16 bits of the fractional part and the high 17 bits
-             * of the integral part. Drop 17 bits to avoid corner cases where
-             * the highest resulting bit is a 1, causing the `cscVal = -cscVal`
-             * line to result in a positive number.
-             */
-            NvS32 cscVal = (magnitude >> 16) & ((1ULL << 31) - 1);
-            if (signBit) {
-                cscVal = -cscVal;
-            }
-
-            nvkms_csc->m[y][x] = cscVal;
-        }
-    }
-}
-
-static int color_mgmt_config_set(struct nv_drm_crtc_state *nv_crtc_state,
-                                 struct NvKmsKapiHeadRequestedConfig *req_config)
+static int color_mgmt_config_set_luts(struct nv_drm_crtc_state *nv_crtc_state,
+                                      struct NvKmsKapiHeadRequestedConfig *req_config)
 {
    struct NvKmsKapiHeadModeSetConfig *modeset_config =
        &req_config->modeSetConfig;
    struct drm_crtc_state *crtc_state = &nv_crtc_state->base;
    int ret = 0;

-    struct drm_color_lut *degamma_lut = NULL;
-    struct drm_color_ctm *ctm = NULL;
-    struct drm_color_lut *gamma_lut = NULL;
-    uint64_t degamma_len = 0;
-    uint64_t gamma_len = 0;
-
-    int i;
-    struct drm_plane *plane;
-    struct drm_plane_state *plane_state;
-
    /*
     * According to the comment in the Linux kernel's
-     * drivers/gpu/drm/drm_color_mgmt.c, if any of these properties are NULL,
-     * that LUT or CTM needs to be changed to a linear LUT or identity matrix
-     * respectively.
+     * drivers/gpu/drm/drm_color_mgmt.c, if either property is NULL, that LUT
+     * needs to be changed to a linear LUT
+     *
+     * On failure, any LUT ramps allocated in this function are freed when the
+     * subsequent atomic state cleanup calls nv_drm_atomic_crtc_destroy_state.
     */

-    req_config->flags.lutChanged = NV_TRUE;
    if (crtc_state->degamma_lut) {
-        nv_crtc_state->ilut_ramps = nv_drm_calloc(1, sizeof(*nv_crtc_state->ilut_ramps));
-        if (!nv_crtc_state->ilut_ramps) {
-            ret = -ENOMEM;
-            goto fail;
+        struct drm_color_lut *degamma_lut = NULL;
+        uint64_t degamma_len = 0;
+
+        if (!modeset_config->lut.input.pRamps) {
+            modeset_config->lut.input.pRamps =
+                nv_drm_calloc(1, sizeof(*modeset_config->lut.input.pRamps));
+            if (!modeset_config->lut.input.pRamps) {
+                return -ENOMEM;
+            }
        }

        degamma_lut = (struct drm_color_lut *)crtc_state->degamma_lut->data;
        degamma_len = crtc_state->degamma_lut->length /
                      sizeof(struct drm_color_lut);

-        if ((ret = color_mgmt_config_copy_lut(nv_crtc_state->ilut_ramps,
+        if ((ret = color_mgmt_config_copy_lut(modeset_config->lut.input.pRamps,
                                              degamma_lut,
                                              degamma_len)) != 0) {
-            goto fail;
+            return ret;
        }

-        modeset_config->lut.input.specified = NV_TRUE;
        modeset_config->lut.input.depth     = 30; /* specify the full LUT */
        modeset_config->lut.input.start     = 0;
        modeset_config->lut.input.end       = degamma_len - 1;
-        modeset_config->lut.input.pRamps    = nv_crtc_state->ilut_ramps;
    } else {
        /* setting input.end to 0 is equivalent to disabling the LUT, which
         * should be equivalent to a linear LUT */
-        modeset_config->lut.input.specified = NV_TRUE;
        modeset_config->lut.input.depth     = 30; /* specify the full LUT */
        modeset_config->lut.input.start     = 0;
        modeset_config->lut.input.end       = 0;
+
+        nv_drm_free(modeset_config->lut.input.pRamps);
        modeset_config->lut.input.pRamps    = NULL;
    }
-
-    nv_drm_for_each_new_plane_in_state(crtc_state->state, plane,
-                                       plane_state, i) {
-        struct nv_drm_plane *nv_plane = to_nv_plane(plane);
-        uint32_t layer = nv_plane->layer_idx;
-        struct NvKmsKapiLayerRequestedConfig *layer_config;
-
-        if (layer == NVKMS_KAPI_LAYER_INVALID_IDX || plane_state->crtc != crtc_state->crtc) {
-            continue;
-        }
-        layer_config = &req_config->layerRequestedConfig[layer];
-
-        if (layer == NVKMS_KAPI_LAYER_PRIMARY_IDX && crtc_state->ctm) {
-            ctm = (struct drm_color_ctm *)crtc_state->ctm->data;
-
-            color_mgmt_config_ctm_to_csc(&layer_config->config.csc, ctm);
-            layer_config->config.cscUseMain = NV_FALSE;
-        } else {
-            /* When crtc_state->ctm is unset, this also sets the main layer to
-             * the identity matrix.
-             */
-            layer_config->config.csc = NVKMS_IDENTITY_CSC_MATRIX;
-        }
-        layer_config->flags.cscChanged = NV_TRUE;
-    }
+    req_config->flags.ilutChanged = NV_TRUE;

    if (crtc_state->gamma_lut) {
-        nv_crtc_state->olut_ramps = nv_drm_calloc(1, sizeof(*nv_crtc_state->olut_ramps));
-        if (!nv_crtc_state->olut_ramps) {
-            ret = -ENOMEM;
-            goto fail;
+        struct drm_color_lut *gamma_lut = NULL;
+        uint64_t gamma_len = 0;
+
+        if (!modeset_config->lut.output.pRamps) {
+            modeset_config->lut.output.pRamps =
+                nv_drm_calloc(1, sizeof(*modeset_config->lut.output.pRamps));
+            if (!modeset_config->lut.output.pRamps) {
+                return -ENOMEM;
+            }
        }

        gamma_lut = (struct drm_color_lut *)crtc_state->gamma_lut->data;
        gamma_len = crtc_state->gamma_lut->length /
                    sizeof(struct drm_color_lut);

-        if ((ret = color_mgmt_config_copy_lut(nv_crtc_state->olut_ramps,
+        if ((ret = color_mgmt_config_copy_lut(modeset_config->lut.output.pRamps,
                                              gamma_lut,
                                              gamma_len)) != 0) {
-            goto fail;
+            return ret;
        }

-        modeset_config->lut.output.specified = NV_TRUE;
        modeset_config->lut.output.enabled   = NV_TRUE;
-        modeset_config->lut.output.pRamps    = nv_crtc_state->olut_ramps;
    } else {
        /* disabling the output LUT should be equivalent to setting a linear
         * LUT */
-        modeset_config->lut.output.specified = NV_TRUE;
        modeset_config->lut.output.enabled   = NV_FALSE;
+
+        nv_drm_free(modeset_config->lut.output.pRamps);
        modeset_config->lut.output.pRamps    = NULL;
    }
+    req_config->flags.olutChanged = NV_TRUE;

    return 0;
-
-fail:
-    /* free allocated state */
-    nv_drm_free(nv_crtc_state->ilut_ramps);
-    nv_drm_free(nv_crtc_state->olut_ramps);
-
-    /* remove dangling pointers */
-    nv_crtc_state->ilut_ramps = NULL;
-    nv_crtc_state->olut_ramps = NULL;
-    modeset_config->lut.input.pRamps = NULL;
-    modeset_config->lut.output.pRamps = NULL;
-
-    /* prevent attempts at reading NULLs */
-    modeset_config->lut.input.specified = NV_FALSE;
-    modeset_config->lut.output.specified = NV_FALSE;
-
-    return ret;
 }
 #endif /* NV_DRM_COLOR_MGMT_AVAILABLE */

@@ -1104,9 +1161,6 @@ static int nv_drm_crtc_atomic_check(struct drm_crtc *crtc,
    struct NvKmsKapiHeadRequestedConfig *req_config =
        &nv_crtc_state->req_config;
    int ret = 0;
-#if defined(NV_DRM_COLOR_MGMT_AVAILABLE)
-    struct nv_drm_device *nv_dev = to_nv_device(crtc_state->crtc->dev);
-#endif

    if (crtc_state->mode_changed) {
        drm_mode_to_nvkms_display_mode(&crtc_state->mode,
@@ -1150,15 +1204,8 @@ static int nv_drm_crtc_atomic_check(struct drm_crtc *crtc,
 #endif

 #if defined(NV_DRM_COLOR_MGMT_AVAILABLE)
-    if (nv_dev->drmMasterChangedSinceLastAtomicCommit &&
-        (crtc_state->degamma_lut ||
-         crtc_state->ctm ||
-         crtc_state->gamma_lut)) {
-
-        crtc_state->color_mgmt_changed = NV_TRUE;
-    }
    if (crtc_state->color_mgmt_changed) {
-        if ((ret = color_mgmt_config_set(nv_crtc_state, req_config)) != 0) {
+        if ((ret = color_mgmt_config_set_luts(nv_crtc_state, req_config)) != 0) {
            return ret;
        }
    }
@@ -1182,7 +1229,7 @@ static const struct drm_crtc_helper_funcs nv_crtc_helper_funcs = {

 static void nv_drm_plane_install_properties(
    struct drm_plane *plane,
-    NvBool supportsHDR)
+    NvBool supportsICtCp)
 {
    struct nv_drm_device *nv_dev = to_nv_device(plane->dev);

@@ -1198,7 +1245,7 @@ static void nv_drm_plane_install_properties(
    }

 #if defined(NV_DRM_HAS_HDR_OUTPUT_METADATA)
-    if (supportsHDR && nv_dev->nv_hdr_output_metadata_property) {
+    if (supportsICtCp && nv_dev->nv_hdr_output_metadata_property) {
        drm_object_attach_property(
            &plane->base, nv_dev->nv_hdr_output_metadata_property, 0);
    }
@@ -1384,7 +1431,7 @@ nv_drm_plane_create(struct drm_device *dev,
    if (plane_type != DRM_PLANE_TYPE_CURSOR) {
        nv_drm_plane_install_properties(
                plane,
-                pResInfo->supportsHDR[layer_idx]);
+                pResInfo->supportsICtCp[layer_idx]);
    }

    __nv_drm_plane_create_alpha_blending_properties(
@@ -1428,7 +1475,7 @@ static struct drm_crtc *__nv_drm_crtc_create(struct nv_drm_device *nv_dev,
        goto failed;
    }

-    nv_state = nv_drm_calloc(1, sizeof(*nv_state));
+    nv_state = nv_drm_crtc_state_alloc();
    if (nv_state == NULL) {
        goto failed_state_alloc;
    }
@@ -1607,7 +1654,7 @@ int nv_drm_get_crtc_crc32_v2_ioctl(struct drm_device *dev,
    struct NvKmsKapiCrcs crc32;

    if (!drm_core_check_feature(dev, DRIVER_MODESET)) {
-        return -ENOENT;
+        return -EOPNOTSUPP;
    }

    crtc = nv_drm_crtc_find(dev, filep, params->crtc_id);
@@ -1635,7 +1682,7 @@ int nv_drm_get_crtc_crc32_ioctl(struct drm_device *dev,
    struct NvKmsKapiCrcs crc32;

    if (!drm_core_check_feature(dev, DRIVER_MODESET)) {
-        return -ENOENT;
+        return -EOPNOTSUPP;
    }

    crtc = nv_drm_crtc_find(dev, filep, params->crtc_id);
--- a/kernel-open/nvidia-drm/nvidia-drm-crtc.h
+++ b/kernel-open/nvidia-drm/nvidia-drm-crtc.h
@@ -129,9 +129,6 @@ struct nv_drm_crtc_state {
     */
    struct NvKmsKapiHeadRequestedConfig req_config;

-    struct NvKmsLutRamps *ilut_ramps;
-    struct NvKmsLutRamps *olut_ramps;
-
    /**
     * @nv_flip:
     *
--- a/kernel-open/nvidia-drm/nvidia-drm-drv.c
+++ b/kernel-open/nvidia-drm/nvidia-drm-drv.c
@@ -74,6 +74,7 @@
 #endif

 #include <linux/pci.h>
+#include <linux/workqueue.h>

 /*
 * Commit fcd70cd36b9b ("drm: Split out drm_probe_helper.h")
@@ -372,19 +373,15 @@ static int nv_drm_create_properties(struct nv_drm_device *nv_dev)
        len++;
    }

-#if defined(NV_LINUX_NVHOST_H_PRESENT) && defined(CONFIG_TEGRA_GRHOST)
-    if (!nv_dev->supportsSyncpts) {
-        return 0;
+    if (nv_dev->supportsSyncpts) {
+        nv_dev->nv_out_fence_property =
+            drm_property_create_range(nv_dev->dev, DRM_MODE_PROP_ATOMIC,
+                    "NV_DRM_OUT_FENCE_PTR", 0, U64_MAX);
+        if (nv_dev->nv_out_fence_property == NULL) {
+            return -ENOMEM;
+        }
    }

-    nv_dev->nv_out_fence_property =
-        drm_property_create_range(nv_dev->dev, DRM_MODE_PROP_ATOMIC,
-            "NV_DRM_OUT_FENCE_PTR", 0, U64_MAX);
-    if (nv_dev->nv_out_fence_property == NULL) {
-        return -ENOMEM;
-    }
-#endif
-
    nv_dev->nv_input_colorspace_property =
        drm_property_create_enum(nv_dev->dev, 0, "NV_INPUT_COLORSPACE",
                                 enum_list, len);
@@ -405,6 +402,27 @@ static int nv_drm_create_properties(struct nv_drm_device *nv_dev)
    return 0;
 }

+#if defined(NV_DRM_ATOMIC_MODESET_AVAILABLE)
+/*
+ * We can't just call drm_kms_helper_hotplug_event directly because
+ * fbdev_generic may attempt to set a mode from inside the hotplug event
+ * handler. Because kapi event handling runs on nvkms_kthread_q, this blocks
+ * other event processing including the flip completion notifier expected by
+ * nv_drm_atomic_commit.
+ *
+ * Defer hotplug event handling to a work item so that nvkms_kthread_q can
+ * continue processing events while a DRM modeset is in progress.
+ */
+static void nv_drm_handle_hotplug_event(struct work_struct *work)
+{
+    struct delayed_work *dwork = to_delayed_work(work);
+    struct nv_drm_device *nv_dev =
+        container_of(dwork, struct nv_drm_device, hotplug_event_work);
+
+    drm_kms_helper_hotplug_event(nv_dev->dev);
+}
+#endif
+
 static int nv_drm_load(struct drm_device *dev, unsigned long flags)
 {
 #if defined(NV_DRM_ATOMIC_MODESET_AVAILABLE)
@@ -412,7 +430,7 @@ static int nv_drm_load(struct drm_device *dev, unsigned long flags)

    struct NvKmsKapiAllocateDeviceParams allocateDeviceParams;
    struct NvKmsKapiDeviceResourcesInfo resInfo;
-#endif
+#endif /* defined(NV_DRM_ATOMIC_MODESET_AVAILABLE) */
 #if defined(NV_DRM_FORMAT_MODIFIERS_PRESENT)
    NvU64 kind;
    NvU64 gen;
@@ -458,6 +476,22 @@ static int nv_drm_load(struct drm_device *dev, unsigned long flags)
        return -ENODEV;
    }

+#if defined(NV_DRM_FBDEV_GENERIC_AVAILABLE)
+    /*
+     * If fbdev is enabled, take modeset ownership now before other DRM clients
+     * can take master (and thus NVKMS ownership).
+     */
+    if (nv_drm_fbdev_module_param) {
+        if (!nvKms->grabOwnership(pDevice)) {
+            nvKms->freeDevice(pDevice);
+            NV_DRM_DEV_LOG_ERR(nv_dev, "Failed to grab NVKMS modeset ownership");
+            return -EBUSY;
+        }
+
+        nv_dev->hasFramebufferConsole = NV_TRUE;
+    }
+#endif
+
    mutex_lock(&nv_dev->lock);

    /* Set NvKmsKapiDevice */
@@ -483,6 +517,12 @@ static int nv_drm_load(struct drm_device *dev, unsigned long flags)
    nv_dev->semsurf_max_submitted_offset =
        resInfo.caps.semsurf.maxSubmittedOffset;

+    nv_dev->display_semaphores.count =
+        resInfo.caps.numDisplaySemaphores;
+    nv_dev->display_semaphores.next_index = 0;
+
+    nv_dev->requiresVrrSemaphores = resInfo.caps.requiresVrrSemaphores;
+
 #if defined(NV_DRM_FORMAT_MODIFIERS_PRESENT)
    gen = nv_dev->pageKindGeneration;
    kind = nv_dev->genericPageKind;
@@ -540,6 +580,7 @@ static int nv_drm_load(struct drm_device *dev, unsigned long flags)

    /* Enable event handling */

+    INIT_DELAYED_WORK(&nv_dev->hotplug_event_work, nv_drm_handle_hotplug_event);
    atomic_set(&nv_dev->enable_event_handling, true);

    init_waitqueue_head(&nv_dev->flip_event_wq);
@@ -567,6 +608,16 @@ static void __nv_drm_unload(struct drm_device *dev)
        return;
    }

+    /* Release modeset ownership if fbdev is enabled */
+
+#if defined(NV_DRM_FBDEV_GENERIC_AVAILABLE)
+    if (nv_dev->hasFramebufferConsole) {
+        drm_atomic_helper_shutdown(dev);
+        nvKms->releaseOwnership(nv_dev->pDevice);
+    }
+#endif
+
+    cancel_delayed_work_sync(&nv_dev->hotplug_event_work);
    mutex_lock(&nv_dev->lock);

    WARN_ON(nv_dev->subOwnershipGranted);
@@ -628,7 +679,6 @@ static int __nv_drm_master_set(struct drm_device *dev,
        !nvKms->grabOwnership(nv_dev->pDevice)) {
        return -EINVAL;
    }
-    nv_dev->drmMasterChangedSinceLastAtomicCommit = NV_TRUE;

    return 0;
 }
@@ -729,6 +779,7 @@ static int nv_drm_get_dev_info_ioctl(struct drm_device *dev,

    params->gpu_id = nv_dev->gpu_info.gpu_id;
    params->primary_index = dev->primary->index;
+    params->supports_alloc = false;
    params->generic_page_kind = 0;
    params->page_kind_generation = 0;
    params->sector_layout = 0;
@@ -736,21 +787,34 @@ static int nv_drm_get_dev_info_ioctl(struct drm_device *dev,
    params->supports_semsurf = false;

 #if defined(NV_DRM_ATOMIC_MODESET_AVAILABLE)
-    params->generic_page_kind = nv_dev->genericPageKind;
-    params->page_kind_generation = nv_dev->pageKindGeneration;
-    params->sector_layout = nv_dev->sectorLayout;
-    /* Semaphore surfaces are only supported if the modeset = 1 parameter is set */
-    if ((nv_dev->pDevice) != NULL && (nv_dev->semsurf_stride != 0)) {
-        params->supports_semsurf = true;
+    /* Memory allocation and semaphore surfaces are only supported
+     * if the modeset = 1 parameter is set */
+    if (nv_dev->pDevice != NULL) {
+        params->supports_alloc = true;
+        params->generic_page_kind = nv_dev->genericPageKind;
+        params->page_kind_generation = nv_dev->pageKindGeneration;
+        params->sector_layout = nv_dev->sectorLayout;
+
+        if (nv_dev->semsurf_stride != 0) {
+            params->supports_semsurf = true;
 #if defined(NV_SYNC_FILE_GET_FENCE_PRESENT)
-        params->supports_sync_fd = true;
+            params->supports_sync_fd = true;
 #endif /* defined(NV_SYNC_FILE_GET_FENCE_PRESENT) */
+        }
    }
 #endif /* defined(NV_DRM_ATOMIC_MODESET_AVAILABLE) */

    return 0;
 }

+static int nv_drm_get_drm_file_unique_id_ioctl(struct drm_device *dev,
+                                               void *data, struct drm_file *filep)
+{
+    struct drm_nvidia_get_drm_file_unique_id_params *params = data;
+    params->id = (u64)(filep->driver_priv);
+    return 0;
+}
+
 static int nv_drm_dmabuf_supported_ioctl(struct drm_device *dev,
                                         void *data, struct drm_file *filep)
 {
@@ -804,13 +868,18 @@ static int nv_drm_get_dpy_id_for_connector_id_ioctl(struct drm_device *dev,
                                                    struct drm_file *filep)
 {
    struct drm_nvidia_get_dpy_id_for_connector_id_params *params = data;
+    struct drm_connector *connector;
+    struct nv_drm_connector *nv_connector;
+    int ret = 0;
+
+    if (!drm_core_check_feature(dev, DRIVER_MODESET)) {
+        return -EOPNOTSUPP;
+    }
+
    // Importantly, drm_connector_lookup (with filep) will only return the
    // connector if we are master, a lessee with the connector, or not master at
    // all. It will return NULL if we are a lessee with other connectors.
-    struct drm_connector *connector =
-        nv_drm_connector_lookup(dev, filep, params->connectorId);
-    struct nv_drm_connector *nv_connector;
-    int ret = 0;
+    connector = nv_drm_connector_lookup(dev, filep, params->connectorId);

    if (!connector) {
        return -EINVAL;
@@ -843,6 +912,11 @@ static int nv_drm_get_connector_id_for_dpy_id_ioctl(struct drm_device *dev,
    int ret = -EINVAL;
 #if defined(NV_DRM_CONNECTOR_LIST_ITER_PRESENT)
    struct drm_connector_list_iter conn_iter;
+#endif
+    if (!drm_core_check_feature(dev, DRIVER_MODESET)) {
+        return -EOPNOTSUPP;
+    }
+#if defined(NV_DRM_CONNECTOR_LIST_ITER_PRESENT)
    nv_drm_connector_list_iter_begin(dev, &conn_iter);
 #endif

@@ -1055,6 +1129,10 @@ static int nv_drm_grant_permission_ioctl(struct drm_device *dev, void *data,
 {
    struct drm_nvidia_grant_permissions_params *params = data;

+    if (!drm_core_check_feature(dev, DRIVER_MODESET)) {
+        return -EOPNOTSUPP;
+    }
+
    if (params->type == NV_DRM_PERMISSIONS_TYPE_MODESET) {
        return nv_drm_grant_modeset_permission(dev, params, filep);
    } else if (params->type == NV_DRM_PERMISSIONS_TYPE_SUB_OWNER) {
@@ -1220,6 +1298,10 @@ static int nv_drm_revoke_permission_ioctl(struct drm_device *dev, void *data,
 {
    struct drm_nvidia_revoke_permissions_params *params = data;

+    if (!drm_core_check_feature(dev, DRIVER_MODESET)) {
+        return -EOPNOTSUPP;
+    }
+
    if (params->type == NV_DRM_PERMISSIONS_TYPE_MODESET) {
        if (!params->dpyId) {
            return -EINVAL;
@@ -1249,6 +1331,17 @@ static void nv_drm_postclose(struct drm_device *dev, struct drm_file *filep)
 }
 #endif /* NV_DRM_ATOMIC_MODESET_AVAILABLE */

+static int nv_drm_open(struct drm_device *dev, struct drm_file *filep)
+{
+    _Static_assert(sizeof(filep->driver_priv) >= sizeof(u64),
+                   "filep->driver_priv can not hold an u64");
+    static atomic64_t id = ATOMIC_INIT(0);
+
+    filep->driver_priv = (void *)atomic64_inc_return(&id);
+
+    return 0;
+}
+
 #if defined(NV_DRM_MASTER_HAS_LEASES)
 static struct drm_master *nv_drm_find_lessee(struct drm_master *master,
                                             int lessee_id)
@@ -1492,6 +1585,9 @@ static const struct drm_ioctl_desc nv_drm_ioctls[] = {
    DRM_IOCTL_DEF_DRV(NVIDIA_GET_DEV_INFO,
                      nv_drm_get_dev_info_ioctl,
                      DRM_RENDER_ALLOW|DRM_UNLOCKED),
+    DRM_IOCTL_DEF_DRV(NVIDIA_GET_DRM_FILE_UNIQUE_ID,
+                      nv_drm_get_drm_file_unique_id_ioctl,
+                      DRM_RENDER_ALLOW|DRM_UNLOCKED),

 #if defined(NV_DRM_FENCE_AVAILABLE)
    DRM_IOCTL_DEF_DRV(NVIDIA_FENCE_SUPPORTED,
@@ -1517,9 +1613,21 @@ static const struct drm_ioctl_desc nv_drm_ioctls[] = {
                      DRM_RENDER_ALLOW|DRM_UNLOCKED),
 #endif

+    /*
+     * DRM_UNLOCKED is implicit for all non-legacy DRM driver IOCTLs since Linux
+     * v4.10 commit fa5386459f06 "drm: Used DRM_LEGACY for all legacy functions"
+     * (Linux v4.4 commit ea487835e887 "drm: Enforce unlocked ioctl operation
+     * for kms driver ioctls" previously did it only for drivers that set the
+     * DRM_MODESET flag), so this will race with SET_CLIENT_CAP. Linux v4.11
+     * commit dcf727ab5d17 "drm: setclientcap doesn't need the drm BKL" also
+     * removed locking from SET_CLIENT_CAP so there is no use attempting to lock
+     * manually. The latter commit acknowledges that this can expose userspace
+     * to inconsistent behavior when racing with itself, but accepts that risk.
+     */
    DRM_IOCTL_DEF_DRV(NVIDIA_GET_CLIENT_CAPABILITY,
                      nv_drm_get_client_capability_ioctl,
                      0),
+
 #if defined(NV_DRM_ATOMIC_MODESET_AVAILABLE)
    DRM_IOCTL_DEF_DRV(NVIDIA_GET_CRTC_CRC32,
                      nv_drm_get_crtc_crc32_ioctl,
@@ -1562,6 +1670,9 @@ static struct drm_driver nv_drm_driver = {
    .driver_features        =
 #if defined(NV_DRM_DRIVER_PRIME_FLAG_PRESENT)
                               DRIVER_PRIME |
+#endif
+#if defined(NV_DRM_SYNCOBJ_FEATURES_PRESENT)
+                               DRIVER_SYNCOBJ | DRIVER_SYNCOBJ_TIMELINE |
 #endif
                               DRIVER_GEM  | DRIVER_RENDER,

@@ -1573,14 +1684,14 @@ static struct drm_driver nv_drm_driver = {
    .num_ioctls             = ARRAY_SIZE(nv_drm_ioctls),

 /*
- * linux-next commit 71a7974ac701 ("drm/prime: Unexport helpers for fd/handle
- * conversion") unexports drm_gem_prime_handle_to_fd() and
+ * Linux kernel v6.6 commit 71a7974ac701 ("drm/prime: Unexport helpers
+ * for fd/handle conversion") unexports drm_gem_prime_handle_to_fd() and
 * drm_gem_prime_fd_to_handle().
 *
- * Prior linux-next commit 6b85aa68d9d5 ("drm: Enable PRIME import/export for
- * all drivers") made these helpers the default when .prime_handle_to_fd /
- * .prime_fd_to_handle are unspecified, so it's fine to just skip specifying
- * them if the helpers aren't present.
+ * Prior Linux kernel v6.6 commit 6b85aa68d9d5 ("drm: Enable PRIME
+ * import/export for all drivers") made these helpers the default when
+ * .prime_handle_to_fd / .prime_fd_to_handle are unspecified, so it's fine
+ * to just skip specifying them if the helpers aren't present.
 */
 #if NV_IS_EXPORT_SYMBOL_PRESENT_drm_gem_prime_handle_to_fd
    .prime_handle_to_fd     = drm_gem_prime_handle_to_fd,
@@ -1614,6 +1725,7 @@ static struct drm_driver nv_drm_driver = {
 #if defined(NV_DRM_ATOMIC_MODESET_AVAILABLE)
    .postclose              = nv_drm_postclose,
 #endif
+    .open                   = nv_drm_open,

    .fops                   = &nv_drm_fops,

@@ -1641,7 +1753,7 @@ static struct drm_driver nv_drm_driver = {
 * kernel supports atomic modeset and the 'modeset' kernel module
 * parameter is true.
 */
-static void nv_drm_update_drm_driver_features(void)
+void nv_drm_update_drm_driver_features(void)
 {
 #if defined(NV_DRM_ATOMIC_MODESET_AVAILABLE)

@@ -1667,11 +1779,12 @@ static void nv_drm_update_drm_driver_features(void)
 /*
 * Helper function for allocate/register DRM device for given NVIDIA GPU ID.
 */
-static void nv_drm_register_drm_device(const nv_gpu_info_t *gpu_info)
+void nv_drm_register_drm_device(const nv_gpu_info_t *gpu_info)
 {
    struct nv_drm_device *nv_dev = NULL;
    struct drm_device *dev = NULL;
    struct device *device = gpu_info->os_device_ptr;
+    bool bus_is_pci;

    DRM_DEBUG(
        "Registering device for NVIDIA GPU ID 0x08%x",
@@ -1705,8 +1818,15 @@ static void nv_drm_register_drm_device(const nv_gpu_info_t *gpu_info)
    dev->dev_private = nv_dev;
    nv_dev->dev = dev;

+    bus_is_pci =
+#if defined(NV_LINUX)
+        device->bus == &pci_bus_type;
+#elif defined(NV_BSD)
+        devclass_find("pci");
+#endif
+
 #if defined(NV_DRM_DEVICE_HAS_PDEV)
-    if (device->bus == &pci_bus_type) {
+    if (bus_is_pci) {
        dev->pdev = to_pci_dev(device);
    }
 #endif
@@ -1722,12 +1842,7 @@ static void nv_drm_register_drm_device(const nv_gpu_info_t *gpu_info)
    if (nv_drm_fbdev_module_param &&
        drm_core_check_feature(dev, DRIVER_MODESET)) {

-        if (!nvKms->grabOwnership(nv_dev->pDevice)) {
-            NV_DRM_DEV_LOG_ERR(nv_dev, "Failed to grab NVKMS modeset ownership");
-            goto failed_grab_ownership;
-        }
-
-        if (device->bus == &pci_bus_type) {
+        if (bus_is_pci) {
            struct pci_dev *pdev = to_pci_dev(device);

 #if defined(NV_DRM_APERTURE_REMOVE_CONFLICTING_PCI_FRAMEBUFFERS_HAS_DRIVER_ARG)
@@ -1737,8 +1852,6 @@ static void nv_drm_register_drm_device(const nv_gpu_info_t *gpu_info)
 #endif
        }
        drm_fbdev_generic_setup(dev, 32);
-
-        nv_dev->hasFramebufferConsole = NV_TRUE;
    }
 #endif /* defined(NV_DRM_FBDEV_GENERIC_AVAILABLE) */

@@ -1749,12 +1862,6 @@ static void nv_drm_register_drm_device(const nv_gpu_info_t *gpu_info)

    return; /* Success */

-#if defined(NV_DRM_FBDEV_GENERIC_AVAILABLE)
-failed_grab_ownership:
-
-    drm_dev_unregister(dev);
-#endif
-
 failed_drm_register:

    nv_drm_dev_free(dev);
@@ -1767,6 +1874,7 @@ failed_drm_alloc:
 /*
 * Enumerate NVIDIA GPUs and allocate/register DRM device for each of them.
 */
+#if defined(NV_LINUX)
 int nv_drm_probe_devices(void)
 {
    nv_gpu_info_t *gpu_info = NULL;
@@ -1809,6 +1917,7 @@ done:

    return ret;
 }
+#endif

 /*
 * Unregister all NVIDIA DRM devices.
@@ -1819,12 +1928,6 @@ void nv_drm_remove_devices(void)
        struct nv_drm_device *next = dev_list->next;
        struct drm_device *dev = dev_list->dev;

-#if defined(NV_DRM_FBDEV_GENERIC_AVAILABLE)
-        if (dev_list->hasFramebufferConsole) {
-            drm_atomic_helper_shutdown(dev);
-            nvKms->releaseOwnership(dev_list->pDevice);
-        }
-#endif
        drm_dev_unregister(dev);
        nv_drm_dev_free(dev);

@@ -1834,4 +1937,79 @@ void nv_drm_remove_devices(void)
    }
 }

+/*
+ * Handle system suspend and resume.
+ *
+ * Normally, a DRM driver would use drm_mode_config_helper_suspend() to save the
+ * current state on suspend and drm_mode_config_helper_resume() to restore it
+ * after resume. This works for upstream drivers because user-mode tasks are
+ * frozen before the suspend hook is called.
+ *
+ * In the case of nvidia-drm, the suspend hook is also called when 'suspend' is
+ * written to /proc/driver/nvidia/suspend, before user-mode tasks are frozen.
+ * However, we don't actually need to save and restore the display state because
+ * the driver requires a VT switch to an unused VT before suspending and a
+ * switch back to the application (or fbdev console) on resume. The DRM client
+ * (or fbdev helper functions) will restore the appropriate mode on resume.
+ *
+ */
+void nv_drm_suspend_resume(NvBool suspend)
+{
+    static DEFINE_MUTEX(nv_drm_suspend_mutex);
+    static NvU32 nv_drm_suspend_count = 0;
+    struct nv_drm_device *nv_dev;
+
+    mutex_lock(&nv_drm_suspend_mutex);
+
+    /*
+     * Count the number of times the driver is asked to suspend. Suspend all DRM
+     * devices on the first suspend call and resume them on the last resume
+     * call.  This is necessary because the kernel may call nvkms_suspend()
+     * simultaneously for each GPU, but NVKMS itself also suspends all GPUs on
+     * the first call.
+     */
+    if (suspend) {
+        if (nv_drm_suspend_count++ > 0) {
+            goto done;
+        }
+    } else {
+        BUG_ON(nv_drm_suspend_count == 0);
+
+        if (--nv_drm_suspend_count > 0) {
+            goto done;
+        }
+    }
+
+#if defined(NV_DRM_ATOMIC_MODESET_AVAILABLE)
+    nv_dev = dev_list;
+
+    /*
+     * NVKMS shuts down all heads on suspend. Update DRM state accordingly.
+     */
+    for (nv_dev = dev_list; nv_dev; nv_dev = nv_dev->next) {
+        struct drm_device *dev = nv_dev->dev;
+
+        if (!drm_core_check_feature(dev, DRIVER_MODESET)) {
+            continue;
+        }
+
+        if (suspend) {
+            drm_kms_helper_poll_disable(dev);
+#if defined(NV_DRM_FBDEV_GENERIC_AVAILABLE)
+            drm_fb_helper_set_suspend_unlocked(dev->fb_helper, 1);
+#endif
+            drm_mode_config_reset(dev);
+        } else {
+#if defined(NV_DRM_FBDEV_GENERIC_AVAILABLE)
+            drm_fb_helper_set_suspend_unlocked(dev->fb_helper, 0);
+#endif
+            drm_kms_helper_poll_enable(dev);
+        }
+    }
+#endif /* NV_DRM_ATOMIC_MODESET_AVAILABLE */
+
+done:
+    mutex_unlock(&nv_drm_suspend_mutex);
+}
+
 #endif /* NV_DRM_AVAILABLE */
--- a/kernel-open/nvidia-drm/nvidia-drm-drv.h
+++ b/kernel-open/nvidia-drm/nvidia-drm-drv.h
@@ -31,6 +31,12 @@ int nv_drm_probe_devices(void);

 void nv_drm_remove_devices(void);

+void nv_drm_suspend_resume(NvBool suspend);
+
+void nv_drm_register_drm_device(const nv_gpu_info_t *);
+
+void nv_drm_update_drm_driver_features(void);
+
 #endif /* defined(NV_DRM_AVAILABLE) */

 #endif /* __NVIDIA_DRM_DRV_H__ */
--- a/kernel-open/nvidia-drm/nvidia-drm-encoder.c
+++ b/kernel-open/nvidia-drm/nvidia-drm-encoder.c
@@ -300,7 +300,7 @@ void nv_drm_handle_display_change(struct nv_drm_device *nv_dev,

    nv_drm_connector_mark_connection_status_dirty(nv_encoder->nv_connector);

-    drm_kms_helper_hotplug_event(dev);
+    schedule_delayed_work(&nv_dev->hotplug_event_work, 0);
 }

 void nv_drm_handle_dynamic_display_connected(struct nv_drm_device *nv_dev,
@@ -347,6 +347,6 @@ void nv_drm_handle_dynamic_display_connected(struct nv_drm_device *nv_dev,
    drm_reinit_primary_mode_group(dev);
 #endif

-    drm_kms_helper_hotplug_event(dev);
+    schedule_delayed_work(&nv_dev->hotplug_event_work, 0);
 }
 #endif
--- a/kernel-open/nvidia-drm/nvidia-drm-fb.c
+++ b/kernel-open/nvidia-drm/nvidia-drm-fb.c
@@ -240,7 +240,7 @@ struct drm_framebuffer *nv_drm_internal_framebuffer_create(
        if (nv_dev->modifiers[i] == DRM_FORMAT_MOD_INVALID) {
            NV_DRM_DEV_DEBUG_DRIVER(
                nv_dev,
-                "Invalid format modifier for framebuffer object: 0x%016llx",
+                "Invalid format modifier for framebuffer object: 0x%016" NvU64_fmtx,
                modifier);
            return ERR_PTR(-EINVAL);
        }
--- a/kernel-open/nvidia-drm/nvidia-drm-fence.c
+++ b/kernel-open/nvidia-drm/nvidia-drm-fence.c
@@ -293,14 +293,12 @@ __nv_drm_prime_fence_context_new(
     * to check a return value.
     */

-    *nv_prime_fence_context = (struct nv_drm_prime_fence_context) {
-        .base.ops = &nv_drm_prime_fence_context_ops,
-        .base.nv_dev = nv_dev,
-        .base.context = nv_dma_fence_context_alloc(1),
-        .base.fenceSemIndex = p->index,
-        .pSemSurface = pSemSurface,
-        .pLinearAddress = pLinearAddress,
-    };
+    nv_prime_fence_context->base.ops = &nv_drm_prime_fence_context_ops;
+    nv_prime_fence_context->base.nv_dev = nv_dev;
+    nv_prime_fence_context->base.context = nv_dma_fence_context_alloc(1);
+    nv_prime_fence_context->base.fenceSemIndex = p->index;
+    nv_prime_fence_context->pSemSurface = pSemSurface;
+    nv_prime_fence_context->pLinearAddress = pLinearAddress;

    INIT_LIST_HEAD(&nv_prime_fence_context->pending);

@@ -465,10 +463,15 @@ int nv_drm_prime_fence_context_create_ioctl(struct drm_device *dev,
 {
    struct nv_drm_device *nv_dev = to_nv_device(dev);
    struct drm_nvidia_prime_fence_context_create_params *p = data;
-    struct nv_drm_prime_fence_context *nv_prime_fence_context =
-        __nv_drm_prime_fence_context_new(nv_dev, p);
+    struct nv_drm_prime_fence_context *nv_prime_fence_context;
    int err;

+    if (nv_dev->pDevice == NULL) {
+        return -EOPNOTSUPP;
+    }
+
+    nv_prime_fence_context = __nv_drm_prime_fence_context_new(nv_dev, p);
+
    if (!nv_prime_fence_context) {
        goto done;
    }
@@ -523,6 +526,11 @@ int nv_drm_gem_prime_fence_attach_ioctl(struct drm_device *dev,
    struct nv_drm_fence_context *nv_fence_context;
    nv_dma_fence_t *fence;

+    if (nv_dev->pDevice == NULL) {
+        ret = -EOPNOTSUPP;
+        goto done;
+    }
+
    if (p->__pad != 0) {
        NV_DRM_DEV_LOG_ERR(nv_dev, "Padding fields must be zeroed");
        goto done;
@@ -1261,18 +1269,16 @@ __nv_drm_semsurf_fence_ctx_new(
     * to check a return value.
     */

-    *ctx = (struct nv_drm_semsurf_fence_ctx) {
-        .base.ops = &nv_drm_semsurf_fence_ctx_ops,
-        .base.nv_dev = nv_dev,
-        .base.context = nv_dma_fence_context_alloc(1),
-        .base.fenceSemIndex = p->index,
-        .pSemSurface = pSemSurface,
-        .pSemMapping.pVoid = semMapping,
-        .pMaxSubmittedMapping = (volatile NvU64 *)maxSubmittedMapping,
-        .callback.local = NULL,
-        .callback.nvKms = NULL,
-        .current_wait_value = 0,
-    };
+    ctx->base.ops = &nv_drm_semsurf_fence_ctx_ops;
+    ctx->base.nv_dev = nv_dev;
+    ctx->base.context = nv_dma_fence_context_alloc(1);
+    ctx->base.fenceSemIndex = p->index;
+    ctx->pSemSurface = pSemSurface;
+    ctx->pSemMapping.pVoid = semMapping;
+    ctx->pMaxSubmittedMapping = (volatile NvU64 *)maxSubmittedMapping;
+    ctx->callback.local = NULL;
+    ctx->callback.nvKms = NULL;
+    ctx->current_wait_value = 0;

    spin_lock_init(&ctx->lock);
    INIT_LIST_HEAD(&ctx->pending_fences);
@@ -1312,6 +1318,10 @@ int nv_drm_semsurf_fence_ctx_create_ioctl(struct drm_device *dev,
    struct nv_drm_semsurf_fence_ctx *ctx;
    int err;

+    if (nv_dev->pDevice == NULL) {
+        return -EOPNOTSUPP;
+    }
+
    if (p->__pad != 0) {
        NV_DRM_DEV_LOG_ERR(nv_dev, "Padding fields must be zeroed");
        return -EINVAL;
@@ -1473,6 +1483,11 @@ int nv_drm_semsurf_fence_create_ioctl(struct drm_device *dev,
    int ret = -EINVAL;
    int fd;

+    if (nv_dev->pDevice == NULL) {
+        ret = -EOPNOTSUPP;
+        goto done;
+    }
+
    if (p->__pad != 0) {
        NV_DRM_DEV_LOG_ERR(nv_dev, "Padding fields must be zeroed");
        goto done;
@@ -1635,10 +1650,14 @@ int nv_drm_semsurf_fence_wait_ioctl(struct drm_device *dev,
    unsigned long flags;
    int ret = -EINVAL;

+    if (nv_dev->pDevice == NULL) {
+        return -EOPNOTSUPP;
+    }
+
    if (p->pre_wait_value >= p->post_wait_value) {
        NV_DRM_DEV_LOG_ERR(
            nv_dev,
-            "Non-monotonic wait values specified to fence wait: 0x%llu, 0x%llu",
+            "Non-monotonic wait values specified to fence wait: 0x%" NvU64_fmtu ", 0x%" NvU64_fmtu,
            p->pre_wait_value, p->post_wait_value);
        goto done;
    }
@@ -1743,6 +1762,11 @@ int nv_drm_semsurf_fence_attach_ioctl(struct drm_device *dev,
    nv_dma_fence_t *fence;
    int ret = -EINVAL;

+    if (nv_dev->pDevice == NULL) {
+        ret = -EOPNOTSUPP;
+        goto done;
+    }
+
    nv_gem = nv_drm_gem_object_lookup(nv_dev->dev, filep, p->handle);

    if (!nv_gem) {
--- a/kernel-open/nvidia-drm/nvidia-drm-gem-dma-buf.c
+++ b/kernel-open/nvidia-drm/nvidia-drm-gem-dma-buf.c
@@ -71,12 +71,42 @@ static int __nv_drm_gem_dma_buf_create_mmap_offset(
 static int __nv_drm_gem_dma_buf_mmap(struct nv_drm_gem_object *nv_gem,
                                     struct vm_area_struct *vma)
 {
+#if defined(NV_LINUX)
    struct dma_buf_attachment *attach = nv_gem->base.import_attach;
    struct dma_buf *dma_buf = attach->dmabuf;
+#endif
    struct file *old_file;
    int ret;

    /* check if buffer supports mmap */
+#if defined(NV_BSD)
+    /*
+     * Most of the FreeBSD DRM code refers to struct file*, which is actually
+     * a struct linux_file*. The dmabuf code in FreeBSD is not actually plumbed
+     * through the same linuxkpi bits it seems (probably so it can be used
+     * elsewhere), so dma_buf->file really is a native FreeBSD struct file...
+     */
+    if (!nv_gem->base.filp->f_op->mmap)
+        return -EINVAL;
+
+    /* readjust the vma */
+    get_file(nv_gem->base.filp);
+    old_file = vma->vm_file;
+    vma->vm_file = nv_gem->base.filp;
+    vma->vm_pgoff -= drm_vma_node_start(&nv_gem->base.vma_node);
+
+    ret = nv_gem->base.filp->f_op->mmap(nv_gem->base.filp, vma);
+
+    if (ret) {
+        /* restore old parameters on failure */
+        vma->vm_file = old_file;
+        vma->vm_pgoff += drm_vma_node_start(&nv_gem->base.vma_node);
+        fput(nv_gem->base.filp);
+    } else {
+        if (old_file)
+            fput(old_file);
+    }
+#else
    if (!dma_buf->file->f_op->mmap)
        return -EINVAL;

@@ -84,18 +114,20 @@ static int __nv_drm_gem_dma_buf_mmap(struct nv_drm_gem_object *nv_gem,
    get_file(dma_buf->file);
    old_file = vma->vm_file;
    vma->vm_file = dma_buf->file;
-    vma->vm_pgoff -= drm_vma_node_start(&nv_gem->base.vma_node);;
+    vma->vm_pgoff -= drm_vma_node_start(&nv_gem->base.vma_node);

    ret = dma_buf->file->f_op->mmap(dma_buf->file, vma);

    if (ret) {
        /* restore old parameters on failure */
        vma->vm_file = old_file;
+        vma->vm_pgoff += drm_vma_node_start(&nv_gem->base.vma_node);
        fput(dma_buf->file);
    } else {
        if (old_file)
            fput(old_file);
    }
+#endif

    return ret;
 }
--- a/kernel-open/nvidia-drm/nvidia-drm-gem-nvkms-memory.c
+++ b/kernel-open/nvidia-drm/nvidia-drm-gem-nvkms-memory.c
@@ -37,6 +37,9 @@
 #endif

 #include <linux/io.h>
+#if defined(NV_BSD)
+#include <vm/vm_pageout.h>
+#endif

 #include "nv-mm.h"

@@ -93,7 +96,17 @@ static vm_fault_t __nv_drm_gem_nvkms_handle_vma_fault(
    if (nv_nvkms_memory->pages_count == 0) {
        pfn = (unsigned long)(uintptr_t)nv_nvkms_memory->pPhysicalAddress;
        pfn >>= PAGE_SHIFT;
+#if defined(NV_LINUX)
+        /*
+         * FreeBSD doesn't set pgoff. We instead have pfn be the base physical
+         * address, and we will calculate the index pidx from the virtual address.
+         *
+         * This only works because linux_cdev_pager_populate passes the pidx as
+         * vmf->virtual_address. Then we turn the virtual address
+         * into a physical page number.
+         */
        pfn += page_offset;
+#endif
    } else {
        BUG_ON(page_offset >= nv_nvkms_memory->pages_count);
        pfn = page_to_pfn(nv_nvkms_memory->pages[page_offset]);
@@ -243,6 +256,15 @@ static int __nv_drm_nvkms_gem_obj_init(
    NvU64 *pages = NULL;
    NvU32 numPages = 0;

+    if ((size % PAGE_SIZE) != 0) {
+        NV_DRM_DEV_LOG_ERR(
+            nv_dev,
+            "NvKmsKapiMemory 0x%p size should be in a multiple of page size to "
+            "create a gem object",
+            pMemory);
+        return -EINVAL;
+    }
+
    nv_nvkms_memory->pPhysicalAddress = NULL;
    nv_nvkms_memory->pWriteCombinedIORemapAddress = NULL;
    nv_nvkms_memory->physically_mapped = false;
@@ -314,7 +336,7 @@ int nv_drm_dumb_create(
        ret = -ENOMEM;
        NV_DRM_DEV_LOG_ERR(
            nv_dev,
-            "Failed to allocate NvKmsKapiMemory for dumb object of size %llu",
+            "Failed to allocate NvKmsKapiMemory for dumb object of size %" NvU64_fmtu,
            args->size);
        goto nvkms_alloc_memory_failed;
    }
@@ -358,7 +380,7 @@ int nv_drm_gem_import_nvkms_memory_ioctl(struct drm_device *dev,
    int ret;

    if (!drm_core_check_feature(dev, DRIVER_MODESET)) {
-        ret = -EINVAL;
+        ret = -EOPNOTSUPP;
        goto failed;
    }

@@ -408,7 +430,7 @@ int nv_drm_gem_export_nvkms_memory_ioctl(struct drm_device *dev,
    int ret = 0;

    if (!drm_core_check_feature(dev, DRIVER_MODESET)) {
-        ret = -EINVAL;
+        ret = -EOPNOTSUPP;
        goto done;
    }

@@ -461,7 +483,7 @@ int nv_drm_gem_alloc_nvkms_memory_ioctl(struct drm_device *dev,
    int ret = 0;

    if (!drm_core_check_feature(dev, DRIVER_MODESET)) {
-        ret = -EINVAL;
+        ret = -EOPNOTSUPP;
        goto failed;
    }

@@ -529,14 +551,12 @@ static struct drm_gem_object *__nv_drm_gem_nvkms_prime_dup(
 {
    struct nv_drm_device *nv_dev = to_nv_device(dev);
    const struct nv_drm_device *nv_dev_src;
-    const struct nv_drm_gem_nvkms_memory *nv_nvkms_memory_src;
    struct nv_drm_gem_nvkms_memory *nv_nvkms_memory;
    struct NvKmsKapiMemory *pMemory;

    BUG_ON(nv_gem_src == NULL || nv_gem_src->ops != &nv_gem_nvkms_memory_ops);

    nv_dev_src = to_nv_device(nv_gem_src->base.dev);
-    nv_nvkms_memory_src = to_nv_nvkms_memory_const(nv_gem_src);

    if ((nv_nvkms_memory =
            nv_drm_calloc(1, sizeof(*nv_nvkms_memory))) == NULL) {
--- a/kernel-open/nvidia-drm/nvidia-drm-gem-user-memory.c
+++ b/kernel-open/nvidia-drm/nvidia-drm-gem-user-memory.c
@@ -36,6 +36,10 @@
 #include "linux/mm.h"
 #include "nv-mm.h"

+#if defined(NV_BSD)
+#include <vm/vm_pageout.h>
+#endif
+
 static inline
 void __nv_drm_gem_user_memory_free(struct nv_drm_gem_object *nv_gem)
 {
@@ -113,6 +117,10 @@ static vm_fault_t __nv_drm_gem_user_memory_handle_vma_fault(
    page_offset = vmf->pgoff - drm_vma_node_start(&gem->vma_node);

    BUG_ON(page_offset >= nv_user_memory->pages_count);
+
+#if !defined(NV_LINUX)
+    ret = vmf_insert_pfn(vma, address, page_to_pfn(nv_user_memory->pages[page_offset]));
+#else /* !defined(NV_LINUX) */
    ret = vm_insert_page(vma, address, nv_user_memory->pages[page_offset]);
    switch (ret) {
        case 0:
@@ -131,6 +139,7 @@ static vm_fault_t __nv_drm_gem_user_memory_handle_vma_fault(
            ret = VM_FAULT_SIGBUS;
            break;
    }
+#endif /* !defined(NV_LINUX) */

    return ret;
 }
@@ -170,7 +179,7 @@ int nv_drm_gem_import_userspace_memory_ioctl(struct drm_device *dev,
    if ((params->size % PAGE_SIZE) != 0) {
        NV_DRM_DEV_LOG_ERR(
            nv_dev,
-            "Userspace memory 0x%llx size should be in a multiple of page "
+            "Userspace memory 0x%" NvU64_fmtx " size should be in a multiple of page "
            "size to create a gem object",
            params->address);
        return -EINVAL;
@@ -183,7 +192,7 @@ int nv_drm_gem_import_userspace_memory_ioctl(struct drm_device *dev,
    if (ret != 0) {
        NV_DRM_DEV_LOG_ERR(
            nv_dev,
-            "Failed to lock user pages for address 0x%llx: %d",
+            "Failed to lock user pages for address 0x%" NvU64_fmtx ": %d",
            params->address, ret);
        return ret;
    }
--- a/kernel-open/nvidia-drm/nvidia-drm-gem.c
+++ b/kernel-open/nvidia-drm/nvidia-drm-gem.c
@@ -319,7 +319,7 @@ int nv_drm_gem_identify_object_ioctl(struct drm_device *dev,
    struct nv_drm_gem_object *nv_gem = NULL;

    if (!drm_core_check_feature(dev, DRIVER_MODESET)) {
-        return -EINVAL;
+        return -EOPNOTSUPP;
    }

    nv_dma_buf = nv_drm_gem_object_dma_buf_lookup(dev, filep, p->handle);
--- a/kernel-open/nvidia-drm/nvidia-drm-helper.c
+++ b/kernel-open/nvidia-drm/nvidia-drm-helper.c
@@ -45,8 +45,7 @@

 /*
 * The inclusion of drm_framebuffer.h was removed from drm_crtc.h by commit
- * 720cf96d8fecde29b72e1101f8a567a0ce99594f ("drm: Drop drm_framebuffer.h from
- * drm_crtc.h") in linux-next, expected in v5.19-rc7.
+ * 720cf96d8fec ("drm: Drop drm_framebuffer.h from drm_crtc.h") in v6.0.
 *
 * We only need drm_framebuffer.h for drm_framebuffer_put(), and it is always
 * present (v4.9+) when drm_framebuffer_{put,get}() is present (v4.12+), so it
--- a/kernel-open/nvidia-drm/nvidia-drm-helper.h
+++ b/kernel-open/nvidia-drm/nvidia-drm-helper.h
@@ -612,6 +612,19 @@ static inline int nv_drm_format_num_planes(uint32_t format)

 #endif /* defined(NV_DRM_FORMAT_MODIFIERS_PRESENT) */

+/*
+ * DRM_UNLOCKED was removed with commit 2798ffcc1d6a ("drm: Remove locking for
+ * legacy ioctls and DRM_UNLOCKED") in v6.8, but it was previously made
+ * implicit for all non-legacy DRM driver IOCTLs since Linux v4.10 commit
+ * fa5386459f06 "drm: Used DRM_LEGACY for all legacy functions" (Linux v4.4
+ * commit ea487835e887 "drm: Enforce unlocked ioctl operation for kms driver
+ * ioctls" previously did it only for drivers that set the DRM_MODESET flag), so
+ * it was effectively a no-op anyway.
+ */
+#if !defined(NV_DRM_UNLOCKED_IOCTL_FLAG_PRESENT)
+#define DRM_UNLOCKED 0
+#endif
+
 /*
 * drm_vma_offset_exact_lookup_locked() were added
 * by kernel commit 2225cfe46bcc which was Signed-off-by:
--- a/kernel-open/nvidia-drm/nvidia-drm-ioctl.h
+++ b/kernel-open/nvidia-drm/nvidia-drm-ioctl.h
@@ -52,6 +52,7 @@
 #define DRM_NVIDIA_SEMSURF_FENCE_CREATE             0x15
 #define DRM_NVIDIA_SEMSURF_FENCE_WAIT               0x16
 #define DRM_NVIDIA_SEMSURF_FENCE_ATTACH             0x17
+#define DRM_NVIDIA_GET_DRM_FILE_UNIQUE_ID           0x18

 #define DRM_IOCTL_NVIDIA_GEM_IMPORT_NVKMS_MEMORY                           \
    DRM_IOWR((DRM_COMMAND_BASE + DRM_NVIDIA_GEM_IMPORT_NVKMS_MEMORY),      \
@@ -71,7 +72,7 @@
 *
 * 'warning: suggest parentheses around arithmetic in operand of |'
 */
-#if defined(NV_LINUX)
+#if defined(NV_LINUX) || defined(NV_BSD)
 #define DRM_IOCTL_NVIDIA_FENCE_SUPPORTED                         \
    DRM_IO(DRM_COMMAND_BASE + DRM_NVIDIA_FENCE_SUPPORTED)
 #define DRM_IOCTL_NVIDIA_DMABUF_SUPPORTED                        \
@@ -157,6 +158,11 @@
              DRM_NVIDIA_SEMSURF_FENCE_ATTACH),                         \
              struct drm_nvidia_semsurf_fence_attach_params)

+#define DRM_IOCTL_NVIDIA_GET_DRM_FILE_UNIQUE_ID                         \
+    DRM_IOWR((DRM_COMMAND_BASE +                                        \
+              DRM_NVIDIA_GET_DRM_FILE_UNIQUE_ID),                       \
+              struct drm_nvidia_get_drm_file_unique_id_params)
+
 struct drm_nvidia_gem_import_nvkms_memory_params {
    uint64_t mem_size;           /* IN */

@@ -178,7 +184,10 @@ struct drm_nvidia_get_dev_info_params {
    uint32_t gpu_id;             /* OUT */
    uint32_t primary_index;      /* OUT; the "card%d" value */

-    /* See DRM_FORMAT_MOD_NVIDIA_BLOCK_LINEAR_2D definitions of these */
+    uint32_t supports_alloc;     /* OUT */
+    /* The generic_page_kind, page_kind_generation, and sector_layout
+     * fields are only valid if supports_alloc is true.
+     * See DRM_FORMAT_MOD_NVIDIA_BLOCK_LINEAR_2D definitions of these. */
    uint32_t generic_page_kind;    /* OUT */
    uint32_t page_kind_generation; /* OUT */
    uint32_t sector_layout;        /* OUT */
@@ -382,4 +391,8 @@ struct drm_nvidia_semsurf_fence_attach_params {
    uint64_t wait_value;            /* IN Semaphore value to reach before signal */
 };

+struct drm_nvidia_get_drm_file_unique_id_params {
+    uint64_t id;                    /* OUT Unique ID of the DRM file */
+};
+
 #endif /* _UAPI_NVIDIA_DRM_IOCTL_H_ */
--- a/kernel-open/nvidia-drm/nvidia-drm-linux.c
+++ b/kernel-open/nvidia-drm/nvidia-drm-linux.c
@@ -1,5 +1,5 @@
 /*
- * Copyright (c) 2015, NVIDIA CORPORATION. All rights reserved.
+ * Copyright (c) 2015-2023, NVIDIA CORPORATION. All rights reserved.
 *
 * Permission is hereby granted, free of charge, to any person obtaining a
 * copy of this software and associated documentation files (the "Software"),
@@ -21,8 +21,6 @@
 */

 #include <linux/module.h>
-#include <linux/slab.h>
-#include <linux/err.h>

 #include "nvidia-drm-os-interface.h"
 #include "nvidia-drm.h"
@@ -31,261 +29,18 @@

 #if defined(NV_DRM_AVAILABLE)

-#if defined(NV_DRM_DRMP_H_PRESENT)
-#include <drm/drmP.h>
-#endif
-
-#if defined(NV_LINUX_SYNC_FILE_H_PRESENT)
-#include <linux/file.h>
-#include <linux/sync_file.h>
-#endif
-
-#include <linux/vmalloc.h>
-#include <linux/sched.h>
-
-#include "nv-mm.h"
-
 MODULE_PARM_DESC(
    modeset,
    "Enable atomic kernel modesetting (1 = enable, 0 = disable (default))");
-bool nv_drm_modeset_module_param = false;
 module_param_named(modeset, nv_drm_modeset_module_param, bool, 0400);

 #if defined(NV_DRM_FBDEV_GENERIC_AVAILABLE)
 MODULE_PARM_DESC(
    fbdev,
    "Create a framebuffer device (1 = enable, 0 = disable (default)) (EXPERIMENTAL)");
-bool nv_drm_fbdev_module_param = false;
 module_param_named(fbdev, nv_drm_fbdev_module_param, bool, 0400);
 #endif

-void *nv_drm_calloc(size_t nmemb, size_t size)
-{
-    size_t total_size = nmemb * size;
-    //
-    // Check for overflow.
-    //
-    if ((nmemb != 0) && ((total_size / nmemb) != size))
-    {
-        return NULL;
-    }
-    return kzalloc(nmemb * size, GFP_KERNEL);
-}
-
-void nv_drm_free(void *ptr)
-{
-    if (IS_ERR(ptr)) {
-        return;
-    }
-
-    kfree(ptr);
-}
-
-char *nv_drm_asprintf(const char *fmt, ...)
-{
-    va_list ap;
-    char *p;
-
-    va_start(ap, fmt);
-    p = kvasprintf(GFP_KERNEL, fmt, ap);
-    va_end(ap);
-
-    return p;
-}
-
-#if defined(NVCPU_X86) || defined(NVCPU_X86_64)
-  #define WRITE_COMBINE_FLUSH()    asm volatile("sfence":::"memory")
-#elif defined(NVCPU_PPC64LE)
-  #define WRITE_COMBINE_FLUSH()    asm volatile("sync":::"memory")
-#else
-  #define WRITE_COMBINE_FLUSH()    mb()
-#endif
-
-void nv_drm_write_combine_flush(void)
-{
-    WRITE_COMBINE_FLUSH();
-}
-
-int nv_drm_lock_user_pages(unsigned long address,
-                           unsigned long pages_count, struct page ***pages)
-{
-    struct mm_struct *mm = current->mm;
-    struct page **user_pages;
-    int pages_pinned;
-
-    user_pages = nv_drm_calloc(pages_count, sizeof(*user_pages));
-
-    if (user_pages == NULL) {
-        return -ENOMEM;
-    }
-
-    nv_mmap_read_lock(mm);
-
-    pages_pinned = NV_PIN_USER_PAGES(address, pages_count, FOLL_WRITE,
-                                     user_pages, NULL);
-    nv_mmap_read_unlock(mm);
-
-    if (pages_pinned < 0 || (unsigned)pages_pinned < pages_count) {
-        goto failed;
-    }
-
-    *pages = user_pages;
-
-    return 0;
-
-failed:
-
-    if (pages_pinned > 0) {
-        int i;
-
-        for (i = 0; i < pages_pinned; i++) {
-           NV_UNPIN_USER_PAGE(user_pages[i]);
-        }
-    }
-
-    nv_drm_free(user_pages);
-
-    return (pages_pinned < 0) ? pages_pinned : -EINVAL;
-}
-
-void nv_drm_unlock_user_pages(unsigned long  pages_count, struct page **pages)
-{
-    unsigned long i;
-
-    for (i = 0; i < pages_count; i++) {
-        set_page_dirty_lock(pages[i]);
-        NV_UNPIN_USER_PAGE(pages[i]);
-    }
-
-    nv_drm_free(pages);
-}
-
-void *nv_drm_vmap(struct page **pages, unsigned long pages_count)
-{
-    return vmap(pages, pages_count, VM_USERMAP, PAGE_KERNEL);
-}
-
-void nv_drm_vunmap(void *address)
-{
-    vunmap(address);
-}
-
-bool nv_drm_workthread_init(nv_drm_workthread *worker, const char *name)
-{
-    worker->shutting_down = false;
-    if (nv_kthread_q_init(&worker->q, name)) {
-        return false;
-    }
-
-    spin_lock_init(&worker->lock);
-
-    return true;
-}
-
-void nv_drm_workthread_shutdown(nv_drm_workthread *worker)
-{
-    unsigned long flags;
-
-    spin_lock_irqsave(&worker->lock, flags);
-    worker->shutting_down = true;
-    spin_unlock_irqrestore(&worker->lock, flags);
-
-    nv_kthread_q_stop(&worker->q);
-}
-
-void nv_drm_workthread_work_init(nv_drm_work *work,
-                                 void (*callback)(void *),
-                                 void *arg)
-{
-    nv_kthread_q_item_init(work, callback, arg);
-}
-
-int nv_drm_workthread_add_work(nv_drm_workthread *worker, nv_drm_work *work)
-{
-    unsigned long flags;
-    int ret = 0;
-
-    spin_lock_irqsave(&worker->lock, flags);
-    if (!worker->shutting_down) {
-        ret = nv_kthread_q_schedule_q_item(&worker->q, work);
-    }
-    spin_unlock_irqrestore(&worker->lock, flags);
-
-    return ret;
-}
-
-void nv_drm_timer_setup(nv_drm_timer *timer, void (*callback)(nv_drm_timer *nv_drm_timer))
-{
-    nv_timer_setup(timer, callback);
-}
-
-void nv_drm_mod_timer(nv_drm_timer *timer, unsigned long timeout_native)
-{
-    mod_timer(&timer->kernel_timer, timeout_native);
-}
-
-unsigned long nv_drm_timer_now(void)
-{
-    return jiffies;
-}
-
-unsigned long nv_drm_timeout_from_ms(NvU64 relative_timeout_ms)
-{
-    return jiffies + msecs_to_jiffies(relative_timeout_ms);
-}
-
-bool nv_drm_del_timer_sync(nv_drm_timer *timer)
-{
-    if (del_timer_sync(&timer->kernel_timer)) {
-        return true;
-    } else {
-        return false;
-    }
-}
-
-#if defined(NV_DRM_FENCE_AVAILABLE)
-int nv_drm_create_sync_file(nv_dma_fence_t *fence)
-{
-#if defined(NV_LINUX_SYNC_FILE_H_PRESENT)
-    struct sync_file *sync;
-    int fd = get_unused_fd_flags(O_CLOEXEC);
-
-    if (fd < 0) {
-        return fd;
-    }
-
-    /* sync_file_create() generates its own reference to the fence */
-    sync = sync_file_create(fence);
-
-    if (IS_ERR(sync)) {
-        put_unused_fd(fd);
-        return PTR_ERR(sync);
-    }
-
-    fd_install(fd, sync->file);
-
-    return fd;
-#else /* defined(NV_LINUX_SYNC_FILE_H_PRESENT) */
-    return -EINVAL;
-#endif  /* defined(NV_LINUX_SYNC_FILE_H_PRESENT) */
-}
-
-nv_dma_fence_t *nv_drm_sync_file_get_fence(int fd)
-{
-#if defined(NV_SYNC_FILE_GET_FENCE_PRESENT)
-    return sync_file_get_fence(fd);
-#else /* defined(NV_SYNC_FILE_GET_FENCE_PRESENT) */
-    return NULL;
-#endif  /* defined(NV_SYNC_FILE_GET_FENCE_PRESENT) */
-}
-#endif /* defined(NV_DRM_FENCE_AVAILABLE) */
-
-void nv_drm_yield(void)
-{
-    set_current_state(TASK_INTERRUPTIBLE);
-    schedule_timeout(1);
-}
-
 #endif /* NV_DRM_AVAILABLE */

 /*************************************************************************
--- a/kernel-open/nvidia-drm/nvidia-drm-modeset.c
+++ b/kernel-open/nvidia-drm/nvidia-drm-modeset.c
@@ -42,6 +42,16 @@
 #include <drm/drm_atomic_helper.h>
 #include <drm/drm_crtc.h>

+#if defined(NV_LINUX_NVHOST_H_PRESENT) && defined(CONFIG_TEGRA_GRHOST)
+#include <linux/nvhost.h>
+#elif defined(NV_LINUX_HOST1X_NEXT_H_PRESENT)            
+#include <linux/host1x-next.h>
+#endif
+
+#if defined(NV_DRM_FENCE_AVAILABLE)
+#include "nvidia-dma-fence-helper.h"
+#endif
+
 struct nv_drm_atomic_state {
    struct NvKmsKapiRequestedModeSetConfig config;
    struct drm_atomic_state base;
@@ -146,6 +156,159 @@ static int __nv_drm_put_back_post_fence_fd(
    return ret;
 }

+#if defined(NV_DRM_FENCE_AVAILABLE)
+struct nv_drm_plane_fence_cb_data {
+    nv_dma_fence_cb_t dma_fence_cb;
+    struct nv_drm_device *nv_dev;
+    NvU32 semaphore_index;
+};
+
+static void
+__nv_drm_plane_fence_cb(
+    nv_dma_fence_t *fence,
+    nv_dma_fence_cb_t *cb_data
+)
+{
+    struct nv_drm_plane_fence_cb_data *fence_data =
+        container_of(cb_data, typeof(*fence_data), dma_fence_cb);
+    struct nv_drm_device *nv_dev = fence_data->nv_dev;
+
+    nv_dma_fence_put(fence);
+    nvKms->signalDisplaySemaphore(nv_dev->pDevice, fence_data->semaphore_index);
+    nv_drm_free(fence_data);
+}
+
+static int __nv_drm_convert_in_fences(
+    struct nv_drm_device *nv_dev,
+    struct drm_atomic_state *state,
+    struct drm_crtc *crtc,
+    struct drm_crtc_state *crtc_state)
+{
+    struct drm_plane *plane = NULL;
+    struct drm_plane_state *plane_state = NULL;
+    struct nv_drm_plane *nv_plane = NULL;
+    struct NvKmsKapiLayerRequestedConfig *plane_req_config = NULL;
+    struct NvKmsKapiHeadRequestedConfig *head_req_config =
+        &to_nv_crtc_state(crtc_state)->req_config;
+    struct nv_drm_plane_fence_cb_data *fence_data;
+    uint32_t semaphore_index;
+    int ret, i;
+
+    if (!crtc_state->active) {
+        return 0;
+    }
+
+    nv_drm_for_each_new_plane_in_state(state, plane, plane_state, i) {
+        if ((plane->type == DRM_PLANE_TYPE_CURSOR) ||
+            (plane_state->crtc != crtc) ||
+            (plane_state->fence == NULL)) {
+            continue;
+        }
+
+        nv_plane = to_nv_plane(plane);
+        plane_req_config =
+            &head_req_config->layerRequestedConfig[nv_plane->layer_idx];
+
+        if (nv_dev->supportsSyncpts) {
+#if defined(NV_LINUX_NVHOST_H_PRESENT) && defined(CONFIG_TEGRA_GRHOST)
+#if defined(NV_NVHOST_DMA_FENCE_UNPACK_PRESENT)
+            int ret =
+                nvhost_dma_fence_unpack(
+                    plane_state->fence,
+                    &plane_req_config->config.syncParams.u.syncpt.preSyncptId,
+                    &plane_req_config->config.syncParams.u.syncpt.preSyncptValue);
+            if (ret == 0) {
+                plane_req_config->config.syncParams.preSyncptSpecified = true;
+                continue;
+            }
+#endif
+#elif defined(NV_LINUX_HOST1X_NEXT_H_PRESENT)
+            int ret =
+                host1x_fence_extract(
+                    plane_state->fence,
+                    &plane_req_config->config.syncParams.u.syncpt.preSyncptId,
+                    &plane_req_config->config.syncParams.u.syncpt.preSyncptValue);
+            if (ret == 0) {
+                plane_req_config->config.syncParams.preSyncptSpecified = true;
+                continue;
+            }
+#endif
+        }
+
+        /*
+         * Syncpt extraction failed, or syncpts are not supported.
+         * Use general DRM fence support with semaphores instead.
+         */
+        if (plane_req_config->config.syncParams.postSyncptRequested) {
+            // Can't mix Syncpts and semaphores in a given request.
+            return -EINVAL;
+        }
+
+        semaphore_index = nv_drm_next_display_semaphore(nv_dev);
+
+        if (!nvKms->resetDisplaySemaphore(nv_dev->pDevice, semaphore_index)) {
+            NV_DRM_DEV_LOG_ERR(
+                nv_dev,
+                "Failed to initialize semaphore for plane fence");
+            /*
+             * This should only happen if the semaphore pool was somehow
+             * exhausted. Waiting a bit and retrying may help in that case.
+             */
+            return -EAGAIN;
+        }
+
+        plane_req_config->config.syncParams.semaphoreSpecified = true;
+        plane_req_config->config.syncParams.u.semaphore.index = semaphore_index;
+
+        fence_data = nv_drm_calloc(1, sizeof(*fence_data));
+
+        if (!fence_data) {
+            NV_DRM_DEV_LOG_ERR(
+                nv_dev,
+                "Failed to allocate callback data for plane fence");
+            nvKms->cancelDisplaySemaphore(nv_dev->pDevice, semaphore_index);
+            return -ENOMEM;
+        }
+
+        fence_data->nv_dev = nv_dev;
+        fence_data->semaphore_index = semaphore_index;
+
+        ret = nv_dma_fence_add_callback(plane_state->fence,
+                                        &fence_data->dma_fence_cb,
+                                        __nv_drm_plane_fence_cb);
+
+        switch (ret) {
+        case -ENOENT:
+            /* The fence is already signaled */
+            __nv_drm_plane_fence_cb(plane_state->fence,
+                                    &fence_data->dma_fence_cb);
+#if defined(fallthrough)
+            fallthrough;
+#else
+            /* Fallthrough */
+#endif
+        case 0:
+            /*
+             * The plane state's fence reference has either been consumed or
+             * belongs to the outstanding callback now.
+             */
+            plane_state->fence = NULL;
+            break;
+        default:
+            NV_DRM_DEV_LOG_ERR(
+                nv_dev,
+                "Failed plane fence callback registration");
+            /* Fence callback registration failed */
+            nvKms->cancelDisplaySemaphore(nv_dev->pDevice, semaphore_index);
+            nv_drm_free(fence_data);
+            return ret;
+        }
+    }
+
+    return 0;
+}
+#endif /* defined(NV_DRM_FENCE_AVAILABLE) */
+
 static int __nv_drm_get_syncpt_data(
    struct nv_drm_device *nv_dev,
    struct drm_crtc *crtc,
@@ -258,11 +421,6 @@ nv_drm_atomic_apply_modeset_config(struct drm_device *dev,
                               commit ? crtc->state : crtc_state;
        struct nv_drm_crtc *nv_crtc = to_nv_crtc(crtc);

-        requested_config->headRequestedConfig[nv_crtc->head] =
-            to_nv_crtc_state(new_crtc_state)->req_config;
-
-        requested_config->headsMask |= 1 << nv_crtc->head;
-
        if (commit) {
            struct drm_crtc_state *old_crtc_state = crtc_state;
            struct nv_drm_crtc_state *nv_new_crtc_state =
@@ -282,7 +440,27 @@ nv_drm_atomic_apply_modeset_config(struct drm_device *dev,

                nv_new_crtc_state->nv_flip = NULL;
            }
+
+#if defined(NV_DRM_FENCE_AVAILABLE)
+            ret = __nv_drm_convert_in_fences(nv_dev,
+                                             state,
+                                             crtc,
+                                             new_crtc_state);
+
+            if (ret != 0) {
+                return ret;
+            }
+#endif /* defined(NV_DRM_FENCE_AVAILABLE) */
        }
+
+        /*
+         * Do this deep copy after calling __nv_drm_convert_in_fences,
+         * which modifies the new CRTC state's req_config member
+         */
+        requested_config->headRequestedConfig[nv_crtc->head] =
+            to_nv_crtc_state(new_crtc_state)->req_config;
+
+        requested_config->headsMask |= 1 << nv_crtc->head;
    }

    if (commit && nvKms->systemInfo.bAllowWriteCombining) {
@@ -313,6 +491,10 @@ nv_drm_atomic_apply_modeset_config(struct drm_device *dev,
        }
    }

+    if (commit && nv_dev->requiresVrrSemaphores && reply_config.vrrFlip) {
+        nvKms->signalVrrSemaphore(nv_dev->pDevice, reply_config.vrrSemaphoreIndex);
+    }
+
    return 0;
 }

@@ -321,6 +503,24 @@ int nv_drm_atomic_check(struct drm_device *dev,
 {
    int ret = 0;

+#if defined(NV_DRM_COLOR_MGMT_AVAILABLE)
+    struct drm_crtc *crtc;
+    struct drm_crtc_state *crtc_state;
+    int i;
+
+    nv_drm_for_each_crtc_in_state(state, crtc, crtc_state, i) {
+        /*
+         * if the color management changed on the crtc, we need to update the
+         * crtc's plane's CSC matrices, so add the crtc's planes to the commit
+         */
+        if (crtc_state->color_mgmt_changed) {
+            if ((ret = drm_atomic_add_affected_planes(state, crtc)) != 0) {
+                goto done;
+            }
+        }
+    }
+#endif /* NV_DRM_COLOR_MGMT_AVAILABLE */
+
    if ((ret = drm_atomic_helper_check(dev, state)) != 0) {
        goto done;
    }
@@ -488,7 +688,6 @@ int nv_drm_atomic_commit(struct drm_device *dev,

        goto done;
    }
-    nv_dev->drmMasterChangedSinceLastAtomicCommit = NV_FALSE;

    nv_drm_for_each_crtc_in_state(state, crtc, crtc_state, i) {
        struct nv_drm_crtc *nv_crtc = to_nv_crtc(crtc);
@@ -569,6 +768,9 @@ int nv_drm_atomic_commit(struct drm_device *dev,
                NV_DRM_DEV_LOG_ERR(
                    nv_dev,
                    "Flip event timeout on head %u", nv_crtc->head);
+                while (!list_empty(&nv_crtc->flip_list)) {
+                    __nv_drm_handle_flip_event(nv_crtc);
+                }
            }
        }
    }
--- a/kernel-open/nvidia-drm/nvidia-drm-os-interface.c
+++ b/kernel-open/nvidia-drm/nvidia-drm-os-interface.c
@@ -0,0 +1,285 @@
+/*
+ * Copyright (c) 2015-2023, NVIDIA CORPORATION. All rights reserved.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ */
+
+#include <linux/slab.h>
+
+#include "nvidia-drm-os-interface.h"
+
+#if defined(NV_DRM_AVAILABLE)
+
+#if defined(NV_LINUX_SYNC_FILE_H_PRESENT)
+#include <linux/file.h>
+#include <linux/sync_file.h>
+#endif
+
+#include <linux/vmalloc.h>
+#include <linux/sched.h>
+#include <linux/device.h>
+
+#include "nv-mm.h"
+
+#if defined(NV_DRM_DRMP_H_PRESENT)
+#include <drm/drmP.h>
+#endif
+
+bool nv_drm_modeset_module_param = false;
+bool nv_drm_fbdev_module_param = false;
+
+void *nv_drm_calloc(size_t nmemb, size_t size)
+{
+    size_t total_size = nmemb * size;
+    //
+    // Check for overflow.
+    //
+    if ((nmemb != 0) && ((total_size / nmemb) != size))
+    {
+        return NULL;
+    }
+    return kzalloc(nmemb * size, GFP_KERNEL);
+}
+
+void nv_drm_free(void *ptr)
+{
+    if (IS_ERR(ptr)) {
+        return;
+    }
+
+    kfree(ptr);
+}
+
+char *nv_drm_asprintf(const char *fmt, ...)
+{
+    va_list ap;
+    char *p;
+
+    va_start(ap, fmt);
+    p = kvasprintf(GFP_KERNEL, fmt, ap);
+    va_end(ap);
+
+    return p;
+}
+
+#if defined(NVCPU_X86) || defined(NVCPU_X86_64)
+  #define WRITE_COMBINE_FLUSH()    asm volatile("sfence":::"memory")
+#elif defined(NVCPU_PPC64LE)
+  #define WRITE_COMBINE_FLUSH()    asm volatile("sync":::"memory")
+#else
+  #define WRITE_COMBINE_FLUSH()    mb()
+#endif
+
+void nv_drm_write_combine_flush(void)
+{
+    WRITE_COMBINE_FLUSH();
+}
+
+int nv_drm_lock_user_pages(unsigned long address,
+                           unsigned long pages_count, struct page ***pages)
+{
+    struct mm_struct *mm = current->mm;
+    struct page **user_pages;
+    int pages_pinned;
+
+    user_pages = nv_drm_calloc(pages_count, sizeof(*user_pages));
+
+    if (user_pages == NULL) {
+        return -ENOMEM;
+    }
+
+    nv_mmap_read_lock(mm);
+
+    pages_pinned = NV_PIN_USER_PAGES(address, pages_count, FOLL_WRITE,
+                                     user_pages);
+    nv_mmap_read_unlock(mm);
+
+    if (pages_pinned < 0 || (unsigned)pages_pinned < pages_count) {
+        goto failed;
+    }
+
+    *pages = user_pages;
+
+    return 0;
+
+failed:
+
+    if (pages_pinned > 0) {
+        int i;
+
+        for (i = 0; i < pages_pinned; i++) {
+           NV_UNPIN_USER_PAGE(user_pages[i]);
+        }
+    }
+
+    nv_drm_free(user_pages);
+
+    return (pages_pinned < 0) ? pages_pinned : -EINVAL;
+}
+
+void nv_drm_unlock_user_pages(unsigned long  pages_count, struct page **pages)
+{
+    unsigned long i;
+
+    for (i = 0; i < pages_count; i++) {
+        set_page_dirty_lock(pages[i]);
+        NV_UNPIN_USER_PAGE(pages[i]);
+    }
+
+    nv_drm_free(pages);
+}
+
+/*
+ * linuxkpi vmap doesn't use the flags argument as it
+ * doesn't seem to be needed. Define VM_USERMAP to 0
+ * to make errors go away
+ *
+ * vmap: sys/compat/linuxkpi/common/src/linux_compat.c
+ */
+#if defined(NV_BSD)
+#define VM_USERMAP 0
+#endif
+
+void *nv_drm_vmap(struct page **pages, unsigned long pages_count)
+{
+    return vmap(pages, pages_count, VM_USERMAP, PAGE_KERNEL);
+}
+
+void nv_drm_vunmap(void *address)
+{
+    vunmap(address);
+}
+
+bool nv_drm_workthread_init(nv_drm_workthread *worker, const char *name)
+{
+    worker->shutting_down = false;
+    if (nv_kthread_q_init(&worker->q, name)) {
+        return false;
+    }
+
+    spin_lock_init(&worker->lock);
+
+    return true;
+}
+
+void nv_drm_workthread_shutdown(nv_drm_workthread *worker)
+{
+    unsigned long flags;
+
+    spin_lock_irqsave(&worker->lock, flags);
+    worker->shutting_down = true;
+    spin_unlock_irqrestore(&worker->lock, flags);
+
+    nv_kthread_q_stop(&worker->q);
+}
+
+void nv_drm_workthread_work_init(nv_drm_work *work,
+                                 void (*callback)(void *),
+                                 void *arg)
+{
+    nv_kthread_q_item_init(work, callback, arg);
+}
+
+int nv_drm_workthread_add_work(nv_drm_workthread *worker, nv_drm_work *work)
+{
+    unsigned long flags;
+    int ret = 0;
+
+    spin_lock_irqsave(&worker->lock, flags);
+    if (!worker->shutting_down) {
+        ret = nv_kthread_q_schedule_q_item(&worker->q, work);
+    }
+    spin_unlock_irqrestore(&worker->lock, flags);
+
+    return ret;
+}
+
+void nv_drm_timer_setup(nv_drm_timer *timer, void (*callback)(nv_drm_timer *nv_drm_timer))
+{
+    nv_timer_setup(timer, callback);
+}
+
+void nv_drm_mod_timer(nv_drm_timer *timer, unsigned long timeout_native)
+{
+    mod_timer(&timer->kernel_timer, timeout_native);
+}
+
+unsigned long nv_drm_timer_now(void)
+{
+    return jiffies;
+}
+
+unsigned long nv_drm_timeout_from_ms(NvU64 relative_timeout_ms)
+{
+    return jiffies + msecs_to_jiffies(relative_timeout_ms);
+}
+
+bool nv_drm_del_timer_sync(nv_drm_timer *timer)
+{
+    if (del_timer_sync(&timer->kernel_timer)) {
+        return true;
+    } else {
+        return false;
+    }
+}
+
+#if defined(NV_DRM_FENCE_AVAILABLE)
+int nv_drm_create_sync_file(nv_dma_fence_t *fence)
+{
+#if defined(NV_LINUX_SYNC_FILE_H_PRESENT)
+    struct sync_file *sync;
+    int fd = get_unused_fd_flags(O_CLOEXEC);
+
+    if (fd < 0) {
+        return fd;
+    }
+
+    /* sync_file_create() generates its own reference to the fence */
+    sync = sync_file_create(fence);
+
+    if (IS_ERR(sync)) {
+        put_unused_fd(fd);
+        return PTR_ERR(sync);
+    }
+
+    fd_install(fd, sync->file);
+
+    return fd;
+#else /* defined(NV_LINUX_SYNC_FILE_H_PRESENT) */
+    return -EINVAL;
+#endif  /* defined(NV_LINUX_SYNC_FILE_H_PRESENT) */
+}
+
+nv_dma_fence_t *nv_drm_sync_file_get_fence(int fd)
+{
+#if defined(NV_SYNC_FILE_GET_FENCE_PRESENT)
+    return sync_file_get_fence(fd);
+#else /* defined(NV_SYNC_FILE_GET_FENCE_PRESENT) */
+    return NULL;
+#endif  /* defined(NV_SYNC_FILE_GET_FENCE_PRESENT) */
+}
+#endif /* defined(NV_DRM_FENCE_AVAILABLE) */
+
+void nv_drm_yield(void)
+{
+    set_current_state(TASK_INTERRUPTIBLE);
+    schedule_timeout(1);
+}
+
+#endif /* NV_DRM_AVAILABLE */
--- a/kernel-open/nvidia-drm/nvidia-drm-os-interface.h
+++ b/kernel-open/nvidia-drm/nvidia-drm-os-interface.h
@@ -33,7 +33,7 @@
 #include "nvidia-dma-fence-helper.h"
 #endif

-#if defined(NV_LINUX)
+#if defined(NV_LINUX) || defined(NV_BSD)
 #include "nv-kthread-q.h"
 #include "linux/spinlock.h"

@@ -45,18 +45,18 @@ typedef struct nv_drm_workthread {

 typedef nv_kthread_q_item_t nv_drm_work;

-#else /* defined(NV_LINUX) */
+#else
 #error "Need to define deferred work primitives for this OS"
-#endif /* else defined(NV_LINUX) */
+#endif

-#if defined(NV_LINUX)
+#if defined(NV_LINUX) || defined(NV_BSD)
 #include "nv-timer.h"

 typedef struct nv_timer nv_drm_timer;

-#else /* defined(NV_LINUX) */
+#else
 #error "Need to define kernel timer callback primitives for this OS"
-#endif /* else defined(NV_LINUX) */
+#endif

 #if defined(NV_DRM_FBDEV_GENERIC_SETUP_PRESENT) && defined(NV_DRM_APERTURE_REMOVE_CONFLICTING_PCI_FRAMEBUFFERS_PRESENT)
 #define NV_DRM_FBDEV_GENERIC_AVAILABLE
--- a/kernel-open/nvidia-drm/nvidia-drm-priv.h
+++ b/kernel-open/nvidia-drm/nvidia-drm-priv.h
@@ -126,6 +126,7 @@ struct nv_drm_device {
    NvU64 modifiers[6 /* block linear */ + 1 /* linear */ + 1 /* terminator */];
 #endif

+    struct delayed_work hotplug_event_work;
    atomic_t enable_event_handling;

    /**
@@ -146,22 +147,18 @@ struct nv_drm_device {
    NvBool hasVideoMemory;

    NvBool supportsSyncpts;
+    NvBool requiresVrrSemaphores;
    NvBool subOwnershipGranted;
    NvBool hasFramebufferConsole;

-    /**
-     * @drmMasterChangedSinceLastAtomicCommit:
-     *
-     * This flag is set in nv_drm_master_set and reset after a completed atomic
-     * commit. It is used to restore or recommit state that is lost by the
-     * NvKms modeset owner change, such as the CRTC color management
-     * properties.
-     */
-    NvBool drmMasterChangedSinceLastAtomicCommit;
-
    struct drm_property *nv_out_fence_property;
    struct drm_property *nv_input_colorspace_property;

+    struct {
+        NvU32 count;
+        NvU32 next_index;
+    } display_semaphores;
+
 #if defined(NV_DRM_HAS_HDR_OUTPUT_METADATA)
    struct drm_property *nv_hdr_output_metadata_property;
 #endif
@@ -169,6 +166,19 @@ struct nv_drm_device {
    struct nv_drm_device *next;
 };

+static inline NvU32 nv_drm_next_display_semaphore(
+    struct nv_drm_device *nv_dev)
+{
+    NvU32 current_index = nv_dev->display_semaphores.next_index++;
+
+    if (nv_dev->display_semaphores.next_index >=
+        nv_dev->display_semaphores.count) {
+        nv_dev->display_semaphores.next_index = 0;
+    }
+
+    return current_index;
+}
+
 static inline struct nv_drm_device *to_nv_device(
    struct drm_device *dev)
 {
--- a/kernel-open/nvidia-drm/nvidia-drm-sources.mk
+++ b/kernel-open/nvidia-drm/nvidia-drm-sources.mk
@@ -0,0 +1,132 @@
+###########################################################################
+# Kbuild fragment for nvidia-drm.ko
+###########################################################################
+
+#
+# Define NVIDIA_DRM_SOURCES
+#
+
+NVIDIA_DRM_SOURCES =
+NVIDIA_DRM_SOURCES += nvidia-drm/nvidia-drm.c
+NVIDIA_DRM_SOURCES += nvidia-drm/nvidia-drm-drv.c
+NVIDIA_DRM_SOURCES += nvidia-drm/nvidia-drm-utils.c
+NVIDIA_DRM_SOURCES += nvidia-drm/nvidia-drm-crtc.c
+NVIDIA_DRM_SOURCES += nvidia-drm/nvidia-drm-encoder.c
+NVIDIA_DRM_SOURCES += nvidia-drm/nvidia-drm-connector.c
+NVIDIA_DRM_SOURCES += nvidia-drm/nvidia-drm-gem.c
+NVIDIA_DRM_SOURCES += nvidia-drm/nvidia-drm-fb.c
+NVIDIA_DRM_SOURCES += nvidia-drm/nvidia-drm-modeset.c
+NVIDIA_DRM_SOURCES += nvidia-drm/nvidia-drm-fence.c
+NVIDIA_DRM_SOURCES += nvidia-drm/nvidia-drm-helper.c
+NVIDIA_DRM_SOURCES += nvidia-drm/nv-kthread-q.c
+NVIDIA_DRM_SOURCES += nvidia-drm/nv-pci-table.c
+NVIDIA_DRM_SOURCES += nvidia-drm/nvidia-drm-gem-nvkms-memory.c
+NVIDIA_DRM_SOURCES += nvidia-drm/nvidia-drm-gem-user-memory.c
+NVIDIA_DRM_SOURCES += nvidia-drm/nvidia-drm-gem-dma-buf.c
+NVIDIA_DRM_SOURCES += nvidia-drm/nvidia-drm-format.c
+NVIDIA_DRM_SOURCES += nvidia-drm/nvidia-drm-os-interface.c
+
+#
+# Register the conftests needed by nvidia-drm.ko
+#
+
+NV_CONFTEST_GENERIC_COMPILE_TESTS += drm_available
+NV_CONFTEST_GENERIC_COMPILE_TESTS += drm_atomic_available
+NV_CONFTEST_GENERIC_COMPILE_TESTS += is_export_symbol_gpl_refcount_inc
+NV_CONFTEST_GENERIC_COMPILE_TESTS += is_export_symbol_gpl_refcount_dec_and_test
+NV_CONFTEST_GENERIC_COMPILE_TESTS += drm_alpha_blending_available
+NV_CONFTEST_GENERIC_COMPILE_TESTS += is_export_symbol_present_drm_gem_prime_fd_to_handle
+NV_CONFTEST_GENERIC_COMPILE_TESTS += is_export_symbol_present_drm_gem_prime_handle_to_fd
+
+NV_CONFTEST_FUNCTION_COMPILE_TESTS += drm_dev_unref
+NV_CONFTEST_FUNCTION_COMPILE_TESTS += drm_reinit_primary_mode_group
+NV_CONFTEST_FUNCTION_COMPILE_TESTS += get_user_pages_remote
+NV_CONFTEST_FUNCTION_COMPILE_TESTS += get_user_pages
+NV_CONFTEST_FUNCTION_COMPILE_TESTS += pin_user_pages_remote
+NV_CONFTEST_FUNCTION_COMPILE_TESTS += pin_user_pages
+NV_CONFTEST_FUNCTION_COMPILE_TESTS += drm_gem_object_lookup
+NV_CONFTEST_FUNCTION_COMPILE_TESTS += drm_atomic_state_ref_counting
+NV_CONFTEST_FUNCTION_COMPILE_TESTS += drm_driver_has_gem_prime_res_obj
+NV_CONFTEST_FUNCTION_COMPILE_TESTS += drm_atomic_helper_connector_dpms
+NV_CONFTEST_FUNCTION_COMPILE_TESTS += drm_connector_funcs_have_mode_in_name
+NV_CONFTEST_FUNCTION_COMPILE_TESTS += drm_connector_has_vrr_capable_property
+NV_CONFTEST_FUNCTION_COMPILE_TESTS += vmf_insert_pfn
+NV_CONFTEST_FUNCTION_COMPILE_TESTS += drm_framebuffer_get
+NV_CONFTEST_FUNCTION_COMPILE_TESTS += drm_gem_object_get
+NV_CONFTEST_FUNCTION_COMPILE_TESTS += drm_dev_put
+NV_CONFTEST_FUNCTION_COMPILE_TESTS += drm_format_num_planes
+NV_CONFTEST_FUNCTION_COMPILE_TESTS += drm_connector_for_each_possible_encoder
+NV_CONFTEST_FUNCTION_COMPILE_TESTS += drm_rotation_available
+NV_CONFTEST_FUNCTION_COMPILE_TESTS += drm_vma_offset_exact_lookup_locked
+NV_CONFTEST_FUNCTION_COMPILE_TESTS += drm_gem_object_put_unlocked
+NV_CONFTEST_FUNCTION_COMPILE_TESTS += nvhost_dma_fence_unpack
+NV_CONFTEST_FUNCTION_COMPILE_TESTS += list_is_first
+NV_CONFTEST_FUNCTION_COMPILE_TESTS += timer_setup
+NV_CONFTEST_FUNCTION_COMPILE_TESTS += dma_fence_set_error
+NV_CONFTEST_FUNCTION_COMPILE_TESTS += fence_set_error
+NV_CONFTEST_FUNCTION_COMPILE_TESTS += sync_file_get_fence
+NV_CONFTEST_FUNCTION_COMPILE_TESTS += drm_aperture_remove_conflicting_pci_framebuffers
+NV_CONFTEST_FUNCTION_COMPILE_TESTS += drm_fbdev_generic_setup
+NV_CONFTEST_FUNCTION_COMPILE_TESTS += drm_connector_attach_hdr_output_metadata_property
+NV_CONFTEST_FUNCTION_COMPILE_TESTS += drm_helper_crtc_enable_color_mgmt
+NV_CONFTEST_FUNCTION_COMPILE_TESTS += drm_crtc_enable_color_mgmt
+NV_CONFTEST_FUNCTION_COMPILE_TESTS += drm_atomic_helper_legacy_gamma_set
+
+NV_CONFTEST_TYPE_COMPILE_TESTS += drm_bus_present
+NV_CONFTEST_TYPE_COMPILE_TESTS += drm_bus_has_bus_type
+NV_CONFTEST_TYPE_COMPILE_TESTS += drm_bus_has_get_irq
+NV_CONFTEST_TYPE_COMPILE_TESTS += drm_bus_has_get_name
+NV_CONFTEST_TYPE_COMPILE_TESTS += drm_driver_has_device_list
+NV_CONFTEST_TYPE_COMPILE_TESTS += drm_driver_has_legacy_dev_list
+NV_CONFTEST_TYPE_COMPILE_TESTS += drm_driver_has_set_busid
+NV_CONFTEST_TYPE_COMPILE_TESTS += drm_crtc_state_has_connectors_changed
+NV_CONFTEST_TYPE_COMPILE_TESTS += drm_init_function_args
+NV_CONFTEST_TYPE_COMPILE_TESTS += drm_helper_mode_fill_fb_struct
+NV_CONFTEST_TYPE_COMPILE_TESTS += drm_master_drop_has_from_release_arg
+NV_CONFTEST_TYPE_COMPILE_TESTS += drm_driver_unload_has_int_return_type
+NV_CONFTEST_TYPE_COMPILE_TESTS += vm_fault_has_address
+NV_CONFTEST_TYPE_COMPILE_TESTS += vm_ops_fault_removed_vma_arg
+NV_CONFTEST_TYPE_COMPILE_TESTS += drm_atomic_helper_crtc_destroy_state_has_crtc_arg
+NV_CONFTEST_TYPE_COMPILE_TESTS += drm_atomic_helper_plane_destroy_state_has_plane_arg
+NV_CONFTEST_TYPE_COMPILE_TESTS += drm_mode_object_find_has_file_priv_arg
+NV_CONFTEST_TYPE_COMPILE_TESTS += dma_buf_owner
+NV_CONFTEST_TYPE_COMPILE_TESTS += drm_connector_list_iter
+NV_CONFTEST_TYPE_COMPILE_TESTS += drm_atomic_helper_swap_state_has_stall_arg
+NV_CONFTEST_TYPE_COMPILE_TESTS += drm_driver_prime_flag_present
+NV_CONFTEST_TYPE_COMPILE_TESTS += vm_fault_t
+NV_CONFTEST_TYPE_COMPILE_TESTS += drm_gem_object_has_resv
+NV_CONFTEST_TYPE_COMPILE_TESTS += drm_crtc_state_has_async_flip
+NV_CONFTEST_TYPE_COMPILE_TESTS += drm_crtc_state_has_pageflip_flags
+NV_CONFTEST_TYPE_COMPILE_TESTS += drm_crtc_state_has_vrr_enabled
+NV_CONFTEST_TYPE_COMPILE_TESTS += drm_format_modifiers_present
+NV_CONFTEST_TYPE_COMPILE_TESTS += mm_has_mmap_lock
+NV_CONFTEST_TYPE_COMPILE_TESTS += drm_vma_node_is_allowed_has_tag_arg
+NV_CONFTEST_TYPE_COMPILE_TESTS += drm_vma_offset_node_has_readonly
+NV_CONFTEST_TYPE_COMPILE_TESTS += drm_display_mode_has_vrefresh
+NV_CONFTEST_TYPE_COMPILE_TESTS += drm_driver_master_set_has_int_return_type
+NV_CONFTEST_TYPE_COMPILE_TESTS += drm_driver_has_gem_free_object
+NV_CONFTEST_TYPE_COMPILE_TESTS += drm_prime_pages_to_sg_has_drm_device_arg
+NV_CONFTEST_TYPE_COMPILE_TESTS += drm_driver_has_gem_prime_callbacks
+NV_CONFTEST_TYPE_COMPILE_TESTS += drm_crtc_atomic_check_has_atomic_state_arg
+NV_CONFTEST_TYPE_COMPILE_TESTS += drm_gem_object_vmap_has_map_arg
+NV_CONFTEST_TYPE_COMPILE_TESTS += drm_plane_atomic_check_has_atomic_state_arg
+NV_CONFTEST_TYPE_COMPILE_TESTS += drm_device_has_pdev
+NV_CONFTEST_TYPE_COMPILE_TESTS += drm_crtc_state_has_no_vblank
+NV_CONFTEST_TYPE_COMPILE_TESTS += drm_mode_config_has_allow_fb_modifiers
+NV_CONFTEST_TYPE_COMPILE_TESTS += drm_has_hdr_output_metadata
+NV_CONFTEST_TYPE_COMPILE_TESTS += dma_resv_add_fence
+NV_CONFTEST_TYPE_COMPILE_TESTS += dma_resv_reserve_fences
+NV_CONFTEST_TYPE_COMPILE_TESTS += reservation_object_reserve_shared_has_num_fences_arg
+NV_CONFTEST_TYPE_COMPILE_TESTS += drm_connector_has_override_edid
+NV_CONFTEST_TYPE_COMPILE_TESTS += drm_master_has_leases
+NV_CONFTEST_TYPE_COMPILE_TESTS += drm_file_get_master
+NV_CONFTEST_TYPE_COMPILE_TESTS += drm_modeset_lock_all_end
+NV_CONFTEST_TYPE_COMPILE_TESTS += drm_connector_lookup
+NV_CONFTEST_TYPE_COMPILE_TESTS += drm_connector_put
+NV_CONFTEST_TYPE_COMPILE_TESTS += vm_area_struct_has_const_vm_flags
+NV_CONFTEST_TYPE_COMPILE_TESTS += drm_driver_has_dumb_destroy
+NV_CONFTEST_TYPE_COMPILE_TESTS += fence_ops_use_64bit_seqno
+NV_CONFTEST_TYPE_COMPILE_TESTS += drm_aperture_remove_conflicting_pci_framebuffers_has_driver_arg
+NV_CONFTEST_TYPE_COMPILE_TESTS += drm_mode_create_dp_colorspace_property_has_supported_colorspaces_arg
+NV_CONFTEST_TYPE_COMPILE_TESTS += drm_syncobj_features_present
+NV_CONFTEST_TYPE_COMPILE_TESTS += drm_unlocked_ioctl_flag_present
--- a/kernel-open/nvidia-drm/nvidia-drm.Kbuild
+++ b/kernel-open/nvidia-drm/nvidia-drm.Kbuild
@@ -2,30 +2,16 @@
 # Kbuild fragment for nvidia-drm.ko
 ###########################################################################

+# Get our source file list and conftest list from the common file
+include $(src)/nvidia-drm/nvidia-drm-sources.mk
+
+# Linux-specific sources
+NVIDIA_DRM_SOURCES += nvidia-drm/nvidia-drm-linux.c
+
 #
 # Define NVIDIA_DRM_{SOURCES,OBJECTS}
 #

-NVIDIA_DRM_SOURCES =
-NVIDIA_DRM_SOURCES += nvidia-drm/nvidia-drm.c
-NVIDIA_DRM_SOURCES += nvidia-drm/nvidia-drm-drv.c
-NVIDIA_DRM_SOURCES += nvidia-drm/nvidia-drm-utils.c
-NVIDIA_DRM_SOURCES += nvidia-drm/nvidia-drm-crtc.c
-NVIDIA_DRM_SOURCES += nvidia-drm/nvidia-drm-encoder.c
-NVIDIA_DRM_SOURCES += nvidia-drm/nvidia-drm-connector.c
-NVIDIA_DRM_SOURCES += nvidia-drm/nvidia-drm-gem.c
-NVIDIA_DRM_SOURCES += nvidia-drm/nvidia-drm-fb.c
-NVIDIA_DRM_SOURCES += nvidia-drm/nvidia-drm-modeset.c
-NVIDIA_DRM_SOURCES += nvidia-drm/nvidia-drm-fence.c
-NVIDIA_DRM_SOURCES += nvidia-drm/nvidia-drm-linux.c
-NVIDIA_DRM_SOURCES += nvidia-drm/nvidia-drm-helper.c
-NVIDIA_DRM_SOURCES += nvidia-drm/nv-kthread-q.c
-NVIDIA_DRM_SOURCES += nvidia-drm/nv-pci-table.c
-NVIDIA_DRM_SOURCES += nvidia-drm/nvidia-drm-gem-nvkms-memory.c
-NVIDIA_DRM_SOURCES += nvidia-drm/nvidia-drm-gem-user-memory.c
-NVIDIA_DRM_SOURCES += nvidia-drm/nvidia-drm-gem-dma-buf.c
-NVIDIA_DRM_SOURCES += nvidia-drm/nvidia-drm-format.c
-
 NVIDIA_DRM_OBJECTS = $(patsubst %.c,%.o,$(NVIDIA_DRM_SOURCES))

 obj-m += nvidia-drm.o
@@ -44,107 +30,4 @@ NVIDIA_DRM_CFLAGS += -UDEBUG -U_DEBUG -DNDEBUG -DNV_BUILD_MODULE_INSTANCES=0

 $(call ASSIGN_PER_OBJ_CFLAGS, $(NVIDIA_DRM_OBJECTS), $(NVIDIA_DRM_CFLAGS))

-#
-# Register the conftests needed by nvidia-drm.ko
-#
-
 NV_OBJECTS_DEPEND_ON_CONFTEST += $(NVIDIA_DRM_OBJECTS)
-
-NV_CONFTEST_GENERIC_COMPILE_TESTS += drm_available
-NV_CONFTEST_GENERIC_COMPILE_TESTS += drm_atomic_available
-NV_CONFTEST_GENERIC_COMPILE_TESTS += is_export_symbol_gpl_refcount_inc
-NV_CONFTEST_GENERIC_COMPILE_TESTS += is_export_symbol_gpl_refcount_dec_and_test
-NV_CONFTEST_GENERIC_COMPILE_TESTS += drm_alpha_blending_available
-NV_CONFTEST_GENERIC_COMPILE_TESTS += is_export_symbol_present_drm_gem_prime_fd_to_handle
-NV_CONFTEST_GENERIC_COMPILE_TESTS += is_export_symbol_present_drm_gem_prime_handle_to_fd
-
-NV_CONFTEST_FUNCTION_COMPILE_TESTS += drm_dev_unref
-NV_CONFTEST_FUNCTION_COMPILE_TESTS += drm_reinit_primary_mode_group
-NV_CONFTEST_FUNCTION_COMPILE_TESTS += get_user_pages_remote
-NV_CONFTEST_FUNCTION_COMPILE_TESTS += get_user_pages
-NV_CONFTEST_FUNCTION_COMPILE_TESTS += pin_user_pages_remote
-NV_CONFTEST_FUNCTION_COMPILE_TESTS += pin_user_pages
-NV_CONFTEST_FUNCTION_COMPILE_TESTS += drm_gem_object_lookup
-NV_CONFTEST_FUNCTION_COMPILE_TESTS += drm_atomic_state_ref_counting
-NV_CONFTEST_FUNCTION_COMPILE_TESTS += drm_driver_has_gem_prime_res_obj
-NV_CONFTEST_FUNCTION_COMPILE_TESTS += drm_atomic_helper_connector_dpms
-NV_CONFTEST_FUNCTION_COMPILE_TESTS += drm_connector_funcs_have_mode_in_name
-NV_CONFTEST_FUNCTION_COMPILE_TESTS += drm_connector_has_vrr_capable_property
-NV_CONFTEST_FUNCTION_COMPILE_TESTS += vmf_insert_pfn
-NV_CONFTEST_FUNCTION_COMPILE_TESTS += drm_framebuffer_get
-NV_CONFTEST_FUNCTION_COMPILE_TESTS += drm_gem_object_get
-NV_CONFTEST_FUNCTION_COMPILE_TESTS += drm_dev_put
-NV_CONFTEST_FUNCTION_COMPILE_TESTS += drm_format_num_planes
-NV_CONFTEST_FUNCTION_COMPILE_TESTS += drm_connector_for_each_possible_encoder
-NV_CONFTEST_FUNCTION_COMPILE_TESTS += drm_rotation_available
-NV_CONFTEST_FUNCTION_COMPILE_TESTS += drm_vma_offset_exact_lookup_locked
-NV_CONFTEST_FUNCTION_COMPILE_TESTS += drm_gem_object_put_unlocked
-NV_CONFTEST_FUNCTION_COMPILE_TESTS += nvhost_dma_fence_unpack
-NV_CONFTEST_FUNCTION_COMPILE_TESTS += list_is_first
-NV_CONFTEST_FUNCTION_COMPILE_TESTS += timer_setup
-NV_CONFTEST_FUNCTION_COMPILE_TESTS += dma_fence_set_error
-NV_CONFTEST_FUNCTION_COMPILE_TESTS += fence_set_error
-NV_CONFTEST_FUNCTION_COMPILE_TESTS += sync_file_get_fence
-NV_CONFTEST_FUNCTION_COMPILE_TESTS += drm_aperture_remove_conflicting_pci_framebuffers
-NV_CONFTEST_FUNCTION_COMPILE_TESTS += drm_fbdev_generic_setup
-NV_CONFTEST_FUNCTION_COMPILE_TESTS += drm_connector_attach_hdr_output_metadata_property
-NV_CONFTEST_FUNCTION_COMPILE_TESTS += drm_helper_crtc_enable_color_mgmt
-NV_CONFTEST_FUNCTION_COMPILE_TESTS += drm_crtc_enable_color_mgmt
-NV_CONFTEST_FUNCTION_COMPILE_TESTS += drm_atomic_helper_legacy_gamma_set
-
-NV_CONFTEST_TYPE_COMPILE_TESTS += drm_bus_present
-NV_CONFTEST_TYPE_COMPILE_TESTS += drm_bus_has_bus_type
-NV_CONFTEST_TYPE_COMPILE_TESTS += drm_bus_has_get_irq
-NV_CONFTEST_TYPE_COMPILE_TESTS += drm_bus_has_get_name
-NV_CONFTEST_TYPE_COMPILE_TESTS += drm_driver_has_device_list
-NV_CONFTEST_TYPE_COMPILE_TESTS += drm_driver_has_legacy_dev_list
-NV_CONFTEST_TYPE_COMPILE_TESTS += drm_driver_has_set_busid
-NV_CONFTEST_TYPE_COMPILE_TESTS += drm_crtc_state_has_connectors_changed
-NV_CONFTEST_TYPE_COMPILE_TESTS += drm_init_function_args
-NV_CONFTEST_TYPE_COMPILE_TESTS += drm_helper_mode_fill_fb_struct
-NV_CONFTEST_TYPE_COMPILE_TESTS += drm_master_drop_has_from_release_arg
-NV_CONFTEST_TYPE_COMPILE_TESTS += drm_driver_unload_has_int_return_type
-NV_CONFTEST_TYPE_COMPILE_TESTS += vm_fault_has_address
-NV_CONFTEST_TYPE_COMPILE_TESTS += vm_ops_fault_removed_vma_arg
-NV_CONFTEST_TYPE_COMPILE_TESTS += drm_atomic_helper_crtc_destroy_state_has_crtc_arg
-NV_CONFTEST_TYPE_COMPILE_TESTS += drm_atomic_helper_plane_destroy_state_has_plane_arg
-NV_CONFTEST_TYPE_COMPILE_TESTS += drm_mode_object_find_has_file_priv_arg
-NV_CONFTEST_TYPE_COMPILE_TESTS += dma_buf_owner
-NV_CONFTEST_TYPE_COMPILE_TESTS += drm_connector_list_iter
-NV_CONFTEST_TYPE_COMPILE_TESTS += drm_atomic_helper_swap_state_has_stall_arg
-NV_CONFTEST_TYPE_COMPILE_TESTS += drm_driver_prime_flag_present
-NV_CONFTEST_TYPE_COMPILE_TESTS += vm_fault_t
-NV_CONFTEST_TYPE_COMPILE_TESTS += drm_gem_object_has_resv
-NV_CONFTEST_TYPE_COMPILE_TESTS += drm_crtc_state_has_async_flip
-NV_CONFTEST_TYPE_COMPILE_TESTS += drm_crtc_state_has_pageflip_flags
-NV_CONFTEST_TYPE_COMPILE_TESTS += drm_crtc_state_has_vrr_enabled
-NV_CONFTEST_TYPE_COMPILE_TESTS += drm_format_modifiers_present
-NV_CONFTEST_TYPE_COMPILE_TESTS += mm_has_mmap_lock
-NV_CONFTEST_TYPE_COMPILE_TESTS += drm_vma_node_is_allowed_has_tag_arg
-NV_CONFTEST_TYPE_COMPILE_TESTS += drm_vma_offset_node_has_readonly
-NV_CONFTEST_TYPE_COMPILE_TESTS += drm_display_mode_has_vrefresh
-NV_CONFTEST_TYPE_COMPILE_TESTS += drm_driver_master_set_has_int_return_type
-NV_CONFTEST_TYPE_COMPILE_TESTS += drm_driver_has_gem_free_object
-NV_CONFTEST_TYPE_COMPILE_TESTS += drm_prime_pages_to_sg_has_drm_device_arg
-NV_CONFTEST_TYPE_COMPILE_TESTS += drm_driver_has_gem_prime_callbacks
-NV_CONFTEST_TYPE_COMPILE_TESTS += drm_crtc_atomic_check_has_atomic_state_arg
-NV_CONFTEST_TYPE_COMPILE_TESTS += drm_gem_object_vmap_has_map_arg
-NV_CONFTEST_TYPE_COMPILE_TESTS += drm_plane_atomic_check_has_atomic_state_arg
-NV_CONFTEST_TYPE_COMPILE_TESTS += drm_device_has_pdev
-NV_CONFTEST_TYPE_COMPILE_TESTS += drm_crtc_state_has_no_vblank
-NV_CONFTEST_TYPE_COMPILE_TESTS += drm_mode_config_has_allow_fb_modifiers
-NV_CONFTEST_TYPE_COMPILE_TESTS += drm_has_hdr_output_metadata
-NV_CONFTEST_TYPE_COMPILE_TESTS += dma_resv_add_fence
-NV_CONFTEST_TYPE_COMPILE_TESTS += dma_resv_reserve_fences
-NV_CONFTEST_TYPE_COMPILE_TESTS += reservation_object_reserve_shared_has_num_fences_arg
-NV_CONFTEST_TYPE_COMPILE_TESTS += drm_connector_has_override_edid
-NV_CONFTEST_TYPE_COMPILE_TESTS += drm_master_has_leases
-NV_CONFTEST_TYPE_COMPILE_TESTS += drm_file_get_master
-NV_CONFTEST_TYPE_COMPILE_TESTS += drm_modeset_lock_all_end
-NV_CONFTEST_TYPE_COMPILE_TESTS += drm_connector_lookup
-NV_CONFTEST_TYPE_COMPILE_TESTS += drm_connector_put
-NV_CONFTEST_TYPE_COMPILE_TESTS += vm_area_struct_has_const_vm_flags
-NV_CONFTEST_TYPE_COMPILE_TESTS += drm_driver_has_dumb_destroy
-NV_CONFTEST_TYPE_COMPILE_TESTS += fence_ops_use_64bit_seqno
-NV_CONFTEST_TYPE_COMPILE_TESTS += drm_aperture_remove_conflicting_pci_framebuffers_has_driver_arg
-NV_CONFTEST_TYPE_COMPILE_TESTS += drm_mode_create_dp_colorspace_property_has_supported_colorspaces_arg
--- a/kernel-open/nvidia-drm/nvidia-drm.c
+++ b/kernel-open/nvidia-drm/nvidia-drm.c
@@ -45,6 +45,7 @@ int nv_drm_init(void)
        return -EINVAL;
    }

+    nvKms->setSuspendResumeCallback(nv_drm_suspend_resume);
    return nv_drm_probe_devices();
 #else
    return 0;
@@ -54,6 +55,7 @@ int nv_drm_init(void)
 void nv_drm_exit(void)
 {
 #if defined(NV_DRM_AVAILABLE)
+    nvKms->setSuspendResumeCallback(NULL);
    nv_drm_remove_devices();
 #endif
 }
--- a/kernel-open/nvidia-modeset/nv-kthread-q.c
+++ b/kernel-open/nvidia-modeset/nv-kthread-q.c
@@ -1,5 +1,5 @@
 /*
- * SPDX-FileCopyrightText: Copyright (c) 2016 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
+ * SPDX-FileCopyrightText: Copyright (c) 2016-2024 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
 * SPDX-License-Identifier: MIT
 *
 * Permission is hereby granted, free of charge, to any person obtaining a
@@ -176,7 +176,7 @@ static struct task_struct *thread_create_on_node(int (*threadfn)(void *data),
 {

    unsigned i, j;
-    const static unsigned attempts = 3;
+    static const unsigned attempts = 3;
    struct task_struct *thread[3];

    for (i = 0;; i++) {
--- a/kernel-open/nvidia-modeset/nvidia-modeset-linux.c
+++ b/kernel-open/nvidia-modeset/nvidia-modeset-linux.c
@@ -35,12 +35,13 @@
 #include <linux/list.h>
 #include <linux/rwsem.h>
 #include <linux/freezer.h>
+#include <linux/poll.h>
+#include <linux/cdev.h>

 #include <acpi/video.h>

 #include "nvstatus.h"

-#include "nv-register-module.h"
 #include "nv-modeset-interface.h"
 #include "nv-kref.h"

@@ -53,8 +54,13 @@
 #include "nv-kthread-q.h"
 #include "nv-time.h"
 #include "nv-lock.h"
+#include "nv-chardev-numbers.h"

-#if !defined(CONFIG_RETPOLINE)
+/*
+ * Commit aefb2f2e619b ("x86/bugs: Rename CONFIG_RETPOLINE =>
+ * CONFIG_MITIGATION_RETPOLINE) in v6.8 renamed CONFIG_RETPOLINE.
+ */
+#if !defined(CONFIG_RETPOLINE) && !defined(CONFIG_MITIGATION_RETPOLINE)
 #include "nv-retpoline.h"
 #endif

@@ -71,9 +77,15 @@ module_param_named(disable_hdmi_frl, disable_hdmi_frl, bool, 0400);
 static bool disable_vrr_memclk_switch = false;
 module_param_named(disable_vrr_memclk_switch, disable_vrr_memclk_switch, bool, 0400);

-static bool hdmi_deepcolor = false;
+static bool hdmi_deepcolor = true;
 module_param_named(hdmi_deepcolor, hdmi_deepcolor, bool, 0400);

+static bool vblank_sem_control = true;
+module_param_named(vblank_sem_control, vblank_sem_control, bool, 0400);
+
+static bool opportunistic_display_sync = true;
+module_param_named(opportunistic_display_sync, opportunistic_display_sync, bool, 0400);
+
 /* These parameters are used for fault injection tests.  Normally the defaults
 * should be used. */
 MODULE_PARM_DESC(fail_malloc, "Fail the Nth call to nvkms_alloc");
@@ -117,6 +129,30 @@ NvBool nvkms_hdmi_deepcolor(void)
    return hdmi_deepcolor;
 }

+NvBool nvkms_vblank_sem_control(void)
+{
+    return vblank_sem_control;
+}
+
+NvBool nvkms_opportunistic_display_sync(void)
+{
+    return opportunistic_display_sync;
+}
+
+NvBool nvkms_kernel_supports_syncpts(void)
+{
+/*
+ * Note this only checks that the kernel has the prerequisite
+ * support for syncpts; callers must also check that the hardware
+ * supports syncpts.
+ */
+#if (defined(CONFIG_TEGRA_GRHOST) || defined(NV_LINUX_HOST1X_NEXT_H_PRESENT))
+    return NV_TRUE;
+#else
+    return NV_FALSE;
+#endif
+}
+
 #define NVKMS_SYNCPT_STUBS_NEEDED

 /*************************************************************************
@@ -218,9 +254,23 @@ static inline int nvkms_read_trylock_pm_lock(void)

 static inline void nvkms_read_lock_pm_lock(void)
 {
-    while (!down_read_trylock(&nvkms_pm_lock)) {
-        try_to_freeze();
-        cond_resched();
+    if ((current->flags & PF_NOFREEZE)) {
+        /*
+         * Non-freezable tasks (i.e. kthreads in this case) don't have to worry
+         * about being frozen during system suspend, but do need to block so
+         * that the CPU can go idle during s2idle. Do a normal uninterruptible
+         * blocking wait for the PM lock.
+         */
+        down_read(&nvkms_pm_lock);
+    } else {
+        /*
+         * For freezable tasks, make sure we give the kernel an opportunity to
+         * freeze if taking the PM lock fails.
+         */
+        while (!down_read_trylock(&nvkms_pm_lock)) {
+            try_to_freeze();
+            cond_resched();
+        }
    }
 }

@@ -467,6 +517,8 @@ nvkms_event_queue_changed(nvkms_per_open_handle_t *pOpenKernel,

 static void nvkms_suspend(NvU32 gpuId)
 {
+    nvKmsKapiSuspendResume(NV_TRUE /* suspend */);
+
    if (gpuId == 0) {
        nvkms_write_lock_pm_lock();
    }
@@ -485,6 +537,8 @@ static void nvkms_resume(NvU32 gpuId)
    if (gpuId == 0) {
        nvkms_write_unlock_pm_lock();
    }
+
+    nvKmsKapiSuspendResume(NV_FALSE /* suspend */);
 }


@@ -813,49 +867,6 @@ void nvkms_free_timer(nvkms_timer_handle_t *handle)
    timer->cancel = NV_TRUE;
 }

-void* nvkms_get_per_open_data(int fd)
-{
-    struct file *filp = fget(fd);
-    struct nvkms_per_open *popen = NULL;
-    dev_t rdev = 0;
-    void *data = NULL;
-
-    if (filp == NULL) {
-        return NULL;
-    }
-
-    if (filp->f_inode == NULL) {
-        goto done;
-    }
-    rdev = filp->f_inode->i_rdev;
-
-    if ((MAJOR(rdev) != NVKMS_MAJOR_DEVICE_NUMBER) ||
-        (MINOR(rdev) != NVKMS_MINOR_DEVICE_NUMBER)) {
-        goto done;
-    }
-
-    popen = filp->private_data;
-    if (popen == NULL) {
-        goto done;
-    }
-
-    data = popen->data;
-
-done:
-    /*
-     * fget() incremented the struct file's reference count, which
-     * needs to be balanced with a call to fput().  It is safe to
-     * decrement the reference count before returning
-     * filp->private_data because core NVKMS is currently holding the
-     * nvkms_lock, which prevents the nvkms_close() => nvKmsClose()
-     * call chain from freeing the file out from under the caller of
-     * nvkms_get_per_open_data().
-     */
-    fput(filp);
-
-    return data;
-}
-
 NvBool nvkms_fd_is_nvidia_chardev(int fd)
 {
    struct file *filp = fget(fd);
@@ -1237,6 +1248,26 @@ void nvkms_close_from_kapi(struct nvkms_per_open *popen)
    nvkms_close_pm_unlocked(popen);
 }

+NvBool nvkms_ioctl_from_kapi_try_pmlock
+(
+    struct nvkms_per_open *popen,
+    NvU32 cmd, void *params_address, const size_t param_size
+)
+{
+    NvBool ret;
+
+    if (nvkms_read_trylock_pm_lock()) {
+        return NV_FALSE;
+    }
+
+    ret = nvkms_ioctl_common(popen,
+                             cmd,
+                             (NvU64)(NvUPtr)params_address, param_size) == 0;
+    nvkms_read_unlock_pm_lock();
+
+    return ret;
+}
+
 NvBool nvkms_ioctl_from_kapi
 (
    struct nvkms_per_open *popen,
@@ -1607,6 +1638,12 @@ static int nvkms_ioctl(struct inode *inode, struct file *filp,
    return status;
 }

+static long nvkms_unlocked_ioctl(struct file *filp, unsigned int cmd,
+                                 unsigned long arg)
+{
+    return nvkms_ioctl(filp->f_inode, filp, cmd, arg);
+}
+
 static unsigned int nvkms_poll(struct file *filp, poll_table *wait)
 {
    unsigned int mask = 0;
@@ -1634,17 +1671,73 @@ static unsigned int nvkms_poll(struct file *filp, poll_table *wait)
 * Module loading support code.
 *************************************************************************/

-static nvidia_module_t nvidia_modeset_module = {
+#define NVKMS_RDEV  (MKDEV(NV_MAJOR_DEVICE_NUMBER, \
+                           NV_MINOR_DEVICE_NUMBER_MODESET_DEVICE))
+
+static struct file_operations nvkms_fops = {
    .owner       = THIS_MODULE,
-    .module_name = "nvidia-modeset",
-    .instance    = 1, /* minor number: 255-1=254 */
-    .open        = nvkms_open,
-    .close       = nvkms_close,
-    .mmap        = nvkms_mmap,
-    .ioctl       = nvkms_ioctl,
    .poll        = nvkms_poll,
+    .unlocked_ioctl = nvkms_unlocked_ioctl,
+#if NVCPU_IS_X86_64 || NVCPU_IS_AARCH64
+    .compat_ioctl = nvkms_unlocked_ioctl,
+#endif
+    .mmap        = nvkms_mmap,
+    .open        = nvkms_open,
+    .release     = nvkms_close,
 };

+static struct cdev nvkms_device_cdev;
+
+static int __init nvkms_register_chrdev(void)
+{
+    int ret;
+
+    ret = register_chrdev_region(NVKMS_RDEV, 1, "nvidia-modeset");
+    if (ret < 0) {
+        return ret;
+    }
+
+    cdev_init(&nvkms_device_cdev, &nvkms_fops);
+    ret = cdev_add(&nvkms_device_cdev, NVKMS_RDEV, 1);
+    if (ret < 0) {
+        unregister_chrdev_region(NVKMS_RDEV, 1);
+        return ret;
+    }
+
+    return ret;
+}
+
+static void nvkms_unregister_chrdev(void)
+{
+    cdev_del(&nvkms_device_cdev);
+    unregister_chrdev_region(NVKMS_RDEV, 1);
+}
+
+void* nvkms_get_per_open_data(int fd)
+{
+    struct file *filp = fget(fd);
+    void *data = NULL;
+
+    if (filp) {
+        if (filp->f_op == &nvkms_fops && filp->private_data) {
+            struct nvkms_per_open *popen = filp->private_data;
+            data = popen->data;
+        }
+
+        /*
+         * fget() incremented the struct file's reference count, which needs to
+         * be balanced with a call to fput().  It is safe to decrement the
+         * reference count before returning filp->private_data because core
+         * NVKMS is currently holding the nvkms_lock, which prevents the
+         * nvkms_close() => nvKmsClose() call chain from freeing the file out
+         * from under the caller of nvkms_get_per_open_data().
+         */
+        fput(filp);
+    }
+
+    return data;
+}
+
 static int __init nvkms_init(void)
 {
    int ret;
@@ -1675,10 +1768,9 @@ static int __init nvkms_init(void)
    INIT_LIST_HEAD(&nvkms_timers.list);
    spin_lock_init(&nvkms_timers.lock);

-    ret = nvidia_register_module(&nvidia_modeset_module);
-
+    ret = nvkms_register_chrdev();
    if (ret != 0) {
-        goto fail_register_module;
+        goto fail_register_chrdev;
    }

    down(&nvkms_lock);
@@ -1697,8 +1789,8 @@ static int __init nvkms_init(void)
    return 0;

 fail_module_load:
-    nvidia_unregister_module(&nvidia_modeset_module);
-fail_register_module:
+    nvkms_unregister_chrdev();
+fail_register_chrdev:
    nv_kthread_q_stop(&nvkms_deferred_close_kthread_q);
 fail_deferred_close_kthread:
    nv_kthread_q_stop(&nvkms_kthread_q);
@@ -1762,7 +1854,7 @@ restart:
    nv_kthread_q_stop(&nvkms_deferred_close_kthread_q);
    nv_kthread_q_stop(&nvkms_kthread_q);

-    nvidia_unregister_module(&nvidia_modeset_module);
+    nvkms_unregister_chrdev();
    nvkms_free_rm();

    if (malloc_verbose) {
--- a/kernel-open/nvidia-modeset/nvidia-modeset-os-interface.h
+++ b/kernel-open/nvidia-modeset/nvidia-modeset-os-interface.h
@@ -100,6 +100,8 @@ NvBool nvkms_output_rounding_fix(void);
 NvBool nvkms_disable_hdmi_frl(void);
 NvBool nvkms_disable_vrr_memclk_switch(void);
 NvBool nvkms_hdmi_deepcolor(void);
+NvBool nvkms_vblank_sem_control(void);
+NvBool nvkms_opportunistic_display_sync(void);

 void   nvkms_call_rm    (void *ops);
 void*  nvkms_alloc      (size_t size,
@@ -302,6 +304,11 @@ NvU32 nvkms_enumerate_gpus(nv_gpu_info_t *gpu_info);

 NvBool nvkms_allow_write_combining(void);

+/*!
+ * Check if OS supports syncpoints.
+ */
+NvBool nvkms_kernel_supports_syncpts(void);
+
 /*!
 * Checks whether the fd is associated with an nvidia character device.
 */
@@ -326,6 +333,16 @@ NvBool nvkms_ioctl_from_kapi
    NvU32 cmd, void *params_address, const size_t params_size
 );

+/*!
+ * Like nvkms_ioctl_from_kapi, but return NV_FALSE instead of waiting if the
+ * power management read lock cannot be acquired.
+ */
+NvBool nvkms_ioctl_from_kapi_try_pmlock
+(
+    struct nvkms_per_open *popen,
+    NvU32 cmd, void *params_address, const size_t params_size
+);
+
 /*!
 * APIs for locking.
 */
--- a/kernel-open/nvidia-modeset/nvidia-modeset.Kbuild
+++ b/kernel-open/nvidia-modeset/nvidia-modeset.Kbuild
@@ -105,3 +105,4 @@ NV_CONFTEST_FUNCTION_COMPILE_TESTS += list_is_first
 NV_CONFTEST_FUNCTION_COMPILE_TESTS += ktime_get_real_ts64
 NV_CONFTEST_FUNCTION_COMPILE_TESTS += ktime_get_raw_ts64
 NV_CONFTEST_FUNCTION_COMPILE_TESTS += acpi_video_backlight_use_native
+NV_CONFTEST_FUNCTION_COMPILE_TESTS += kernel_read_has_pointer_pos_arg
--- a/kernel-open/nvidia-modeset/nvkms.h
+++ b/kernel-open/nvidia-modeset/nvkms.h
@@ -103,6 +103,8 @@ NvBool nvKmsKapiGetFunctionsTableInternal
    struct NvKmsKapiFunctionsTable *funcsTable
 );

+void nvKmsKapiSuspendResume(NvBool suspend);
+
 NvBool nvKmsGetBacklight(NvU32 display_id, void *drv_priv, NvU32 *brightness);
 NvBool nvKmsSetBacklight(NvU32 display_id, void *drv_priv, NvU32 brightness);

--- a/kernel-open/nvidia-peermem/nvidia-peermem.c
+++ b/kernel-open/nvidia-peermem/nvidia-peermem.c
@@ -1,20 +1,25 @@
-/* SPDX-License-Identifier: Linux-OpenIB */
 /*
 * Copyright (c) 2006, 2007 Cisco Systems, Inc. All rights reserved.
 * Copyright (c) 2007, 2008 Mellanox Technologies. All rights reserved.
 *
- * Redistribution and use in source and binary forms, with or
- * without modification, are permitted provided that the following
- * conditions are met:
+ * This software is available to you under a choice of one of two
+ * licenses.  You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
 *
- *  - Redistributions of source code must retain the above
- *    copyright notice, this list of conditions and the following
- *    disclaimer.
+ *     Redistribution and use in source and binary forms, with or
+ *     without modification, are permitted provided that the following
+ *     conditions are met:
 *
- *  - Redistributions in binary form must reproduce the above
- *    copyright notice, this list of conditions and the following
- *    disclaimer in the documentation and/or other materials
- *    provided with the distribution.
+ *      - Redistributions of source code must retain the above
+ *        copyright notice, this list of conditions and the following
+ *        disclaimer.
+ *
+ *      - Redistributions in binary form must reproduce the above
+ *        copyright notice, this list of conditions and the following
+ *        disclaimer in the documentation and/or other materials
+ *        provided with the distribution.
 *
 * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
 * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
@@ -43,7 +48,9 @@

 MODULE_AUTHOR("Yishai Hadas");
 MODULE_DESCRIPTION("NVIDIA GPU memory plug-in");
-MODULE_LICENSE("Linux-OpenIB");
+
+MODULE_LICENSE("Dual BSD/GPL");
+
 MODULE_VERSION(DRV_VERSION);
 enum {
        NV_MEM_PEERDIRECT_SUPPORT_DEFAULT = 0,
@@ -53,7 +60,20 @@ static int peerdirect_support = NV_MEM_PEERDIRECT_SUPPORT_DEFAULT;
 module_param(peerdirect_support, int, S_IRUGO);
 MODULE_PARM_DESC(peerdirect_support, "Set level of support for Peer-direct, 0 [default] or 1 [legacy, for example MLNX_OFED 4.9 LTS]");

-#define peer_err(FMT, ARGS...) printk(KERN_ERR "nvidia-peermem" " %s:%d " FMT, __FUNCTION__, __LINE__, ## ARGS)
+enum {
+        NV_MEM_PERSISTENT_API_SUPPORT_LEGACY = 0,
+        NV_MEM_PERSISTENT_API_SUPPORT_DEFAULT = 1,
+};
+static int persistent_api_support = NV_MEM_PERSISTENT_API_SUPPORT_DEFAULT;
+module_param(persistent_api_support, int, S_IRUGO);
+MODULE_PARM_DESC(persistent_api_support, "Set level of support for persistent APIs, 0 [legacy] or 1 [default]");
+
+#define peer_err(FMT, ARGS...) printk(KERN_ERR "nvidia-peermem" " %s:%d ERROR " FMT, __FUNCTION__, __LINE__, ## ARGS)
+#ifdef NV_MEM_DEBUG
+#define peer_trace(FMT, ARGS...) printk(KERN_DEBUG "nvidia-peermem" " %s:%d TRACE " FMT, __FUNCTION__, __LINE__, ## ARGS)
+#else
+#define peer_trace(FMT, ARGS...) do {} while (0)
+#endif

 #if defined(NV_MLNX_IB_PEER_MEM_SYMBOLS_PRESENT)

@@ -74,7 +94,10 @@ invalidate_peer_memory mem_invalidate_callback;
 static void *reg_handle = NULL;
 static void *reg_handle_nc = NULL;

+#define NV_MEM_CONTEXT_MAGIC ((u64)0xF1F4F1D0FEF0DAD0ULL)
+
 struct nv_mem_context {
+    u64 pad1;
    struct nvidia_p2p_page_table *page_table;
    struct nvidia_p2p_dma_mapping *dma_mapping;
    u64 core_context;
@@ -86,8 +109,22 @@ struct nv_mem_context {
    struct task_struct *callback_task;
    int sg_allocated;
    struct sg_table sg_head;
+    u64 pad2;
 };

+#define NV_MEM_CONTEXT_CHECK_OK(MC) ({                                  \
+    struct nv_mem_context *mc = (MC);                                   \
+    int rc = ((0 != mc) &&                                              \
+              (READ_ONCE(mc->pad1) == NV_MEM_CONTEXT_MAGIC) &&          \
+              (READ_ONCE(mc->pad2) == NV_MEM_CONTEXT_MAGIC));           \
+    if (!rc) {                                                          \
+        peer_trace("invalid nv_mem_context=%px pad1=%016llx pad2=%016llx\n", \
+                   mc,                                                  \
+                   mc?mc->pad1:0,                                       \
+                   mc?mc->pad2:0);                                      \
+    }                                                                   \
+    rc;                                                                 \
+})

 static void nv_get_p2p_free_callback(void *data)
 {
@@ -97,8 +134,9 @@ static void nv_get_p2p_free_callback(void *data)
    struct nvidia_p2p_dma_mapping *dma_mapping = NULL;

    __module_get(THIS_MODULE);
-    if (!nv_mem_context) {
-        peer_err("nv_get_p2p_free_callback -- invalid nv_mem_context\n");
+
+    if (!NV_MEM_CONTEXT_CHECK_OK(nv_mem_context)) {
+        peer_err("detected invalid context, skipping further processing\n");
        goto out;
    }

@@ -169,9 +207,11 @@ static int nv_mem_acquire(unsigned long addr, size_t size, void *peer_mem_privat
        /* Error case handled as not mine */
        return 0;

+    nv_mem_context->pad1 = NV_MEM_CONTEXT_MAGIC;
    nv_mem_context->page_virt_start = addr & GPU_PAGE_MASK;
    nv_mem_context->page_virt_end   = (addr + size + GPU_PAGE_SIZE - 1) & GPU_PAGE_MASK;
    nv_mem_context->mapped_size  = nv_mem_context->page_virt_end - nv_mem_context->page_virt_start;
+    nv_mem_context->pad2 = NV_MEM_CONTEXT_MAGIC;

    ret = nvidia_p2p_get_pages(0, 0, nv_mem_context->page_virt_start, nv_mem_context->mapped_size,
                               &nv_mem_context->page_table, nv_mem_dummy_callback, nv_mem_context);
@@ -195,6 +235,7 @@ static int nv_mem_acquire(unsigned long addr, size_t size, void *peer_mem_privat
    return 1;

 err:
+    memset(nv_mem_context, 0, sizeof(*nv_mem_context));
    kfree(nv_mem_context);

    /* Error case handled as not mine */
@@ -347,6 +388,7 @@ static void nv_mem_release(void *context)
        sg_free_table(&nv_mem_context->sg_head);
        nv_mem_context->sg_allocated = 0;
    }
+    memset(nv_mem_context, 0, sizeof(*nv_mem_context));
    kfree(nv_mem_context);
    module_put(THIS_MODULE);
    return;
@@ -444,32 +486,8 @@ static struct peer_memory_client nv_mem_client_nc = {
    .release        = nv_mem_release,
 };

-#endif /* NV_MLNX_IB_PEER_MEM_SYMBOLS_PRESENT */
-
-static int nv_mem_param_conf_check(void)
+static int nv_mem_legacy_client_init(void)
 {
-    int rc = 0;
-    switch (peerdirect_support) {
-    case NV_MEM_PEERDIRECT_SUPPORT_DEFAULT:
-    case NV_MEM_PEERDIRECT_SUPPORT_LEGACY:
-        break;
-    default:
-        peer_err("invalid peerdirect_support param value %d\n", peerdirect_support);
-        rc = -EINVAL;
-        break;
-    }
-    return rc;
-}
-
-static int __init nv_mem_client_init(void)
-{
-    int rc;
-    rc = nv_mem_param_conf_check();
-    if (rc) {
-        return rc;
-    }
-
-#if defined (NV_MLNX_IB_PEER_MEM_SYMBOLS_PRESENT)
    // off by one, to leave space for the trailing '1' which is flagging
    // the new client type
    BUG_ON(strlen(DRV_NAME) > IB_PEER_MEMORY_NAME_MAX-1);
@@ -498,19 +516,96 @@ static int __init nv_mem_client_init(void)
                         &mem_invalidate_callback);
    if (!reg_handle) {
        peer_err("nv_mem_client_init -- error while registering traditional client\n");
-        rc = -EINVAL;
-        goto out;
+        return -EINVAL;
+    }
+    return 0;
+}
+
+static int nv_mem_nc_client_init(void)
+{
+    // The nc client enables support for persistent pages.
+    if (persistent_api_support == NV_MEM_PERSISTENT_API_SUPPORT_LEGACY)
+    {
+        //
+        // If legacy behavior is forced via module param,
+        // both legacy and persistent clients are registered and are named
+        // "nv_mem"(legacy) and "nv_mem_nc"(persistent).
+        //
+        strcpy(nv_mem_client_nc.name, DRV_NAME "_nc");
+    }
+    else
+    {
+        //
+        // With default persistent behavior, the client name shall be "nv_mem"
+        // so that libraries can use the persistent client under the same name.
+        //
+        strcpy(nv_mem_client_nc.name, DRV_NAME);
    }

-    // The nc client enables support for persistent pages.
-    strcpy(nv_mem_client_nc.name, DRV_NAME "_nc");
    strcpy(nv_mem_client_nc.version, DRV_VERSION);
    reg_handle_nc = ib_register_peer_memory_client(&nv_mem_client_nc, NULL);
    if (!reg_handle_nc) {
        peer_err("nv_mem_client_init -- error while registering nc client\n");
-        rc = -EINVAL;
-        goto out;
+        return -EINVAL;
    }
+    return 0;
+}
+
+#endif /* NV_MLNX_IB_PEER_MEM_SYMBOLS_PRESENT */
+
+static int nv_mem_param_peerdirect_conf_check(void)
+{
+    int rc = 0;
+    switch (peerdirect_support) {
+    case NV_MEM_PEERDIRECT_SUPPORT_DEFAULT:
+    case NV_MEM_PEERDIRECT_SUPPORT_LEGACY:
+        break;
+    default:
+        peer_err("invalid peerdirect_support param value %d\n", peerdirect_support);
+        rc = -EINVAL;
+        break;
+    }
+    return rc;
+}
+
+static int nv_mem_param_persistent_api_conf_check(void)
+{
+    int rc = 0;
+    switch (persistent_api_support) {
+    case NV_MEM_PERSISTENT_API_SUPPORT_DEFAULT:
+    case NV_MEM_PERSISTENT_API_SUPPORT_LEGACY:
+        break;
+    default:
+        peer_err("invalid persistent_api_support param value %d\n", persistent_api_support);
+        rc = -EINVAL;
+        break;
+    }
+    return rc;
+}
+
+static int __init nv_mem_client_init(void)
+{
+#if defined (NV_MLNX_IB_PEER_MEM_SYMBOLS_PRESENT)
+    int rc;
+    rc = nv_mem_param_peerdirect_conf_check();
+    if (rc) {
+        return rc;
+    }
+
+    rc = nv_mem_param_persistent_api_conf_check();
+    if (rc) {
+        return rc;
+    }
+
+    if (persistent_api_support == NV_MEM_PERSISTENT_API_SUPPORT_LEGACY) {
+        rc = nv_mem_legacy_client_init();
+        if (rc)
+            goto out;
+    }
+
+    rc = nv_mem_nc_client_init();
+    if (rc)
+        goto out;

 out:
    if (rc) {
--- a/kernel-open/nvidia-uvm/clc96f.h
+++ b/kernel-open/nvidia-uvm/clc96f.h
@@ -0,0 +1,329 @@
+/*******************************************************************************
+    Copyright (c) 2012-2015 NVIDIA Corporation
+
+    Permission is hereby granted, free of charge, to any person obtaining a copy
+    of this software and associated documentation files (the "Software"), to
+    deal in the Software without restriction, including without limitation the
+    rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
+    sell copies of the Software, and to permit persons to whom the Software is
+    furnished to do so, subject to the following conditions:
+
+    The above copyright notice and this permission notice shall be
+    included in all copies or substantial portions of the Software.
+
+    THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+    IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+    FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+    THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+    LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+    FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+    DEALINGS IN THE SOFTWARE.
+
+*******************************************************************************/
+
+
+#ifndef _clc96f_h_
+#define _clc96f_h_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include "nvtypes.h"
+
+/* class BLACKWELL_CHANNEL_GPFIFO  */
+/*
+ * Documentation for BLACKWELL_CHANNEL_GPFIFO can be found in dev_pbdma.ref,
+ * chapter "User Control Registers". It is documented as device NV_UDMA.
+ * The GPFIFO format itself is also documented in dev_pbdma.ref,
+ * NV_PPBDMA_GP_ENTRY_*. The pushbuffer format is documented in dev_ram.ref,
+ * chapter "FIFO DMA RAM", NV_FIFO_DMA_*.
+ *
+ * Note there is no .mfs file for this class.
+ */
+#define  BLACKWELL_CHANNEL_GPFIFO_A                           (0x0000C96F)
+
+#define NVC96F_TYPEDEF                             BLACKWELL_CHANNELChannelGPFifoA
+
+/* dma flow control data structure */
+typedef volatile struct Nvc96fControl_struct {
+ NvU32 Ignored00[0x23];        /*                                  0000-008b*/
+ NvU32 GPPut;                   /* GP FIFO put offset               008c-008f*/
+ NvU32 Ignored01[0x5c];
+} Nvc96fControl, BlackwellAControlGPFifo;
+
+/* fields and values */
+#define NVC96F_NUMBER_OF_SUBCHANNELS                               (8)
+#define NVC96F_SET_OBJECT                                          (0x00000000)
+#define NVC96F_SET_OBJECT_NVCLASS                                         15:0
+#define NVC96F_SET_OBJECT_ENGINE                                         20:16
+#define NVC96F_SET_OBJECT_ENGINE_SW                                 0x0000001f
+#define NVC96F_NOP                                                 (0x00000008)
+#define NVC96F_NOP_HANDLE                                                 31:0
+#define NVC96F_NON_STALL_INTERRUPT                                 (0x00000020)
+#define NVC96F_NON_STALL_INTERRUPT_HANDLE                                 31:0
+#define NVC96F_FB_FLUSH                                            (0x00000024) // Deprecated - use MEMBAR TYPE SYS_MEMBAR
+#define NVC96F_FB_FLUSH_HANDLE                                            31:0
+// NOTE - MEM_OP_A and MEM_OP_B have been replaced in gp100 with methods for
+// specifying the page address for a targeted TLB invalidate and the uTLB for
+// a targeted REPLAY_CANCEL for UVM.
+// The previous MEM_OP_A/B functionality is in MEM_OP_C/D, with slightly
+// rearranged fields.
+#define NVC96F_MEM_OP_A                                            (0x00000028)
+#define NVC96F_MEM_OP_A_TLB_INVALIDATE_CANCEL_TARGET_CLIENT_UNIT_ID        5:0  // only relevant for REPLAY_CANCEL_TARGETED
+#define NVC96F_MEM_OP_A_TLB_INVALIDATE_INVALIDATION_SIZE                   5:0  // Used to specify size of invalidate, used for invalidates which are not of the REPLAY_CANCEL_TARGETED type
+#define NVC96F_MEM_OP_A_TLB_INVALIDATE_CANCEL_TARGET_GPC_ID               10:6  // only relevant for REPLAY_CANCEL_TARGETED
+#define NVC96F_MEM_OP_A_TLB_INVALIDATE_INVAL_SCOPE                         7:6  // only relevant for invalidates with NVC96F_MEM_OP_C_TLB_INVALIDATE_REPLAY_NONE for invalidating  link TLB only, or non-link TLB only or all TLBs
+#define NVC96F_MEM_OP_A_TLB_INVALIDATE_INVAL_SCOPE_ALL_TLBS                  0
+#define NVC96F_MEM_OP_A_TLB_INVALIDATE_INVAL_SCOPE_LINK_TLBS                 1
+#define NVC96F_MEM_OP_A_TLB_INVALIDATE_INVAL_SCOPE_NON_LINK_TLBS             2
+#define NVC96F_MEM_OP_A_TLB_INVALIDATE_INVAL_SCOPE_RSVRVD                    3
+#define NVC96F_MEM_OP_A_TLB_INVALIDATE_CANCEL_MMU_ENGINE_ID                8:0  // only relevant for REPLAY_CANCEL_VA_GLOBAL
+#define NVC96F_MEM_OP_A_TLB_INVALIDATE_SYSMEMBAR                         11:11
+#define NVC96F_MEM_OP_A_TLB_INVALIDATE_SYSMEMBAR_EN                 0x00000001
+#define NVC96F_MEM_OP_A_TLB_INVALIDATE_SYSMEMBAR_DIS                0x00000000
+#define NVC96F_MEM_OP_A_TLB_INVALIDATE_TARGET_ADDR_LO                    31:12
+#define NVC96F_MEM_OP_B                                            (0x0000002c)
+#define NVC96F_MEM_OP_B_TLB_INVALIDATE_TARGET_ADDR_HI                     31:0
+#define NVC96F_MEM_OP_C                                            (0x00000030)
+#define NVC96F_MEM_OP_C_MEMBAR_TYPE                                        2:0
+#define NVC96F_MEM_OP_C_MEMBAR_TYPE_SYS_MEMBAR                      0x00000000
+#define NVC96F_MEM_OP_C_MEMBAR_TYPE_MEMBAR                          0x00000001
+#define NVC96F_MEM_OP_C_TLB_INVALIDATE_PDB                                 0:0
+#define NVC96F_MEM_OP_C_TLB_INVALIDATE_PDB_ONE                      0x00000000
+#define NVC96F_MEM_OP_C_TLB_INVALIDATE_PDB_ALL                      0x00000001  // Probably nonsensical for MMU_TLB_INVALIDATE_TARGETED
+#define NVC96F_MEM_OP_C_TLB_INVALIDATE_GPC                                 1:1
+#define NVC96F_MEM_OP_C_TLB_INVALIDATE_GPC_ENABLE                   0x00000000
+#define NVC96F_MEM_OP_C_TLB_INVALIDATE_GPC_DISABLE                  0x00000001
+#define NVC96F_MEM_OP_C_TLB_INVALIDATE_REPLAY                              4:2  // only relevant if GPC ENABLE
+#define NVC96F_MEM_OP_C_TLB_INVALIDATE_REPLAY_NONE                  0x00000000
+#define NVC96F_MEM_OP_C_TLB_INVALIDATE_REPLAY_START                 0x00000001
+#define NVC96F_MEM_OP_C_TLB_INVALIDATE_REPLAY_START_ACK_ALL         0x00000002
+#define NVC96F_MEM_OP_C_TLB_INVALIDATE_REPLAY_CANCEL_TARGETED       0x00000003
+#define NVC96F_MEM_OP_C_TLB_INVALIDATE_REPLAY_CANCEL_GLOBAL         0x00000004
+#define NVC96F_MEM_OP_C_TLB_INVALIDATE_REPLAY_CANCEL_VA_GLOBAL      0x00000005
+#define NVC96F_MEM_OP_C_TLB_INVALIDATE_ACK_TYPE                            6:5  // only relevant if GPC ENABLE
+#define NVC96F_MEM_OP_C_TLB_INVALIDATE_ACK_TYPE_NONE                0x00000000
+#define NVC96F_MEM_OP_C_TLB_INVALIDATE_ACK_TYPE_GLOBALLY            0x00000001
+#define NVC96F_MEM_OP_C_TLB_INVALIDATE_ACK_TYPE_INTRANODE           0x00000002
+#define NVC96F_MEM_OP_C_TLB_INVALIDATE_ACCESS_TYPE                         9:7 //only relevant for REPLAY_CANCEL_VA_GLOBAL
+#define NVC96F_MEM_OP_C_TLB_INVALIDATE_ACCESS_TYPE_VIRT_READ                 0
+#define NVC96F_MEM_OP_C_TLB_INVALIDATE_ACCESS_TYPE_VIRT_WRITE                1
+#define NVC96F_MEM_OP_C_TLB_INVALIDATE_ACCESS_TYPE_VIRT_ATOMIC_STRONG        2
+#define NVC96F_MEM_OP_C_TLB_INVALIDATE_ACCESS_TYPE_VIRT_RSVRVD               3
+#define NVC96F_MEM_OP_C_TLB_INVALIDATE_ACCESS_TYPE_VIRT_ATOMIC_WEAK          4
+#define NVC96F_MEM_OP_C_TLB_INVALIDATE_ACCESS_TYPE_VIRT_ATOMIC_ALL           5
+#define NVC96F_MEM_OP_C_TLB_INVALIDATE_ACCESS_TYPE_VIRT_WRITE_AND_ATOMIC     6
+#define NVC96F_MEM_OP_C_TLB_INVALIDATE_ACCESS_TYPE_VIRT_ALL                  7
+#define NVC96F_MEM_OP_C_TLB_INVALIDATE_PAGE_TABLE_LEVEL                    9:7  // Invalidate affects this level and all below
+#define NVC96F_MEM_OP_C_TLB_INVALIDATE_PAGE_TABLE_LEVEL_ALL         0x00000000  // Invalidate tlb caches at all levels of the page table
+#define NVC96F_MEM_OP_C_TLB_INVALIDATE_PAGE_TABLE_LEVEL_PTE_ONLY    0x00000001
+#define NVC96F_MEM_OP_C_TLB_INVALIDATE_PAGE_TABLE_LEVEL_UP_TO_PDE0  0x00000002
+#define NVC96F_MEM_OP_C_TLB_INVALIDATE_PAGE_TABLE_LEVEL_UP_TO_PDE1  0x00000003
+#define NVC96F_MEM_OP_C_TLB_INVALIDATE_PAGE_TABLE_LEVEL_UP_TO_PDE2  0x00000004
+#define NVC96F_MEM_OP_C_TLB_INVALIDATE_PAGE_TABLE_LEVEL_UP_TO_PDE3  0x00000005
+#define NVC96F_MEM_OP_C_TLB_INVALIDATE_PAGE_TABLE_LEVEL_UP_TO_PDE4  0x00000006
+#define NVC96F_MEM_OP_C_TLB_INVALIDATE_PAGE_TABLE_LEVEL_UP_TO_PDE5  0x00000007
+#define NVC96F_MEM_OP_C_TLB_INVALIDATE_PDB_APERTURE                          11:10  // only relevant if PDB_ONE
+#define NVC96F_MEM_OP_C_TLB_INVALIDATE_PDB_APERTURE_VID_MEM             0x00000000
+#define NVC96F_MEM_OP_C_TLB_INVALIDATE_PDB_APERTURE_SYS_MEM_COHERENT    0x00000002
+#define NVC96F_MEM_OP_C_TLB_INVALIDATE_PDB_APERTURE_SYS_MEM_NONCOHERENT 0x00000003
+#define NVC96F_MEM_OP_C_TLB_INVALIDATE_PDB_ADDR_LO                       31:12  // only relevant if PDB_ONE
+#define NVC96F_MEM_OP_C_ACCESS_COUNTER_CLR_TARGETED_NOTIFY_TAG            19:0
+// MEM_OP_D MUST be preceded by MEM_OPs A-C.
+#define NVC96F_MEM_OP_D                                            (0x00000034)
+#define NVC96F_MEM_OP_D_TLB_INVALIDATE_PDB_ADDR_HI                        26:0  // only relevant if PDB_ONE
+#define NVC96F_MEM_OP_D_OPERATION                                        31:27
+#define NVC96F_MEM_OP_D_OPERATION_MEMBAR                            0x00000005
+#define NVC96F_MEM_OP_D_OPERATION_MMU_TLB_INVALIDATE                0x00000009
+#define NVC96F_MEM_OP_D_OPERATION_MMU_TLB_INVALIDATE_TARGETED       0x0000000a
+#define NVC96F_MEM_OP_D_OPERATION_MMU_OPERATION                     0x0000000b
+#define NVC96F_MEM_OP_D_OPERATION_L2_PEERMEM_INVALIDATE             0x0000000d
+#define NVC96F_MEM_OP_D_OPERATION_L2_SYSMEM_INVALIDATE              0x0000000e
+// CLEAN_LINES is an alias for Tegra/GPU IP usage
+#define NVC96F_MEM_OP_B_OPERATION_L2_INVALIDATE_CLEAN_LINES         0x0000000e
+#define NVC96F_MEM_OP_D_OPERATION_L2_CLEAN_COMPTAGS                 0x0000000f
+#define NVC96F_MEM_OP_D_OPERATION_L2_FLUSH_DIRTY                    0x00000010
+#define NVC96F_MEM_OP_D_OPERATION_L2_SYSMEM_NCOH_INVALIDATE         0x00000011
+#define NVC96F_MEM_OP_D_OPERATION_L2_SYSMEM_COH_INVALIDATE          0x00000012
+#define NVC96F_MEM_OP_D_OPERATION_L2_WAIT_FOR_SYS_PENDING_READS     0x00000015
+#define NVC96F_MEM_OP_D_OPERATION_ACCESS_COUNTER_CLR                0x00000016
+#define NVC96F_MEM_OP_D_ACCESS_COUNTER_CLR_TYPE                            1:0
+#define NVC96F_MEM_OP_D_ACCESS_COUNTER_CLR_TYPE_MIMC                0x00000000
+#define NVC96F_MEM_OP_D_ACCESS_COUNTER_CLR_TYPE_MOMC                0x00000001
+#define NVC96F_MEM_OP_D_ACCESS_COUNTER_CLR_TYPE_ALL                 0x00000002
+#define NVC96F_MEM_OP_D_ACCESS_COUNTER_CLR_TYPE_TARGETED            0x00000003
+#define NVC96F_MEM_OP_D_ACCESS_COUNTER_CLR_TARGETED_TYPE                   2:2
+#define NVC96F_MEM_OP_D_ACCESS_COUNTER_CLR_TARGETED_TYPE_MIMC       0x00000000
+#define NVC96F_MEM_OP_D_ACCESS_COUNTER_CLR_TARGETED_TYPE_MOMC       0x00000001
+#define NVC96F_MEM_OP_D_ACCESS_COUNTER_CLR_TARGETED_BANK                   6:3
+#define NVC96F_MEM_OP_D_MMU_OPERATION_TYPE                               23:20
+#define NVC96F_MEM_OP_D_MMU_OPERATION_TYPE_RESERVED                 0x00000000
+#define NVC96F_MEM_OP_D_MMU_OPERATION_TYPE_VIDMEM_ACCESS_BIT_DUMP   0x00000001
+#define NVC96F_SEM_ADDR_LO                                         (0x0000005c)
+#define NVC96F_SEM_ADDR_LO_OFFSET                                         31:2
+#define NVC96F_SEM_ADDR_HI                                         (0x00000060)
+#define NVC96F_SEM_ADDR_HI_OFFSET                                         24:0
+#define NVC96F_SEM_PAYLOAD_LO                                      (0x00000064)
+#define NVC96F_SEM_PAYLOAD_LO_PAYLOAD                                     31:0
+#define NVC96F_SEM_PAYLOAD_HI                                      (0x00000068)
+#define NVC96F_SEM_PAYLOAD_HI_PAYLOAD                                     31:0
+#define NVC96F_SEM_EXECUTE                                         (0x0000006c)
+#define NVC96F_SEM_EXECUTE_OPERATION                                       2:0
+#define NVC96F_SEM_EXECUTE_OPERATION_ACQUIRE                        0x00000000
+#define NVC96F_SEM_EXECUTE_OPERATION_RELEASE                        0x00000001
+#define NVC96F_SEM_EXECUTE_OPERATION_ACQ_STRICT_GEQ                 0x00000002
+#define NVC96F_SEM_EXECUTE_OPERATION_ACQ_CIRC_GEQ                   0x00000003
+#define NVC96F_SEM_EXECUTE_OPERATION_ACQ_AND                        0x00000004
+#define NVC96F_SEM_EXECUTE_OPERATION_ACQ_NOR                        0x00000005
+#define NVC96F_SEM_EXECUTE_OPERATION_REDUCTION                      0x00000006
+#define NVC96F_SEM_EXECUTE_ACQUIRE_SWITCH_TSG                            12:12
+#define NVC96F_SEM_EXECUTE_ACQUIRE_SWITCH_TSG_DIS                   0x00000000
+#define NVC96F_SEM_EXECUTE_ACQUIRE_SWITCH_TSG_EN                    0x00000001
+#define NVC96F_SEM_EXECUTE_ACQUIRE_RECHECK                               18:18
+#define NVC96F_SEM_EXECUTE_ACQUIRE_RECHECK_DIS                      0x00000000
+#define NVC96F_SEM_EXECUTE_ACQUIRE_RECHECK_EN                       0x00000001
+#define NVC96F_SEM_EXECUTE_RELEASE_WFI                                   20:20
+#define NVC96F_SEM_EXECUTE_RELEASE_WFI_DIS                          0x00000000
+#define NVC96F_SEM_EXECUTE_RELEASE_WFI_EN                           0x00000001
+#define NVC96F_SEM_EXECUTE_PAYLOAD_SIZE                                  24:24
+#define NVC96F_SEM_EXECUTE_PAYLOAD_SIZE_32BIT                       0x00000000
+#define NVC96F_SEM_EXECUTE_PAYLOAD_SIZE_64BIT                       0x00000001
+#define NVC96F_SEM_EXECUTE_RELEASE_TIMESTAMP                             25:25
+#define NVC96F_SEM_EXECUTE_RELEASE_TIMESTAMP_DIS                    0x00000000
+#define NVC96F_SEM_EXECUTE_RELEASE_TIMESTAMP_EN                     0x00000001
+#define NVC96F_SEM_EXECUTE_REDUCTION                                     30:27
+#define NVC96F_SEM_EXECUTE_REDUCTION_IMIN                           0x00000000
+#define NVC96F_SEM_EXECUTE_REDUCTION_IMAX                           0x00000001
+#define NVC96F_SEM_EXECUTE_REDUCTION_IXOR                           0x00000002
+#define NVC96F_SEM_EXECUTE_REDUCTION_IAND                           0x00000003
+#define NVC96F_SEM_EXECUTE_REDUCTION_IOR                            0x00000004
+#define NVC96F_SEM_EXECUTE_REDUCTION_IADD                           0x00000005
+#define NVC96F_SEM_EXECUTE_REDUCTION_INC                            0x00000006
+#define NVC96F_SEM_EXECUTE_REDUCTION_DEC                            0x00000007
+#define NVC96F_SEM_EXECUTE_REDUCTION_FORMAT                              31:31
+#define NVC96F_SEM_EXECUTE_REDUCTION_FORMAT_SIGNED                  0x00000000
+#define NVC96F_SEM_EXECUTE_REDUCTION_FORMAT_UNSIGNED                0x00000001
+#define NVC96F_WFI                                                 (0x00000078)
+#define NVC96F_WFI_SCOPE                                                   0:0
+#define NVC96F_WFI_SCOPE_CURRENT_SCG_TYPE                           0x00000000
+#define NVC96F_WFI_SCOPE_CURRENT_VEID                               0x00000000
+#define NVC96F_WFI_SCOPE_ALL                                        0x00000001
+#define NVC96F_YIELD                                               (0x00000080)
+#define NVC96F_YIELD_OP                                                    1:0
+#define NVC96F_YIELD_OP_NOP                                         0x00000000
+#define NVC96F_YIELD_OP_TSG                                         0x00000003
+#define NVC96F_CLEAR_FAULTED                                       (0x00000084)
+// Note: RM provides the HANDLE as an opaque value; the internal detail fields
+// are intentionally not exposed to the driver through these defines.
+#define NVC96F_CLEAR_FAULTED_HANDLE                                       30:0
+#define NVC96F_CLEAR_FAULTED_TYPE                                        31:31
+#define NVC96F_CLEAR_FAULTED_TYPE_PBDMA_FAULTED                     0x00000000
+#define NVC96F_CLEAR_FAULTED_TYPE_ENG_FAULTED                       0x00000001
+
+
+/* GPFIFO entry format */
+#define NVC96F_GP_ENTRY__SIZE                                          8
+#define NVC96F_GP_ENTRY0_FETCH                                       0:0
+#define NVC96F_GP_ENTRY0_FETCH_UNCONDITIONAL                  0x00000000
+#define NVC96F_GP_ENTRY0_FETCH_CONDITIONAL                    0x00000001
+#define NVC96F_GP_ENTRY0_GET                                        31:2
+#define NVC96F_GP_ENTRY0_OPERAND                                    31:0
+#define NVC96F_GP_ENTRY0_PB_EXTENDED_BASE_OPERAND                   24:8
+#define NVC96F_GP_ENTRY1_GET_HI                                      7:0
+#define NVC96F_GP_ENTRY1_LEVEL                                       9:9
+#define NVC96F_GP_ENTRY1_LEVEL_MAIN                           0x00000000
+#define NVC96F_GP_ENTRY1_LEVEL_SUBROUTINE                     0x00000001
+#define NVC96F_GP_ENTRY1_LENGTH                                    30:10
+#define NVC96F_GP_ENTRY1_SYNC                                      31:31
+#define NVC96F_GP_ENTRY1_SYNC_PROCEED                         0x00000000
+#define NVC96F_GP_ENTRY1_SYNC_WAIT                            0x00000001
+#define NVC96F_GP_ENTRY1_OPCODE                                      7:0
+#define NVC96F_GP_ENTRY1_OPCODE_NOP                           0x00000000
+#define NVC96F_GP_ENTRY1_OPCODE_ILLEGAL                       0x00000001
+#define NVC96F_GP_ENTRY1_OPCODE_GP_CRC                        0x00000002
+#define NVC96F_GP_ENTRY1_OPCODE_PB_CRC                        0x00000003
+#define NVC96F_GP_ENTRY1_OPCODE_SET_PB_SEGMENT_EXTENDED_BASE  0x00000004
+
+/* dma method formats */
+#define NVC96F_DMA_METHOD_ADDRESS_OLD                              12:2
+#define NVC96F_DMA_METHOD_ADDRESS                                  11:0
+#define NVC96F_DMA_SUBDEVICE_MASK                                  15:4
+#define NVC96F_DMA_METHOD_SUBCHANNEL                               15:13
+#define NVC96F_DMA_TERT_OP                                         17:16
+#define NVC96F_DMA_TERT_OP_GRP0_INC_METHOD                         (0x00000000)
+#define NVC96F_DMA_TERT_OP_GRP0_SET_SUB_DEV_MASK                   (0x00000001)
+#define NVC96F_DMA_TERT_OP_GRP0_STORE_SUB_DEV_MASK                 (0x00000002)
+#define NVC96F_DMA_TERT_OP_GRP0_USE_SUB_DEV_MASK                   (0x00000003)
+#define NVC96F_DMA_TERT_OP_GRP2_NON_INC_METHOD                     (0x00000000)
+#define NVC96F_DMA_METHOD_COUNT_OLD                                28:18
+#define NVC96F_DMA_METHOD_COUNT                                    28:16
+#define NVC96F_DMA_IMMD_DATA                                       28:16
+#define NVC96F_DMA_SEC_OP                                          31:29
+#define NVC96F_DMA_SEC_OP_GRP0_USE_TERT                            (0x00000000)
+#define NVC96F_DMA_SEC_OP_INC_METHOD                               (0x00000001)
+#define NVC96F_DMA_SEC_OP_GRP2_USE_TERT                            (0x00000002)
+#define NVC96F_DMA_SEC_OP_NON_INC_METHOD                           (0x00000003)
+#define NVC96F_DMA_SEC_OP_IMMD_DATA_METHOD                         (0x00000004)
+#define NVC96F_DMA_SEC_OP_ONE_INC                                  (0x00000005)
+#define NVC96F_DMA_SEC_OP_RESERVED6                                (0x00000006)
+#define NVC96F_DMA_SEC_OP_END_PB_SEGMENT                           (0x00000007)
+/* dma incrementing method format */
+#define NVC96F_DMA_INCR_ADDRESS                                    11:0
+#define NVC96F_DMA_INCR_SUBCHANNEL                                 15:13
+#define NVC96F_DMA_INCR_COUNT                                      28:16
+#define NVC96F_DMA_INCR_OPCODE                                     31:29
+#define NVC96F_DMA_INCR_OPCODE_VALUE                               (0x00000001)
+#define NVC96F_DMA_INCR_DATA                                       31:0
+/* dma non-incrementing method format */
+#define NVC96F_DMA_NONINCR_ADDRESS                                 11:0
+#define NVC96F_DMA_NONINCR_SUBCHANNEL                              15:13
+#define NVC96F_DMA_NONINCR_COUNT                                   28:16
+#define NVC96F_DMA_NONINCR_OPCODE                                  31:29
+#define NVC96F_DMA_NONINCR_OPCODE_VALUE                            (0x00000003)
+#define NVC96F_DMA_NONINCR_DATA                                    31:0
+/* dma increment-once method format */
+#define NVC96F_DMA_ONEINCR_ADDRESS                                 11:0
+#define NVC96F_DMA_ONEINCR_SUBCHANNEL                              15:13
+#define NVC96F_DMA_ONEINCR_COUNT                                   28:16
+#define NVC96F_DMA_ONEINCR_OPCODE                                  31:29
+#define NVC96F_DMA_ONEINCR_OPCODE_VALUE                            (0x00000005)
+#define NVC96F_DMA_ONEINCR_DATA                                    31:0
+/* dma no-operation format */
+#define NVC96F_DMA_NOP                                             (0x00000000)
+/* dma immediate-data format */
+#define NVC96F_DMA_IMMD_ADDRESS                                    11:0
+#define NVC96F_DMA_IMMD_SUBCHANNEL                                 15:13
+#define NVC96F_DMA_IMMD_DATA                                       28:16
+#define NVC96F_DMA_IMMD_OPCODE                                     31:29
+#define NVC96F_DMA_IMMD_OPCODE_VALUE                               (0x00000004)
+/* dma set sub-device mask format */
+#define NVC96F_DMA_SET_SUBDEVICE_MASK_VALUE                        15:4
+#define NVC96F_DMA_SET_SUBDEVICE_MASK_OPCODE                       31:16
+#define NVC96F_DMA_SET_SUBDEVICE_MASK_OPCODE_VALUE                 (0x00000001)
+/* dma store sub-device mask format */
+#define NVC96F_DMA_STORE_SUBDEVICE_MASK_VALUE                      15:4
+#define NVC96F_DMA_STORE_SUBDEVICE_MASK_OPCODE                     31:16
+#define NVC96F_DMA_STORE_SUBDEVICE_MASK_OPCODE_VALUE               (0x00000002)
+/* dma use sub-device mask format */
+#define NVC96F_DMA_USE_SUBDEVICE_MASK_OPCODE                       31:16
+#define NVC96F_DMA_USE_SUBDEVICE_MASK_OPCODE_VALUE                 (0x00000003)
+/* dma end-segment format */
+#define NVC96F_DMA_ENDSEG_OPCODE                                   31:29
+#define NVC96F_DMA_ENDSEG_OPCODE_VALUE                             (0x00000007)
+/* dma legacy incrementing/non-incrementing formats */
+#define NVC96F_DMA_ADDRESS                                         12:2
+#define NVC96F_DMA_SUBCH                                           15:13
+#define NVC96F_DMA_OPCODE3                                         17:16
+#define NVC96F_DMA_OPCODE3_NONE                                    (0x00000000)
+#define NVC96F_DMA_COUNT                                           28:18
+#define NVC96F_DMA_OPCODE                                          31:29
+#define NVC96F_DMA_OPCODE_METHOD                                   (0x00000000)
+#define NVC96F_DMA_OPCODE_NONINC_METHOD                            (0x00000002)
+#define NVC96F_DMA_DATA                                            31:0
+
+#ifdef __cplusplus
+};     /* extern "C" */
+#endif
+
+#endif /* _clc96f_h_ */
--- a/kernel-open/nvidia-uvm/clc9b5.h
+++ b/kernel-open/nvidia-uvm/clc9b5.h
@@ -0,0 +1,460 @@
+/*******************************************************************************
+    Copyright (c) 1993-2004 NVIDIA Corporation
+
+    Permission is hereby granted, free of charge, to any person obtaining a copy
+    of this software and associated documentation files (the "Software"), to
+    deal in the Software without restriction, including without limitation the
+    rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
+    sell copies of the Software, and to permit persons to whom the Software is
+    furnished to do so, subject to the following conditions:
+
+    The above copyright notice and this permission notice shall be
+    included in all copies or substantial portions of the Software.
+
+    THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+    IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+    FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+    THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+    LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+    FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+    DEALINGS IN THE SOFTWARE.
+
+*******************************************************************************/
+
+
+
+#include "nvtypes.h"
+
+#ifndef _clc9b5_h_
+#define _clc9b5_h_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#define BLACKWELL_DMA_COPY_A                                                            (0x0000C9B5)
+
+typedef volatile struct _clc9b5_tag0 {
+    NvV32 Reserved00[0x40];
+    NvV32 Nop;                                                                  // 0x00000100 - 0x00000103
+    NvV32 Reserved01[0xF];
+    NvV32 PmTrigger;                                                            // 0x00000140 - 0x00000143
+    NvV32 Reserved02[0x36];
+    NvV32 SetMonitoredFenceType;                                                // 0x0000021C - 0x0000021F
+    NvV32 SetMonitoredFenceSignalAddrBaseUpper;                                 // 0x00000220 - 0x00000223
+    NvV32 SetMonitoredFenceSignalAddrBaseLower;                                 // 0x00000224 - 0x00000227
+    NvV32 Reserved03[0x6];
+    NvV32 SetSemaphoreA;                                                        // 0x00000240 - 0x00000243
+    NvV32 SetSemaphoreB;                                                        // 0x00000244 - 0x00000247
+    NvV32 SetSemaphorePayload;                                                  // 0x00000248 - 0x0000024B
+    NvV32 SetSemaphorePayloadUpper;                                             // 0x0000024C - 0x0000024F
+    NvV32 Reserved04[0x1];
+    NvV32 SetRenderEnableA;                                                     // 0x00000254 - 0x00000257
+    NvV32 SetRenderEnableB;                                                     // 0x00000258 - 0x0000025B
+    NvV32 SetRenderEnableC;                                                     // 0x0000025C - 0x0000025F
+    NvV32 SetSrcPhysMode;                                                       // 0x00000260 - 0x00000263
+    NvV32 SetDstPhysMode;                                                       // 0x00000264 - 0x00000267
+    NvV32 Reserved05[0x26];
+    NvV32 LaunchDma;                                                            // 0x00000300 - 0x00000303
+    NvV32 Reserved06[0x3F];
+    NvV32 OffsetInUpper;                                                        // 0x00000400 - 0x00000403
+    NvV32 OffsetInLower;                                                        // 0x00000404 - 0x00000407
+    NvV32 OffsetOutUpper;                                                       // 0x00000408 - 0x0000040B
+    NvV32 OffsetOutLower;                                                       // 0x0000040C - 0x0000040F
+    NvV32 PitchIn;                                                              // 0x00000410 - 0x00000413
+    NvV32 PitchOut;                                                             // 0x00000414 - 0x00000417
+    NvV32 LineLengthIn;                                                         // 0x00000418 - 0x0000041B
+    NvV32 LineCount;                                                            // 0x0000041C - 0x0000041F
+    NvV32 Reserved07[0x38];
+    NvV32 SetSecureCopyMode;                                                    // 0x00000500 - 0x00000503
+    NvV32 SetDecryptIv0;                                                        // 0x00000504 - 0x00000507
+    NvV32 SetDecryptIv1;                                                        // 0x00000508 - 0x0000050B
+    NvV32 SetDecryptIv2;                                                        // 0x0000050C - 0x0000050F
+    NvV32 Reserved_SetAESCounter;                                               // 0x00000510 - 0x00000513
+    NvV32 SetDecryptAuthTagCompareAddrUpper;                                    // 0x00000514 - 0x00000517
+    NvV32 SetDecryptAuthTagCompareAddrLower;                                    // 0x00000518 - 0x0000051B
+    NvV32 Reserved08[0x5];
+    NvV32 SetEncryptAuthTagAddrUpper;                                           // 0x00000530 - 0x00000533
+    NvV32 SetEncryptAuthTagAddrLower;                                           // 0x00000534 - 0x00000537
+    NvV32 SetEncryptIvAddrUpper;                                                // 0x00000538 - 0x0000053B
+    NvV32 SetEncryptIvAddrLower;                                                // 0x0000053C - 0x0000053F
+    NvV32 Reserved09[0x10];
+    NvV32 SetCompressionParameters;                                             // 0x00000580 - 0x00000583
+    NvV32 SetDecompressOutLength;                                               // 0x00000584 - 0x00000587
+    NvV32 SetDecompressOutLengthAddrUpper;                                      // 0x00000588 - 0x0000058B
+    NvV32 SetDecompressOutLengthAddrLower;                                      // 0x0000058C - 0x0000058F
+    NvV32 SetDecompressChecksum;                                                // 0x00000590 - 0x00000593
+    NvV32 Reserved10[0x5A];
+    NvV32 SetMemoryScrubParameters;                                             // 0x000006FC - 0x000006FF
+    NvV32 SetRemapConstA;                                                       // 0x00000700 - 0x00000703
+    NvV32 SetRemapConstB;                                                       // 0x00000704 - 0x00000707
+    NvV32 SetRemapComponents;                                                   // 0x00000708 - 0x0000070B
+    NvV32 SetDstBlockSize;                                                      // 0x0000070C - 0x0000070F
+    NvV32 SetDstWidth;                                                          // 0x00000710 - 0x00000713
+    NvV32 SetDstHeight;                                                         // 0x00000714 - 0x00000717
+    NvV32 SetDstDepth;                                                          // 0x00000718 - 0x0000071B
+    NvV32 SetDstLayer;                                                          // 0x0000071C - 0x0000071F
+    NvV32 SetDstOrigin;                                                         // 0x00000720 - 0x00000723
+    NvV32 Reserved11[0x1];
+    NvV32 SetSrcBlockSize;                                                      // 0x00000728 - 0x0000072B
+    NvV32 SetSrcWidth;                                                          // 0x0000072C - 0x0000072F
+    NvV32 SetSrcHeight;                                                         // 0x00000730 - 0x00000733
+    NvV32 SetSrcDepth;                                                          // 0x00000734 - 0x00000737
+    NvV32 SetSrcLayer;                                                          // 0x00000738 - 0x0000073B
+    NvV32 SetSrcOrigin;                                                         // 0x0000073C - 0x0000073F
+    NvV32 Reserved12[0x1];
+    NvV32 SrcOriginX;                                                           // 0x00000744 - 0x00000747
+    NvV32 SrcOriginY;                                                           // 0x00000748 - 0x0000074B
+    NvV32 DstOriginX;                                                           // 0x0000074C - 0x0000074F
+    NvV32 DstOriginY;                                                           // 0x00000750 - 0x00000753
+    NvV32 Reserved13[0x270];
+    NvV32 PmTriggerEnd;                                                         // 0x00001114 - 0x00001117
+    NvV32 Reserved14[0x3BA];
+} blackwell_dma_copy_aControlPio;
+
+#define NVC9B5_NOP                                                              (0x00000100)
+#define NVC9B5_NOP_PARAMETER                                                    31:0
+#define NVC9B5_PM_TRIGGER                                                       (0x00000140)
+#define NVC9B5_PM_TRIGGER_V                                                     31:0
+#define NVC9B5_SET_MONITORED_FENCE_TYPE                                         (0x0000021C)
+#define NVC9B5_SET_MONITORED_FENCE_TYPE_TYPE                                    0:0
+#define NVC9B5_SET_MONITORED_FENCE_TYPE_TYPE_MONITORED_FENCE                    (0x00000000)
+#define NVC9B5_SET_MONITORED_FENCE_TYPE_TYPE_MONITORED_FENCE_EXT                (0x00000001)
+#define NVC9B5_SET_MONITORED_FENCE_SIGNAL_ADDR_BASE_UPPER                       (0x00000220)
+#define NVC9B5_SET_MONITORED_FENCE_SIGNAL_ADDR_BASE_UPPER_UPPER                 24:0
+#define NVC9B5_SET_MONITORED_FENCE_SIGNAL_ADDR_BASE_LOWER                       (0x00000224)
+#define NVC9B5_SET_MONITORED_FENCE_SIGNAL_ADDR_BASE_LOWER_LOWER                 31:0
+#define NVC9B5_SET_SEMAPHORE_A                                                  (0x00000240)
+#define NVC9B5_SET_SEMAPHORE_A_UPPER                                            24:0
+#define NVC9B5_SET_SEMAPHORE_B                                                  (0x00000244)
+#define NVC9B5_SET_SEMAPHORE_B_LOWER                                            31:0
+#define NVC9B5_SET_SEMAPHORE_PAYLOAD                                            (0x00000248)
+#define NVC9B5_SET_SEMAPHORE_PAYLOAD_PAYLOAD                                    31:0
+#define NVC9B5_SET_SEMAPHORE_PAYLOAD_UPPER                                      (0x0000024C)
+#define NVC9B5_SET_SEMAPHORE_PAYLOAD_UPPER_PAYLOAD                              31:0
+#define NVC9B5_SET_RENDER_ENABLE_A                                              (0x00000254)
+#define NVC9B5_SET_RENDER_ENABLE_A_UPPER                                        24:0
+#define NVC9B5_SET_RENDER_ENABLE_B                                              (0x00000258)
+#define NVC9B5_SET_RENDER_ENABLE_B_LOWER                                        31:0
+#define NVC9B5_SET_RENDER_ENABLE_C                                              (0x0000025C)
+#define NVC9B5_SET_RENDER_ENABLE_C_MODE                                         2:0
+#define NVC9B5_SET_RENDER_ENABLE_C_MODE_FALSE                                   (0x00000000)
+#define NVC9B5_SET_RENDER_ENABLE_C_MODE_TRUE                                    (0x00000001)
+#define NVC9B5_SET_RENDER_ENABLE_C_MODE_CONDITIONAL                             (0x00000002)
+#define NVC9B5_SET_RENDER_ENABLE_C_MODE_RENDER_IF_EQUAL                         (0x00000003)
+#define NVC9B5_SET_RENDER_ENABLE_C_MODE_RENDER_IF_NOT_EQUAL                     (0x00000004)
+#define NVC9B5_SET_SRC_PHYS_MODE                                                (0x00000260)
+#define NVC9B5_SET_SRC_PHYS_MODE_TARGET                                         1:0
+#define NVC9B5_SET_SRC_PHYS_MODE_TARGET_LOCAL_FB                                (0x00000000)
+#define NVC9B5_SET_SRC_PHYS_MODE_TARGET_COHERENT_SYSMEM                         (0x00000001)
+#define NVC9B5_SET_SRC_PHYS_MODE_TARGET_NONCOHERENT_SYSMEM                      (0x00000002)
+#define NVC9B5_SET_SRC_PHYS_MODE_TARGET_PEERMEM                                 (0x00000003)
+#define NVC9B5_SET_SRC_PHYS_MODE_BASIC_KIND                                     5:2
+#define NVC9B5_SET_SRC_PHYS_MODE_PEER_ID                                        8:6
+#define NVC9B5_SET_SRC_PHYS_MODE_FLA                                            9:9
+#define NVC9B5_SET_DST_PHYS_MODE                                                (0x00000264)
+#define NVC9B5_SET_DST_PHYS_MODE_TARGET                                         1:0
+#define NVC9B5_SET_DST_PHYS_MODE_TARGET_LOCAL_FB                                (0x00000000)
+#define NVC9B5_SET_DST_PHYS_MODE_TARGET_COHERENT_SYSMEM                         (0x00000001)
+#define NVC9B5_SET_DST_PHYS_MODE_TARGET_NONCOHERENT_SYSMEM                      (0x00000002)
+#define NVC9B5_SET_DST_PHYS_MODE_TARGET_PEERMEM                                 (0x00000003)
+#define NVC9B5_SET_DST_PHYS_MODE_BASIC_KIND                                     5:2
+#define NVC9B5_SET_DST_PHYS_MODE_PEER_ID                                        8:6
+#define NVC9B5_SET_DST_PHYS_MODE_FLA                                            9:9
+#define NVC9B5_LAUNCH_DMA                                                       (0x00000300)
+#define NVC9B5_LAUNCH_DMA_DATA_TRANSFER_TYPE                                    1:0
+#define NVC9B5_LAUNCH_DMA_DATA_TRANSFER_TYPE_NONE                               (0x00000000)
+#define NVC9B5_LAUNCH_DMA_DATA_TRANSFER_TYPE_PIPELINED                          (0x00000001)
+#define NVC9B5_LAUNCH_DMA_DATA_TRANSFER_TYPE_NON_PIPELINED                      (0x00000002)
+#define NVC9B5_LAUNCH_DMA_FLUSH_ENABLE                                          2:2
+#define NVC9B5_LAUNCH_DMA_FLUSH_ENABLE_FALSE                                    (0x00000000)
+#define NVC9B5_LAUNCH_DMA_FLUSH_ENABLE_TRUE                                     (0x00000001)
+#define NVC9B5_LAUNCH_DMA_FLUSH_TYPE                                            25:25
+#define NVC9B5_LAUNCH_DMA_FLUSH_TYPE_SYS                                        (0x00000000)
+#define NVC9B5_LAUNCH_DMA_FLUSH_TYPE_GL                                         (0x00000001)
+#define NVC9B5_LAUNCH_DMA_SEMAPHORE_TYPE                                        4:3
+#define NVC9B5_LAUNCH_DMA_SEMAPHORE_TYPE_NONE                                   (0x00000000)
+#define NVC9B5_LAUNCH_DMA_SEMAPHORE_TYPE_RELEASE_SEMAPHORE_NO_TIMESTAMP         (0x00000001)
+#define NVC9B5_LAUNCH_DMA_SEMAPHORE_TYPE_RELEASE_SEMAPHORE_WITH_TIMESTAMP       (0x00000002)
+#define NVC9B5_LAUNCH_DMA_SEMAPHORE_TYPE_RELEASE_ONE_WORD_SEMAPHORE             (0x00000001)
+#define NVC9B5_LAUNCH_DMA_SEMAPHORE_TYPE_RELEASE_FOUR_WORD_SEMAPHORE            (0x00000002)
+#define NVC9B5_LAUNCH_DMA_SEMAPHORE_TYPE_RELEASE_CONDITIONAL_INTR_SEMAPHORE     (0x00000003)
+#define NVC9B5_LAUNCH_DMA_INTERRUPT_TYPE                                        6:5
+#define NVC9B5_LAUNCH_DMA_INTERRUPT_TYPE_NONE                                   (0x00000000)
+#define NVC9B5_LAUNCH_DMA_INTERRUPT_TYPE_BLOCKING                               (0x00000001)
+#define NVC9B5_LAUNCH_DMA_INTERRUPT_TYPE_NON_BLOCKING                           (0x00000002)
+#define NVC9B5_LAUNCH_DMA_SRC_MEMORY_LAYOUT                                     7:7
+#define NVC9B5_LAUNCH_DMA_SRC_MEMORY_LAYOUT_BLOCKLINEAR                         (0x00000000)
+#define NVC9B5_LAUNCH_DMA_SRC_MEMORY_LAYOUT_PITCH                               (0x00000001)
+#define NVC9B5_LAUNCH_DMA_DST_MEMORY_LAYOUT                                     8:8
+#define NVC9B5_LAUNCH_DMA_DST_MEMORY_LAYOUT_BLOCKLINEAR                         (0x00000000)
+#define NVC9B5_LAUNCH_DMA_DST_MEMORY_LAYOUT_PITCH                               (0x00000001)
+#define NVC9B5_LAUNCH_DMA_MULTI_LINE_ENABLE                                     9:9
+#define NVC9B5_LAUNCH_DMA_MULTI_LINE_ENABLE_FALSE                               (0x00000000)
+#define NVC9B5_LAUNCH_DMA_MULTI_LINE_ENABLE_TRUE                                (0x00000001)
+#define NVC9B5_LAUNCH_DMA_REMAP_ENABLE                                          10:10
+#define NVC9B5_LAUNCH_DMA_REMAP_ENABLE_FALSE                                    (0x00000000)
+#define NVC9B5_LAUNCH_DMA_REMAP_ENABLE_TRUE                                     (0x00000001)
+#define NVC9B5_LAUNCH_DMA_COMPRESSION_ENABLE                                    11:11
+#define NVC9B5_LAUNCH_DMA_COMPRESSION_ENABLE_FALSE                              (0x00000000)
+#define NVC9B5_LAUNCH_DMA_COMPRESSION_ENABLE_TRUE                               (0x00000001)
+#define NVC9B5_LAUNCH_DMA_SRC_TYPE                                              12:12
+#define NVC9B5_LAUNCH_DMA_SRC_TYPE_VIRTUAL                                      (0x00000000)
+#define NVC9B5_LAUNCH_DMA_SRC_TYPE_PHYSICAL                                     (0x00000001)
+#define NVC9B5_LAUNCH_DMA_DST_TYPE                                              13:13
+#define NVC9B5_LAUNCH_DMA_DST_TYPE_VIRTUAL                                      (0x00000000)
+#define NVC9B5_LAUNCH_DMA_DST_TYPE_PHYSICAL                                     (0x00000001)
+#define NVC9B5_LAUNCH_DMA_SEMAPHORE_REDUCTION                                   17:14
+#define NVC9B5_LAUNCH_DMA_SEMAPHORE_REDUCTION_IMIN                              (0x00000000)
+#define NVC9B5_LAUNCH_DMA_SEMAPHORE_REDUCTION_IMAX                              (0x00000001)
+#define NVC9B5_LAUNCH_DMA_SEMAPHORE_REDUCTION_IXOR                              (0x00000002)
+#define NVC9B5_LAUNCH_DMA_SEMAPHORE_REDUCTION_IAND                              (0x00000003)
+#define NVC9B5_LAUNCH_DMA_SEMAPHORE_REDUCTION_IOR                               (0x00000004)
+#define NVC9B5_LAUNCH_DMA_SEMAPHORE_REDUCTION_IADD                              (0x00000005)
+#define NVC9B5_LAUNCH_DMA_SEMAPHORE_REDUCTION_INC                               (0x00000006)
+#define NVC9B5_LAUNCH_DMA_SEMAPHORE_REDUCTION_DEC                               (0x00000007)
+#define NVC9B5_LAUNCH_DMA_SEMAPHORE_REDUCTION_INVALIDA                          (0x00000008)
+#define NVC9B5_LAUNCH_DMA_SEMAPHORE_REDUCTION_INVALIDB                          (0x00000009)
+#define NVC9B5_LAUNCH_DMA_SEMAPHORE_REDUCTION_FADD                              (0x0000000A)
+#define NVC9B5_LAUNCH_DMA_SEMAPHORE_REDUCTION_FMIN                              (0x0000000B)
+#define NVC9B5_LAUNCH_DMA_SEMAPHORE_REDUCTION_FMAX                              (0x0000000C)
+#define NVC9B5_LAUNCH_DMA_SEMAPHORE_REDUCTION_INVALIDC                          (0x0000000D)
+#define NVC9B5_LAUNCH_DMA_SEMAPHORE_REDUCTION_INVALIDD                          (0x0000000E)
+#define NVC9B5_LAUNCH_DMA_SEMAPHORE_REDUCTION_INVALIDE                          (0x0000000F)
+#define NVC9B5_LAUNCH_DMA_SEMAPHORE_REDUCTION_SIGN                              18:18
+#define NVC9B5_LAUNCH_DMA_SEMAPHORE_REDUCTION_SIGN_SIGNED                       (0x00000000)
+#define NVC9B5_LAUNCH_DMA_SEMAPHORE_REDUCTION_SIGN_UNSIGNED                     (0x00000001)
+#define NVC9B5_LAUNCH_DMA_SEMAPHORE_REDUCTION_ENABLE                            19:19
+#define NVC9B5_LAUNCH_DMA_SEMAPHORE_REDUCTION_ENABLE_FALSE                      (0x00000000)
+#define NVC9B5_LAUNCH_DMA_SEMAPHORE_REDUCTION_ENABLE_TRUE                       (0x00000001)
+#define NVC9B5_LAUNCH_DMA_COPY_TYPE                                             21:20
+#define NVC9B5_LAUNCH_DMA_COPY_TYPE_PROT2PROT                                   (0x00000000)
+#define NVC9B5_LAUNCH_DMA_COPY_TYPE_DEFAULT                                     (0x00000000)
+#define NVC9B5_LAUNCH_DMA_COPY_TYPE_SECURE                                      (0x00000001)
+#define NVC9B5_LAUNCH_DMA_COPY_TYPE_NONPROT2NONPROT                             (0x00000002)
+#define NVC9B5_LAUNCH_DMA_COPY_TYPE_RESERVED                                    (0x00000003)
+#define NVC9B5_LAUNCH_DMA_VPRMODE                                               22:22
+#define NVC9B5_LAUNCH_DMA_VPRMODE_VPR_NONE                                      (0x00000000)
+#define NVC9B5_LAUNCH_DMA_VPRMODE_VPR_VID2VID                                   (0x00000001)
+#define NVC9B5_LAUNCH_DMA_MEMORY_SCRUB_ENABLE                                   23:23
+#define NVC9B5_LAUNCH_DMA_MEMORY_SCRUB_ENABLE_FALSE                             (0x00000000)
+#define NVC9B5_LAUNCH_DMA_MEMORY_SCRUB_ENABLE_TRUE                              (0x00000001)
+#define NVC9B5_LAUNCH_DMA_RESERVED_START_OF_COPY                                24:24
+#define NVC9B5_LAUNCH_DMA_DISABLE_PLC                                           26:26
+#define NVC9B5_LAUNCH_DMA_DISABLE_PLC_FALSE                                     (0x00000000)
+#define NVC9B5_LAUNCH_DMA_DISABLE_PLC_TRUE                                      (0x00000001)
+#define NVC9B5_LAUNCH_DMA_SEMAPHORE_PAYLOAD_SIZE                                27:27
+#define NVC9B5_LAUNCH_DMA_SEMAPHORE_PAYLOAD_SIZE_ONE_WORD                       (0x00000000)
+#define NVC9B5_LAUNCH_DMA_SEMAPHORE_PAYLOAD_SIZE_TWO_WORD                       (0x00000001)
+#define NVC9B5_LAUNCH_DMA_RESERVED_ERR_CODE                                     31:28
+#define NVC9B5_OFFSET_IN_UPPER                                                  (0x00000400)
+#define NVC9B5_OFFSET_IN_UPPER_UPPER                                            24:0
+#define NVC9B5_OFFSET_IN_LOWER                                                  (0x00000404)
+#define NVC9B5_OFFSET_IN_LOWER_VALUE                                            31:0
+#define NVC9B5_OFFSET_OUT_UPPER                                                 (0x00000408)
+#define NVC9B5_OFFSET_OUT_UPPER_UPPER                                           24:0
+#define NVC9B5_OFFSET_OUT_LOWER                                                 (0x0000040C)
+#define NVC9B5_OFFSET_OUT_LOWER_VALUE                                           31:0
+#define NVC9B5_PITCH_IN                                                         (0x00000410)
+#define NVC9B5_PITCH_IN_VALUE                                                   31:0
+#define NVC9B5_PITCH_OUT                                                        (0x00000414)
+#define NVC9B5_PITCH_OUT_VALUE                                                  31:0
+#define NVC9B5_LINE_LENGTH_IN                                                   (0x00000418)
+#define NVC9B5_LINE_LENGTH_IN_VALUE                                             31:0
+#define NVC9B5_LINE_COUNT                                                       (0x0000041C)
+#define NVC9B5_LINE_COUNT_VALUE                                                 31:0
+#define NVC9B5_SET_SECURE_COPY_MODE                                             (0x00000500)
+#define NVC9B5_SET_SECURE_COPY_MODE_MODE                                        0:0
+#define NVC9B5_SET_SECURE_COPY_MODE_MODE_ENCRYPT                                (0x00000000)
+#define NVC9B5_SET_SECURE_COPY_MODE_MODE_DECRYPT                                (0x00000001)
+#define NVC9B5_SET_SECURE_COPY_MODE_RESERVED_SRC_TARGET                         20:19
+#define NVC9B5_SET_SECURE_COPY_MODE_RESERVED_SRC_TARGET_LOCAL_FB                (0x00000000)
+#define NVC9B5_SET_SECURE_COPY_MODE_RESERVED_SRC_TARGET_COHERENT_SYSMEM         (0x00000001)
+#define NVC9B5_SET_SECURE_COPY_MODE_RESERVED_SRC_TARGET_NONCOHERENT_SYSMEM      (0x00000002)
+#define NVC9B5_SET_SECURE_COPY_MODE_RESERVED_SRC_TARGET_PEERMEM                 (0x00000003)
+#define NVC9B5_SET_SECURE_COPY_MODE_RESERVED_SRC_PEER_ID                        23:21
+#define NVC9B5_SET_SECURE_COPY_MODE_RESERVED_SRC_FLA                            24:24
+#define NVC9B5_SET_SECURE_COPY_MODE_RESERVED_DST_TARGET                         26:25
+#define NVC9B5_SET_SECURE_COPY_MODE_RESERVED_DST_TARGET_LOCAL_FB                (0x00000000)
+#define NVC9B5_SET_SECURE_COPY_MODE_RESERVED_DST_TARGET_COHERENT_SYSMEM         (0x00000001)
+#define NVC9B5_SET_SECURE_COPY_MODE_RESERVED_DST_TARGET_NONCOHERENT_SYSMEM      (0x00000002)
+#define NVC9B5_SET_SECURE_COPY_MODE_RESERVED_DST_TARGET_PEERMEM                 (0x00000003)
+#define NVC9B5_SET_SECURE_COPY_MODE_RESERVED_DST_PEER_ID                        29:27
+#define NVC9B5_SET_SECURE_COPY_MODE_RESERVED_DST_FLA                            30:30
+#define NVC9B5_SET_SECURE_COPY_MODE_RESERVED_END_OF_COPY                        31:31
+#define NVC9B5_SET_DECRYPT_IV0                                                  (0x00000504)
+#define NVC9B5_SET_DECRYPT_IV0_VALUE                                            31:0
+#define NVC9B5_SET_DECRYPT_IV1                                                  (0x00000508)
+#define NVC9B5_SET_DECRYPT_IV1_VALUE                                            31:0
+#define NVC9B5_SET_DECRYPT_IV2                                                  (0x0000050C)
+#define NVC9B5_SET_DECRYPT_IV2_VALUE                                            31:0
+#define NVC9B5_RESERVED_SET_AESCOUNTER                                          (0x00000510)
+#define NVC9B5_RESERVED_SET_AESCOUNTER_VALUE                                    31:0
+#define NVC9B5_SET_DECRYPT_AUTH_TAG_COMPARE_ADDR_UPPER                          (0x00000514)
+#define NVC9B5_SET_DECRYPT_AUTH_TAG_COMPARE_ADDR_UPPER_UPPER                    24:0
+#define NVC9B5_SET_DECRYPT_AUTH_TAG_COMPARE_ADDR_LOWER                          (0x00000518)
+#define NVC9B5_SET_DECRYPT_AUTH_TAG_COMPARE_ADDR_LOWER_LOWER                    31:0
+#define NVC9B5_SET_ENCRYPT_AUTH_TAG_ADDR_UPPER                                  (0x00000530)
+#define NVC9B5_SET_ENCRYPT_AUTH_TAG_ADDR_UPPER_UPPER                            24:0
+#define NVC9B5_SET_ENCRYPT_AUTH_TAG_ADDR_LOWER                                  (0x00000534)
+#define NVC9B5_SET_ENCRYPT_AUTH_TAG_ADDR_LOWER_LOWER                            31:0
+#define NVC9B5_SET_ENCRYPT_IV_ADDR_UPPER                                        (0x00000538)
+#define NVC9B5_SET_ENCRYPT_IV_ADDR_UPPER_UPPER                                  24:0
+#define NVC9B5_SET_ENCRYPT_IV_ADDR_LOWER                                        (0x0000053C)
+#define NVC9B5_SET_ENCRYPT_IV_ADDR_LOWER_LOWER                                  31:0
+#define NVC9B5_SET_COMPRESSION_PARAMETERS                                       (0x00000580)
+#define NVC9B5_SET_COMPRESSION_PARAMETERS_OPERATION                             0:0
+#define NVC9B5_SET_COMPRESSION_PARAMETERS_OPERATION_DECOMPRESS                  (0x00000000)
+#define NVC9B5_SET_COMPRESSION_PARAMETERS_OPERATION_COMPRESS                    (0x00000001)
+#define NVC9B5_SET_COMPRESSION_PARAMETERS_ALGO                                  3:1
+#define NVC9B5_SET_COMPRESSION_PARAMETERS_ALGO_SNAPPY                           (0x00000000)
+#define NVC9B5_SET_COMPRESSION_PARAMETERS_ALGO_LZ4_DATA_ONLY                    (0x00000001)
+#define NVC9B5_SET_COMPRESSION_PARAMETERS_ALGO_LZ4_BLOCK                        (0x00000002)
+#define NVC9B5_SET_COMPRESSION_PARAMETERS_ALGO_LZ4_BLOCK_CHECKSUM               (0x00000003)
+#define NVC9B5_SET_COMPRESSION_PARAMETERS_ALGO_DEFLATE                          (0x00000004)
+#define NVC9B5_SET_COMPRESSION_PARAMETERS_ALGO_SNAPPY_WITH_LONG_FETCH           (0x00000005)
+#define NVC9B5_SET_COMPRESSION_PARAMETERS_CHECK_SUM                             29:28
+#define NVC9B5_SET_COMPRESSION_PARAMETERS_CHECK_SUM_NONE                        (0x00000000)
+#define NVC9B5_SET_COMPRESSION_PARAMETERS_CHECK_SUM_ADLER32                     (0x00000001)
+#define NVC9B5_SET_COMPRESSION_PARAMETERS_CHECK_SUM_CRC32                       (0x00000002)
+#define NVC9B5_SET_COMPRESSION_PARAMETERS_CHECK_SUM_SNAPPY_CRC                  (0x00000003)
+#define NVC9B5_SET_DECOMPRESS_OUT_LENGTH                                        (0x00000584)
+#define NVC9B5_SET_DECOMPRESS_OUT_LENGTH_V                                      31:0
+#define NVC9B5_SET_DECOMPRESS_OUT_LENGTH_ADDR_UPPER                             (0x00000588)
+#define NVC9B5_SET_DECOMPRESS_OUT_LENGTH_ADDR_UPPER_UPPER                       24:0
+#define NVC9B5_SET_DECOMPRESS_OUT_LENGTH_ADDR_LOWER                             (0x0000058C)
+#define NVC9B5_SET_DECOMPRESS_OUT_LENGTH_ADDR_LOWER_LOWER                       31:0
+#define NVC9B5_SET_DECOMPRESS_CHECKSUM                                          (0x00000590)
+#define NVC9B5_SET_DECOMPRESS_CHECKSUM_V                                        31:0
+#define NVC9B5_SET_MEMORY_SCRUB_PARAMETERS                                      (0x000006FC)
+#define NVC9B5_SET_MEMORY_SCRUB_PARAMETERS_DISCARDABLE                          0:0
+#define NVC9B5_SET_MEMORY_SCRUB_PARAMETERS_DISCARDABLE_FALSE                    (0x00000000)
+#define NVC9B5_SET_MEMORY_SCRUB_PARAMETERS_DISCARDABLE_TRUE                     (0x00000001)
+#define NVC9B5_SET_REMAP_CONST_A                                                (0x00000700)
+#define NVC9B5_SET_REMAP_CONST_A_V                                              31:0
+#define NVC9B5_SET_REMAP_CONST_B                                                (0x00000704)
+#define NVC9B5_SET_REMAP_CONST_B_V                                              31:0
+#define NVC9B5_SET_REMAP_COMPONENTS                                             (0x00000708)
+#define NVC9B5_SET_REMAP_COMPONENTS_DST_X                                       2:0
+#define NVC9B5_SET_REMAP_COMPONENTS_DST_X_SRC_X                                 (0x00000000)
+#define NVC9B5_SET_REMAP_COMPONENTS_DST_X_SRC_Y                                 (0x00000001)
+#define NVC9B5_SET_REMAP_COMPONENTS_DST_X_SRC_Z                                 (0x00000002)
+#define NVC9B5_SET_REMAP_COMPONENTS_DST_X_SRC_W                                 (0x00000003)
+#define NVC9B5_SET_REMAP_COMPONENTS_DST_X_CONST_A                               (0x00000004)
+#define NVC9B5_SET_REMAP_COMPONENTS_DST_X_CONST_B                               (0x00000005)
+#define NVC9B5_SET_REMAP_COMPONENTS_DST_X_NO_WRITE                              (0x00000006)
+#define NVC9B5_SET_REMAP_COMPONENTS_DST_Y                                       6:4
+#define NVC9B5_SET_REMAP_COMPONENTS_DST_Y_SRC_X                                 (0x00000000)
+#define NVC9B5_SET_REMAP_COMPONENTS_DST_Y_SRC_Y                                 (0x00000001)
+#define NVC9B5_SET_REMAP_COMPONENTS_DST_Y_SRC_Z                                 (0x00000002)
+#define NVC9B5_SET_REMAP_COMPONENTS_DST_Y_SRC_W                                 (0x00000003)
+#define NVC9B5_SET_REMAP_COMPONENTS_DST_Y_CONST_A                               (0x00000004)
+#define NVC9B5_SET_REMAP_COMPONENTS_DST_Y_CONST_B                               (0x00000005)
+#define NVC9B5_SET_REMAP_COMPONENTS_DST_Y_NO_WRITE                              (0x00000006)
+#define NVC9B5_SET_REMAP_COMPONENTS_DST_Z                                       10:8
+#define NVC9B5_SET_REMAP_COMPONENTS_DST_Z_SRC_X                                 (0x00000000)
+#define NVC9B5_SET_REMAP_COMPONENTS_DST_Z_SRC_Y                                 (0x00000001)
+#define NVC9B5_SET_REMAP_COMPONENTS_DST_Z_SRC_Z                                 (0x00000002)
+#define NVC9B5_SET_REMAP_COMPONENTS_DST_Z_SRC_W                                 (0x00000003)
+#define NVC9B5_SET_REMAP_COMPONENTS_DST_Z_CONST_A                               (0x00000004)
+#define NVC9B5_SET_REMAP_COMPONENTS_DST_Z_CONST_B                               (0x00000005)
+#define NVC9B5_SET_REMAP_COMPONENTS_DST_Z_NO_WRITE                              (0x00000006)
+#define NVC9B5_SET_REMAP_COMPONENTS_DST_W                                       14:12
+#define NVC9B5_SET_REMAP_COMPONENTS_DST_W_SRC_X                                 (0x00000000)
+#define NVC9B5_SET_REMAP_COMPONENTS_DST_W_SRC_Y                                 (0x00000001)
+#define NVC9B5_SET_REMAP_COMPONENTS_DST_W_SRC_Z                                 (0x00000002)
+#define NVC9B5_SET_REMAP_COMPONENTS_DST_W_SRC_W                                 (0x00000003)
+#define NVC9B5_SET_REMAP_COMPONENTS_DST_W_CONST_A                               (0x00000004)
+#define NVC9B5_SET_REMAP_COMPONENTS_DST_W_CONST_B                               (0x00000005)
+#define NVC9B5_SET_REMAP_COMPONENTS_DST_W_NO_WRITE                              (0x00000006)
+#define NVC9B5_SET_REMAP_COMPONENTS_COMPONENT_SIZE                              17:16
+#define NVC9B5_SET_REMAP_COMPONENTS_COMPONENT_SIZE_ONE                          (0x00000000)
+#define NVC9B5_SET_REMAP_COMPONENTS_COMPONENT_SIZE_TWO                          (0x00000001)
+#define NVC9B5_SET_REMAP_COMPONENTS_COMPONENT_SIZE_THREE                        (0x00000002)
+#define NVC9B5_SET_REMAP_COMPONENTS_COMPONENT_SIZE_FOUR                         (0x00000003)
+#define NVC9B5_SET_REMAP_COMPONENTS_NUM_SRC_COMPONENTS                          21:20
+#define NVC9B5_SET_REMAP_COMPONENTS_NUM_SRC_COMPONENTS_ONE                      (0x00000000)
+#define NVC9B5_SET_REMAP_COMPONENTS_NUM_SRC_COMPONENTS_TWO                      (0x00000001)
+#define NVC9B5_SET_REMAP_COMPONENTS_NUM_SRC_COMPONENTS_THREE                    (0x00000002)
+#define NVC9B5_SET_REMAP_COMPONENTS_NUM_SRC_COMPONENTS_FOUR                     (0x00000003)
+#define NVC9B5_SET_REMAP_COMPONENTS_NUM_DST_COMPONENTS                          25:24
+#define NVC9B5_SET_REMAP_COMPONENTS_NUM_DST_COMPONENTS_ONE                      (0x00000000)
+#define NVC9B5_SET_REMAP_COMPONENTS_NUM_DST_COMPONENTS_TWO                      (0x00000001)
+#define NVC9B5_SET_REMAP_COMPONENTS_NUM_DST_COMPONENTS_THREE                    (0x00000002)
+#define NVC9B5_SET_REMAP_COMPONENTS_NUM_DST_COMPONENTS_FOUR                     (0x00000003)
+#define NVC9B5_SET_DST_BLOCK_SIZE                                               (0x0000070C)
+#define NVC9B5_SET_DST_BLOCK_SIZE_WIDTH                                         3:0
+#define NVC9B5_SET_DST_BLOCK_SIZE_WIDTH_ONE_GOB                                 (0x00000000)
+#define NVC9B5_SET_DST_BLOCK_SIZE_HEIGHT                                        7:4
+#define NVC9B5_SET_DST_BLOCK_SIZE_HEIGHT_ONE_GOB                                (0x00000000)
+#define NVC9B5_SET_DST_BLOCK_SIZE_HEIGHT_TWO_GOBS                               (0x00000001)
+#define NVC9B5_SET_DST_BLOCK_SIZE_HEIGHT_FOUR_GOBS                              (0x00000002)
+#define NVC9B5_SET_DST_BLOCK_SIZE_HEIGHT_EIGHT_GOBS                             (0x00000003)
+#define NVC9B5_SET_DST_BLOCK_SIZE_HEIGHT_SIXTEEN_GOBS                           (0x00000004)
+#define NVC9B5_SET_DST_BLOCK_SIZE_HEIGHT_THIRTYTWO_GOBS                         (0x00000005)
+#define NVC9B5_SET_DST_BLOCK_SIZE_DEPTH                                         11:8
+#define NVC9B5_SET_DST_BLOCK_SIZE_DEPTH_ONE_GOB                                 (0x00000000)
+#define NVC9B5_SET_DST_BLOCK_SIZE_DEPTH_TWO_GOBS                                (0x00000001)
+#define NVC9B5_SET_DST_BLOCK_SIZE_DEPTH_FOUR_GOBS                               (0x00000002)
+#define NVC9B5_SET_DST_BLOCK_SIZE_DEPTH_EIGHT_GOBS                              (0x00000003)
+#define NVC9B5_SET_DST_BLOCK_SIZE_DEPTH_SIXTEEN_GOBS                            (0x00000004)
+#define NVC9B5_SET_DST_BLOCK_SIZE_DEPTH_THIRTYTWO_GOBS                          (0x00000005)
+#define NVC9B5_SET_DST_BLOCK_SIZE_GOB_HEIGHT                                    15:12
+#define NVC9B5_SET_DST_BLOCK_SIZE_GOB_HEIGHT_GOB_HEIGHT_FERMI_8                 (0x00000001)
+#define NVC9B5_SET_DST_WIDTH                                                    (0x00000710)
+#define NVC9B5_SET_DST_WIDTH_V                                                  31:0
+#define NVC9B5_SET_DST_HEIGHT                                                   (0x00000714)
+#define NVC9B5_SET_DST_HEIGHT_V                                                 31:0
+#define NVC9B5_SET_DST_DEPTH                                                    (0x00000718)
+#define NVC9B5_SET_DST_DEPTH_V                                                  31:0
+#define NVC9B5_SET_DST_LAYER                                                    (0x0000071C)
+#define NVC9B5_SET_DST_LAYER_V                                                  31:0
+#define NVC9B5_SET_DST_ORIGIN                                                   (0x00000720)
+#define NVC9B5_SET_DST_ORIGIN_X                                                 15:0
+#define NVC9B5_SET_DST_ORIGIN_Y                                                 31:16
+#define NVC9B5_SET_SRC_BLOCK_SIZE                                               (0x00000728)
+#define NVC9B5_SET_SRC_BLOCK_SIZE_WIDTH                                         3:0
+#define NVC9B5_SET_SRC_BLOCK_SIZE_WIDTH_ONE_GOB                                 (0x00000000)
+#define NVC9B5_SET_SRC_BLOCK_SIZE_HEIGHT                                        7:4
+#define NVC9B5_SET_SRC_BLOCK_SIZE_HEIGHT_ONE_GOB                                (0x00000000)
+#define NVC9B5_SET_SRC_BLOCK_SIZE_HEIGHT_TWO_GOBS                               (0x00000001)
+#define NVC9B5_SET_SRC_BLOCK_SIZE_HEIGHT_FOUR_GOBS                              (0x00000002)
+#define NVC9B5_SET_SRC_BLOCK_SIZE_HEIGHT_EIGHT_GOBS                             (0x00000003)
+#define NVC9B5_SET_SRC_BLOCK_SIZE_HEIGHT_SIXTEEN_GOBS                           (0x00000004)
+#define NVC9B5_SET_SRC_BLOCK_SIZE_HEIGHT_THIRTYTWO_GOBS                         (0x00000005)
+#define NVC9B5_SET_SRC_BLOCK_SIZE_DEPTH                                         11:8
+#define NVC9B5_SET_SRC_BLOCK_SIZE_DEPTH_ONE_GOB                                 (0x00000000)
+#define NVC9B5_SET_SRC_BLOCK_SIZE_DEPTH_TWO_GOBS                                (0x00000001)
+#define NVC9B5_SET_SRC_BLOCK_SIZE_DEPTH_FOUR_GOBS                               (0x00000002)
+#define NVC9B5_SET_SRC_BLOCK_SIZE_DEPTH_EIGHT_GOBS                              (0x00000003)
+#define NVC9B5_SET_SRC_BLOCK_SIZE_DEPTH_SIXTEEN_GOBS                            (0x00000004)
+#define NVC9B5_SET_SRC_BLOCK_SIZE_DEPTH_THIRTYTWO_GOBS                          (0x00000005)
+#define NVC9B5_SET_SRC_BLOCK_SIZE_GOB_HEIGHT                                    15:12
+#define NVC9B5_SET_SRC_BLOCK_SIZE_GOB_HEIGHT_GOB_HEIGHT_FERMI_8                 (0x00000001)
+#define NVC9B5_SET_SRC_WIDTH                                                    (0x0000072C)
+#define NVC9B5_SET_SRC_WIDTH_V                                                  31:0
+#define NVC9B5_SET_SRC_HEIGHT                                                   (0x00000730)
+#define NVC9B5_SET_SRC_HEIGHT_V                                                 31:0
+#define NVC9B5_SET_SRC_DEPTH                                                    (0x00000734)
+#define NVC9B5_SET_SRC_DEPTH_V                                                  31:0
+#define NVC9B5_SET_SRC_LAYER                                                    (0x00000738)
+#define NVC9B5_SET_SRC_LAYER_V                                                  31:0
+#define NVC9B5_SET_SRC_ORIGIN                                                   (0x0000073C)
+#define NVC9B5_SET_SRC_ORIGIN_X                                                 15:0
+#define NVC9B5_SET_SRC_ORIGIN_Y                                                 31:16
+#define NVC9B5_SRC_ORIGIN_X                                                     (0x00000744)
+#define NVC9B5_SRC_ORIGIN_X_VALUE                                               31:0
+#define NVC9B5_SRC_ORIGIN_Y                                                     (0x00000748)
+#define NVC9B5_SRC_ORIGIN_Y_VALUE                                               31:0
+#define NVC9B5_DST_ORIGIN_X                                                     (0x0000074C)
+#define NVC9B5_DST_ORIGIN_X_VALUE                                               31:0
+#define NVC9B5_DST_ORIGIN_Y                                                     (0x00000750)
+#define NVC9B5_DST_ORIGIN_Y_VALUE                                               31:0
+#define NVC9B5_PM_TRIGGER_END                                                   (0x00001114)
+#define NVC9B5_PM_TRIGGER_END_V                                                 31:0
+
+#ifdef __cplusplus
+};     /* extern "C" */
+#endif
+#endif // _clc9b5_h
+
--- a/kernel-open/nvidia-uvm/ctrl2080mc.h
+++ b/kernel-open/nvidia-uvm/ctrl2080mc.h
@@ -1,5 +1,5 @@
 /*******************************************************************************
-    Copyright (c) 2013-2022 NVIDIA Corporation
+    Copyright (c) 2013-2023 NVIDIA Corporation

    Permission is hereby granted, free of charge, to any person obtaining a copy
    of this software and associated documentation files (the "Software"), to
@@ -34,6 +34,7 @@
 #define NV2080_CTRL_MC_ARCH_INFO_ARCHITECTURE_GA100                (0x00000170)
 #define NV2080_CTRL_MC_ARCH_INFO_ARCHITECTURE_GH100                (0x00000180)
 #define NV2080_CTRL_MC_ARCH_INFO_ARCHITECTURE_AD100                (0x00000190)
+#define NV2080_CTRL_MC_ARCH_INFO_ARCHITECTURE_GB100                (0x000001A0)

 /* valid ARCHITECTURE_GP10x implementation values */
 #define NV2080_CTRL_MC_ARCH_INFO_IMPLEMENTATION_GP100              (0x00000000)
--- a/kernel-open/nvidia-uvm/hwref/blackwell/gb100/dev_fault.h
+++ b/kernel-open/nvidia-uvm/hwref/blackwell/gb100/dev_fault.h
@@ -0,0 +1,546 @@
+/*******************************************************************************
+    Copyright (c) 2003-2016 NVIDIA Corporation
+
+    Permission is hereby granted, free of charge, to any person obtaining a copy
+    of this software and associated documentation files (the "Software"), to
+    deal in the Software without restriction, including without limitation the
+    rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
+    sell copies of the Software, and to permit persons to whom the Software is
+    furnished to do so, subject to the following conditions:
+
+    The above copyright notice and this permission notice shall be
+    included in all copies or substantial portions of the Software.
+
+    THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+    IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+    FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+    THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+    LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+    FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+    DEALINGS IN THE SOFTWARE.
+
+*******************************************************************************/
+
+
+#ifndef __gb100_dev_fault_h__
+#define __gb100_dev_fault_h__
+/* This file is autogenerated.  Do not edit */
+#define NV_PFAULT                                              /* ----G */
+#define NV_PFAULT_MMU_ENG_ID_GRAPHICS          384 /*       */
+#define NV_PFAULT_MMU_ENG_ID_DISPLAY           1 /*       */
+#define NV_PFAULT_MMU_ENG_ID_GSP               2 /*       */
+#define NV_PFAULT_MMU_ENG_ID_IFB               55 /*       */
+#define NV_PFAULT_MMU_ENG_ID_FLA               4 /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR1              256 /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR2              320 /*       */
+#define NV_PFAULT_MMU_ENG_ID_SEC               6 /*       */
+#define NV_PFAULT_MMU_ENG_ID_FSP               7 /*       */
+#define NV_PFAULT_MMU_ENG_ID_PERF              10 /*       */
+#define NV_PFAULT_MMU_ENG_ID_PERF0             10 /*       */
+#define NV_PFAULT_MMU_ENG_ID_PERF1             11 /*       */
+#define NV_PFAULT_MMU_ENG_ID_PERF2             12 /*       */
+#define NV_PFAULT_MMU_ENG_ID_PERF3             13 /*       */
+#define NV_PFAULT_MMU_ENG_ID_PERF4             14 /*       */
+#define NV_PFAULT_MMU_ENG_ID_PERF5             15 /*       */
+#define NV_PFAULT_MMU_ENG_ID_PERF6             16 /*       */
+#define NV_PFAULT_MMU_ENG_ID_PERF7             17 /*       */
+#define NV_PFAULT_MMU_ENG_ID_PERF8             18 /*       */
+#define NV_PFAULT_MMU_ENG_ID_PERF9             19 /*       */
+#define NV_PFAULT_MMU_ENG_ID_GSPLITE          20 /*       */
+#define NV_PFAULT_MMU_ENG_ID_NVDEC             28 /*       */
+#define NV_PFAULT_MMU_ENG_ID_NVDEC0            28 /*       */
+#define NV_PFAULT_MMU_ENG_ID_NVDEC1            29 /*       */
+#define NV_PFAULT_MMU_ENG_ID_NVDEC2            30 /*       */
+#define NV_PFAULT_MMU_ENG_ID_NVDEC3            31 /*       */
+#define NV_PFAULT_MMU_ENG_ID_NVDEC4            32 /*       */
+#define NV_PFAULT_MMU_ENG_ID_NVDEC5            33 /*       */
+#define NV_PFAULT_MMU_ENG_ID_NVDEC6            34 /*       */
+#define NV_PFAULT_MMU_ENG_ID_NVDEC7            35 /*       */
+#define NV_PFAULT_MMU_ENG_ID_NVJPG0            36 /*       */
+#define NV_PFAULT_MMU_ENG_ID_NVJPG1            37 /*       */
+#define NV_PFAULT_MMU_ENG_ID_NVJPG2            38 /*       */
+#define NV_PFAULT_MMU_ENG_ID_NVJPG3            39 /*       */
+#define NV_PFAULT_MMU_ENG_ID_NVJPG4            40 /*       */
+#define NV_PFAULT_MMU_ENG_ID_NVJPG5            41 /*       */
+#define NV_PFAULT_MMU_ENG_ID_NVJPG6            42 /*       */
+#define NV_PFAULT_MMU_ENG_ID_NVJPG7            43 /*       */
+#define NV_PFAULT_MMU_ENG_ID_GRCOPY            65 /*       */
+#define NV_PFAULT_MMU_ENG_ID_CE0               65 /*       */
+#define NV_PFAULT_MMU_ENG_ID_CE1               66 /*       */
+#define NV_PFAULT_MMU_ENG_ID_CE2               67 /*       */
+#define NV_PFAULT_MMU_ENG_ID_CE3               68 /*       */
+#define NV_PFAULT_MMU_ENG_ID_CE4               69 /*       */
+#define NV_PFAULT_MMU_ENG_ID_CE5               70 /*       */
+#define NV_PFAULT_MMU_ENG_ID_CE6               71 /*       */
+#define NV_PFAULT_MMU_ENG_ID_CE7               72 /*       */
+#define NV_PFAULT_MMU_ENG_ID_CE8               73 /*       */
+#define NV_PFAULT_MMU_ENG_ID_CE9               74 /*       */
+#define NV_PFAULT_MMU_ENG_ID_CE10               75 /*       */
+#define NV_PFAULT_MMU_ENG_ID_CE11               76 /*       */
+#define NV_PFAULT_MMU_ENG_ID_CE12               77 /*       */
+#define NV_PFAULT_MMU_ENG_ID_CE13               78 /*       */
+#define NV_PFAULT_MMU_ENG_ID_CE14               79 /*       */
+#define NV_PFAULT_MMU_ENG_ID_CE15               80 /*       */
+#define NV_PFAULT_MMU_ENG_ID_CE16               81 /*       */
+#define NV_PFAULT_MMU_ENG_ID_CE17               82 /*       */
+#define NV_PFAULT_MMU_ENG_ID_CE18               83 /*       */
+#define NV_PFAULT_MMU_ENG_ID_CE19               84 /*       */
+#define NV_PFAULT_MMU_ENG_ID_PWR_PMU           5 /*       */
+#define NV_PFAULT_MMU_ENG_ID_PTP               3 /*       */
+#define NV_PFAULT_MMU_ENG_ID_NVENC0            44 /*       */
+#define NV_PFAULT_MMU_ENG_ID_NVENC1            45 /*       */
+#define NV_PFAULT_MMU_ENG_ID_NVENC2            46 /*       */
+#define NV_PFAULT_MMU_ENG_ID_NVENC3            47 /*       */
+#define NV_PFAULT_MMU_ENG_ID_OFA0              48 /*       */
+#define NV_PFAULT_MMU_ENG_ID_PHYSICAL          56 /*       */
+#define NV_PFAULT_MMU_ENG_ID_HOST0             85 /*       */
+#define NV_PFAULT_MMU_ENG_ID_HOST1             86 /*       */
+#define NV_PFAULT_MMU_ENG_ID_HOST2             87 /*       */
+#define NV_PFAULT_MMU_ENG_ID_HOST3             88 /*       */
+#define NV_PFAULT_MMU_ENG_ID_HOST4             89 /*       */
+#define NV_PFAULT_MMU_ENG_ID_HOST5             90 /*       */
+#define NV_PFAULT_MMU_ENG_ID_HOST6             91 /*       */
+#define NV_PFAULT_MMU_ENG_ID_HOST7             92 /*       */
+#define NV_PFAULT_MMU_ENG_ID_HOST8             93 /*       */
+#define NV_PFAULT_MMU_ENG_ID_HOST9             94 /*       */
+#define NV_PFAULT_MMU_ENG_ID_HOST10            95 /*       */
+#define NV_PFAULT_MMU_ENG_ID_HOST11            96 /*       */
+#define NV_PFAULT_MMU_ENG_ID_HOST12            97 /*       */
+#define NV_PFAULT_MMU_ENG_ID_HOST13            98 /*       */
+#define NV_PFAULT_MMU_ENG_ID_HOST14            99 /*       */
+#define NV_PFAULT_MMU_ENG_ID_HOST15            100 /*       */
+#define NV_PFAULT_MMU_ENG_ID_HOST16            101 /*       */
+#define NV_PFAULT_MMU_ENG_ID_HOST17            102 /*       */
+#define NV_PFAULT_MMU_ENG_ID_HOST18            103 /*       */
+#define NV_PFAULT_MMU_ENG_ID_HOST19            104 /*       */
+#define NV_PFAULT_MMU_ENG_ID_HOST20            105 /*       */
+#define NV_PFAULT_MMU_ENG_ID_HOST21            106 /*       */
+#define NV_PFAULT_MMU_ENG_ID_HOST22            107 /*       */
+#define NV_PFAULT_MMU_ENG_ID_HOST23            108 /*       */
+#define NV_PFAULT_MMU_ENG_ID_HOST24            109 /*       */
+#define NV_PFAULT_MMU_ENG_ID_HOST25            110 /*       */
+#define NV_PFAULT_MMU_ENG_ID_HOST26            111 /*       */
+#define NV_PFAULT_MMU_ENG_ID_HOST27            112 /*       */
+#define NV_PFAULT_MMU_ENG_ID_HOST28            113 /*       */
+#define NV_PFAULT_MMU_ENG_ID_HOST29            114 /*       */
+#define NV_PFAULT_MMU_ENG_ID_HOST30            115 /*       */
+#define NV_PFAULT_MMU_ENG_ID_HOST31            116 /*       */
+#define NV_PFAULT_MMU_ENG_ID_HOST32            117 /*       */
+#define NV_PFAULT_MMU_ENG_ID_HOST33            118 /*       */
+#define NV_PFAULT_MMU_ENG_ID_HOST34            119 /*       */
+#define NV_PFAULT_MMU_ENG_ID_HOST35            120 /*       */
+#define NV_PFAULT_MMU_ENG_ID_HOST36            121 /*       */
+#define NV_PFAULT_MMU_ENG_ID_HOST37            122 /*       */
+#define NV_PFAULT_MMU_ENG_ID_HOST38            123 /*       */
+#define NV_PFAULT_MMU_ENG_ID_HOST39            124 /*       */
+#define NV_PFAULT_MMU_ENG_ID_HOST40            125 /*       */
+#define NV_PFAULT_MMU_ENG_ID_HOST41            126 /*       */
+#define NV_PFAULT_MMU_ENG_ID_HOST42            127 /*       */
+#define NV_PFAULT_MMU_ENG_ID_HOST43            128 /*       */
+#define NV_PFAULT_MMU_ENG_ID_HOST44            129 /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR1_FN0          256  /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR1_FN1          257  /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR1_FN2          258  /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR1_FN3          259  /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR1_FN4          260  /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR1_FN5          261  /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR1_FN6          262  /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR1_FN7          263  /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR1_FN8          264  /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR1_FN9          265  /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR1_FN10         266 /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR1_FN11         267 /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR1_FN12         268 /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR1_FN13         269 /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR1_FN14         270 /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR1_FN15         271 /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR1_FN16         272 /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR1_FN17         273 /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR1_FN18         274 /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR1_FN19         275 /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR1_FN20         276 /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR1_FN21         277 /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR1_FN22         278 /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR1_FN23         279 /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR1_FN24         280 /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR1_FN25         281 /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR1_FN26         282 /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR1_FN27         283 /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR1_FN28         284 /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR1_FN29         285 /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR1_FN30         286 /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR1_FN31         287 /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR1_FN32         288 /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR1_FN33         289 /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR1_FN34         290 /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR1_FN35         291 /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR1_FN36         292 /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR1_FN37         293 /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR1_FN38         294 /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR1_FN39         295 /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR1_FN40         296 /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR1_FN41         297 /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR1_FN42         298 /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR1_FN43         299 /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR1_FN44         300 /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR1_FN45         301 /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR1_FN46         302 /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR1_FN47         303 /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR1_FN48         304 /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR1_FN49         305 /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR1_FN50         306 /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR1_FN51         307 /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR1_FN52         308 /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR1_FN53         309 /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR1_FN54         310 /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR1_FN55         311 /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR1_FN56         312 /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR1_FN57         313 /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR1_FN58         314 /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR1_FN59         315 /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR1_FN60         316 /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR1_FN61         317 /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR1_FN62         318 /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR1_FN63         319 /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR2_FN0          320  /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR2_FN1          321  /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR2_FN2          322  /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR2_FN3          323  /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR2_FN4          324  /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR2_FN5          325  /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR2_FN6          326  /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR2_FN7          327  /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR2_FN8          328  /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR2_FN9          329  /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR2_FN10         330 /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR2_FN11         331 /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR2_FN12         332 /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR2_FN13         333 /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR2_FN14         334 /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR2_FN15         335 /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR2_FN16         336 /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR2_FN17         337 /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR2_FN18         338 /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR2_FN19         339 /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR2_FN20         340 /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR2_FN21         341 /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR2_FN22         342 /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR2_FN23         343 /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR2_FN24         344 /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR2_FN25         345 /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR2_FN26         346 /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR2_FN27         347 /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR2_FN28         348 /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR2_FN29         349 /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR2_FN30         350 /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR2_FN31         351 /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR2_FN32         352 /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR2_FN33         353 /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR2_FN34         354 /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR2_FN35         355 /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR2_FN36         356 /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR2_FN37         357 /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR2_FN38         358 /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR2_FN39         359 /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR2_FN40         360 /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR2_FN41         361 /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR2_FN42         362 /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR2_FN43         363 /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR2_FN44         364 /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR2_FN45         365 /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR2_FN46         366 /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR2_FN47         367 /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR2_FN48         368 /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR2_FN49         369 /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR2_FN50         370 /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR2_FN51         371 /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR2_FN52         372 /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR2_FN53         373 /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR2_FN54         374 /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR2_FN55         375 /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR2_FN56         376 /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR2_FN57         377 /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR2_FN58         378 /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR2_FN59         379 /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR2_FN60         380 /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR2_FN61         381 /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR2_FN62         382 /*       */
+#define NV_PFAULT_MMU_ENG_ID_BAR2_FN63         383 /*       */
+#define NV_PFAULT_FAULT_TYPE                             4:0 /*       */
+#define NV_PFAULT_FAULT_TYPE_PDE                  0x00000000 /*       */
+#define NV_PFAULT_FAULT_TYPE_PDE_SIZE             0x00000001 /*       */
+#define NV_PFAULT_FAULT_TYPE_PTE                  0x00000002 /*       */
+#define NV_PFAULT_FAULT_TYPE_VA_LIMIT_VIOLATION   0x00000003 /*       */
+#define NV_PFAULT_FAULT_TYPE_UNBOUND_INST_BLOCK   0x00000004 /*       */
+#define NV_PFAULT_FAULT_TYPE_PRIV_VIOLATION       0x00000005 /*       */
+#define NV_PFAULT_FAULT_TYPE_RO_VIOLATION         0x00000006 /*       */
+#define NV_PFAULT_FAULT_TYPE_WO_VIOLATION         0x00000007 /*       */
+#define NV_PFAULT_FAULT_TYPE_PITCH_MASK_VIOLATION 0x00000008 /*       */
+#define NV_PFAULT_FAULT_TYPE_WORK_CREATION        0x00000009 /*       */
+#define NV_PFAULT_FAULT_TYPE_UNSUPPORTED_APERTURE 0x0000000a /*       */
+#define NV_PFAULT_FAULT_TYPE_CC_VIOLATION         0x0000000b /*       */
+#define NV_PFAULT_FAULT_TYPE_UNSUPPORTED_KIND     0x0000000c /*       */
+#define NV_PFAULT_FAULT_TYPE_REGION_VIOLATION     0x0000000d /*       */
+#define NV_PFAULT_FAULT_TYPE_POISONED             0x0000000e /*       */
+#define NV_PFAULT_FAULT_TYPE_ATOMIC_VIOLATION     0x0000000f /*       */
+#define NV_PFAULT_CLIENT                       14:8 /*       */
+#define NV_PFAULT_CLIENT_GPC_T1_0        0x00000000 /*       */
+#define NV_PFAULT_CLIENT_GPC_T1_1        0x00000001 /*       */
+#define NV_PFAULT_CLIENT_GPC_T1_2        0x00000002 /*       */
+#define NV_PFAULT_CLIENT_GPC_T1_3        0x00000003 /*       */
+#define NV_PFAULT_CLIENT_GPC_T1_4        0x00000004 /*       */
+#define NV_PFAULT_CLIENT_GPC_T1_5        0x00000005 /*       */
+#define NV_PFAULT_CLIENT_GPC_T1_6        0x00000006 /*       */
+#define NV_PFAULT_CLIENT_GPC_T1_7        0x00000007 /*       */
+#define NV_PFAULT_CLIENT_GPC_PE_0        0x00000008 /*       */
+#define NV_PFAULT_CLIENT_GPC_PE_1        0x00000009 /*       */
+#define NV_PFAULT_CLIENT_GPC_PE_2        0x0000000A /*       */
+#define NV_PFAULT_CLIENT_GPC_PE_3        0x0000000B /*       */
+#define NV_PFAULT_CLIENT_GPC_PE_4        0x0000000C /*       */
+#define NV_PFAULT_CLIENT_GPC_PE_5        0x0000000D /*       */
+#define NV_PFAULT_CLIENT_GPC_PE_6        0x0000000E /*       */
+#define NV_PFAULT_CLIENT_GPC_PE_7        0x0000000F /*       */
+#define NV_PFAULT_CLIENT_GPC_RAST        0x00000010 /*       */
+#define NV_PFAULT_CLIENT_GPC_GCC         0x00000011 /*       */
+#define NV_PFAULT_CLIENT_GPC_GPCCS       0x00000012 /*       */
+#define NV_PFAULT_CLIENT_GPC_PROP_0      0x00000013 /*       */
+#define NV_PFAULT_CLIENT_GPC_PROP_1      0x00000014 /*       */
+#define NV_PFAULT_CLIENT_GPC_PROP_2      0x00000015 /*       */
+#define NV_PFAULT_CLIENT_GPC_PROP_3      0x00000016 /*       */
+#define NV_PFAULT_CLIENT_GPC_T1_8        0x00000021 /*       */
+#define NV_PFAULT_CLIENT_GPC_T1_9        0x00000022 /*       */
+#define NV_PFAULT_CLIENT_GPC_T1_10       0x00000023 /*       */
+#define NV_PFAULT_CLIENT_GPC_T1_11       0x00000024 /*       */
+#define NV_PFAULT_CLIENT_GPC_T1_12       0x00000025 /*       */
+#define NV_PFAULT_CLIENT_GPC_T1_13       0x00000026 /*       */
+#define NV_PFAULT_CLIENT_GPC_T1_14       0x00000027 /*       */
+#define NV_PFAULT_CLIENT_GPC_T1_15       0x00000028 /*       */
+#define NV_PFAULT_CLIENT_GPC_TPCCS_0     0x00000029 /*       */
+#define NV_PFAULT_CLIENT_GPC_TPCCS_1     0x0000002A /*       */
+#define NV_PFAULT_CLIENT_GPC_TPCCS_2     0x0000002B /*       */
+#define NV_PFAULT_CLIENT_GPC_TPCCS_3     0x0000002C /*       */
+#define NV_PFAULT_CLIENT_GPC_TPCCS_4     0x0000002D /*       */
+#define NV_PFAULT_CLIENT_GPC_TPCCS_5     0x0000002E /*       */
+#define NV_PFAULT_CLIENT_GPC_TPCCS_6     0x0000002F /*       */
+#define NV_PFAULT_CLIENT_GPC_TPCCS_7     0x00000030 /*       */
+#define NV_PFAULT_CLIENT_GPC_PE_8        0x00000031 /*       */
+#define NV_PFAULT_CLIENT_GPC_PE_9        0x00000032 /*       */
+#define NV_PFAULT_CLIENT_GPC_TPCCS_8     0x00000033 /*       */
+#define NV_PFAULT_CLIENT_GPC_TPCCS_9     0x00000034 /*       */
+#define NV_PFAULT_CLIENT_GPC_T1_16       0x00000035 /*       */
+#define NV_PFAULT_CLIENT_GPC_T1_17       0x00000036 /*       */
+#define NV_PFAULT_CLIENT_GPC_T1_18       0x00000037 /*       */
+#define NV_PFAULT_CLIENT_GPC_T1_19       0x00000038 /*       */
+#define NV_PFAULT_CLIENT_GPC_PE_10       0x00000039 /*       */
+#define NV_PFAULT_CLIENT_GPC_PE_11       0x0000003A /*       */
+#define NV_PFAULT_CLIENT_GPC_TPCCS_10    0x0000003B /*       */
+#define NV_PFAULT_CLIENT_GPC_TPCCS_11    0x0000003C /*       */
+#define NV_PFAULT_CLIENT_GPC_T1_20       0x0000003D /*       */
+#define NV_PFAULT_CLIENT_GPC_T1_21       0x0000003E /*       */
+#define NV_PFAULT_CLIENT_GPC_T1_22       0x0000003F /*       */
+#define NV_PFAULT_CLIENT_GPC_T1_23       0x00000040 /*       */
+#define NV_PFAULT_CLIENT_GPC_PE_12       0x00000041 /*       */
+#define NV_PFAULT_CLIENT_GPC_PE_13       0x00000042 /*       */
+#define NV_PFAULT_CLIENT_GPC_TPCCS_12    0x00000043 /*       */
+#define NV_PFAULT_CLIENT_GPC_TPCCS_13    0x00000044 /*       */
+#define NV_PFAULT_CLIENT_GPC_T1_24       0x00000045 /*       */
+#define NV_PFAULT_CLIENT_GPC_T1_25       0x00000046 /*       */
+#define NV_PFAULT_CLIENT_GPC_T1_26       0x00000047 /*       */
+#define NV_PFAULT_CLIENT_GPC_T1_27       0x00000048 /*       */
+#define NV_PFAULT_CLIENT_GPC_PE_14       0x00000049 /*       */
+#define NV_PFAULT_CLIENT_GPC_PE_15       0x0000004A /*       */
+#define NV_PFAULT_CLIENT_GPC_TPCCS_14    0x0000004B /*       */
+#define NV_PFAULT_CLIENT_GPC_TPCCS_15    0x0000004C /*       */
+#define NV_PFAULT_CLIENT_GPC_T1_28       0x0000004D /*       */
+#define NV_PFAULT_CLIENT_GPC_T1_29       0x0000004E /*       */
+#define NV_PFAULT_CLIENT_GPC_T1_30       0x0000004F /*       */
+#define NV_PFAULT_CLIENT_GPC_T1_31       0x00000050 /*       */
+#define NV_PFAULT_CLIENT_GPC_PE_16       0x00000051 /*       */
+#define NV_PFAULT_CLIENT_GPC_PE_17       0x00000052 /*       */
+#define NV_PFAULT_CLIENT_GPC_TPCCS_16    0x00000053 /*       */
+#define NV_PFAULT_CLIENT_GPC_TPCCS_17    0x00000054 /*       */
+#define NV_PFAULT_CLIENT_GPC_T1_32       0x00000055 /*       */
+#define NV_PFAULT_CLIENT_GPC_T1_33       0x00000056 /*       */
+#define NV_PFAULT_CLIENT_GPC_T1_34       0x00000057 /*       */
+#define NV_PFAULT_CLIENT_GPC_T1_35       0x00000058 /*       */
+#define NV_PFAULT_CLIENT_GPC_PE_18       0x00000059 /*       */
+#define NV_PFAULT_CLIENT_GPC_PE_19       0x0000005A /*       */
+#define NV_PFAULT_CLIENT_GPC_TPCCS_18    0x0000005B /*       */
+#define NV_PFAULT_CLIENT_GPC_TPCCS_19    0x0000005C /*       */
+#define NV_PFAULT_CLIENT_GPC_T1_36       0x0000005D /*       */
+#define NV_PFAULT_CLIENT_GPC_T1_37       0x0000005E /*       */
+#define NV_PFAULT_CLIENT_GPC_T1_38       0x0000005F /*       */
+#define NV_PFAULT_CLIENT_GPC_T1_39       0x00000060 /*       */
+#define NV_PFAULT_CLIENT_GPC_ROP_0       0x00000070 /*       */
+#define NV_PFAULT_CLIENT_GPC_ROP_1       0x00000071 /*       */
+#define NV_PFAULT_CLIENT_GPC_ROP_2       0x00000072 /*       */
+#define NV_PFAULT_CLIENT_GPC_ROP_3       0x00000073 /*       */
+#define NV_PFAULT_CLIENT_GPC_GPM          0x00000017 /*       */
+#define NV_PFAULT_CLIENT_HUB_VIP         0x00000000 /*       */
+#define NV_PFAULT_CLIENT_HUB_CE0         0x00000001 /*       */
+#define NV_PFAULT_CLIENT_HUB_CE1         0x00000002 /*       */
+#define NV_PFAULT_CLIENT_HUB_DNISO       0x00000003 /*       */
+#define NV_PFAULT_CLIENT_HUB_DISPNISO    0x00000003 /*       */
+#define NV_PFAULT_CLIENT_HUB_FE0         0x00000004 /*       */
+#define NV_PFAULT_CLIENT_HUB_FE          0x00000004 /*       */
+#define NV_PFAULT_CLIENT_HUB_FECS0       0x00000005 /*       */
+#define NV_PFAULT_CLIENT_HUB_FECS        0x00000005 /*       */
+#define NV_PFAULT_CLIENT_HUB_HOST        0x00000006 /*       */
+#define NV_PFAULT_CLIENT_HUB_HOST_CPU    0x00000007 /*       */
+#define NV_PFAULT_CLIENT_HUB_HOST_CPU_NB 0x00000008 /*       */
+#define NV_PFAULT_CLIENT_HUB_ISO         0x00000009 /*       */
+#define NV_PFAULT_CLIENT_HUB_MMU         0x0000000A /*       */
+#define NV_PFAULT_CLIENT_HUB_NVDEC0      0x0000000B /*       */
+#define NV_PFAULT_CLIENT_HUB_NVDEC       0x0000000B /*       */
+#define NV_PFAULT_CLIENT_HUB_CE3         0x0000000C /*       */
+#define NV_PFAULT_CLIENT_HUB_NVENC1      0x0000000D /*       */
+#define NV_PFAULT_CLIENT_HUB_NISO        0x0000000E /*       */
+#define NV_PFAULT_CLIENT_HUB_ACTRS       0x0000000E /*       */
+#define NV_PFAULT_CLIENT_HUB_P2P         0x0000000F /*       */
+#define NV_PFAULT_CLIENT_HUB_PD          0x00000010 /*       */
+#define NV_PFAULT_CLIENT_HUB_PD0         0x00000010 /*       */
+#define NV_PFAULT_CLIENT_HUB_PERF0       0x00000011 /*       */
+#define NV_PFAULT_CLIENT_HUB_PERF        0x00000011 /*       */
+#define NV_PFAULT_CLIENT_HUB_PMU         0x00000012 /*       */
+#define NV_PFAULT_CLIENT_HUB_RASTERTWOD  0x00000013 /*       */
+#define NV_PFAULT_CLIENT_HUB_RASTERTWOD0 0x00000013 /*       */
+#define NV_PFAULT_CLIENT_HUB_SCC         0x00000014 /*       */
+#define NV_PFAULT_CLIENT_HUB_SCC0        0x00000014 /*       */
+#define NV_PFAULT_CLIENT_HUB_SCC_NB      0x00000015 /*       */
+#define NV_PFAULT_CLIENT_HUB_SCC_NB0     0x00000015 /*       */
+#define NV_PFAULT_CLIENT_HUB_SEC         0x00000016 /*       */
+#define NV_PFAULT_CLIENT_HUB_SSYNC       0x00000017 /*       */
+#define NV_PFAULT_CLIENT_HUB_SSYNC0      0x00000017 /*       */
+#define NV_PFAULT_CLIENT_HUB_GRCOPY      0x00000018 /*       */
+#define NV_PFAULT_CLIENT_HUB_CE2         0x00000018 /*       */
+#define NV_PFAULT_CLIENT_HUB_XV          0x00000019 /*       */
+#define NV_PFAULT_CLIENT_HUB_MMU_NB      0x0000001A /*       */
+#define NV_PFAULT_CLIENT_HUB_NVENC0      0x0000001B /*       */
+#define NV_PFAULT_CLIENT_HUB_NVENC       0x0000001B /*       */
+#define NV_PFAULT_CLIENT_HUB_DFALCON     0x0000001C /*       */
+#define NV_PFAULT_CLIENT_HUB_SKED0       0x0000001D /*       */
+#define NV_PFAULT_CLIENT_HUB_SKED        0x0000001D /*       */
+#define NV_PFAULT_CLIENT_HUB_PD1         0x0000001E /*       */
+#define NV_PFAULT_CLIENT_HUB_DONT_CARE   0x0000001F /*       */
+#define NV_PFAULT_CLIENT_HUB_HSCE0       0x00000020 /*       */
+#define NV_PFAULT_CLIENT_HUB_HSCE1       0x00000021 /*       */
+#define NV_PFAULT_CLIENT_HUB_HSCE2       0x00000022 /*       */
+#define NV_PFAULT_CLIENT_HUB_HSCE3       0x00000023 /*       */
+#define NV_PFAULT_CLIENT_HUB_HSCE4       0x00000024 /*       */
+#define NV_PFAULT_CLIENT_HUB_HSCE5       0x00000025 /*       */
+#define NV_PFAULT_CLIENT_HUB_HSCE6       0x00000026 /*       */
+#define NV_PFAULT_CLIENT_HUB_HSCE7       0x00000027 /*       */
+#define NV_PFAULT_CLIENT_HUB_SSYNC1      0x00000028 /*       */
+#define NV_PFAULT_CLIENT_HUB_SSYNC2      0x00000029 /*       */
+#define NV_PFAULT_CLIENT_HUB_HSHUB       0x0000002A /*       */
+#define NV_PFAULT_CLIENT_HUB_PTP_X0      0x0000002B /*       */
+#define NV_PFAULT_CLIENT_HUB_PTP_X1      0x0000002C /*       */
+#define NV_PFAULT_CLIENT_HUB_PTP_X2      0x0000002D /*       */
+#define NV_PFAULT_CLIENT_HUB_PTP_X3      0x0000002E /*       */
+#define NV_PFAULT_CLIENT_HUB_PTP_X4      0x0000002F /*       */
+#define NV_PFAULT_CLIENT_HUB_PTP_X5      0x00000030 /*       */
+#define NV_PFAULT_CLIENT_HUB_PTP_X6      0x00000031 /*       */
+#define NV_PFAULT_CLIENT_HUB_PTP_X7      0x00000032 /*       */
+#define NV_PFAULT_CLIENT_HUB_NVENC2      0x00000033 /*       */
+#define NV_PFAULT_CLIENT_HUB_VPR_SCRUBBER0 0x00000034 /*       */
+#define NV_PFAULT_CLIENT_HUB_VPR_SCRUBBER1 0x00000035 /*       */
+#define NV_PFAULT_CLIENT_HUB_SSYNC3      0x00000036 /*       */
+#define NV_PFAULT_CLIENT_HUB_FBFALCON    0x00000037 /*       */
+#define NV_PFAULT_CLIENT_HUB_CE_SHIM     0x00000038 /*       */
+#define NV_PFAULT_CLIENT_HUB_CE_SHIM0    0x00000038 /*       */
+#define NV_PFAULT_CLIENT_HUB_GSP         0x00000039 /*       */
+#define NV_PFAULT_CLIENT_HUB_NVDEC1      0x0000003A /*       */
+#define NV_PFAULT_CLIENT_HUB_NVDEC2      0x0000003B /*       */
+#define NV_PFAULT_CLIENT_HUB_NVJPG0      0x0000003C /*       */
+#define NV_PFAULT_CLIENT_HUB_NVDEC3      0x0000003D /*       */
+#define NV_PFAULT_CLIENT_HUB_NVDEC4      0x0000003E /*       */
+#define NV_PFAULT_CLIENT_HUB_OFA0        0x0000003F /*       */
+#define NV_PFAULT_CLIENT_HUB_SCC1        0x00000040 /*       */
+#define NV_PFAULT_CLIENT_HUB_SCC_NB1     0x00000041 /*       */
+#define NV_PFAULT_CLIENT_HUB_SCC2        0x00000042 /*       */
+#define NV_PFAULT_CLIENT_HUB_SCC_NB2     0x00000043 /*       */
+#define NV_PFAULT_CLIENT_HUB_SCC3        0x00000044 /*       */
+#define NV_PFAULT_CLIENT_HUB_SCC_NB3     0x00000045 /*       */
+#define NV_PFAULT_CLIENT_HUB_RASTERTWOD1 0x00000046 /*       */
+#define NV_PFAULT_CLIENT_HUB_RASTERTWOD2 0x00000047 /*       */
+#define NV_PFAULT_CLIENT_HUB_RASTERTWOD3 0x00000048 /*       */
+#define NV_PFAULT_CLIENT_HUB_GSPLITE1    0x00000049 /*       */
+#define NV_PFAULT_CLIENT_HUB_GSPLITE2    0x0000004A /*       */
+#define NV_PFAULT_CLIENT_HUB_GSPLITE3    0x0000004B /*       */
+#define NV_PFAULT_CLIENT_HUB_PD2         0x0000004C /*       */
+#define NV_PFAULT_CLIENT_HUB_PD3         0x0000004D /*       */
+#define NV_PFAULT_CLIENT_HUB_FE1         0x0000004E /*       */
+#define NV_PFAULT_CLIENT_HUB_FE2         0x0000004F /*       */
+#define NV_PFAULT_CLIENT_HUB_FE3         0x00000050 /*       */
+#define NV_PFAULT_CLIENT_HUB_FE4         0x00000051 /*       */
+#define NV_PFAULT_CLIENT_HUB_FE5         0x00000052 /*       */
+#define NV_PFAULT_CLIENT_HUB_FE6         0x00000053 /*       */
+#define NV_PFAULT_CLIENT_HUB_FE7         0x00000054 /*       */
+#define NV_PFAULT_CLIENT_HUB_FECS1       0x00000055 /*       */
+#define NV_PFAULT_CLIENT_HUB_FECS2       0x00000056 /*       */
+#define NV_PFAULT_CLIENT_HUB_FECS3       0x00000057 /*       */
+#define NV_PFAULT_CLIENT_HUB_FECS4       0x00000058 /*       */
+#define NV_PFAULT_CLIENT_HUB_FECS5       0x00000059 /*       */
+#define NV_PFAULT_CLIENT_HUB_FECS6       0x0000005A /*       */
+#define NV_PFAULT_CLIENT_HUB_FECS7       0x0000005B /*       */
+#define NV_PFAULT_CLIENT_HUB_SKED1       0x0000005C /*       */
+#define NV_PFAULT_CLIENT_HUB_SKED2       0x0000005D /*       */
+#define NV_PFAULT_CLIENT_HUB_SKED3       0x0000005E /*       */
+#define NV_PFAULT_CLIENT_HUB_SKED4       0x0000005F /*       */
+#define NV_PFAULT_CLIENT_HUB_SKED5       0x00000060 /*       */
+#define NV_PFAULT_CLIENT_HUB_SKED6       0x00000061 /*       */
+#define NV_PFAULT_CLIENT_HUB_SKED7       0x00000062 /*       */
+#define NV_PFAULT_CLIENT_HUB_ESC          0x00000063 /*       */
+#define NV_PFAULT_CLIENT_HUB_ESC0         0x00000063 /*       */
+#define NV_PFAULT_CLIENT_HUB_ESC1         0x00000064 /*       */
+#define NV_PFAULT_CLIENT_HUB_ESC2         0x00000065 /*       */
+#define NV_PFAULT_CLIENT_HUB_ESC3         0x00000066 /*       */
+#define NV_PFAULT_CLIENT_HUB_ESC4         0x00000067 /*       */
+#define NV_PFAULT_CLIENT_HUB_ESC5         0x00000068 /*       */
+#define NV_PFAULT_CLIENT_HUB_ESC6         0x00000069 /*       */
+#define NV_PFAULT_CLIENT_HUB_ESC7         0x0000006a /*       */
+#define NV_PFAULT_CLIENT_HUB_ESC8         0x0000006b /*       */
+#define NV_PFAULT_CLIENT_HUB_ESC9         0x0000006c /*       */
+#define NV_PFAULT_CLIENT_HUB_ESC10        0x0000006d /*       */
+#define NV_PFAULT_CLIENT_HUB_ESC11        0x0000006e /*       */
+#define NV_PFAULT_CLIENT_HUB_NVDEC5      0x0000006F /*       */
+#define NV_PFAULT_CLIENT_HUB_NVDEC6      0x00000070 /*       */
+#define NV_PFAULT_CLIENT_HUB_NVDEC7      0x00000071 /*       */
+#define NV_PFAULT_CLIENT_HUB_NVJPG1      0x00000072 /*       */
+#define NV_PFAULT_CLIENT_HUB_NVJPG2      0x00000073 /*       */
+#define NV_PFAULT_CLIENT_HUB_NVJPG3      0x00000074 /*       */
+#define NV_PFAULT_CLIENT_HUB_NVJPG4      0x00000075 /*       */
+#define NV_PFAULT_CLIENT_HUB_NVJPG5      0x00000076 /*       */
+#define NV_PFAULT_CLIENT_HUB_NVJPG6      0x00000077 /*       */
+#define NV_PFAULT_CLIENT_HUB_NVJPG7      0x00000078 /*       */
+#define NV_PFAULT_CLIENT_HUB_FSP         0x00000079 /*       */
+#define NV_PFAULT_CLIENT_HUB_BSI         0x0000007A /*       */
+#define NV_PFAULT_CLIENT_HUB_GSPLITE     0x0000007B /*       */
+#define NV_PFAULT_CLIENT_HUB_GSPLITE0    0x0000007B /*       */
+#define NV_PFAULT_CLIENT_HUB_VPR_SCRUBBER2 0x0000007C /*       */
+#define NV_PFAULT_CLIENT_HUB_VPR_SCRUBBER3 0x0000007D /*       */
+#define NV_PFAULT_CLIENT_HUB_VPR_SCRUBBER4 0x0000007E /*       */
+#define NV_PFAULT_CLIENT_HUB_NVENC3      0x0000007F /*       */
+#define NV_PFAULT_ACCESS_TYPE                 19:16 /*       */
+#define NV_PFAULT_ACCESS_TYPE_READ       0x00000000 /*       */
+#define NV_PFAULT_ACCESS_TYPE_WRITE      0x00000001 /*       */
+#define NV_PFAULT_ACCESS_TYPE_ATOMIC     0x00000002 /*       */
+#define NV_PFAULT_ACCESS_TYPE_PREFETCH   0x00000003 /*       */
+#define NV_PFAULT_ACCESS_TYPE_VIRT_READ          0x00000000 /*       */
+#define NV_PFAULT_ACCESS_TYPE_VIRT_WRITE         0x00000001 /*       */
+#define NV_PFAULT_ACCESS_TYPE_VIRT_ATOMIC        0x00000002 /*       */
+#define NV_PFAULT_ACCESS_TYPE_VIRT_ATOMIC_STRONG 0x00000002 /*       */
+#define NV_PFAULT_ACCESS_TYPE_VIRT_PREFETCH      0x00000003 /*       */
+#define NV_PFAULT_ACCESS_TYPE_VIRT_ATOMIC_WEAK   0x00000004 /*       */
+#define NV_PFAULT_ACCESS_TYPE_PHYS_READ          0x00000008 /*       */
+#define NV_PFAULT_ACCESS_TYPE_PHYS_WRITE         0x00000009 /*       */
+#define NV_PFAULT_ACCESS_TYPE_PHYS_ATOMIC        0x0000000a /*       */
+#define NV_PFAULT_ACCESS_TYPE_PHYS_PREFETCH      0x0000000b /*       */
+#define NV_PFAULT_MMU_CLIENT_TYPE             20:20 /*       */
+#define NV_PFAULT_MMU_CLIENT_TYPE_GPC    0x00000000 /*       */
+#define NV_PFAULT_MMU_CLIENT_TYPE_HUB    0x00000001 /*       */
+#define NV_PFAULT_GPC_ID                      28:24 /*       */
+#define NV_PFAULT_PROTECTED_MODE              29:29 /*       */
+#define NV_PFAULT_REPLAYABLE_FAULT_EN         30:30 /*       */
+#define NV_PFAULT_VALID                       31:31 /*       */
+#endif // __gb100_dev_fault_h__
--- a/kernel-open/nvidia-uvm/hwref/blackwell/gb100/dev_mmu.h
+++ b/kernel-open/nvidia-uvm/hwref/blackwell/gb100/dev_mmu.h
@@ -0,0 +1,560 @@
+/*******************************************************************************
+    Copyright (c) 2003-2016 NVIDIA Corporation
+
+    Permission is hereby granted, free of charge, to any person obtaining a copy
+    of this software and associated documentation files (the "Software"), to
+    deal in the Software without restriction, including without limitation the
+    rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
+    sell copies of the Software, and to permit persons to whom the Software is
+    furnished to do so, subject to the following conditions:
+
+    The above copyright notice and this permission notice shall be
+    included in all copies or substantial portions of the Software.
+
+    THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+    IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+    FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+    THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+    LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+    FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+    DEALINGS IN THE SOFTWARE.
+
+*******************************************************************************/
+
+
+#ifndef __gb100_dev_mmu_h__
+#define __gb100_dev_mmu_h__
+/* This file is autogenerated.  Do not edit */
+#define NV_MMU_PDE                                                      /* ----G */
+#define NV_MMU_PDE_APERTURE_BIG                       (0*32+1):(0*32+0) /* RWXVF */
+#define NV_MMU_PDE_APERTURE_BIG_INVALID                      0x00000000 /* RW--V */
+#define NV_MMU_PDE_APERTURE_BIG_VIDEO_MEMORY                 0x00000001 /* RW--V */
+#define NV_MMU_PDE_APERTURE_BIG_SYSTEM_COHERENT_MEMORY       0x00000002 /* RW--V */
+#define NV_MMU_PDE_APERTURE_BIG_SYSTEM_NON_COHERENT_MEMORY   0x00000003 /* RW--V */
+#define NV_MMU_PDE_SIZE                               (0*32+3):(0*32+2) /* RWXVF */
+#define NV_MMU_PDE_SIZE_FULL                                 0x00000000 /* RW--V */
+#define NV_MMU_PDE_SIZE_HALF                                 0x00000001 /* RW--V */
+#define NV_MMU_PDE_SIZE_QUARTER                              0x00000002 /* RW--V */
+#define NV_MMU_PDE_SIZE_EIGHTH                               0x00000003 /* RW--V */
+#define NV_MMU_PDE_ADDRESS_BIG_SYS                   (0*32+31):(0*32+4) /* RWXVF */
+#define NV_MMU_PDE_ADDRESS_BIG_VID                   (0*32+31-3):(0*32+4) /* RWXVF */
+#define NV_MMU_PDE_ADDRESS_BIG_VID_PEER             (0*32+31):(0*32+32-3) /* RWXVF */
+#define NV_MMU_PDE_ADDRESS_BIG_VID_PEER_0                    0x00000000 /* RW--V */
+#define NV_MMU_PDE_APERTURE_SMALL                     (1*32+1):(1*32+0) /* RWXVF */
+#define NV_MMU_PDE_APERTURE_SMALL_INVALID                    0x00000000 /* RW--V */
+#define NV_MMU_PDE_APERTURE_SMALL_VIDEO_MEMORY               0x00000001 /* RW--V */
+#define NV_MMU_PDE_APERTURE_SMALL_SYSTEM_COHERENT_MEMORY     0x00000002 /* RW--V */
+#define NV_MMU_PDE_APERTURE_SMALL_SYSTEM_NON_COHERENT_MEMORY 0x00000003 /* RW--V */
+#define NV_MMU_PDE_VOL_SMALL                          (1*32+2):(1*32+2) /* RWXVF */
+#define NV_MMU_PDE_VOL_SMALL_TRUE                            0x00000001 /* RW--V */
+#define NV_MMU_PDE_VOL_SMALL_FALSE                           0x00000000 /* RW--V */
+#define NV_MMU_PDE_VOL_BIG                            (1*32+3):(1*32+3) /* RWXVF */
+#define NV_MMU_PDE_VOL_BIG_TRUE                              0x00000001 /* RW--V */
+#define NV_MMU_PDE_VOL_BIG_FALSE                             0x00000000 /* RW--V */
+#define NV_MMU_PDE_ADDRESS_SMALL_SYS                 (1*32+31):(1*32+4) /* RWXVF */
+#define NV_MMU_PDE_ADDRESS_SMALL_VID                 (1*32+31-3):(1*32+4) /* RWXVF */
+#define NV_MMU_PDE_ADDRESS_SMALL_VID_PEER           (1*32+31):(1*32+32-3) /* RWXVF */
+#define NV_MMU_PDE_ADDRESS_SMALL_VID_PEER_0                  0x00000000 /* RW--V */
+#define NV_MMU_PDE_ADDRESS_SHIFT                             0x0000000c /*       */
+#define NV_MMU_PDE__SIZE                                              8
+#define NV_MMU_PTE                                                      /* ----G */
+#define NV_MMU_PTE_VALID                              (0*32+0):(0*32+0) /* RWXVF */
+#define NV_MMU_PTE_VALID_TRUE                                       0x1 /* RW--V */
+#define NV_MMU_PTE_VALID_FALSE                                      0x0 /* RW--V */
+#define NV_MMU_PTE_PRIVILEGE                          (0*32+1):(0*32+1) /* RWXVF */
+#define NV_MMU_PTE_PRIVILEGE_TRUE                                   0x1 /* RW--V */
+#define NV_MMU_PTE_PRIVILEGE_FALSE                                  0x0 /* RW--V */
+#define NV_MMU_PTE_READ_ONLY                          (0*32+2):(0*32+2) /* RWXVF */
+#define NV_MMU_PTE_READ_ONLY_TRUE                                  0x1  /* RW--V */
+#define NV_MMU_PTE_READ_ONLY_FALSE                                 0x0  /* RW--V */
+#define NV_MMU_PTE_ENCRYPTED                          (0*32+3):(0*32+3) /* RWXVF */
+#define NV_MMU_PTE_ENCRYPTED_TRUE                            0x00000001 /* R---V */
+#define NV_MMU_PTE_ENCRYPTED_FALSE                           0x00000000 /* R---V */
+#define NV_MMU_PTE_ADDRESS_SYS                      (0*32+31):(0*32+4) /* RWXVF */
+#define NV_MMU_PTE_ADDRESS_VID                      (0*32+31-3):(0*32+4) /* RWXVF */
+#define NV_MMU_PTE_ADDRESS_VID_PEER                (0*32+31):(0*32+32-3) /* RWXVF */
+#define NV_MMU_PTE_ADDRESS_VID_PEER_0                       0x00000000 /* RW--V */
+#define NV_MMU_PTE_ADDRESS_VID_PEER_1                       0x00000001 /* RW--V */
+#define NV_MMU_PTE_ADDRESS_VID_PEER_2                       0x00000002 /* RW--V */
+#define NV_MMU_PTE_ADDRESS_VID_PEER_3                       0x00000003 /* RW--V */
+#define NV_MMU_PTE_ADDRESS_VID_PEER_4                       0x00000004 /* RW--V */
+#define NV_MMU_PTE_ADDRESS_VID_PEER_5                       0x00000005 /* RW--V */
+#define NV_MMU_PTE_ADDRESS_VID_PEER_6                       0x00000006 /* RW--V */
+#define NV_MMU_PTE_ADDRESS_VID_PEER_7                       0x00000007 /* RW--V */
+#define NV_MMU_PTE_VOL                                (1*32+0):(1*32+0) /* RWXVF */
+#define NV_MMU_PTE_VOL_TRUE                                  0x00000001 /* RW--V */
+#define NV_MMU_PTE_VOL_FALSE                                 0x00000000 /* RW--V */
+#define NV_MMU_PTE_APERTURE                           (1*32+2):(1*32+1) /* RWXVF */
+#define NV_MMU_PTE_APERTURE_VIDEO_MEMORY                     0x00000000 /* RW--V */
+#define NV_MMU_PTE_APERTURE_PEER_MEMORY                      0x00000001 /* RW--V */
+#define NV_MMU_PTE_APERTURE_SYSTEM_COHERENT_MEMORY           0x00000002 /* RW--V */
+#define NV_MMU_PTE_APERTURE_SYSTEM_NON_COHERENT_MEMORY       0x00000003 /* RW--V */
+#define NV_MMU_PTE_LOCK                               (1*32+3):(1*32+3) /* RWXVF */
+#define NV_MMU_PTE_LOCK_TRUE                                        0x1 /* RW--V */
+#define NV_MMU_PTE_LOCK_FALSE                                       0x0 /* RW--V */
+#define NV_MMU_PTE_ATOMIC_DISABLE                     (1*32+3):(1*32+3) /* RWXVF */
+#define NV_MMU_PTE_ATOMIC_DISABLE_TRUE                              0x1 /* RW--V */
+#define NV_MMU_PTE_ATOMIC_DISABLE_FALSE                             0x0 /* RW--V */
+#define NV_MMU_PTE_COMPTAGLINE                      (1*32+20+11):(1*32+12) /* RWXVF */
+#define NV_MMU_PTE_READ_DISABLE                     (1*32+30):(1*32+30) /* RWXVF */
+#define NV_MMU_PTE_READ_DISABLE_TRUE                               0x1  /* RW--V */
+#define NV_MMU_PTE_READ_DISABLE_FALSE                              0x0  /* RW--V */
+#define NV_MMU_PTE_WRITE_DISABLE                    (1*32+31):(1*32+31) /* RWXVF */
+#define NV_MMU_PTE_WRITE_DISABLE_TRUE                              0x1  /* RW--V */
+#define NV_MMU_PTE_WRITE_DISABLE_FALSE                             0x0  /* RW--V */
+#define NV_MMU_PTE_ADDRESS_SHIFT                             0x0000000c /*       */
+#define NV_MMU_PTE__SIZE                                             8
+#define NV_MMU_PTE_COMPTAGS_NONE                                    0x0 /*       */
+#define NV_MMU_PTE_COMPTAGS_1                                       0x1 /*       */
+#define NV_MMU_PTE_COMPTAGS_2                                       0x2 /*       */
+#define NV_MMU_PTE_KIND                              (1*32+7):(1*32+4) /* RWXVF */
+#define NV_MMU_PTE_KIND_INVALID                       0x07 /* R---V */
+#define NV_MMU_PTE_KIND_PITCH                         0x00 /* R---V */
+#define NV_MMU_PTE_KIND_GENERIC_MEMORY                                                  0x6 /* R---V */
+#define NV_MMU_PTE_KIND_Z16                                                             0x1 /* R---V */
+#define NV_MMU_PTE_KIND_S8                                                              0x2 /* R---V */
+#define NV_MMU_PTE_KIND_S8Z24                                                           0x3 /* R---V */
+#define NV_MMU_PTE_KIND_ZF32_X24S8                                                      0x4 /* R---V */
+#define NV_MMU_PTE_KIND_Z24S8                                                           0x5 /* R---V */
+#define NV_MMU_PTE_KIND_GENERIC_MEMORY_COMPRESSIBLE                                     0x8 /* R---V */
+#define NV_MMU_PTE_KIND_GENERIC_MEMORY_COMPRESSIBLE_DISABLE_PLC                         0x9 /* R---V */
+#define NV_MMU_PTE_KIND_S8_COMPRESSIBLE_DISABLE_PLC                                     0xA /* R---V */
+#define NV_MMU_PTE_KIND_Z16_COMPRESSIBLE_DISABLE_PLC                                    0xB /* R---V */
+#define NV_MMU_PTE_KIND_S8Z24_COMPRESSIBLE_DISABLE_PLC                                  0xC /* R---V */
+#define NV_MMU_PTE_KIND_ZF32_X24S8_COMPRESSIBLE_DISABLE_PLC                             0xD /* R---V */
+#define NV_MMU_PTE_KIND_Z24S8_COMPRESSIBLE_DISABLE_PLC                                  0xE /* R---V */
+#define NV_MMU_PTE_KIND_SMSKED_MESSAGE                                                  0xF /* R---V */
+#define NV_MMU_VER1_PDE                                                      /* ----G */
+#define NV_MMU_VER1_PDE_APERTURE_BIG                       (0*32+1):(0*32+0) /* RWXVF */
+#define NV_MMU_VER1_PDE_APERTURE_BIG_INVALID                      0x00000000 /* RW--V */
+#define NV_MMU_VER1_PDE_APERTURE_BIG_VIDEO_MEMORY                 0x00000001 /* RW--V */
+#define NV_MMU_VER1_PDE_APERTURE_BIG_SYSTEM_COHERENT_MEMORY       0x00000002 /* RW--V */
+#define NV_MMU_VER1_PDE_APERTURE_BIG_SYSTEM_NON_COHERENT_MEMORY   0x00000003 /* RW--V */
+#define NV_MMU_VER1_PDE_SIZE                               (0*32+3):(0*32+2) /* RWXVF */
+#define NV_MMU_VER1_PDE_SIZE_FULL                                 0x00000000 /* RW--V */
+#define NV_MMU_VER1_PDE_SIZE_HALF                                 0x00000001 /* RW--V */
+#define NV_MMU_VER1_PDE_SIZE_QUARTER                              0x00000002 /* RW--V */
+#define NV_MMU_VER1_PDE_SIZE_EIGHTH                               0x00000003 /* RW--V */
+#define NV_MMU_VER1_PDE_ADDRESS_BIG_SYS                   (0*32+31):(0*32+4) /* RWXVF */
+#define NV_MMU_VER1_PDE_ADDRESS_BIG_VID                   (0*32+31-3):(0*32+4) /* RWXVF */
+#define NV_MMU_VER1_PDE_ADDRESS_BIG_VID_PEER             (0*32+31):(0*32+32-3) /* RWXVF */
+#define NV_MMU_VER1_PDE_ADDRESS_BIG_VID_PEER_0                    0x00000000 /* RW--V */
+#define NV_MMU_VER1_PDE_APERTURE_SMALL                     (1*32+1):(1*32+0) /* RWXVF */
+#define NV_MMU_VER1_PDE_APERTURE_SMALL_INVALID                    0x00000000 /* RW--V */
+#define NV_MMU_VER1_PDE_APERTURE_SMALL_VIDEO_MEMORY               0x00000001 /* RW--V */
+#define NV_MMU_VER1_PDE_APERTURE_SMALL_SYSTEM_COHERENT_MEMORY     0x00000002 /* RW--V */
+#define NV_MMU_VER1_PDE_APERTURE_SMALL_SYSTEM_NON_COHERENT_MEMORY 0x00000003 /* RW--V */
+#define NV_MMU_VER1_PDE_VOL_SMALL                          (1*32+2):(1*32+2) /* RWXVF */
+#define NV_MMU_VER1_PDE_VOL_SMALL_TRUE                            0x00000001 /* RW--V */
+#define NV_MMU_VER1_PDE_VOL_SMALL_FALSE                           0x00000000 /* RW--V */
+#define NV_MMU_VER1_PDE_VOL_BIG                            (1*32+3):(1*32+3) /* RWXVF */
+#define NV_MMU_VER1_PDE_VOL_BIG_TRUE                              0x00000001 /* RW--V */
+#define NV_MMU_VER1_PDE_VOL_BIG_FALSE                             0x00000000 /* RW--V */
+#define NV_MMU_VER1_PDE_ADDRESS_SMALL_SYS                 (1*32+31):(1*32+4) /* RWXVF */
+#define NV_MMU_VER1_PDE_ADDRESS_SMALL_VID                 (1*32+31-3):(1*32+4) /* RWXVF */
+#define NV_MMU_VER1_PDE_ADDRESS_SMALL_VID_PEER           (1*32+31):(1*32+32-3) /* RWXVF */
+#define NV_MMU_VER1_PDE_ADDRESS_SMALL_VID_PEER_0                  0x00000000 /* RW--V */
+#define NV_MMU_VER1_PDE_ADDRESS_SHIFT                             0x0000000c /*       */
+#define NV_MMU_VER1_PDE__SIZE                                              8
+#define NV_MMU_VER1_PTE                                                      /* ----G */
+#define NV_MMU_VER1_PTE_VALID                              (0*32+0):(0*32+0) /* RWXVF */
+#define NV_MMU_VER1_PTE_VALID_TRUE                                       0x1 /* RW--V */
+#define NV_MMU_VER1_PTE_VALID_FALSE                                      0x0 /* RW--V */
+#define NV_MMU_VER1_PTE_PRIVILEGE                          (0*32+1):(0*32+1) /* RWXVF */
+#define NV_MMU_VER1_PTE_PRIVILEGE_TRUE                                   0x1 /* RW--V */
+#define NV_MMU_VER1_PTE_PRIVILEGE_FALSE                                  0x0 /* RW--V */
+#define NV_MMU_VER1_PTE_READ_ONLY                          (0*32+2):(0*32+2) /* RWXVF */
+#define NV_MMU_VER1_PTE_READ_ONLY_TRUE                                  0x1  /* RW--V */
+#define NV_MMU_VER1_PTE_READ_ONLY_FALSE                                 0x0  /* RW--V */
+#define NV_MMU_VER1_PTE_ENCRYPTED                          (0*32+3):(0*32+3) /* RWXVF */
+#define NV_MMU_VER1_PTE_ENCRYPTED_TRUE                            0x00000001 /* R---V */
+#define NV_MMU_VER1_PTE_ENCRYPTED_FALSE                           0x00000000 /* R---V */
+#define NV_MMU_VER1_PTE_ADDRESS_SYS                      (0*32+31):(0*32+4) /* RWXVF */
+#define NV_MMU_VER1_PTE_ADDRESS_VID                      (0*32+31-3):(0*32+4) /* RWXVF */
+#define NV_MMU_VER1_PTE_ADDRESS_VID_PEER                (0*32+31):(0*32+32-3) /* RWXVF */
+#define NV_MMU_VER1_PTE_ADDRESS_VID_PEER_0                       0x00000000 /* RW--V */
+#define NV_MMU_VER1_PTE_ADDRESS_VID_PEER_1                       0x00000001 /* RW--V */
+#define NV_MMU_VER1_PTE_ADDRESS_VID_PEER_2                       0x00000002 /* RW--V */
+#define NV_MMU_VER1_PTE_ADDRESS_VID_PEER_3                       0x00000003 /* RW--V */
+#define NV_MMU_VER1_PTE_ADDRESS_VID_PEER_4                       0x00000004 /* RW--V */
+#define NV_MMU_VER1_PTE_ADDRESS_VID_PEER_5                       0x00000005 /* RW--V */
+#define NV_MMU_VER1_PTE_ADDRESS_VID_PEER_6                       0x00000006 /* RW--V */
+#define NV_MMU_VER1_PTE_ADDRESS_VID_PEER_7                       0x00000007 /* RW--V */
+#define NV_MMU_VER1_PTE_VOL                                (1*32+0):(1*32+0) /* RWXVF */
+#define NV_MMU_VER1_PTE_VOL_TRUE                                  0x00000001 /* RW--V */
+#define NV_MMU_VER1_PTE_VOL_FALSE                                 0x00000000 /* RW--V */
+#define NV_MMU_VER1_PTE_APERTURE                           (1*32+2):(1*32+1) /* RWXVF */
+#define NV_MMU_VER1_PTE_APERTURE_VIDEO_MEMORY                     0x00000000 /* RW--V */
+#define NV_MMU_VER1_PTE_APERTURE_PEER_MEMORY                      0x00000001 /* RW--V */
+#define NV_MMU_VER1_PTE_APERTURE_SYSTEM_COHERENT_MEMORY           0x00000002 /* RW--V */
+#define NV_MMU_VER1_PTE_APERTURE_SYSTEM_NON_COHERENT_MEMORY       0x00000003 /* RW--V */
+#define NV_MMU_VER1_PTE_ATOMIC_DISABLE                     (1*32+3):(1*32+3) /* RWXVF */
+#define NV_MMU_VER1_PTE_ATOMIC_DISABLE_TRUE                              0x1 /* RW--V */
+#define NV_MMU_VER1_PTE_ATOMIC_DISABLE_FALSE                             0x0 /* RW--V */
+#define NV_MMU_VER1_PTE_COMPTAGLINE                      (1*32+20+11):(1*32+12) /* RWXVF */
+#define NV_MMU_VER1_PTE_KIND                              (1*32+11):(1*32+4) /* RWXVF */
+#define NV_MMU_VER1_PTE_ADDRESS_SHIFT                             0x0000000c /*       */
+#define NV_MMU_VER1_PTE__SIZE                                             8
+#define NV_MMU_VER1_PTE_COMPTAGS_NONE                                    0x0 /*       */
+#define NV_MMU_VER1_PTE_COMPTAGS_1                                       0x1 /*       */
+#define NV_MMU_VER1_PTE_COMPTAGS_2                                       0x2 /*       */
+#define NV_MMU_NEW_PDE                                                      /* ----G */
+#define NV_MMU_NEW_PDE_IS_PTE                                           0:0 /* RWXVF */
+#define NV_MMU_NEW_PDE_IS_PTE_TRUE                                      0x1 /* RW--V */
+#define NV_MMU_NEW_PDE_IS_PTE_FALSE                                     0x0 /* RW--V */
+#define NV_MMU_NEW_PDE_IS_PDE                                           0:0 /* RWXVF */
+#define NV_MMU_NEW_PDE_IS_PDE_TRUE                                      0x0 /* RW--V */
+#define NV_MMU_NEW_PDE_IS_PDE_FALSE                                     0x1 /* RW--V */
+#define NV_MMU_NEW_PDE_VALID                                            0:0 /* RWXVF */
+#define NV_MMU_NEW_PDE_VALID_TRUE                                       0x1 /* RW--V */
+#define NV_MMU_NEW_PDE_VALID_FALSE                                      0x0 /* RW--V */
+#define NV_MMU_NEW_PDE_APERTURE                                         2:1 /* RWXVF */
+#define NV_MMU_NEW_PDE_APERTURE_INVALID                          0x00000000 /* RW--V */
+#define NV_MMU_NEW_PDE_APERTURE_VIDEO_MEMORY                     0x00000001 /* RW--V */
+#define NV_MMU_NEW_PDE_APERTURE_SYSTEM_COHERENT_MEMORY           0x00000002 /* RW--V */
+#define NV_MMU_NEW_PDE_APERTURE_SYSTEM_NON_COHERENT_MEMORY       0x00000003 /* RW--V */
+#define NV_MMU_NEW_PDE_VOL                                              3:3 /* RWXVF */
+#define NV_MMU_NEW_PDE_VOL_TRUE                                  0x00000001 /* RW--V */
+#define NV_MMU_NEW_PDE_VOL_FALSE                                 0x00000000 /* RW--V */
+#define NV_MMU_NEW_PDE_NO_ATS                                            5:5 /* RWXVF */
+#define NV_MMU_NEW_PDE_NO_ATS_TRUE                                       0x1 /* RW--V */
+#define NV_MMU_NEW_PDE_NO_ATS_FALSE                                      0x0 /* RW--V */
+#define NV_MMU_NEW_PDE_ADDRESS_SYS                                     53:8 /* RWXVF */
+#define NV_MMU_NEW_PDE_ADDRESS_VID             (35-3):8 /* RWXVF */
+#define NV_MMU_NEW_PDE_ADDRESS_VID_PEER       35:(36-3) /* RWXVF */
+#define NV_MMU_NEW_PDE_ADDRESS_VID_PEER_0                        0x00000000 /* RW--V */
+#define NV_MMU_NEW_PDE_ADDRESS_SHIFT                             0x0000000c /*       */
+#define NV_MMU_NEW_PDE__SIZE                                              8
+#define NV_MMU_NEW_DUAL_PDE                                                      /* ----G */
+#define NV_MMU_NEW_DUAL_PDE_IS_PTE                                           0:0 /* RWXVF */
+#define NV_MMU_NEW_DUAL_PDE_IS_PTE_TRUE                                      0x1 /* RW--V */
+#define NV_MMU_NEW_DUAL_PDE_IS_PTE_FALSE                                     0x0 /* RW--V */
+#define NV_MMU_NEW_DUAL_PDE_IS_PDE                                           0:0 /* RWXVF */
+#define NV_MMU_NEW_DUAL_PDE_IS_PDE_TRUE                                      0x0 /* RW--V */
+#define NV_MMU_NEW_DUAL_PDE_IS_PDE_FALSE                                     0x1 /* RW--V */
+#define NV_MMU_NEW_DUAL_PDE_VALID                                            0:0 /* RWXVF */
+#define NV_MMU_NEW_DUAL_PDE_VALID_TRUE                                       0x1 /* RW--V */
+#define NV_MMU_NEW_DUAL_PDE_VALID_FALSE                                      0x0 /* RW--V */
+#define NV_MMU_NEW_DUAL_PDE_APERTURE_BIG                                     2:1 /* RWXVF */
+#define NV_MMU_NEW_DUAL_PDE_APERTURE_BIG_INVALID                      0x00000000 /* RW--V */
+#define NV_MMU_NEW_DUAL_PDE_APERTURE_BIG_VIDEO_MEMORY                 0x00000001 /* RW--V */
+#define NV_MMU_NEW_DUAL_PDE_APERTURE_BIG_SYSTEM_COHERENT_MEMORY       0x00000002 /* RW--V */
+#define NV_MMU_NEW_DUAL_PDE_APERTURE_BIG_SYSTEM_NON_COHERENT_MEMORY   0x00000003 /* RW--V */
+#define NV_MMU_NEW_DUAL_PDE_VOL_BIG                                          3:3 /* RWXVF */
+#define NV_MMU_NEW_DUAL_PDE_VOL_BIG_TRUE                              0x00000001 /* RW--V */
+#define NV_MMU_NEW_DUAL_PDE_VOL_BIG_FALSE                             0x00000000 /* RW--V */
+#define NV_MMU_NEW_DUAL_PDE_NO_ATS                                       5:5 /* RWXVF */
+#define NV_MMU_NEW_DUAL_PDE_NO_ATS_TRUE                                  0x1 /* RW--V */
+#define NV_MMU_NEW_DUAL_PDE_NO_ATS_FALSE                                 0x0 /* RW--V */
+#define NV_MMU_NEW_DUAL_PDE_ADDRESS_BIG_SYS                                 53:(8-4) /* RWXVF */
+#define NV_MMU_NEW_DUAL_PDE_ADDRESS_BIG_VID         (35-3):(8-4) /* RWXVF */
+#define NV_MMU_NEW_DUAL_PDE_ADDRESS_BIG_VID_PEER   35:(36-3) /* RWXVF */
+#define NV_MMU_NEW_DUAL_PDE_ADDRESS_BIG_VID_PEER_0                    0x00000000 /* RW--V */
+#define NV_MMU_NEW_DUAL_PDE_APERTURE_SMALL                                 66:65 /* RWXVF */
+#define NV_MMU_NEW_DUAL_PDE_APERTURE_SMALL_INVALID                    0x00000000 /* RW--V */
+#define NV_MMU_NEW_DUAL_PDE_APERTURE_SMALL_VIDEO_MEMORY               0x00000001 /* RW--V */
+#define NV_MMU_NEW_DUAL_PDE_APERTURE_SMALL_SYSTEM_COHERENT_MEMORY     0x00000002 /* RW--V */
+#define NV_MMU_NEW_DUAL_PDE_APERTURE_SMALL_SYSTEM_NON_COHERENT_MEMORY 0x00000003 /* RW--V */
+#define NV_MMU_NEW_DUAL_PDE_VOL_SMALL                                      67:67 /* RWXVF */
+#define NV_MMU_NEW_DUAL_PDE_VOL_SMALL_TRUE                            0x00000001 /* RW--V */
+#define NV_MMU_NEW_DUAL_PDE_VOL_SMALL_FALSE                           0x00000000 /* RW--V */
+#define NV_MMU_NEW_DUAL_PDE_ADDRESS_SMALL_SYS                             117:72 /* RWXVF */
+#define NV_MMU_NEW_DUAL_PDE_ADDRESS_SMALL_VID      (99-3):72 /* RWXVF */
+#define NV_MMU_NEW_DUAL_PDE_ADDRESS_SMALL_VID_PEER 99:(100-3) /* RWXVF */
+#define NV_MMU_NEW_DUAL_PDE_ADDRESS_SMALL_VID_PEER_0                  0x00000000 /* RW--V */
+#define NV_MMU_NEW_DUAL_PDE_ADDRESS_SHIFT                             0x0000000c /*       */
+#define NV_MMU_NEW_DUAL_PDE_ADDRESS_BIG_SHIFT 8 /*       */
+#define NV_MMU_NEW_DUAL_PDE__SIZE                                             16
+#define NV_MMU_NEW_PTE                                                      /* ----G */
+#define NV_MMU_NEW_PTE_VALID                                            0:0 /* RWXVF */
+#define NV_MMU_NEW_PTE_VALID_TRUE                                       0x1 /* RW--V */
+#define NV_MMU_NEW_PTE_VALID_FALSE                                      0x0 /* RW--V */
+#define NV_MMU_NEW_PTE_APERTURE                                         2:1 /* RWXVF */
+#define NV_MMU_NEW_PTE_APERTURE_VIDEO_MEMORY                     0x00000000 /* RW--V */
+#define NV_MMU_NEW_PTE_APERTURE_PEER_MEMORY                      0x00000001 /* RW--V */
+#define NV_MMU_NEW_PTE_APERTURE_SYSTEM_COHERENT_MEMORY           0x00000002 /* RW--V */
+#define NV_MMU_NEW_PTE_APERTURE_SYSTEM_NON_COHERENT_MEMORY       0x00000003 /* RW--V */
+#define NV_MMU_NEW_PTE_VOL                                              3:3 /* RWXVF */
+#define NV_MMU_NEW_PTE_VOL_TRUE                                  0x00000001 /* RW--V */
+#define NV_MMU_NEW_PTE_VOL_FALSE                                 0x00000000 /* RW--V */
+#define NV_MMU_NEW_PTE_ENCRYPTED                                        4:4 /* RWXVF */
+#define NV_MMU_NEW_PTE_ENCRYPTED_TRUE                            0x00000001 /* R---V */
+#define NV_MMU_NEW_PTE_ENCRYPTED_FALSE                           0x00000000 /* R---V */
+#define NV_MMU_NEW_PTE_PRIVILEGE                                        5:5 /* RWXVF */
+#define NV_MMU_NEW_PTE_PRIVILEGE_TRUE                                   0x1 /* RW--V */
+#define NV_MMU_NEW_PTE_PRIVILEGE_FALSE                                  0x0 /* RW--V */
+#define NV_MMU_NEW_PTE_READ_ONLY                                        6:6 /* RWXVF */
+#define NV_MMU_NEW_PTE_READ_ONLY_TRUE                                   0x1 /* RW--V */
+#define NV_MMU_NEW_PTE_READ_ONLY_FALSE                                  0x0 /* RW--V */
+#define NV_MMU_NEW_PTE_ATOMIC_DISABLE                                   7:7 /* RWXVF */
+#define NV_MMU_NEW_PTE_ATOMIC_DISABLE_TRUE                              0x1 /* RW--V */
+#define NV_MMU_NEW_PTE_ATOMIC_DISABLE_FALSE                             0x0 /* RW--V */
+#define NV_MMU_NEW_PTE_ADDRESS_SYS                                     53:8 /* RWXVF */
+#define NV_MMU_NEW_PTE_ADDRESS_VID             (35-3):8 /* RWXVF */
+#define NV_MMU_NEW_PTE_ADDRESS_VID_PEER       35:(36-3) /* RWXVF */
+#define NV_MMU_NEW_PTE_ADDRESS_VID_PEER_0                        0x00000000 /* RW--V */
+#define NV_MMU_NEW_PTE_ADDRESS_VID_PEER_1                        0x00000001 /* RW--V */
+#define NV_MMU_NEW_PTE_ADDRESS_VID_PEER_2                        0x00000002 /* RW--V */
+#define NV_MMU_NEW_PTE_ADDRESS_VID_PEER_3                        0x00000003 /* RW--V */
+#define NV_MMU_NEW_PTE_ADDRESS_VID_PEER_4                        0x00000004 /* RW--V */
+#define NV_MMU_NEW_PTE_ADDRESS_VID_PEER_5                        0x00000005 /* RW--V */
+#define NV_MMU_NEW_PTE_ADDRESS_VID_PEER_6                        0x00000006 /* RW--V */
+#define NV_MMU_NEW_PTE_ADDRESS_VID_PEER_7                        0x00000007 /* RW--V */
+#define NV_MMU_NEW_PTE_COMPTAGLINE   (20+35):36 /* RWXVF */
+#define NV_MMU_NEW_PTE_KIND                                           63:56 /* RWXVF */
+#define NV_MMU_NEW_PTE_ADDRESS_SHIFT                             0x0000000c /*       */
+#define NV_MMU_NEW_PTE__SIZE                                              8
+#define NV_MMU_VER2_PDE                                                      /* ----G */
+#define NV_MMU_VER2_PDE_IS_PTE                                           0:0 /* RWXVF */
+#define NV_MMU_VER2_PDE_IS_PTE_TRUE                                      0x1 /* RW--V */
+#define NV_MMU_VER2_PDE_IS_PTE_FALSE                                     0x0 /* RW--V */
+#define NV_MMU_VER2_PDE_IS_PDE                                           0:0 /* RWXVF */
+#define NV_MMU_VER2_PDE_IS_PDE_TRUE                                      0x0 /* RW--V */
+#define NV_MMU_VER2_PDE_IS_PDE_FALSE                                     0x1 /* RW--V */
+#define NV_MMU_VER2_PDE_VALID                                            0:0 /* RWXVF */
+#define NV_MMU_VER2_PDE_VALID_TRUE                                       0x1 /* RW--V */
+#define NV_MMU_VER2_PDE_VALID_FALSE                                      0x0 /* RW--V */
+#define NV_MMU_VER2_PDE_APERTURE                                         2:1 /* RWXVF */
+#define NV_MMU_VER2_PDE_APERTURE_INVALID                          0x00000000 /* RW--V */
+#define NV_MMU_VER2_PDE_APERTURE_VIDEO_MEMORY                     0x00000001 /* RW--V */
+#define NV_MMU_VER2_PDE_APERTURE_SYSTEM_COHERENT_MEMORY           0x00000002 /* RW--V */
+#define NV_MMU_VER2_PDE_APERTURE_SYSTEM_NON_COHERENT_MEMORY       0x00000003 /* RW--V */
+#define NV_MMU_VER2_PDE_VOL                                              3:3 /* RWXVF */
+#define NV_MMU_VER2_PDE_VOL_TRUE                                  0x00000001 /* RW--V */
+#define NV_MMU_VER2_PDE_VOL_FALSE                                 0x00000000 /* RW--V */
+#define NV_MMU_VER2_PDE_NO_ATS                                           5:5 /* RWXVF */
+#define NV_MMU_VER2_PDE_NO_ATS_TRUE                                      0x1 /* RW--V */
+#define NV_MMU_VER2_PDE_NO_ATS_FALSE                                     0x0 /* RW--V */
+#define NV_MMU_VER2_PDE_ADDRESS_SYS                                     53:8 /* RWXVF */
+#define NV_MMU_VER2_PDE_ADDRESS_VID             (35-3):8 /* RWXVF */
+#define NV_MMU_VER2_PDE_ADDRESS_VID_PEER       35:(36-3) /* RWXVF */
+#define NV_MMU_VER2_PDE_ADDRESS_VID_PEER_0                        0x00000000 /* RW--V */
+#define NV_MMU_VER2_PDE_ADDRESS_SHIFT                             0x0000000c /*       */
+#define NV_MMU_VER2_PDE__SIZE                                              8
+#define NV_MMU_VER2_DUAL_PDE                                                      /* ----G */
+#define NV_MMU_VER2_DUAL_PDE_IS_PTE                                           0:0 /* RWXVF */
+#define NV_MMU_VER2_DUAL_PDE_IS_PTE_TRUE                                      0x1 /* RW--V */
+#define NV_MMU_VER2_DUAL_PDE_IS_PTE_FALSE                                     0x0 /* RW--V */
+#define NV_MMU_VER2_DUAL_PDE_IS_PDE                                           0:0 /* RWXVF */
+#define NV_MMU_VER2_DUAL_PDE_IS_PDE_TRUE                                      0x0 /* RW--V */
+#define NV_MMU_VER2_DUAL_PDE_IS_PDE_FALSE                                     0x1 /* RW--V */
+#define NV_MMU_VER2_DUAL_PDE_VALID                                            0:0 /* RWXVF */
+#define NV_MMU_VER2_DUAL_PDE_VALID_TRUE                                       0x1 /* RW--V */
+#define NV_MMU_VER2_DUAL_PDE_VALID_FALSE                                      0x0 /* RW--V */
+#define NV_MMU_VER2_DUAL_PDE_APERTURE_BIG                                     2:1 /* RWXVF */
+#define NV_MMU_VER2_DUAL_PDE_APERTURE_BIG_INVALID                      0x00000000 /* RW--V */
+#define NV_MMU_VER2_DUAL_PDE_APERTURE_BIG_VIDEO_MEMORY                 0x00000001 /* RW--V */
+#define NV_MMU_VER2_DUAL_PDE_APERTURE_BIG_SYSTEM_COHERENT_MEMORY       0x00000002 /* RW--V */
+#define NV_MMU_VER2_DUAL_PDE_APERTURE_BIG_SYSTEM_NON_COHERENT_MEMORY   0x00000003 /* RW--V */
+#define NV_MMU_VER2_DUAL_PDE_VOL_BIG                                          3:3 /* RWXVF */
+#define NV_MMU_VER2_DUAL_PDE_VOL_BIG_TRUE                              0x00000001 /* RW--V */
+#define NV_MMU_VER2_DUAL_PDE_VOL_BIG_FALSE                             0x00000000 /* RW--V */
+#define NV_MMU_VER2_DUAL_PDE_NO_ATS                                      5:5 /* RWXVF */
+#define NV_MMU_VER2_DUAL_PDE_NO_ATS_TRUE                                 0x1 /* RW--V */
+#define NV_MMU_VER2_DUAL_PDE_NO_ATS_FALSE                                0x0 /* RW--V */
+#define NV_MMU_VER2_DUAL_PDE_ADDRESS_BIG_SYS                                 53:(8-4) /* RWXVF */
+#define NV_MMU_VER2_DUAL_PDE_ADDRESS_BIG_VID         (35-3):(8-4) /* RWXVF */
+#define NV_MMU_VER2_DUAL_PDE_ADDRESS_BIG_VID_PEER   35:(36-3) /* RWXVF */
+#define NV_MMU_VER2_DUAL_PDE_ADDRESS_BIG_VID_PEER_0                    0x00000000 /* RW--V */
+#define NV_MMU_VER2_DUAL_PDE_APERTURE_SMALL                                 66:65 /* RWXVF */
+#define NV_MMU_VER2_DUAL_PDE_APERTURE_SMALL_INVALID                    0x00000000 /* RW--V */
+#define NV_MMU_VER2_DUAL_PDE_APERTURE_SMALL_VIDEO_MEMORY               0x00000001 /* RW--V */
+#define NV_MMU_VER2_DUAL_PDE_APERTURE_SMALL_SYSTEM_COHERENT_MEMORY     0x00000002 /* RW--V */
+#define NV_MMU_VER2_DUAL_PDE_APERTURE_SMALL_SYSTEM_NON_COHERENT_MEMORY 0x00000003 /* RW--V */
+#define NV_MMU_VER2_DUAL_PDE_VOL_SMALL                                      67:67 /* RWXVF */
+#define NV_MMU_VER2_DUAL_PDE_VOL_SMALL_TRUE                            0x00000001 /* RW--V */
+#define NV_MMU_VER2_DUAL_PDE_VOL_SMALL_FALSE                           0x00000000 /* RW--V */
+#define NV_MMU_VER2_DUAL_PDE_ADDRESS_SMALL_SYS                             117:72 /* RWXVF */
+#define NV_MMU_VER2_DUAL_PDE_ADDRESS_SMALL_VID      (99-3):72 /* RWXVF */
+#define NV_MMU_VER2_DUAL_PDE_ADDRESS_SMALL_VID_PEER 99:(100-3) /* RWXVF */
+#define NV_MMU_VER2_DUAL_PDE_ADDRESS_SMALL_VID_PEER_0                  0x00000000 /* RW--V */
+#define NV_MMU_VER2_DUAL_PDE_ADDRESS_SHIFT                             0x0000000c /*       */
+#define NV_MMU_VER2_DUAL_PDE_ADDRESS_BIG_SHIFT 8 /*       */
+#define NV_MMU_VER2_DUAL_PDE__SIZE                                             16
+#define NV_MMU_VER2_PTE                                                      /* ----G */
+#define NV_MMU_VER2_PTE_VALID                                            0:0 /* RWXVF */
+#define NV_MMU_VER2_PTE_VALID_TRUE                                       0x1 /* RW--V */
+#define NV_MMU_VER2_PTE_VALID_FALSE                                      0x0 /* RW--V */
+#define NV_MMU_VER2_PTE_APERTURE                                         2:1 /* RWXVF */
+#define NV_MMU_VER2_PTE_APERTURE_VIDEO_MEMORY                     0x00000000 /* RW--V */
+#define NV_MMU_VER2_PTE_APERTURE_PEER_MEMORY                      0x00000001 /* RW--V */
+#define NV_MMU_VER2_PTE_APERTURE_SYSTEM_COHERENT_MEMORY           0x00000002 /* RW--V */
+#define NV_MMU_VER2_PTE_APERTURE_SYSTEM_NON_COHERENT_MEMORY       0x00000003 /* RW--V */
+#define NV_MMU_VER2_PTE_VOL                                              3:3 /* RWXVF */
+#define NV_MMU_VER2_PTE_VOL_TRUE                                  0x00000001 /* RW--V */
+#define NV_MMU_VER2_PTE_VOL_FALSE                                 0x00000000 /* RW--V */
+#define NV_MMU_VER2_PTE_ENCRYPTED                                        4:4 /* RWXVF */
+#define NV_MMU_VER2_PTE_ENCRYPTED_TRUE                            0x00000001 /* R---V */
+#define NV_MMU_VER2_PTE_ENCRYPTED_FALSE                           0x00000000 /* R---V */
+#define NV_MMU_VER2_PTE_PRIVILEGE                                        5:5 /* RWXVF */
+#define NV_MMU_VER2_PTE_PRIVILEGE_TRUE                                   0x1 /* RW--V */
+#define NV_MMU_VER2_PTE_PRIVILEGE_FALSE                                  0x0 /* RW--V */
+#define NV_MMU_VER2_PTE_READ_ONLY                                        6:6 /* RWXVF */
+#define NV_MMU_VER2_PTE_READ_ONLY_TRUE                                   0x1 /* RW--V */
+#define NV_MMU_VER2_PTE_READ_ONLY_FALSE                                  0x0 /* RW--V */
+#define NV_MMU_VER2_PTE_ATOMIC_DISABLE                                   7:7 /* RWXVF */
+#define NV_MMU_VER2_PTE_ATOMIC_DISABLE_TRUE                              0x1 /* RW--V */
+#define NV_MMU_VER2_PTE_ATOMIC_DISABLE_FALSE                             0x0 /* RW--V */
+#define NV_MMU_VER2_PTE_ADDRESS_SYS                                     53:8 /* RWXVF */
+#define NV_MMU_VER2_PTE_ADDRESS_VID             (35-3):8 /* RWXVF */
+#define NV_MMU_VER2_PTE_ADDRESS_VID_PEER       35:(36-3) /* RWXVF */
+#define NV_MMU_VER2_PTE_ADDRESS_VID_PEER_0                        0x00000000 /* RW--V */
+#define NV_MMU_VER2_PTE_ADDRESS_VID_PEER_1                        0x00000001 /* RW--V */
+#define NV_MMU_VER2_PTE_ADDRESS_VID_PEER_2                        0x00000002 /* RW--V */
+#define NV_MMU_VER2_PTE_ADDRESS_VID_PEER_3                        0x00000003 /* RW--V */
+#define NV_MMU_VER2_PTE_ADDRESS_VID_PEER_4                        0x00000004 /* RW--V */
+#define NV_MMU_VER2_PTE_ADDRESS_VID_PEER_5                        0x00000005 /* RW--V */
+#define NV_MMU_VER2_PTE_ADDRESS_VID_PEER_6                        0x00000006 /* RW--V */
+#define NV_MMU_VER2_PTE_ADDRESS_VID_PEER_7                        0x00000007 /* RW--V */
+#define NV_MMU_VER2_PTE_COMPTAGLINE   (20+35):36 /* RWXVF */
+#define NV_MMU_VER2_PTE_KIND                                           63:56 /* RWXVF */
+#define NV_MMU_VER2_PTE_ADDRESS_SHIFT                             0x0000000c /*       */
+#define NV_MMU_VER2_PTE__SIZE                                              8
+#define NV_MMU_VER3_PDE                                                      /* ----G */
+#define NV_MMU_VER3_PDE_IS_PTE                                           0:0 /* RWXVF */
+#define NV_MMU_VER3_PDE_IS_PTE_TRUE                                      0x1 /* RW--V */
+#define NV_MMU_VER3_PDE_IS_PTE_FALSE                                     0x0 /* RW--V */
+#define NV_MMU_VER3_PDE_VALID                                            0:0 /* RWXVF */
+#define NV_MMU_VER3_PDE_VALID_TRUE                                       0x1 /* RW--V */
+#define NV_MMU_VER3_PDE_VALID_FALSE                                      0x0 /* RW--V */
+#define NV_MMU_VER3_PDE_APERTURE                                         2:1 /* RWXVF */
+#define NV_MMU_VER3_PDE_APERTURE_INVALID                          0x00000000 /* RW--V */
+#define NV_MMU_VER3_PDE_APERTURE_VIDEO_MEMORY                     0x00000001 /* RW--V */
+#define NV_MMU_VER3_PDE_APERTURE_SYSTEM_COHERENT_MEMORY           0x00000002 /* RW--V */
+#define NV_MMU_VER3_PDE_APERTURE_SYSTEM_NON_COHERENT_MEMORY       0x00000003 /* RW--V */
+#define NV_MMU_VER3_PDE_PCF                                                                        5:3 /* RWXVF */
+#define NV_MMU_VER3_PDE_PCF_VALID_CACHED_ATS_ALLOWED__OR__INVALID_ATS_ALLOWED               0x00000000 /* RW--V */
+#define NV_MMU_VER3_PDE_PCF_VALID_CACHED_ATS_ALLOWED                                        0x00000000 /* RW--V */
+#define NV_MMU_VER3_PDE_PCF_INVALID_ATS_ALLOWED                                             0x00000000 /* RW--V */
+#define NV_MMU_VER3_PDE_PCF_VALID_UNCACHED_ATS_ALLOWED__OR__SPARSE_ATS_ALLOWED              0x00000001 /* RW--V */
+#define NV_MMU_VER3_PDE_PCF_VALID_UNCACHED_ATS_ALLOWED                                      0x00000001 /* RW--V */
+#define NV_MMU_VER3_PDE_PCF_SPARSE_ATS_ALLOWED                                              0x00000001 /* RW--V */
+#define NV_MMU_VER3_PDE_PCF_VALID_CACHED_ATS_NOT_ALLOWED__OR__INVALID_ATS_NOT_ALLOWED       0x00000002 /* RW--V */
+#define NV_MMU_VER3_PDE_PCF_VALID_CACHED_ATS_NOT_ALLOWED                                    0x00000002 /* RW--V */
+#define NV_MMU_VER3_PDE_PCF_INVALID_ATS_NOT_ALLOWED                                         0x00000002 /* RW--V */
+#define NV_MMU_VER3_PDE_PCF_VALID_UNCACHED_ATS_NOT_ALLOWED__OR__SPARSE_ATS_NOT_ALLOWED      0x00000003 /* RW--V */
+#define NV_MMU_VER3_PDE_PCF_VALID_UNCACHED_ATS_NOT_ALLOWED                                  0x00000003 /* RW--V */
+#define NV_MMU_VER3_PDE_PCF_SPARSE_ATS_NOT_ALLOWED                                          0x00000003 /* RW--V */
+#define NV_MMU_VER3_PDE_ADDRESS                                             51:12 /* RWXVF */
+#define NV_MMU_VER3_PDE_ADDRESS_SHIFT                                  0x0000000c /*       */
+#define NV_MMU_VER3_PDE__SIZE                                              8
+#define NV_MMU_VER3_DUAL_PDE                                                      /* ----G */
+#define NV_MMU_VER3_DUAL_PDE_IS_PTE                                           0:0 /* RWXVF */
+#define NV_MMU_VER3_DUAL_PDE_IS_PTE_TRUE                                      0x1 /* RW--V */
+#define NV_MMU_VER3_DUAL_PDE_IS_PTE_FALSE                                     0x0 /* RW--V */
+#define NV_MMU_VER3_DUAL_PDE_VALID                                            0:0 /* RWXVF */
+#define NV_MMU_VER3_DUAL_PDE_VALID_TRUE                                       0x1 /* RW--V */
+#define NV_MMU_VER3_DUAL_PDE_VALID_FALSE                                      0x0 /* RW--V */
+#define NV_MMU_VER3_DUAL_PDE_APERTURE_BIG                                     2:1 /* RWXVF */
+#define NV_MMU_VER3_DUAL_PDE_APERTURE_BIG_INVALID                      0x00000000 /* RW--V */
+#define NV_MMU_VER3_DUAL_PDE_APERTURE_BIG_VIDEO_MEMORY                 0x00000001 /* RW--V */
+#define NV_MMU_VER3_DUAL_PDE_APERTURE_BIG_SYSTEM_COHERENT_MEMORY       0x00000002 /* RW--V */
+#define NV_MMU_VER3_DUAL_PDE_APERTURE_BIG_SYSTEM_NON_COHERENT_MEMORY   0x00000003 /* RW--V */
+#define NV_MMU_VER3_DUAL_PDE_PCF_BIG                                                                        5:3 /* RWXVF */
+#define NV_MMU_VER3_DUAL_PDE_PCF_BIG_VALID_CACHED_ATS_ALLOWED__OR__INVALID_ATS_ALLOWED               0x00000000 /* RW--V */
+#define NV_MMU_VER3_DUAL_PDE_PCF_BIG_VALID_CACHED_ATS_ALLOWED                                        0x00000000 /* RW--V */
+#define NV_MMU_VER3_DUAL_PDE_PCF_BIG_INVALID_ATS_ALLOWED                                             0x00000000 /* RW--V */
+#define NV_MMU_VER3_DUAL_PDE_PCF_BIG_VALID_UNCACHED_ATS_ALLOWED__OR__SPARSE_ATS_ALLOWED              0x00000001 /* RW--V */
+#define NV_MMU_VER3_DUAL_PDE_PCF_BIG_VALID_UNCACHED_ATS_ALLOWED                                      0x00000001 /* RW--V */
+#define NV_MMU_VER3_DUAL_PDE_PCF_BIG_SPARSE_ATS_ALLOWED                                              0x00000001 /* RW--V */
+#define NV_MMU_VER3_DUAL_PDE_PCF_BIG_VALID_CACHED_ATS_NOT_ALLOWED__OR__INVALID_ATS_NOT_ALLOWED       0x00000002 /* RW--V */
+#define NV_MMU_VER3_DUAL_PDE_PCF_BIG_VALID_CACHED_ATS_NOT_ALLOWED                                    0x00000002 /* RW--V */
+#define NV_MMU_VER3_DUAL_PDE_PCF_BIG_INVALID_ATS_NOT_ALLOWED                                         0x00000002 /* RW--V */
+#define NV_MMU_VER3_DUAL_PDE_PCF_BIG_VALID_UNCACHED_ATS_NOT_ALLOWED__OR__SPARSE_ATS_NOT_ALLOWED      0x00000003 /* RW--V */
+#define NV_MMU_VER3_DUAL_PDE_PCF_BIG_VALID_UNCACHED_ATS_NOT_ALLOWED                                  0x00000003 /* RW--V */
+#define NV_MMU_VER3_DUAL_PDE_PCF_BIG_SPARSE_ATS_NOT_ALLOWED                                          0x00000003 /* RW--V */
+#define NV_MMU_VER3_DUAL_PDE_ADDRESS_BIG                                     51:8 /* RWXVF */
+#define NV_MMU_VER3_DUAL_PDE_APERTURE_SMALL                                 66:65 /* RWXVF */
+#define NV_MMU_VER3_DUAL_PDE_APERTURE_SMALL_INVALID                    0x00000000 /* RW--V */
+#define NV_MMU_VER3_DUAL_PDE_APERTURE_SMALL_VIDEO_MEMORY               0x00000001 /* RW--V */
+#define NV_MMU_VER3_DUAL_PDE_APERTURE_SMALL_SYSTEM_COHERENT_MEMORY     0x00000002 /* RW--V */
+#define NV_MMU_VER3_DUAL_PDE_APERTURE_SMALL_SYSTEM_NON_COHERENT_MEMORY 0x00000003 /* RW--V */
+#define NV_MMU_VER3_DUAL_PDE_PCF_SMALL                                                                      69:67 /* RWXVF */
+#define NV_MMU_VER3_DUAL_PDE_PCF_SMALL_VALID_CACHED_ATS_ALLOWED__OR__INVALID_ATS_ALLOWED               0x00000000 /* RW--V */
+#define NV_MMU_VER3_DUAL_PDE_PCF_SMALL_VALID_CACHED_ATS_ALLOWED                                        0x00000000 /* RW--V */
+#define NV_MMU_VER3_DUAL_PDE_PCF_SMALL_INVALID_ATS_ALLOWED                                             0x00000000 /* RW--V */
+#define NV_MMU_VER3_DUAL_PDE_PCF_SMALL_VALID_UNCACHED_ATS_ALLOWED__OR__SPARSE_ATS_ALLOWED              0x00000001 /* RW--V */
+#define NV_MMU_VER3_DUAL_PDE_PCF_SMALL_VALID_UNCACHED_ATS_ALLOWED                                      0x00000001 /* RW--V */
+#define NV_MMU_VER3_DUAL_PDE_PCF_SMALL_SPARSE_ATS_ALLOWED                                              0x00000001 /* RW--V */
+#define NV_MMU_VER3_DUAL_PDE_PCF_SMALL_VALID_CACHED_ATS_NOT_ALLOWED__OR__INVALID_ATS_NOT_ALLOWED       0x00000002 /* RW--V */
+#define NV_MMU_VER3_DUAL_PDE_PCF_SMALL_VALID_CACHED_ATS_NOT_ALLOWED                                    0x00000002 /* RW--V */
+#define NV_MMU_VER3_DUAL_PDE_PCF_SMALL_INVALID_ATS_NOT_ALLOWED                                         0x00000002 /* RW--V */
+#define NV_MMU_VER3_DUAL_PDE_PCF_SMALL_VALID_UNCACHED_ATS_NOT_ALLOWED__OR__SPARSE_ATS_NOT_ALLOWED      0x00000003 /* RW--V */
+#define NV_MMU_VER3_DUAL_PDE_PCF_SMALL_VALID_UNCACHED_ATS_NOT_ALLOWED                                  0x00000003 /* RW--V */
+#define NV_MMU_VER3_DUAL_PDE_PCF_SMALL_SPARSE_ATS_NOT_ALLOWED                                          0x00000003 /* RW--V */
+#define NV_MMU_VER3_DUAL_PDE_ADDRESS_SMALL                                 115:76 /* RWXVF */
+#define NV_MMU_VER3_DUAL_PDE_ADDRESS_SHIFT                             0x0000000c /*       */
+#define NV_MMU_VER3_DUAL_PDE_ADDRESS_BIG_SHIFT 8 /*       */
+#define NV_MMU_VER3_DUAL_PDE__SIZE                                             16
+#define NV_MMU_VER3_PTE                                                      /* ----G */
+#define NV_MMU_VER3_PTE_VALID                                            0:0 /* RWXVF */
+#define NV_MMU_VER3_PTE_VALID_TRUE                                       0x1 /* RW--V */
+#define NV_MMU_VER3_PTE_VALID_FALSE                                      0x0 /* RW--V */
+#define NV_MMU_VER3_PTE_APERTURE                                         2:1 /* RWXVF */
+#define NV_MMU_VER3_PTE_APERTURE_VIDEO_MEMORY                     0x00000000 /* RW--V */
+#define NV_MMU_VER3_PTE_APERTURE_PEER_MEMORY                      0x00000001 /* RW--V */
+#define NV_MMU_VER3_PTE_APERTURE_SYSTEM_COHERENT_MEMORY           0x00000002 /* RW--V */
+#define NV_MMU_VER3_PTE_APERTURE_SYSTEM_NON_COHERENT_MEMORY       0x00000003 /* RW--V */
+#define NV_MMU_VER3_PTE_PCF                                                                        7:3 /* RWXVF */
+#define NV_MMU_VER3_PTE_PCF_INVALID                                                         0x00000000 /* RW--V */
+#define NV_MMU_VER3_PTE_PCF_SPARSE                                                          0x00000001 /* RW--V */
+#define NV_MMU_VER3_PTE_PCF_MAPPING_NOWHERE                                                 0x00000002 /* RW--V */
+#define NV_MMU_VER3_PTE_PCF_NO_VALID_4KB_PAGE                                               0x00000003 /* RW--V */
+#define NV_MMU_VER3_PTE_PCF_REGULAR_RW_ATOMIC_CACHED_ACE                                    0x00000000 /* RW--V */
+#define NV_MMU_VER3_PTE_PCF_REGULAR_RW_ATOMIC_UNCACHED_ACE                                  0x00000001 /* RW--V */
+#define NV_MMU_VER3_PTE_PCF_PRIVILEGE_RW_ATOMIC_CACHED_ACE                                  0x00000002 /* RW--V */
+#define NV_MMU_VER3_PTE_PCF_PRIVILEGE_RW_ATOMIC_UNCACHED_ACE                                0x00000003 /* RW--V */
+#define NV_MMU_VER3_PTE_PCF_REGULAR_RO_ATOMIC_CACHED_ACE                                    0x00000004 /* RW--V */
+#define NV_MMU_VER3_PTE_PCF_REGULAR_RO_ATOMIC_UNCACHED_ACE                                   0x00000005 /* RW--V */
+#define NV_MMU_VER3_PTE_PCF_PRIVILEGE_RO_ATOMIC_CACHED_ACE                                  0x00000006 /* RW--V */
+#define NV_MMU_VER3_PTE_PCF_PRIVILEGE_RO_ATOMIC_UNCACHED_ACE                                0x00000007 /* RW--V */
+#define NV_MMU_VER3_PTE_PCF_REGULAR_RW_NO_ATOMIC_CACHED_ACE                                 0x00000008 /* RW--V */
+#define NV_MMU_VER3_PTE_PCF_REGULAR_RW_NO_ATOMIC_UNCACHED_ACE                               0x00000009 /* RW--V */
+#define NV_MMU_VER3_PTE_PCF_PRIVILEGE_RW_NO_ATOMIC_CACHED_ACE                               0x0000000A /* RW--V */
+#define NV_MMU_VER3_PTE_PCF_PRIVILEGE_RW_NO_ATOMIC_UNCACHED_ACE                             0x0000000B /* RW--V */
+#define NV_MMU_VER3_PTE_PCF_REGULAR_RO_NO_ATOMIC_CACHED_ACE                                 0x0000000C /* RW--V */
+#define NV_MMU_VER3_PTE_PCF_REGULAR_RO_NO_ATOMIC_UNCACHED_ACE                               0x0000000D /* RW--V */
+#define NV_MMU_VER3_PTE_PCF_PRIVILEGE_RO_NO_ATOMIC_CACHED_ACE                               0x0000000E /* RW--V */
+#define NV_MMU_VER3_PTE_PCF_PRIVILEGE_RO_NO_ATOMIC_UNCACHED_ACE                             0x0000000F /* RW--V */
+#define NV_MMU_VER3_PTE_PCF_REGULAR_RW_ATOMIC_CACHED_ACD                                    0x00000010 /* RW--V */
+#define NV_MMU_VER3_PTE_PCF_REGULAR_RW_ATOMIC_UNCACHED_ACD                                  0x00000011 /* RW--V */
+#define NV_MMU_VER3_PTE_PCF_PRIVILEGE_RW_ATOMIC_CACHED_ACD                                  0x00000012 /* RW--V */
+#define NV_MMU_VER3_PTE_PCF_PRIVILEGE_RW_ATOMIC_UNCACHED_ACD                                0x00000013 /* RW--V */
+#define NV_MMU_VER3_PTE_PCF_REGULAR_RO_ATOMIC_CACHED_ACD                                    0x00000014 /* RW--V */
+#define NV_MMU_VER3_PTE_PCF_REGULAR_RO_ATOMIC_UNCACHED_ACD                                  0x00000015 /* RW--V */
+#define NV_MMU_VER3_PTE_PCF_PRIVILEGE_RO_ATOMIC_CACHED_ACD                                  0x00000016 /* RW--V */
+#define NV_MMU_VER3_PTE_PCF_PRIVILEGE_RO_ATOMIC_UNCACHED_ACD                                0x00000017 /* RW--V */
+#define NV_MMU_VER3_PTE_PCF_REGULAR_RW_NO_ATOMIC_CACHED_ACD                                 0x00000018 /* RW--V */
+#define NV_MMU_VER3_PTE_PCF_REGULAR_RW_NO_ATOMIC_UNCACHED_ACD                               0x00000019 /* RW--V */
+#define NV_MMU_VER3_PTE_PCF_PRIVILEGE_RW_NO_ATOMIC_CACHED_ACD                               0x0000001A /* RW--V */
+#define NV_MMU_VER3_PTE_PCF_PRIVILEGE_RW_NO_ATOMIC_UNCACHED_ACD                             0x0000001B /* RW--V */
+#define NV_MMU_VER3_PTE_PCF_REGULAR_RO_NO_ATOMIC_CACHED_ACD                                 0x0000001C /* RW--V */
+#define NV_MMU_VER3_PTE_PCF_REGULAR_RO_NO_ATOMIC_UNCACHED_ACD                               0x0000001D /* RW--V */
+#define NV_MMU_VER3_PTE_PCF_PRIVILEGE_RO_NO_ATOMIC_CACHED_ACD                               0x0000001E /* RW--V */
+#define NV_MMU_VER3_PTE_PCF_PRIVILEGE_RO_NO_ATOMIC_UNCACHED_ACD                             0x0000001F /* RW--V */
+#define NV_MMU_VER3_PTE_KIND                                           11:8 /* RWXVF */
+#define NV_MMU_VER3_PTE_ADDRESS                                         51:12 /* RWXVF */
+#define NV_MMU_VER3_PTE_ADDRESS_SYS                                     51:12 /* RWXVF */
+#define NV_MMU_VER3_PTE_ADDRESS_PEER                                    51:12 /* RWXVF */
+#define NV_MMU_VER3_PTE_ADDRESS_VID                                     39:12 /* RWXVF */
+#define NV_MMU_VER3_PTE_PEER_ID                63:(64-3) /* RWXVF */
+#define NV_MMU_VER3_PTE_PEER_ID_0                                 0x00000000 /* RW--V */
+#define NV_MMU_VER3_PTE_PEER_ID_1                                 0x00000001 /* RW--V */
+#define NV_MMU_VER3_PTE_PEER_ID_2                                 0x00000002 /* RW--V */
+#define NV_MMU_VER3_PTE_PEER_ID_3                                 0x00000003 /* RW--V */
+#define NV_MMU_VER3_PTE_PEER_ID_4                                 0x00000004 /* RW--V */
+#define NV_MMU_VER3_PTE_PEER_ID_5                                 0x00000005 /* RW--V */
+#define NV_MMU_VER3_PTE_PEER_ID_6                                 0x00000006 /* RW--V */
+#define NV_MMU_VER3_PTE_PEER_ID_7                                 0x00000007 /* RW--V */
+#define NV_MMU_VER3_PTE_ADDRESS_SHIFT                             0x0000000c /*       */
+#define NV_MMU_VER3_PTE__SIZE                                              8
+#define NV_MMU_CLIENT                                             /* ----G */
+#define NV_MMU_CLIENT_KIND                                    2:0 /* RWXVF */
+#define NV_MMU_CLIENT_KIND_Z16                                0x1 /* R---V */
+#define NV_MMU_CLIENT_KIND_S8                                 0x2 /* R---V */
+#define NV_MMU_CLIENT_KIND_S8Z24                              0x3 /* R---V */
+#define NV_MMU_CLIENT_KIND_ZF32_X24S8                         0x4 /* R---V */
+#define NV_MMU_CLIENT_KIND_Z24S8                              0x5 /* R---V */
+#define NV_MMU_CLIENT_KIND_GENERIC_MEMORY                     0x6 /* R---V */
+#define NV_MMU_CLIENT_KIND_INVALID                            0x7 /* R---V */
+#endif // __gb100_dev_mmu_h__
--- a/kernel-open/nvidia-uvm/nv-kthread-q.c
+++ b/kernel-open/nvidia-uvm/nv-kthread-q.c
@@ -1,5 +1,5 @@
 /*
- * SPDX-FileCopyrightText: Copyright (c) 2016 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
+ * SPDX-FileCopyrightText: Copyright (c) 2016-2024 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
 * SPDX-License-Identifier: MIT
 *
 * Permission is hereby granted, free of charge, to any person obtaining a
@@ -176,7 +176,7 @@ static struct task_struct *thread_create_on_node(int (*threadfn)(void *data),
 {

    unsigned i, j;
-    const static unsigned attempts = 3;
+    static const unsigned attempts = 3;
    struct task_struct *thread[3];

    for (i = 0;; i++) {
--- a/kernel-open/nvidia-uvm/nvidia-uvm-sources.Kbuild
+++ b/kernel-open/nvidia-uvm/nvidia-uvm-sources.Kbuild
@@ -6,6 +6,10 @@ NVIDIA_UVM_SOURCES += nvidia-uvm/uvm_conf_computing.c
 NVIDIA_UVM_SOURCES += nvidia-uvm/uvm_sec2_test.c
 NVIDIA_UVM_SOURCES += nvidia-uvm/uvm_maxwell_sec2.c
 NVIDIA_UVM_SOURCES += nvidia-uvm/uvm_hopper_sec2.c
+NVIDIA_UVM_SOURCES += nvidia-uvm/uvm_blackwell.c
+NVIDIA_UVM_SOURCES += nvidia-uvm/uvm_blackwell_fault_buffer.c
+NVIDIA_UVM_SOURCES += nvidia-uvm/uvm_blackwell_mmu.c
+NVIDIA_UVM_SOURCES += nvidia-uvm/uvm_blackwell_host.c
 NVIDIA_UVM_SOURCES += nvidia-uvm/uvm_common.c
 NVIDIA_UVM_SOURCES += nvidia-uvm/uvm_linux.c
 NVIDIA_UVM_SOURCES += nvidia-uvm/uvm_debug_optimized.c
@@ -72,6 +76,7 @@ NVIDIA_UVM_SOURCES += nvidia-uvm/uvm_turing_host.c
 NVIDIA_UVM_SOURCES += nvidia-uvm/uvm_ampere.c
 NVIDIA_UVM_SOURCES += nvidia-uvm/uvm_ampere_ce.c
 NVIDIA_UVM_SOURCES += nvidia-uvm/uvm_ampere_host.c
+NVIDIA_UVM_SOURCES += nvidia-uvm/uvm_ampere_fault_buffer.c
 NVIDIA_UVM_SOURCES += nvidia-uvm/uvm_ampere_mmu.c
 NVIDIA_UVM_SOURCES += nvidia-uvm/uvm_hopper.c
 NVIDIA_UVM_SOURCES += nvidia-uvm/uvm_hopper_fault_buffer.c
--- a/kernel-open/nvidia-uvm/nvidia-uvm.Kbuild
+++ b/kernel-open/nvidia-uvm/nvidia-uvm.Kbuild
@@ -82,12 +82,12 @@ NV_CONFTEST_FUNCTION_COMPILE_TESTS += set_pages_uc
 NV_CONFTEST_FUNCTION_COMPILE_TESTS += ktime_get_raw_ts64
 NV_CONFTEST_FUNCTION_COMPILE_TESTS += ioasid_get
 NV_CONFTEST_FUNCTION_COMPILE_TESTS += mm_pasid_drop
-NV_CONFTEST_FUNCTION_COMPILE_TESTS += migrate_vma_setup
 NV_CONFTEST_FUNCTION_COMPILE_TESTS += mmget_not_zero
 NV_CONFTEST_FUNCTION_COMPILE_TESTS += mmgrab
 NV_CONFTEST_FUNCTION_COMPILE_TESTS += iommu_sva_bind_device_has_drvdata_arg
 NV_CONFTEST_FUNCTION_COMPILE_TESTS += vm_fault_to_errno
 NV_CONFTEST_FUNCTION_COMPILE_TESTS += find_next_bit_wrap
+NV_CONFTEST_FUNCTION_COMPILE_TESTS += iommu_is_dma_domain

 NV_CONFTEST_TYPE_COMPILE_TESTS += backing_dev_info
 NV_CONFTEST_TYPE_COMPILE_TESTS += mm_context_t
@@ -114,5 +114,7 @@ NV_CONFTEST_TYPE_COMPILE_TESTS += mempolicy_has_unified_nodes
 NV_CONFTEST_TYPE_COMPILE_TESTS += mempolicy_has_home_node
 NV_CONFTEST_TYPE_COMPILE_TESTS += mpol_preferred_many_present
 NV_CONFTEST_TYPE_COMPILE_TESTS += mmu_interval_notifier
+NV_CONFTEST_TYPE_COMPILE_TESTS += fault_flag_remote_present

 NV_CONFTEST_SYMBOL_COMPILE_TESTS += is_export_symbol_present_int_active_memcg
+NV_CONFTEST_SYMBOL_COMPILE_TESTS += is_export_symbol_present_migrate_vma_setup
--- a/kernel-open/nvidia-uvm/nvstatus.c
+++ b/kernel-open/nvidia-uvm/nvstatus.c
@@ -25,7 +25,8 @@

 #if !defined(NV_PRINTF_STRING_SECTION)
 #if defined(NVRM) && NVOS_IS_LIBOS
-#define NV_PRINTF_STRING_SECTION         __attribute__ ((section (".logging")))
+#include "libos_log.h"
+#define NV_PRINTF_STRING_SECTION LIBOS_SECTION_LOGGING
 #else // defined(NVRM) && NVOS_IS_LIBOS
 #define NV_PRINTF_STRING_SECTION
 #endif // defined(NVRM) && NVOS_IS_LIBOS
@@ -33,7 +34,7 @@

 /*
 * Include nvstatuscodes.h twice.  Once for creating constant strings in the
- * the NV_PRINTF_STRING_SECTION section of the ececutable, and once to build
+ * the NV_PRINTF_STRING_SECTION section of the executable, and once to build
 * the g_StatusCodeList table.
 */
 #undef NV_STATUS_CODE
--- a/kernel-open/nvidia-uvm/uvm.c
+++ b/kernel-open/nvidia-uvm/uvm.c
@@ -1,5 +1,5 @@
 /*******************************************************************************
-    Copyright (c) 2015-2022 NVIDIA Corporation
+    Copyright (c) 2015-2023 NVIDIA Corporation

    Permission is hereby granted, free of charge, to any person obtaining a copy
    of this software and associated documentation files (the "Software"), to
@@ -1053,7 +1053,7 @@ NV_STATUS uvm_test_register_unload_state_buffer(UVM_TEST_REGISTER_UNLOAD_STATE_B
    // are not used because unload_state_buf may be a managed memory pointer and
    // therefore a locking assertion from the CPU fault handler could be fired.
    nv_mmap_read_lock(current->mm);
-    ret = NV_PIN_USER_PAGES(params->unload_state_buf, 1, FOLL_WRITE, &page, NULL);
+    ret = NV_PIN_USER_PAGES(params->unload_state_buf, 1, FOLL_WRITE, &page);
    nv_mmap_read_unlock(current->mm);

    if (ret < 0)
--- a/kernel-open/nvidia-uvm/uvm.h
+++ b/kernel-open/nvidia-uvm/uvm.h
@@ -1,5 +1,5 @@
 /*******************************************************************************
-    Copyright (c) 2013-2022 NVIDIA Corporation
+    Copyright (c) 2013-2024 NVIDIA Corporation

    Permission is hereby granted, free of charge, to any person obtaining a copy
    of this software and associated documentation files (the "Software"), to
@@ -45,16 +45,20 @@
 //     #endif
 // 3) Do the same thing for the function definition, and for any structs that
 //    are taken as arguments to these functions.
-// 4) Let this change propagate over to cuda_a, so that the CUDA driver can
-//    start using the new API by bumping up the API version number its using.
-//    This can be found in gpgpu/cuda/cuda.nvmk.
-// 5) Once the cuda_a changes have made it back into chips_a, remove the old API
-//    declaration, definition, and any old structs that were in use.
+// 4) Let this change propagate over to cuda_a and dev_a, so that the CUDA and
+//    nvidia-cfg libraries can start using the new API by bumping up the API
+//    version number it's using.
+//    Places where UVM_API_REVISION is defined are:
+//      drivers/gpgpu/cuda/cuda.nvmk (cuda_a)
+//      drivers/setup/linux/nvidia-cfg/makefile.nvmk (dev_a)
+// 5) Once the dev_a and cuda_a changes have made it back into chips_a,
+//    remove the old API declaration, definition, and any old structs that were
+//    in use.

 #ifndef _UVM_H_
 #define _UVM_H_

-#define UVM_API_LATEST_REVISION 8
+#define UVM_API_LATEST_REVISION 12

 #if !defined(UVM_API_REVISION)
 #error "please define UVM_API_REVISION macro to a desired version number or UVM_API_LATEST_REVISION macro"
@@ -163,7 +167,7 @@ NV_STATUS UvmSetDriverVersion(NvU32 major, NvU32 changelist);
 //
 // Error codes:
 //     NV_ERR_NOT_SUPPORTED:
-//         The Linux kernel is not able to support UVM. This could be because
+//         The kernel is not able to support UVM. This could be because
 //         the kernel is too old, or because it lacks a feature that UVM
 //         requires. The kernel log will have details.
 //
@@ -180,12 +184,8 @@ NV_STATUS UvmSetDriverVersion(NvU32 major, NvU32 changelist);
 //         because it is not very informative.
 //
 //------------------------------------------------------------------------------
-#if UVM_API_REV_IS_AT_MOST(4)
-NV_STATUS UvmInitialize(UvmFileDescriptor fd);
-#else
 NV_STATUS UvmInitialize(UvmFileDescriptor fd,
                        NvU64             flags);
-#endif

 //------------------------------------------------------------------------------
 // UvmDeinitialize
@@ -297,7 +297,9 @@ NV_STATUS UvmIsPageableMemoryAccessSupported(NvBool *pageableMemAccess);
 //
 // Arguments:
 //     gpuUuid: (INPUT)
-//         UUID of the GPU for which pageable memory access support is queried.
+//         UUID of the physical GPU if the GPU is not SMC capable or SMC
+//         enabled, or the GPU instance UUID of the partition for which
+//         pageable memory access support is queried.
 //
 //     pageableMemAccess: (OUTPUT)
 //         Returns true (non-zero) if the GPU represented by gpuUuid supports
@@ -327,9 +329,19 @@ NV_STATUS UvmIsPageableMemoryAccessSupportedOnGpu(const NvProcessorUuid *gpuUuid
 // usage. Calling UvmRegisterGpu multiple times on the same GPU from the same
 // process results in an error.
 //
+// After successfully registering a GPU partition, all subsequent API calls
+// which take a NvProcessorUuid argument (including UvmGpuMappingAttributes),
+// must use the GI partition UUID which can be obtained with
+// NvRmControl(NVC637_CTRL_CMD_GET_UUID). Otherwise, if the GPU is not SMC
+// capable or SMC enabled, the physical GPU UUID must be used.
+//
 // Arguments:
 //     gpuUuid: (INPUT)
-//         UUID of the GPU to register.
+//         UUID of the physical GPU to register.
+//
+//     platformParams: (INPUT)
+//         User handles identifying the GPU partition to register.
+//         This should be NULL if the GPU is not SMC capable or SMC enabled.
 //
 // Error codes:
 //     NV_ERR_NO_MEMORY:
@@ -364,27 +376,31 @@ NV_STATUS UvmIsPageableMemoryAccessSupportedOnGpu(const NvProcessorUuid *gpuUuid
 //         OS state required to register the GPU is not present.
 //
 //     NV_ERR_INVALID_STATE:
-//         OS state required to register the GPU is malformed.
+//         OS state required to register the GPU is malformed, or the partition
+//         identified by the user handles or its configuration changed.
 //
 //     NV_ERR_GENERIC:
 //         Unexpected error. We try hard to avoid returning this error code,
 //         because it is not very informative.
 //
 //------------------------------------------------------------------------------
+#if UVM_API_REV_IS_AT_MOST(8)
 NV_STATUS UvmRegisterGpu(const NvProcessorUuid *gpuUuid);
+#else
+NV_STATUS UvmRegisterGpu(const NvProcessorUuid *gpuUuid,
+                         const UvmGpuPlatformParams *platformParams);
+#endif

+#if UVM_API_REV_IS_AT_MOST(8)
 //------------------------------------------------------------------------------
 // UvmRegisterGpuSmc
 //
 // The same as UvmRegisterGpu, but takes additional parameters to specify the
 // GPU partition being registered if SMC is enabled.
 //
-// TODO: Bug 2844714: Merge UvmRegisterGpuSmc() with UvmRegisterGpu() once
-//       the initial SMC support is in place.
-//
 // Arguments:
 //     gpuUuid: (INPUT)
-//         UUID of the parent GPU of the SMC partition to register.
+//         UUID of the physical GPU of the SMC partition to register.
 //
 //     platformParams: (INPUT)
 //         User handles identifying the partition to register.
@@ -397,6 +413,7 @@ NV_STATUS UvmRegisterGpu(const NvProcessorUuid *gpuUuid);
 //
 NV_STATUS UvmRegisterGpuSmc(const NvProcessorUuid *gpuUuid,
                            const UvmGpuPlatformParams *platformParams);
+#endif

 //------------------------------------------------------------------------------
 // UvmUnregisterGpu
@@ -422,7 +439,8 @@ NV_STATUS UvmRegisterGpuSmc(const NvProcessorUuid *gpuUuid,
 //
 // Arguments:
 //     gpuUuid: (INPUT)
-//         UUID of the GPU to unregister.
+//         UUID of the physical GPU if the GPU is not SMC capable or SMC
+//         enabled, or the GPU instance UUID of the partition to unregister.
 //
 // Error codes:
 //     NV_ERR_INVALID_DEVICE:
@@ -480,7 +498,8 @@ NV_STATUS UvmUnregisterGpu(const NvProcessorUuid *gpuUuid);
 //
 // Arguments:
 //     gpuUuid: (INPUT)
-//         UUID of the GPU to register.
+//         UUID of the physical GPU if the GPU is not SMC capable or SMC
+//         enabled, or the GPU instance UUID of the partition to register.
 //
 //     platformParams: (INPUT)
 //         On Linux: RM ctrl fd, hClient and hVaSpace.
@@ -551,7 +570,9 @@ NV_STATUS UvmRegisterGpuVaSpace(const NvProcessorUuid             *gpuUuid,
 //
 // Arguments:
 //     gpuUuid: (INPUT)
-//         UUID of the GPU whose VA space should be unregistered.
+//         UUID of the physical GPU if the GPU is not SMC capable or SMC
+//         enabled, or the GPU instance UUID of the partition whose VA space
+//         should be unregistered.
 //
 // Error codes:
 //     NV_ERR_INVALID_DEVICE:
@@ -581,7 +602,7 @@ NV_STATUS UvmUnregisterGpuVaSpace(const NvProcessorUuid *gpuUuid);
 //
 // The two GPUs must be connected via PCIe. An error is returned if the GPUs are
 // not connected or are connected over an interconnect different than PCIe
-// (NVLink, for example).
+// (NVLink or SMC partitions, for example).
 //
 // If both GPUs have GPU VA spaces registered for them, the two GPU VA spaces
 // must support the same set of page sizes for GPU mappings.
@@ -594,10 +615,12 @@ NV_STATUS UvmUnregisterGpuVaSpace(const NvProcessorUuid *gpuUuid);
 //
 // Arguments:
 //     gpuUuidA: (INPUT)
-//         UUID of GPU A.
+//         UUID of the physical GPU if the GPU is not SMC capable or SMC
+//         enabled, or the GPU instance UUID of the partition A.
 //
 //     gpuUuidB: (INPUT)
-//         UUID of GPU B.
+//         UUID of the physical GPU if the GPU is not SMC capable or SMC
+//         enabled, or the GPU instance UUID of the partition B.
 //
 // Error codes:
 //     NV_ERR_NO_MEMORY:
@@ -643,10 +666,12 @@ NV_STATUS UvmEnablePeerAccess(const NvProcessorUuid *gpuUuidA,
 //
 // Arguments:
 //     gpuUuidA: (INPUT)
-//         UUID of GPU A.
+//         UUID of the physical GPU if the GPU is not SMC capable or SMC
+//         enabled, or the GPU instance UUID of the partition A.
 //
 //     gpuUuidB: (INPUT)
-//         UUID of GPU B.
+//         UUID of the physical GPU if the GPU is not SMC capable or SMC
+//         enabled, or the GPU instance UUID of the partition B.
 //
 // Error codes:
 //     NV_ERR_INVALID_DEVICE:
@@ -691,7 +716,9 @@ NV_STATUS UvmDisablePeerAccess(const NvProcessorUuid *gpuUuidA,
 //
 // Arguments:
 //     gpuUuid: (INPUT)
-//        UUID of the GPU that the channel is associated with.
+//        UUID of the physical GPU if the GPU is not SMC capable or SMC
+//        enabled, or the GPU instance UUID of the partition that the channel is
+//        associated with.
 //
 //     platformParams: (INPUT)
 //         On Linux: RM ctrl fd, hClient and hChannel.
@@ -1130,11 +1157,14 @@ NV_STATUS UvmAllowMigrationRangeGroups(const NvU64 *rangeGroupIds,
 //         Length, in bytes, of the range.
 //
 //     preferredLocationUuid: (INPUT)
-//         UUID of the preferred location for this VA range.
+//         UUID of the CPU, UUID of the physical GPU if the GPU is not SMC
+//         capable or SMC enabled, or the GPU instance UUID of the partition of
+//         the preferred location for this VA range.
 //
 //     accessedByUuids: (INPUT)
-//         UUIDs of all processors that should have persistent mappings to this
-//         VA range.
+//         UUID of the CPU, UUID of the physical GPUs if the GPUs are not SMC
+//         capable or SMC enabled, or the GPU instance UUID of the partitions
+//         that should have persistent mappings to this VA range.
 //
 //     accessedByCount: (INPUT)
 //         Number of elements in the accessedByUuids array.
@@ -1412,12 +1442,15 @@ NV_STATUS UvmAllocSemaphorePool(void                          *base,
 //         Length, in bytes, of the range.
 //
 //     destinationUuid: (INPUT)
-//         UUID of the destination processor to migrate pages to.
+//         UUID of the physical GPU if the GPU is not SMC capable or SMC
+//         enabled, the GPU instance UUID of the partition, or the CPU UUID to
+//         migrate pages to.
 //
 //     preferredCpuMemoryNode: (INPUT)
 //         Preferred CPU NUMA memory node used if the destination processor is
-//         the CPU. This argument is ignored if the given virtual address range
-//         corresponds to managed memory.
+//         the CPU. -1 indicates no preference, in which case the pages used
+//         can be on any of the available CPU NUMA nodes. If NUMA is disabled
+//         only 0 and -1 are allowed.
 //
 // Error codes:
 //     NV_ERR_INVALID_ADDRESS:
@@ -1431,6 +1464,11 @@ NV_STATUS UvmAllocSemaphorePool(void                          *base,
 //         The VA range exceeds the largest virtual address supported by the
 //         destination processor.
 //
+//     NV_ERR_INVALID_ARGUMENT:
+//         preferredCpuMemoryNode is not a valid CPU NUMA node or it corresponds
+//         to a NUMA node ID for a registered GPU. If NUMA is disabled, it
+//         indicates that preferredCpuMemoryNode was not either 0 or -1.
+//
 //     NV_ERR_INVALID_DEVICE:
 //         destinationUuid does not represent a valid processor such as a CPU or
 //         a GPU with a GPU VA space registered for it. Or destinationUuid is a
@@ -1456,16 +1494,10 @@ NV_STATUS UvmAllocSemaphorePool(void                          *base,
 //         pages were associated with a non-migratable range group.
 //
 //------------------------------------------------------------------------------
-#if UVM_API_REV_IS_AT_MOST(5)
-NV_STATUS UvmMigrate(void                  *base,
-                     NvLength               length,
-                     const NvProcessorUuid *destinationUuid);
-#else
 NV_STATUS UvmMigrate(void                  *base,
                     NvLength               length,
                     const NvProcessorUuid *destinationUuid,
                     NvS32                  preferredCpuMemoryNode);
-#endif

 //------------------------------------------------------------------------------
 // UvmMigrateAsync
@@ -1497,12 +1529,15 @@ NV_STATUS UvmMigrate(void                  *base,
 //         Length, in bytes, of the range.
 //
 //     destinationUuid: (INPUT)
-//         UUID of the destination processor to migrate pages to.
+//         UUID of the physical GPU if the GPU is not SMC capable or SMC
+//         enabled, the GPU instance UUID of the partition, or the CPU UUID to
+//         migrate pages to.
 //
 //     preferredCpuMemoryNode: (INPUT)
 //         Preferred CPU NUMA memory node used if the destination processor is
-//         the CPU. This argument is ignored if the given virtual address range
-//         corresponds to managed memory.
+//         the CPU. -1 indicates no preference, in which case the pages used
+//         can be on any of the available CPU NUMA nodes. If NUMA is disabled
+//         only 0 and -1 are allowed.
 //
 //     semaphoreAddress: (INPUT)
 //         Base address of the semaphore.
@@ -1547,30 +1582,20 @@ NV_STATUS UvmMigrate(void                  *base,
 //         pages were associated with a non-migratable range group.
 //
 //------------------------------------------------------------------------------
-#if UVM_API_REV_IS_AT_MOST(5)
-NV_STATUS UvmMigrateAsync(void                  *base,
-                          NvLength               length,
-                          const NvProcessorUuid *destinationUuid,
-                          void                  *semaphoreAddress,
-                          NvU32                  semaphorePayload);
-#else
 NV_STATUS UvmMigrateAsync(void                  *base,
                          NvLength               length,
                          const NvProcessorUuid *destinationUuid,
                          NvS32                  preferredCpuMemoryNode,
                          void                  *semaphoreAddress,
                          NvU32                  semaphorePayload);
-#endif

 //------------------------------------------------------------------------------
 // UvmMigrateRangeGroup
 //
 // Migrates the backing of all virtual address ranges associated with the given
 // range group to the specified destination processor. The behavior of this API
-// is equivalent to calling UvmMigrate on each VA range associated with this
-// range group. The value for the preferredCpuMemoryNode is irrelevant in this
-// case as it only applies to migrations of pageable address, which cannot be
-// used to create range groups.
+// is equivalent to calling UvmMigrate with preferredCpuMemoryNode = -1 on each
+// VA range associated with this range group.
 //
 // Any errors encountered during migration are returned immediately. No attempt
 // is made to migrate the remaining unmigrated ranges and the ranges that are
@@ -1584,7 +1609,9 @@ NV_STATUS UvmMigrateAsync(void                  *base,
 //         Id of the range group whose associated VA ranges have to be migrated.
 //
 //     destinationUuid: (INPUT)
-//         UUID of the destination processor to migrate pages to.
+//         UUID of the physical GPU if the GPU is not SMC capable or SMC
+//         enabled, the GPU instance UUID of the partition, or the CPU UUID to
+//         migrate pages to.
 //
 // Error codes:
 //     NV_ERR_OBJECT_NOT_FOUND:
@@ -1946,7 +1973,9 @@ NV_STATUS UvmMapExternalAllocation(void                              *base,
 //
 //
 //     gpuUuid: (INPUT)
-//         UUID of the GPU to map the sparse region on.
+//         UUID of the physical GPU if the GPU is not SMC capable or SMC
+//         enabled, or the GPU instance UUID of the partition to map the sparse
+//         region on.
 //
 // Errors:
 //     NV_ERR_INVALID_ADDRESS:
@@ -2003,7 +2032,9 @@ NV_STATUS UvmMapExternalSparse(void                  *base,
 //         The length of the virtual address range.
 //
 //     gpuUuid: (INPUT)
-//         UUID of the GPU to unmap the VA range from.
+//         UUID of the physical GPU if the GPU is not SMC capable or SMC
+//         enabled, or the GPU instance UUID of the partition to unmap the VA
+//         range from.
 //
 // Errors:
 //     NV_ERR_INVALID_ADDRESS:
@@ -2070,7 +2101,9 @@ NV_STATUS UvmUnmapExternalAllocation(void                  *base,
 //         supported by the GPU.
 //
 //     gpuUuid: (INPUT)
-//         UUID of the GPU to map the dynamic parallelism region on.
+//         UUID of the physical GPU if the GPU is not SMC capable or SMC
+//         enabled, or the GPU instance UUID of the partition to map the
+//         dynamic parallelism region on.
 //
 // Errors:
 //     NV_ERR_UVM_ADDRESS_IN_USE:
@@ -2144,7 +2177,8 @@ NV_STATUS UvmMapDynamicParallelismRegion(void                  *base,
 //
 // If any page in the VA range has a preferred location, then the migration and
 // mapping policies associated with this API take precedence over those related
-// to the preferred location.
+// to the preferred location. If the preferred location is a specific CPU NUMA
+// node, that NUMA node will be used for a CPU-resident copy of the page.
 //
 // If any pages in this VA range have any processors present in their
 // accessed-by list, the migration and mapping policies associated with this
@@ -2275,7 +2309,7 @@ NV_STATUS UvmDisableReadDuplication(void     *base,
 // UvmPreventMigrationRangeGroups has not been called on the range group that
 // those pages are associated with, then the migration and mapping policies
 // associated with UvmEnableReadDuplication override the policies outlined
-// above. Note that enabling read duplication on on any pages in this VA range
+// above. Note that enabling read duplication on any pages in this VA range
 // does not clear the state set by this API for those pages. It merely overrides
 // the policies associated with this state until read duplication is disabled
 // for those pages.
@@ -2301,15 +2335,15 @@ NV_STATUS UvmDisableReadDuplication(void     *base,
 //         Length, in bytes, of the range.
 //
 //     preferredLocationUuid: (INPUT)
-//         UUID of the preferred location.
+//         UUID of the physical GPU if the GPU is not SMC capable or SMC
+//         enabled, the GPU instance UUID of the partition, or the CPU UUID
+//         preferred location.
 //
-//     preferredCpuNumaNode: (INPUT)
+//     preferredCpuMemoryNode: (INPUT)
 //         Preferred CPU NUMA memory node used if preferredLocationUuid is the
 //         UUID of the CPU. -1 is a special value which indicates all CPU nodes
-//         allowed by the global and thread memory policies. This argument is
-//         ignored if preferredLocationUuid refers to a GPU or the given virtual
-//         address range corresponds to managed memory. If NUMA is not enabled,
-//         only 0 or -1 is allowed.
+//         allowed by the global and thread memory policies. If NUMA is disabled
+//         only 0 and -1 are allowed.
 //
 // Errors:
 //     NV_ERR_INVALID_ADDRESS:
@@ -2339,10 +2373,11 @@ NV_STATUS UvmDisableReadDuplication(void     *base,
 //
 //      NV_ERR_INVALID_ARGUMENT:
 //         One of the following occured:
-//         - preferredLocationUuid is the UUID of a CPU and preferredCpuNumaNode
-//           refers to a registered GPU.
-//         - preferredCpuNumaNode is invalid and preferredLocationUuid is the
-//           UUID of the CPU.
+//         - preferredLocationUuid is the UUID of the CPU and
+//           preferredCpuMemoryNode is either:
+//              - not a valid NUMA node,
+//              - not a possible NUMA node, or
+//              - a NUMA node ID corresponding to a registered GPU.
 //
 //     NV_ERR_NOT_SUPPORTED:
 //         The UVM file descriptor is associated with another process and the
@@ -2353,16 +2388,10 @@ NV_STATUS UvmDisableReadDuplication(void     *base,
 //         because it is not very informative.
 //
 //------------------------------------------------------------------------------
-#if UVM_API_REV_IS_AT_MOST(7)
-NV_STATUS UvmSetPreferredLocation(void                  *base,
-                                  NvLength               length,
-                                  const NvProcessorUuid *preferredLocationUuid);
-#else
 NV_STATUS UvmSetPreferredLocation(void                  *base,
                                  NvLength               length,
                                  const NvProcessorUuid *preferredLocationUuid,
-                                  NvS32                  preferredCpuNumaNode);
-#endif
+                                  NvS32                  preferredCpuMemoryNode);

 //------------------------------------------------------------------------------
 // UvmUnsetPreferredLocation
@@ -2485,8 +2514,9 @@ NV_STATUS UvmUnsetPreferredLocation(void     *base,
 //         Length, in bytes, of the range.
 //
 //     accessedByUuid: (INPUT)
-//         UUID of the processor that should have pages in the the VA range
-//         mapped when possible.
+//         UUID of the physical GPU if the GPU is not SMC capable or SMC
+//         enabled, the GPU instance UUID of the partition, or the CPU UUID
+//         that should have pages in the VA range mapped when possible.
 //
 // Errors:
 //     NV_ERR_INVALID_ADDRESS:
@@ -2554,8 +2584,10 @@ NV_STATUS UvmSetAccessedBy(void                  *base,
 //         Length, in bytes, of the range.
 //
 //     accessedByUuid: (INPUT)
-//         UUID of the processor from which any policies set by
-//         UvmSetAccessedBy should be revoked for the given VA range.
+//         UUID of the physical GPU if the GPU is not SMC capable or SMC
+//         enabled, the GPU instance UUID of the partition, or the CPU UUID
+//         from which any policies set by UvmSetAccessedBy should be revoked
+//         for the given VA range.
 //
 // Errors:
 //     NV_ERR_INVALID_ADDRESS:
@@ -2613,7 +2645,9 @@ NV_STATUS UvmUnsetAccessedBy(void                  *base,
 //
 // Arguments:
 //     gpuUuid: (INPUT)
-//         UUID of the GPU to enable software-assisted system-wide atomics on.
+//         UUID of the physical GPU if the GPU is not SMC capable or SMC
+//         enabled, or the GPU instance UUID of the partition to enable
+//         software-assisted system-wide atomics on.
 //
 // Error codes:
 //     NV_ERR_NO_MEMORY:
@@ -2649,7 +2683,9 @@ NV_STATUS UvmEnableSystemWideAtomics(const NvProcessorUuid *gpuUuid);
 //
 // Arguments:
 //     gpuUuid: (INPUT)
-//         UUID of the GPU to disable software-assisted system-wide atomics on.
+//         UUID of the physical GPU if the GPU is not SMC capable or SMC
+//         enabled, or the GPU instance UUID of the partition to disable
+//         software-assisted system-wide atomics on.
 //
 // Error codes:
 //     NV_ERR_INVALID_DEVICE:
@@ -2878,7 +2914,9 @@ NV_STATUS UvmDebugCountersEnable(UvmDebugSession   session,
 //         Name of the counter in that scope.
 //
 //     gpu: (INPUT)
-//         Gpuid of the scoped GPU. This parameter is ignored in AllGpu scopes.
+//         UUID of the physical GPU if the GPU is not SMC capable or SMC
+//         enabled, or the GPU instance UUID of the partition of the scoped GPU.
+//         This parameter is ignored in AllGpu scopes.
 //
 //     pCounterHandle: (OUTPUT)
 //         Handle to the counter address.
@@ -2932,7 +2970,7 @@ NV_STATUS UvmDebugGetCounterVal(UvmDebugSession     session,
 // UvmEventQueueCreate
 //
 // This call creates an event queue of the given size.
-// No events are added in the queue till they are enabled by the user.
+// No events are added in the queue until they are enabled by the user.
 // Event queue data is visible to the user even after the target process dies
 // if the session is active and queue is not freed.
 //
@@ -2983,7 +3021,7 @@ NV_STATUS UvmEventQueueCreate(UvmDebugSession        sessionHandle,
 // UvmEventQueueDestroy
 //
 // This call frees all interal resources associated with the queue, including
-// upinning of the memory associated with that queue. Freeing user buffer is
+// unpinning of the memory associated with that queue. Freeing user buffer is
 // responsibility of a caller. Event queue might be also destroyed as a side
 // effect of destroying a session associated with this queue.
 //
@@ -3167,9 +3205,9 @@ NV_STATUS UvmEventGetNotificationHandles(UvmEventQueueHandle  *queueHandleArray,
 // UvmEventGetGpuUuidTable
 //
 // Each migration event entry contains the gpu index to/from where data is
-// migrated. This index maps to a corresponding gpu UUID in the gpuUuidTable.
-// Using indices saves on the size of each event entry. This API provides the
-// gpuIndex to gpuUuid relation to the user.
+// migrated. This index maps to a corresponding physical gpu UUID in the
+// gpuUuidTable. Using indices saves on the size of each event entry. This API
+// provides the gpuIndex to gpuUuid relation to the user.
 //
 // This API does not access the queue state maintained in the user
 // library and so the user doesn't need to acquire a lock to protect the
@@ -3177,9 +3215,9 @@ NV_STATUS UvmEventGetNotificationHandles(UvmEventQueueHandle  *queueHandleArray,
 //
 // Arguments:
 //     gpuUuidTable: (OUTPUT)
-//         The return value is an array of UUIDs. The array index is the
-//         corresponding gpuIndex. There can be at max 32 gpus associated with
-//         UVM, so array size is 32.
+//         The return value is an array of physical GPU UUIDs. The array index
+//         is the corresponding gpuIndex. There can be at max 32 GPUs
+//         associated with UVM, so array size is 32.
 //
 //     validCount: (OUTPUT)
 //         The system doesn't normally contain 32 GPUs. This field gives the
@@ -3238,7 +3276,7 @@ NV_STATUS UvmEventGetGpuUuidTable(NvProcessorUuid *gpuUuidTable,
 //------------------------------------------------------------------------------
 NV_STATUS UvmEventFetch(UvmDebugSession      sessionHandle,
                        UvmEventQueueHandle  queueHandle,
-                        UvmEventEntry       *pBuffer,
+                        UvmEventEntry_V1    *pBuffer,
                        NvU64               *nEntries);

 //------------------------------------------------------------------------------
@@ -3434,10 +3472,14 @@ NV_STATUS UvmToolsDestroySession(UvmToolsSessionHandle session);
 // 4. Destroy event Queue using UvmToolsDestroyEventQueue
 //

-
+#if UVM_API_REV_IS_AT_MOST(10)
+// This is deprecated and replaced by sizeof(UvmToolsEventControlData).
 NvLength UvmToolsGetEventControlSize(void);

+// This is deprecated and replaced by sizeof(UvmEventEntry_V1) or
+// sizeof(UvmEventEntry_V2).
 NvLength UvmToolsGetEventEntrySize(void);
+#endif

 NvLength UvmToolsGetNumberOfCounters(void);

@@ -3452,6 +3494,10 @@ NvLength UvmToolsGetNumberOfCounters(void);
 //     session: (INPUT)
 //         Handle to the tools session.
 //
+//     version: (INPUT)
+//         Requested version for events or counters.
+//         See UvmToolsEventQueueVersion.
+//
 //     event_buffer: (INPUT)
 //         User allocated buffer. Must be page-aligned. Must be large enough to
 //         hold at least event_buffer_size events. Gets pinned until queue is
@@ -3464,9 +3510,7 @@ NvLength UvmToolsGetNumberOfCounters(void);
 //     event_control (INPUT)
 //         User allocated buffer. Must be page-aligned. Must be large enough to
 //         hold UvmToolsEventControlData (although single page-size allocation
-//         should be more than enough). One could call
-//         UvmToolsGetEventControlSize() function to find out current size of
-//         UvmToolsEventControlData. Gets pinned until queue is destroyed.
+//         should be more than enough). Gets pinned until queue is destroyed.
 //
 //     queue: (OUTPUT)
 //         Handle to the created queue.
@@ -3476,22 +3520,38 @@ NvLength UvmToolsGetNumberOfCounters(void);
 //         Session handle does not refer to a valid session
 //
 //     NV_ERR_INVALID_ARGUMENT:
+//         The version is not UvmToolsEventQueueVersion_V1 or
+//         UvmToolsEventQueueVersion_V2.
 //         One of the parameters: event_buffer, event_buffer_size, event_control
 //         is not valid
 //
+//     NV_ERR_NOT_SUPPORTED:
+//         The requested version queue could not be created
+//         (i.e., the UVM kernel driver is older and doesn't support
+//         UvmToolsEventQueueVersion_V2).
+//
 //     NV_ERR_INSUFFICIENT_RESOURCES:
-//         There could be multiple reasons for this error. One would be that it's
-//         not possible to allocate a queue of requested size. Another would be
-//         that either event_buffer or event_control memory couldn't be pinned
-//         (e.g. because of OS limitation of pinnable memory). Also it could not
-//         have been possible to create UvmToolsEventQueueDescriptor.
+//         There could be multiple reasons for this error. One would be that
+//         it's not possible to allocate a queue of requested size. Another
+//         would be either event_buffer or event_control memory couldn't be
+//         pinned (e.g. because of OS limitation of pinnable memory). Also it
+//         could not have been possible to create UvmToolsEventQueueDescriptor.
 //
 //------------------------------------------------------------------------------
+#if UVM_API_REV_IS_AT_MOST(10)
 NV_STATUS UvmToolsCreateEventQueue(UvmToolsSessionHandle     session,
                                   void                     *event_buffer,
                                   NvLength                  event_buffer_size,
                                   void                     *event_control,
                                   UvmToolsEventQueueHandle *queue);
+#else
+NV_STATUS UvmToolsCreateEventQueue(UvmToolsSessionHandle        session,
+                                   UvmToolsEventQueueVersion    version,
+                                   void                        *event_buffer,
+                                   NvLength                     event_buffer_size,
+                                   void                        *event_control,
+                                   UvmToolsEventQueueHandle    *queue);
+#endif

 UvmToolsEventQueueDescriptor UvmToolsGetEventQueueDescriptor(UvmToolsEventQueueHandle queue);

@@ -3528,7 +3588,7 @@ NV_STATUS UvmToolsSetNotificationThreshold(UvmToolsEventQueueHandle queue,
 //------------------------------------------------------------------------------
 // UvmToolsDestroyEventQueue
 //
-// Destroys all internal resources associated with the queue. It unpinns the
+// Destroys all internal resources associated with the queue. It unpins the
 // buffers provided in UvmToolsCreateEventQueue. Event Queue is also auto
 // destroyed when corresponding session gets destroyed.
 //
@@ -3550,7 +3610,7 @@ NV_STATUS UvmToolsDestroyEventQueue(UvmToolsEventQueueHandle queue);
 // UvmEventQueueEnableEvents
 //
 // This call enables a particular event type in the event queue. All events are
-// disabled by default. Any event type is considered listed if and only if it's
+// disabled by default. Any event type is considered listed if and only if its
 // corresponding value is equal to 1 (in other words, bit is set). Disabled
 // events listed in eventTypeFlags are going to be enabled. Enabled events and
 // events not listed in eventTypeFlags are not affected by this call.
@@ -3583,7 +3643,7 @@ NV_STATUS UvmToolsEventQueueEnableEvents(UvmToolsEventQueueHandle queue,
 // UvmToolsEventQueueDisableEvents
 //
 // This call disables a particular event type in the event queue. Any event type
-// is considered listed if and only if it's corresponding value is equal to 1
+// is considered listed if and only if its corresponding value is equal to 1
 // (in other words, bit is set). Enabled events listed in eventTypeFlags are
 // going to be disabled. Disabled events and events not listed in eventTypeFlags
 // are not affected by this call.
@@ -3621,7 +3681,7 @@ NV_STATUS UvmToolsEventQueueDisableEvents(UvmToolsEventQueueHandle queue,
 //
 // Counters position follows the layout of the memory that UVM driver decides to
 // use. To obtain particular counter value, user should perform consecutive
-// atomic reads at a a given buffer + offset address.
+// atomic reads at a given buffer + offset address.
 //
 // It is not defined what is the initial value of a counter. User should rely on
 // a difference between each snapshot.
@@ -3644,9 +3704,9 @@ NV_STATUS UvmToolsEventQueueDisableEvents(UvmToolsEventQueueHandle queue,
 //         Provided session is not valid
 //
 //     NV_ERR_INSUFFICIENT_RESOURCES
-//         There could be multiple reasons for this error. One would be that it's
-//         not possible to allocate counters structure. Another would be that
-//         either event_buffer or event_control memory couldn't be pinned
+//         There could be multiple reasons for this error. One would be that
+//         it's not possible to allocate counters structure. Another would be
+//         that either event_buffer or event_control memory couldn't be pinned
 //         (e.g. because of OS limitation of pinnable memory)
 //
 //------------------------------------------------------------------------------
@@ -3657,12 +3717,12 @@ NV_STATUS UvmToolsCreateProcessAggregateCounters(UvmToolsSessionHandle   session
 //------------------------------------------------------------------------------
 // UvmToolsCreateProcessorCounters
 //
-// Creates the counters structure for tracking per-process counters.
+// Creates the counters structure for tracking per-processor counters.
 // These counters are disabled by default.
 //
 // Counters position follows the layout of the memory that UVM driver decides to
 // use. To obtain particular counter value, user should perform consecutive
-// atomic reads at a a given buffer + offset address.
+// atomic reads at a given buffer + offset address.
 //
 // It is not defined what is the initial value of a counter. User should rely on
 // a difference between each snapshot.
@@ -3678,7 +3738,9 @@ NV_STATUS UvmToolsCreateProcessAggregateCounters(UvmToolsSessionHandle   session
 //         counters are destroyed.
 //
 //     processorUuid: (INPUT)
-//        UUID of the resource, for which counters will provide statistic data.
+//         UUID of the physical GPU if the GPU is not SMC capable or SMC
+//         enabled, the GPU instance UUID of the partition, or the CPU UUID of
+//         the resource, for which counters will provide statistic data.
 //
 //     counters: (OUTPUT)
 //         Handle to the created counters.
@@ -3688,9 +3750,9 @@ NV_STATUS UvmToolsCreateProcessAggregateCounters(UvmToolsSessionHandle   session
 //         session handle does not refer to a valid tools session
 //
 //     NV_ERR_INSUFFICIENT_RESOURCES
-//         There could be multiple reasons for this error. One would be that it's
-//         not possible to allocate counters structure. Another would be that
-//         either event_buffer or event_control memory couldn't be pinned
+//         There could be multiple reasons for this error. One would be that
+//         it's not possible to allocate counters structure. Another would be
+//         that either event_buffer or event_control memory couldn't be pinned
 //         (e.g. because of OS limitation of pinnable memory)
 //
 //     NV_ERR_INVALID_ARGUMENT
@@ -3706,7 +3768,7 @@ NV_STATUS UvmToolsCreateProcessorCounters(UvmToolsSessionHandle   session,
 // UvmToolsDestroyCounters
 //
 // Destroys all internal resources associated with this counters structure.
-// It unpinns the buffer provided in UvmToolsCreate*Counters. Counters structure
+// It unpins the buffer provided in UvmToolsCreate*Counters. Counters structure
 // also gest destroyed when corresponding session is destroyed.
 //
 // Arguments:
@@ -3727,7 +3789,7 @@ NV_STATUS UvmToolsDestroyCounters(UvmToolsCountersHandle counters);
 // UvmToolsEnableCounters
 //
 // This call enables certain counter types in the counters structure. Any
-// counter type is considered listed if and only if it's corresponding value is
+// counter type is considered listed if and only if its corresponding value is
 // equal to 1 (in other words, bit is set). Disabled counter types listed in
 // counterTypeFlags are going to be enabled. Already enabled counter types and
 // counter types not listed in counterTypeFlags are not affected by this call.
@@ -3761,7 +3823,7 @@ NV_STATUS UvmToolsEnableCounters(UvmToolsCountersHandle counters,
 // UvmToolsDisableCounters
 //
 // This call disables certain counter types in the counters structure. Any
-// counter type is considered listed if and only if it's corresponding value is
+// counter type is considered listed if and only if its corresponding value is
 // equal to 1 (in other words, bit is set). Enabled counter types listed in
 // counterTypeFlags are going to be disabled. Already disabled counter types and
 // counter types not listed in counterTypeFlags are not affected by this call.
@@ -3906,32 +3968,66 @@ NV_STATUS UvmToolsWriteProcessMemory(UvmToolsSessionHandle  session,
 // UvmToolsGetProcessorUuidTable
 //
 // Populate a table with the UUIDs of all the currently registered processors
-// in the target process.  When a GPU is registered, it is added to the table.
-// When a GPU is unregistered, it is removed.  As long as a GPU remains registered,
-// its index in the table does not change.  New registrations obtain the first
-// unused index.
+// in the target process. When a GPU is registered, it is added to the table.
+// When a GPU is unregistered, it is removed. As long as a GPU remains
+// registered, its index in the table does not change.
+// Note that the index in the table corresponds to the processor ID reported
+// in UvmEventEntry event records and that the table is not contiguously packed
+// with non-zero UUIDs even with no GPU unregistrations.
 //
 // Arguments:
 //     session: (INPUT)
 //         Handle to the tools session.
 //
+//     version: (INPUT)
+//         Requested version for the UUID table returned. The version must
+//         match the requested version of the event queue created with
+//         UvmToolsCreateEventQueue(). See UvmToolsEventQueueVersion.
+//         If the version of the event queue does not match the version of the
+//         UUID table, the behavior is undefined.
+//
 //     table: (OUTPUT)
 //         Array of processor UUIDs, including the CPU's UUID which is always
-//         at index zero.  The srcIndex and dstIndex fields of the
-//         UvmEventMigrationInfo struct index this array.  Unused indices will
-//         have a UUID of zero.
-//
-//     count: (OUTPUT)
-//         Set by UVM to the number of UUIDs written, including any gaps in
-//         the table due to unregistered GPUs.
+//         at index zero. The number of elements in the array must be greater
+//         or equal to UVM_MAX_PROCESSORS_V1 if the version is
+//         UvmToolsEventQueueVersion_V1 and UVM_MAX_PROCESSORS if the version is
+//         UvmToolsEventQueueVersion_V2.
+//         The srcIndex and dstIndex fields of the UvmEventMigrationInfo struct
+//         index this array. Unused indices will have a UUID of zero.
+//         If version is UvmToolsEventQueueVersion_V1 then the reported UUID
+//         will be that of the corresponding physical GPU, even if multiple SMC
+//         partitions are registered under that physical GPU. If version is
+//         UvmToolsEventQueueVersion_V2 then the reported UUID will be the GPU
+//         instance UUID if SMC is enabled, otherwise it will be the UUID of
+//         the physical GPU.
 //
 // Error codes:
 //     NV_ERR_INVALID_ADDRESS:
 //         writing to table failed.
+//
+//     NV_ERR_INVALID_ARGUMENT:
+//         The version is not UvmToolsEventQueueVersion_V1 or
+//         UvmToolsEventQueueVersion_V2.
+//
+//     NV_ERR_NOT_SUPPORTED:
+//         The kernel is not able to support the requested version
+//         (i.e., the UVM kernel driver is older and doesn't support
+//         UvmToolsEventQueueVersion_V2).
+//
+//     NV_ERR_NO_MEMORY:
+//         Internal memory allocation failed.
 //------------------------------------------------------------------------------
-NV_STATUS UvmToolsGetProcessorUuidTable(UvmToolsSessionHandle  session,
-                                        NvProcessorUuid       *table,
-                                        NvLength              *count);
+#if UVM_API_REV_IS_AT_MOST(11)
+NV_STATUS UvmToolsGetProcessorUuidTable(UvmToolsSessionHandle      session,
+                                        UvmToolsEventQueueVersion  version,
+                                        NvProcessorUuid           *table,
+                                        NvLength                   table_size,
+                                        NvLength                  *count);
+#else
+NV_STATUS UvmToolsGetProcessorUuidTable(UvmToolsSessionHandle     session,
+                                        UvmToolsEventQueueVersion version,
+                                        NvProcessorUuid          *table);
+#endif

 //------------------------------------------------------------------------------
 // UvmToolsFlushEvents
--- a/kernel-open/nvidia-uvm/uvm_ada.c
+++ b/kernel-open/nvidia-uvm/uvm_ada.c
@@ -1,5 +1,5 @@
 /*******************************************************************************
-    Copyright (c) 2021 NVIDIA Corporation
+    Copyright (c) 2021-2023 NVIDIA Corporation

    Permission is hereby granted, free of charge, to any person obtaining a copy
    of this software and associated documentation files (the "Software"), to
@@ -79,6 +79,8 @@ void uvm_hal_ada_arch_init_properties(uvm_parent_gpu_t *parent_gpu)

    parent_gpu->access_counters_supported = true;

+    parent_gpu->access_counters_can_use_physical_addresses = false;
+
    parent_gpu->fault_cancel_va_supported = true;

    parent_gpu->scoped_atomics_supported = true;
@@ -94,4 +96,6 @@ void uvm_hal_ada_arch_init_properties(uvm_parent_gpu_t *parent_gpu)
    parent_gpu->map_remap_larger_page_promotion = false;

    parent_gpu->plc_supported = true;
+
+    parent_gpu->no_ats_range_required = false;
 }
--- a/kernel-open/nvidia-uvm/uvm_ampere.c
+++ b/kernel-open/nvidia-uvm/uvm_ampere.c
@@ -1,5 +1,5 @@
 /*******************************************************************************
-    Copyright (c) 2018-20221 NVIDIA Corporation
+    Copyright (c) 2018-2023 NVIDIA Corporation

    Permission is hereby granted, free of charge, to any person obtaining a copy
    of this software and associated documentation files (the "Software"), to
@@ -38,10 +38,12 @@ void uvm_hal_ampere_arch_init_properties(uvm_parent_gpu_t *parent_gpu)

    parent_gpu->utlb_per_gpc_count = uvm_ampere_get_utlbs_per_gpc(parent_gpu);

-    parent_gpu->fault_buffer_info.replayable.utlb_count = parent_gpu->rm_info.maxGpcCount * parent_gpu->utlb_per_gpc_count;
+    parent_gpu->fault_buffer_info.replayable.utlb_count = parent_gpu->rm_info.maxGpcCount *
+                                                          parent_gpu->utlb_per_gpc_count;
    {
        uvm_fault_buffer_entry_t *dummy;
-        UVM_ASSERT(parent_gpu->fault_buffer_info.replayable.utlb_count <= (1 << (sizeof(dummy->fault_source.utlb_id) * 8)));
+        UVM_ASSERT(parent_gpu->fault_buffer_info.replayable.utlb_count <= (1 <<
+                                                                           (sizeof(dummy->fault_source.utlb_id) * 8)));
    }

    // A single top level PDE on Ampere covers 128 TB and that's the minimum
@@ -53,7 +55,7 @@ void uvm_hal_ampere_arch_init_properties(uvm_parent_gpu_t *parent_gpu)
    parent_gpu->uvm_mem_va_size = UVM_MEM_VA_SIZE;

    // See uvm_mmu.h for mapping placement
-    parent_gpu->flat_vidmem_va_base = 136 * UVM_SIZE_1TB;
+    parent_gpu->flat_vidmem_va_base = 160 * UVM_SIZE_1TB;
    parent_gpu->flat_sysmem_va_base = 256 * UVM_SIZE_1TB;

    parent_gpu->ce_phys_vidmem_write_supported = true;
@@ -81,6 +83,8 @@ void uvm_hal_ampere_arch_init_properties(uvm_parent_gpu_t *parent_gpu)

    parent_gpu->access_counters_supported = true;

+    parent_gpu->access_counters_can_use_physical_addresses = false;
+
    parent_gpu->fault_cancel_va_supported = true;

    parent_gpu->scoped_atomics_supported = true;
@@ -101,4 +105,6 @@ void uvm_hal_ampere_arch_init_properties(uvm_parent_gpu_t *parent_gpu)
        parent_gpu->map_remap_larger_page_promotion = false;

    parent_gpu->plc_supported = true;
+
+    parent_gpu->no_ats_range_required = false;
 }
--- a/kernel-open/nvidia-uvm/uvm_ampere_ce.c
+++ b/kernel-open/nvidia-uvm/uvm_ampere_ce.c
@@ -1,5 +1,5 @@
 /*******************************************************************************
-    Copyright (c) 2018-2022 NVIDIA Corporation
+    Copyright (c) 2018-2023 NVIDIA Corporation

    Permission is hereby granted, free of charge, to any person obtaining a copy
    of this software and associated documentation files (the "Software"), to
@@ -117,7 +117,7 @@ bool uvm_hal_ampere_ce_memcopy_is_valid_c6b5(uvm_push_t *push, uvm_gpu_address_t
    NvU64 push_begin_gpu_va;
    uvm_gpu_t *gpu = uvm_push_get_gpu(push);

-    if (!uvm_gpu_is_virt_mode_sriov_heavy(gpu))
+    if (!uvm_parent_gpu_is_virt_mode_sriov_heavy(gpu->parent))
        return true;

    if (uvm_channel_is_proxy(push->channel)) {
@@ -196,7 +196,7 @@ bool uvm_hal_ampere_ce_memset_is_valid_c6b5(uvm_push_t *push,
 {
    uvm_gpu_t *gpu = uvm_push_get_gpu(push);

-    if (!uvm_gpu_is_virt_mode_sriov_heavy(gpu))
+    if (!uvm_parent_gpu_is_virt_mode_sriov_heavy(gpu->parent))
        return true;

    if (uvm_channel_is_proxy(push->channel)) {
--- a/kernel-open/nvidia-uvm/uvm_ampere_fault_buffer.c
+++ b/kernel-open/nvidia-uvm/uvm_ampere_fault_buffer.c
@@ -0,0 +1,75 @@
+/*******************************************************************************
+    Copyright (c) 2024 NVIDIA Corporation
+
+    Permission is hereby granted, free of charge, to any person obtaining a copy
+    of this software and associated documentation files (the "Software"), to
+    deal in the Software without restriction, including without limitation the
+    rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
+    sell copies of the Software, and to permit persons to whom the Software is
+    furnished to do so, subject to the following conditions:
+
+        The above copyright notice and this permission notice shall be
+        included in all copies or substantial portions of the Software.
+
+    THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+    IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+    FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+    THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+    LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+    FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+    DEALINGS IN THE SOFTWARE.
+
+*******************************************************************************/
+
+#include "uvm_linux.h"
+#include "uvm_global.h"
+#include "uvm_gpu.h"
+#include "uvm_hal.h"
+#include "hwref/ampere/ga100/dev_fault.h"
+
+static bool client_id_ce(NvU16 client_id)
+{
+    if (client_id >= NV_PFAULT_CLIENT_HUB_HSCE0 && client_id <= NV_PFAULT_CLIENT_HUB_HSCE9)
+        return true;
+
+    if (client_id >= NV_PFAULT_CLIENT_HUB_HSCE10 && client_id <= NV_PFAULT_CLIENT_HUB_HSCE15)
+        return true;
+
+    switch (client_id) {
+        case NV_PFAULT_CLIENT_HUB_CE0:
+        case NV_PFAULT_CLIENT_HUB_CE1:
+        case NV_PFAULT_CLIENT_HUB_CE2:
+            return true;
+    }
+
+    return false;
+}
+
+uvm_mmu_engine_type_t uvm_hal_ampere_fault_buffer_get_mmu_engine_type(NvU16 mmu_engine_id,
+                                                                      uvm_fault_client_type_t client_type,
+                                                                      NvU16 client_id)
+{
+    // Servicing CE and Host (HUB clients) faults.
+    if (client_type == UVM_FAULT_CLIENT_TYPE_HUB) {
+        if (client_id_ce(client_id)) {
+            UVM_ASSERT(mmu_engine_id >= NV_PFAULT_MMU_ENG_ID_CE0 && mmu_engine_id <= NV_PFAULT_MMU_ENG_ID_CE9);
+
+            return UVM_MMU_ENGINE_TYPE_CE;
+        }
+
+        if (client_id == NV_PFAULT_CLIENT_HUB_HOST || client_id == NV_PFAULT_CLIENT_HUB_ESC) {
+            UVM_ASSERT(mmu_engine_id >= NV_PFAULT_MMU_ENG_ID_HOST0 && mmu_engine_id <= NV_PFAULT_MMU_ENG_ID_HOST31);
+
+            return UVM_MMU_ENGINE_TYPE_HOST;
+        }
+    }
+
+    // We shouldn't be servicing faults from any other engines other than GR.
+    UVM_ASSERT_MSG(client_id <= NV_PFAULT_CLIENT_GPC_ROP_3, "Unexpected client ID: 0x%x\n", client_id);
+    UVM_ASSERT_MSG(mmu_engine_id >= NV_PFAULT_MMU_ENG_ID_GRAPHICS && mmu_engine_id < NV_PFAULT_MMU_ENG_ID_BAR1,
+                   "Unexpected engine ID: 0x%x\n",
+                   mmu_engine_id);
+    UVM_ASSERT(client_type == UVM_FAULT_CLIENT_TYPE_GPC);
+
+    return UVM_MMU_ENGINE_TYPE_GRAPHICS;
+}
--- a/kernel-open/nvidia-uvm/uvm_ampere_host.c
+++ b/kernel-open/nvidia-uvm/uvm_ampere_host.c
@@ -1,5 +1,5 @@
 /*******************************************************************************
-    Copyright (c) 2018-2022 NVIDIA Corporation
+    Copyright (c) 2018-2024 NVIDIA Corporation

    Permission is hereby granted, free of charge, to any person obtaining a copy
    of this software and associated documentation files (the "Software"), to
@@ -33,7 +33,7 @@ bool uvm_hal_ampere_host_method_is_valid(uvm_push_t *push, NvU32 method_address,
 {
    uvm_gpu_t *gpu = uvm_push_get_gpu(push);

-    if (!uvm_gpu_is_virt_mode_sriov_heavy(gpu))
+    if (!uvm_parent_gpu_is_virt_mode_sriov_heavy(gpu->parent))
        return true;

    if (uvm_channel_is_privileged(push->channel)) {
@@ -205,17 +205,18 @@ void uvm_hal_ampere_host_clear_faulted_channel_sw_method(uvm_push_t *push,
                     CLEAR_FAULTED_B, HWVALUE(C076, CLEAR_FAULTED_B, INST_HI, instance_ptr_hi));
 }

-// Copy from Pascal, this version sets TLB_INVALIDATE_INVAL_SCOPE.
+// Copy from Turing, this version sets TLB_INVALIDATE_INVAL_SCOPE.
 void uvm_hal_ampere_host_tlb_invalidate_all(uvm_push_t *push,
-                                            uvm_gpu_phys_address_t pdb,
-                                            NvU32 depth,
-                                            uvm_membar_t membar)
+                                           uvm_gpu_phys_address_t pdb,
+                                           NvU32 depth,
+                                           uvm_membar_t membar)
 {
    NvU32 aperture_value;
    NvU32 page_table_level;
    NvU32 pdb_lo;
    NvU32 pdb_hi;
    NvU32 ack_value = 0;
+    NvU32 sysmembar_value = 0;

    UVM_ASSERT_MSG(pdb.aperture == UVM_APERTURE_VID || pdb.aperture == UVM_APERTURE_SYS, "aperture: %u", pdb.aperture);

@@ -230,8 +231,8 @@ void uvm_hal_ampere_host_tlb_invalidate_all(uvm_push_t *push,
    pdb_lo = pdb.address & HWMASK(C56F, MEM_OP_C, TLB_INVALIDATE_PDB_ADDR_LO);
    pdb_hi = pdb.address >> HWSIZE(C56F, MEM_OP_C, TLB_INVALIDATE_PDB_ADDR_LO);

-    // PDE3 is the highest level on Pascal, see the comment in uvm_pascal_mmu.c
-    // for details.
+    // PDE3 is the highest level on Pascal-Ampere, see the comment in
+    // uvm_pascal_mmu.c for details.
    UVM_ASSERT_MSG(depth < NVC56F_MEM_OP_C_TLB_INVALIDATE_PAGE_TABLE_LEVEL_UP_TO_PDE3, "depth %u", depth);
    page_table_level = NVC56F_MEM_OP_C_TLB_INVALIDATE_PAGE_TABLE_LEVEL_UP_TO_PDE3 - depth;

@@ -242,7 +243,12 @@ void uvm_hal_ampere_host_tlb_invalidate_all(uvm_push_t *push,
        ack_value = HWCONST(C56F, MEM_OP_C, TLB_INVALIDATE_ACK_TYPE, GLOBALLY);
    }

-    NV_PUSH_4U(C56F, MEM_OP_A, HWCONST(C56F, MEM_OP_A, TLB_INVALIDATE_SYSMEMBAR, DIS) |
+    if (membar == UVM_MEMBAR_SYS)
+        sysmembar_value = HWCONST(C56F, MEM_OP_A, TLB_INVALIDATE_SYSMEMBAR, EN);
+    else
+        sysmembar_value = HWCONST(C56F, MEM_OP_A, TLB_INVALIDATE_SYSMEMBAR, DIS);
+
+    NV_PUSH_4U(C56F, MEM_OP_A, sysmembar_value |
                               HWCONST(C56F, MEM_OP_A, TLB_INVALIDATE_INVAL_SCOPE, NON_LINK_TLBS),
                     MEM_OP_B, 0,
                     MEM_OP_C, HWCONST(C56F, MEM_OP_C, TLB_INVALIDATE_PDB, ONE) |
@@ -255,16 +261,18 @@ void uvm_hal_ampere_host_tlb_invalidate_all(uvm_push_t *push,
                     MEM_OP_D, HWCONST(C56F, MEM_OP_D, OPERATION, MMU_TLB_INVALIDATE) |
                               HWVALUE(C56F, MEM_OP_D, TLB_INVALIDATE_PDB_ADDR_HI, pdb_hi));

-    uvm_hal_tlb_invalidate_membar(push, membar);
+    // GPU membar still requires an explicit membar method.
+    if (membar == UVM_MEMBAR_GPU)
+        uvm_push_get_gpu(push)->parent->host_hal->membar_gpu(push);
 }

-// Copy from Volta, this version sets TLB_INVALIDATE_INVAL_SCOPE.
+// Copy from Turing, this version sets TLB_INVALIDATE_INVAL_SCOPE.
 void uvm_hal_ampere_host_tlb_invalidate_va(uvm_push_t *push,
                                           uvm_gpu_phys_address_t pdb,
                                           NvU32 depth,
                                           NvU64 base,
                                           NvU64 size,
-                                           NvU32 page_size,
+                                           NvU64 page_size,
                                           uvm_membar_t membar)
 {
    NvU32 aperture_value;
@@ -272,6 +280,7 @@ void uvm_hal_ampere_host_tlb_invalidate_va(uvm_push_t *push,
    NvU32 pdb_lo;
    NvU32 pdb_hi;
    NvU32 ack_value = 0;
+    NvU32 sysmembar_value = 0;
    NvU32 va_lo;
    NvU32 va_hi;
    NvU64 end;
@@ -281,9 +290,9 @@ void uvm_hal_ampere_host_tlb_invalidate_va(uvm_push_t *push,
    NvU32 log2_invalidation_size;
    uvm_gpu_t *gpu = uvm_push_get_gpu(push);

-    UVM_ASSERT_MSG(IS_ALIGNED(page_size, 1 << 12), "page_size 0x%x\n", page_size);
-    UVM_ASSERT_MSG(IS_ALIGNED(base, page_size), "base 0x%llx page_size 0x%x\n", base, page_size);
-    UVM_ASSERT_MSG(IS_ALIGNED(size, page_size), "size 0x%llx page_size 0x%x\n", size, page_size);
+    UVM_ASSERT_MSG(IS_ALIGNED(page_size, 1 << 12), "page_size 0x%llx\n", page_size);
+    UVM_ASSERT_MSG(IS_ALIGNED(base, page_size), "base 0x%llx page_size 0x%llx\n", base, page_size);
+    UVM_ASSERT_MSG(IS_ALIGNED(size, page_size), "size 0x%llx page_size 0x%llx\n", size, page_size);
    UVM_ASSERT_MSG(size > 0, "size 0x%llx\n", size);

    // The invalidation size must be a power-of-two number of pages containing
@@ -325,7 +334,7 @@ void uvm_hal_ampere_host_tlb_invalidate_va(uvm_push_t *push,
    pdb_lo = pdb.address & HWMASK(C56F, MEM_OP_C, TLB_INVALIDATE_PDB_ADDR_LO);
    pdb_hi = pdb.address >> HWSIZE(C56F, MEM_OP_C, TLB_INVALIDATE_PDB_ADDR_LO);

-    // PDE3 is the highest level on Pascal-Ampere , see the comment in
+    // PDE3 is the highest level on Pascal-Ampere, see the comment in
    // uvm_pascal_mmu.c for details.
    UVM_ASSERT_MSG(depth < NVC56F_MEM_OP_C_TLB_INVALIDATE_PAGE_TABLE_LEVEL_UP_TO_PDE3, "depth %u", depth);
    page_table_level = NVC56F_MEM_OP_C_TLB_INVALIDATE_PAGE_TABLE_LEVEL_UP_TO_PDE3 - depth;
@@ -337,10 +346,15 @@ void uvm_hal_ampere_host_tlb_invalidate_va(uvm_push_t *push,
        ack_value = HWCONST(C56F, MEM_OP_C, TLB_INVALIDATE_ACK_TYPE, GLOBALLY);
    }

+    if (membar == UVM_MEMBAR_SYS)
+        sysmembar_value = HWCONST(C56F, MEM_OP_A, TLB_INVALIDATE_SYSMEMBAR, EN);
+    else
+        sysmembar_value = HWCONST(C56F, MEM_OP_A, TLB_INVALIDATE_SYSMEMBAR, DIS);
+
    NV_PUSH_4U(C56F, MEM_OP_A, HWVALUE(C56F, MEM_OP_A, TLB_INVALIDATE_INVALIDATION_SIZE, log2_invalidation_size) |
-                               HWCONST(C56F, MEM_OP_A, TLB_INVALIDATE_SYSMEMBAR, DIS) |
-                               HWVALUE(C56F, MEM_OP_A, TLB_INVALIDATE_TARGET_ADDR_LO, va_lo) |
-                               HWCONST(C56F, MEM_OP_A, TLB_INVALIDATE_INVAL_SCOPE, NON_LINK_TLBS),
+                               HWCONST(C56F, MEM_OP_A, TLB_INVALIDATE_INVAL_SCOPE, NON_LINK_TLBS) |
+                               sysmembar_value |
+                               HWVALUE(C56F, MEM_OP_A, TLB_INVALIDATE_TARGET_ADDR_LO, va_lo),
                     MEM_OP_B, HWVALUE(C56F, MEM_OP_B, TLB_INVALIDATE_TARGET_ADDR_HI, va_hi),
                     MEM_OP_C, HWCONST(C56F, MEM_OP_C, TLB_INVALIDATE_PDB, ONE) |
                               HWVALUE(C56F, MEM_OP_C, TLB_INVALIDATE_PDB_ADDR_LO, pdb_lo) |
@@ -352,21 +366,23 @@ void uvm_hal_ampere_host_tlb_invalidate_va(uvm_push_t *push,
                     MEM_OP_D, HWCONST(C56F, MEM_OP_D, OPERATION, MMU_TLB_INVALIDATE_TARGETED) |
                               HWVALUE(C56F, MEM_OP_D, TLB_INVALIDATE_PDB_ADDR_HI, pdb_hi));

-    uvm_hal_tlb_invalidate_membar(push, membar);
+    // GPU membar still requires an explicit membar method.
+    if (membar == UVM_MEMBAR_GPU)
+        gpu->parent->host_hal->membar_gpu(push);
 }

-// Copy from Pascal, this version sets TLB_INVALIDATE_INVAL_SCOPE.
+// Copy from Turing, this version sets TLB_INVALIDATE_INVAL_SCOPE.
 void uvm_hal_ampere_host_tlb_invalidate_test(uvm_push_t *push,
                                             uvm_gpu_phys_address_t pdb,
                                             UVM_TEST_INVALIDATE_TLB_PARAMS *params)
 {
    NvU32 ack_value = 0;
+    NvU32 sysmembar_value = 0;
    NvU32 invalidate_gpc_value = 0;
    NvU32 aperture_value = 0;
    NvU32 pdb_lo = 0;
    NvU32 pdb_hi = 0;
    NvU32 page_table_level = 0;
-    uvm_membar_t membar;

    UVM_ASSERT_MSG(pdb.aperture == UVM_APERTURE_VID || pdb.aperture == UVM_APERTURE_SYS, "aperture: %u", pdb.aperture);
    if (pdb.aperture == UVM_APERTURE_VID)
@@ -381,7 +397,7 @@ void uvm_hal_ampere_host_tlb_invalidate_test(uvm_push_t *push,
    pdb_hi = pdb.address >> HWSIZE(C56F, MEM_OP_C, TLB_INVALIDATE_PDB_ADDR_LO);

    if (params->page_table_level != UvmInvalidatePageTableLevelAll) {
-        // PDE3 is the highest level on Pascal, see the comment in
+        // PDE3 is the highest level on Pascal-Ampere, see the comment in
        // uvm_pascal_mmu.c for details.
        page_table_level = min((NvU32)UvmInvalidatePageTableLevelPde3, params->page_table_level) - 1;
    }
@@ -393,6 +409,11 @@ void uvm_hal_ampere_host_tlb_invalidate_test(uvm_push_t *push,
        ack_value = HWCONST(C56F, MEM_OP_C, TLB_INVALIDATE_ACK_TYPE, GLOBALLY);
    }

+    if (params->membar == UvmInvalidateTlbMemBarSys)
+        sysmembar_value = HWCONST(C56F, MEM_OP_A, TLB_INVALIDATE_SYSMEMBAR, EN);
+    else
+        sysmembar_value = HWCONST(C56F, MEM_OP_A, TLB_INVALIDATE_SYSMEMBAR, DIS);
+
    if (params->disable_gpc_invalidate)
        invalidate_gpc_value = HWCONST(C56F, MEM_OP_C, TLB_INVALIDATE_GPC, DISABLE);
    else
@@ -403,9 +424,9 @@ void uvm_hal_ampere_host_tlb_invalidate_test(uvm_push_t *push,

        NvU32 va_lo = va & HWMASK(C56F, MEM_OP_A, TLB_INVALIDATE_TARGET_ADDR_LO);
        NvU32 va_hi = va >> HWSIZE(C56F, MEM_OP_A, TLB_INVALIDATE_TARGET_ADDR_LO);
-        NV_PUSH_4U(C56F, MEM_OP_A, HWCONST(C56F, MEM_OP_A, TLB_INVALIDATE_SYSMEMBAR, DIS) |
-                                   HWVALUE(C56F, MEM_OP_A, TLB_INVALIDATE_TARGET_ADDR_LO, va_lo) |
-                                   HWCONST(C56F, MEM_OP_A, TLB_INVALIDATE_INVAL_SCOPE, NON_LINK_TLBS),
+        NV_PUSH_4U(C56F, MEM_OP_A, sysmembar_value |
+                                   HWCONST(C56F, MEM_OP_A, TLB_INVALIDATE_INVAL_SCOPE, NON_LINK_TLBS) |
+                                   HWVALUE(C56F, MEM_OP_A, TLB_INVALIDATE_TARGET_ADDR_LO, va_lo),
                         MEM_OP_B, HWVALUE(C56F, MEM_OP_B, TLB_INVALIDATE_TARGET_ADDR_HI, va_hi),
                         MEM_OP_C, HWCONST(C56F, MEM_OP_C, TLB_INVALIDATE_REPLAY, NONE) |
                                   HWVALUE(C56F, MEM_OP_C, TLB_INVALIDATE_PAGE_TABLE_LEVEL, page_table_level) |
@@ -418,7 +439,7 @@ void uvm_hal_ampere_host_tlb_invalidate_test(uvm_push_t *push,
                                   HWVALUE(C56F, MEM_OP_D, TLB_INVALIDATE_PDB_ADDR_HI, pdb_hi));
    }
    else {
-        NV_PUSH_4U(C56F, MEM_OP_A, HWCONST(C56F, MEM_OP_A, TLB_INVALIDATE_SYSMEMBAR, DIS) |
+        NV_PUSH_4U(C56F, MEM_OP_A, sysmembar_value |
                                   HWCONST(C56F, MEM_OP_A, TLB_INVALIDATE_INVAL_SCOPE, NON_LINK_TLBS),
                         MEM_OP_B, 0,
                         MEM_OP_C, HWCONST(C56F, MEM_OP_C, TLB_INVALIDATE_REPLAY, NONE) |
@@ -432,12 +453,7 @@ void uvm_hal_ampere_host_tlb_invalidate_test(uvm_push_t *push,
                                   HWVALUE(C56F, MEM_OP_D, TLB_INVALIDATE_PDB_ADDR_HI, pdb_hi));
    }

-    if (params->membar == UvmInvalidateTlbMemBarSys)
-        membar = UVM_MEMBAR_SYS;
-    else if (params->membar == UvmInvalidateTlbMemBarLocal)
-        membar = UVM_MEMBAR_GPU;
-    else
-        membar = UVM_MEMBAR_NONE;
-
-    uvm_hal_tlb_invalidate_membar(push, membar);
+    // GPU membar still requires an explicit membar method.
+    if (params->membar == UvmInvalidateTlbMemBarLocal)
+        uvm_push_get_gpu(push)->parent->host_hal->membar_gpu(push);
 }
--- a/kernel-open/nvidia-uvm/uvm_ampere_mmu.c
+++ b/kernel-open/nvidia-uvm/uvm_ampere_mmu.c
@@ -1,5 +1,5 @@
 /*******************************************************************************
-    Copyright (c) 2018-2020 NVIDIA Corporation
+    Copyright (c) 2018-2024 NVIDIA Corporation

    Permission is hereby granted, free of charge, to any person obtaining a copy
    of this software and associated documentation files (the "Software"), to
@@ -36,22 +36,7 @@
 #include "uvm_ampere_fault_buffer.h"
 #include "hwref/ampere/ga100/dev_fault.h"

-uvm_mmu_engine_type_t uvm_hal_ampere_mmu_engine_id_to_type(NvU16 mmu_engine_id)
-{
-    if (mmu_engine_id >= NV_PFAULT_MMU_ENG_ID_HOST0 && mmu_engine_id <= NV_PFAULT_MMU_ENG_ID_HOST31)
-        return UVM_MMU_ENGINE_TYPE_HOST;
-
-    if (mmu_engine_id >= NV_PFAULT_MMU_ENG_ID_CE0 && mmu_engine_id <= NV_PFAULT_MMU_ENG_ID_CE9)
-        return UVM_MMU_ENGINE_TYPE_CE;
-
-    // We shouldn't be servicing faults from any other engines
-    UVM_ASSERT_MSG(mmu_engine_id >= NV_PFAULT_MMU_ENG_ID_GRAPHICS && mmu_engine_id < NV_PFAULT_MMU_ENG_ID_BAR1,
-                   "Unexpected engine ID: 0x%x\n", mmu_engine_id);
-
-    return UVM_MMU_ENGINE_TYPE_GRAPHICS;
-}
-
-static NvU32 page_table_depth_ampere(NvU32 page_size)
+static NvU32 page_table_depth_ampere(NvU64 page_size)
 {
    // The common-case is page_size == UVM_PAGE_SIZE_2M, hence the first check
    if (page_size == UVM_PAGE_SIZE_2M)
@@ -62,14 +47,14 @@ static NvU32 page_table_depth_ampere(NvU32 page_size)
        return 4;
 }

-static NvU32 page_sizes_ampere(void)
+static NvU64 page_sizes_ampere(void)
 {
    return UVM_PAGE_SIZE_512M | UVM_PAGE_SIZE_2M | UVM_PAGE_SIZE_64K | UVM_PAGE_SIZE_4K;
 }

 static uvm_mmu_mode_hal_t ampere_mmu_mode_hal;

-uvm_mmu_mode_hal_t *uvm_hal_mmu_mode_ampere(NvU32 big_page_size)
+uvm_mmu_mode_hal_t *uvm_hal_mmu_mode_ampere(NvU64 big_page_size)
 {
    static bool initialized = false;

--- a/kernel-open/nvidia-uvm/uvm_ats.c
+++ b/kernel-open/nvidia-uvm/uvm_ats.c
@@ -1,5 +1,5 @@
 /*******************************************************************************
-    Copyright (c) 2018-2021 NVIDIA Corporation
+    Copyright (c) 2018-2024 NVIDIA Corporation

    Permission is hereby granted, free of charge, to any person obtaining a copy
    of this software and associated documentation files (the "Software"), to
--- a/kernel-open/nvidia-uvm/uvm_ats.h
+++ b/kernel-open/nvidia-uvm/uvm_ats.h
@@ -1,5 +1,5 @@
 /*******************************************************************************
-    Copyright (c) 2018-2021 NVIDIA Corporation
+    Copyright (c) 2018-2024 NVIDIA Corporation

    Permission is hereby granted, free of charge, to any person obtaining a copy
    of this software and associated documentation files (the "Software"), to
@@ -29,20 +29,9 @@
 #include "uvm_ats_ibm.h"
 #include "nv_uvm_types.h"
 #include "uvm_lock.h"
+#include "uvm_ats_sva.h"

-    #include "uvm_ats_sva.h"
-
-    #define UVM_ATS_SUPPORTED() (UVM_ATS_IBM_SUPPORTED() || UVM_ATS_SVA_SUPPORTED())
-
-// ATS prefetcher uses hmm_range_fault() to query residency information.
-// hmm_range_fault() needs CONFIG_HMM_MIRROR. To detect racing CPU invalidates
-// of memory regions while hmm_range_fault() is being called, MMU interval
-// notifiers are needed.
-    #if defined(CONFIG_HMM_MIRROR) && defined(NV_MMU_INTERVAL_NOTIFIER)
-        #define UVM_ATS_PREFETCH_SUPPORTED() 1
-    #else
-        #define UVM_ATS_PREFETCH_SUPPORTED() 0
-    #endif
+#define UVM_ATS_SUPPORTED() (UVM_ATS_IBM_SUPPORTED() || UVM_ATS_SVA_SUPPORTED())

 typedef struct
 {
--- a/kernel-open/nvidia-uvm/uvm_ats_faults.c
+++ b/kernel-open/nvidia-uvm/uvm_ats_faults.c
@@ -30,23 +30,36 @@
 #include <linux/mempolicy.h>
 #include <linux/mmu_notifier.h>

-#if UVM_ATS_PREFETCH_SUPPORTED()
+#if UVM_HMM_RANGE_FAULT_SUPPORTED()
 #include <linux/hmm.h>
 #endif

-static NV_STATUS service_ats_faults(uvm_gpu_va_space_t *gpu_va_space,
-                                    struct vm_area_struct *vma,
-                                    NvU64 start,
-                                    size_t length,
-                                    uvm_fault_access_type_t access_type,
-                                    uvm_ats_fault_context_t *ats_context)
+typedef enum
+{
+    UVM_ATS_SERVICE_TYPE_FAULTS = 0,
+    UVM_ATS_SERVICE_TYPE_ACCESS_COUNTERS,
+    UVM_ATS_SERVICE_TYPE_COUNT
+} uvm_ats_service_type_t;
+
+static NV_STATUS service_ats_requests(uvm_gpu_va_space_t *gpu_va_space,
+                                      struct vm_area_struct *vma,
+                                      NvU64 start,
+                                      size_t length,
+                                      uvm_fault_access_type_t access_type,
+                                      uvm_ats_service_type_t service_type,
+                                      uvm_ats_fault_context_t *ats_context)
 {
    uvm_va_space_t *va_space = gpu_va_space->va_space;
    struct mm_struct *mm = va_space->va_space_mm.mm;
-    bool write = (access_type >= UVM_FAULT_ACCESS_TYPE_WRITE);
    NV_STATUS status;
    NvU64 user_space_start;
    NvU64 user_space_length;
+    bool write = (access_type >= UVM_FAULT_ACCESS_TYPE_WRITE);
+    bool fault_service_type = (service_type == UVM_ATS_SERVICE_TYPE_FAULTS);
+    uvm_populate_permissions_t populate_permissions = fault_service_type ?
+                                            (write ? UVM_POPULATE_PERMISSIONS_WRITE : UVM_POPULATE_PERMISSIONS_ANY) :
+                                            UVM_POPULATE_PERMISSIONS_INHERIT;
+

    // Request uvm_migrate_pageable() to touch the corresponding page after
    // population.
@@ -83,10 +96,10 @@ static NV_STATUS service_ats_faults(uvm_gpu_va_space_t *gpu_va_space,
        .dst_node_id                    = ats_context->residency_node,
        .start                          = start,
        .length                         = length,
-        .populate_permissions           = write ? UVM_POPULATE_PERMISSIONS_WRITE : UVM_POPULATE_PERMISSIONS_ANY,
-        .touch                          = true,
-        .skip_mapped                    = true,
-        .populate_on_cpu_alloc_failures = true,
+        .populate_permissions           = populate_permissions,
+        .touch                          = fault_service_type,
+        .skip_mapped                    = fault_service_type,
+        .populate_on_cpu_alloc_failures = fault_service_type,
        .user_space_start               = &user_space_start,
        .user_space_length              = &user_space_length,
    };
@@ -107,26 +120,24 @@ static NV_STATUS service_ats_faults(uvm_gpu_va_space_t *gpu_va_space,
    return status;
 }

-static void flush_tlb_write_faults(uvm_gpu_va_space_t *gpu_va_space,
-                                   NvU64 addr,
-                                   size_t size,
-                                   uvm_fault_client_type_t client_type)
+static void flush_tlb_va_region(uvm_gpu_va_space_t *gpu_va_space,
+                                NvU64 addr,
+                                size_t size,
+                                uvm_fault_client_type_t client_type)
 {
    uvm_ats_fault_invalidate_t *ats_invalidate;

-    uvm_ats_smmu_invalidate_tlbs(gpu_va_space, addr, size);
-
    if (client_type == UVM_FAULT_CLIENT_TYPE_GPC)
        ats_invalidate = &gpu_va_space->gpu->parent->fault_buffer_info.replayable.ats_invalidate;
    else
        ats_invalidate = &gpu_va_space->gpu->parent->fault_buffer_info.non_replayable.ats_invalidate;

-    if (!ats_invalidate->write_faults_in_batch) {
-        uvm_tlb_batch_begin(&gpu_va_space->page_tables, &ats_invalidate->write_faults_tlb_batch);
-        ats_invalidate->write_faults_in_batch = true;
+    if (!ats_invalidate->tlb_batch_pending) {
+        uvm_tlb_batch_begin(&gpu_va_space->page_tables, &ats_invalidate->tlb_batch);
+        ats_invalidate->tlb_batch_pending = true;
    }

-    uvm_tlb_batch_invalidate(&ats_invalidate->write_faults_tlb_batch, addr, size, PAGE_SIZE, UVM_MEMBAR_NONE);
+    uvm_tlb_batch_invalidate(&ats_invalidate->tlb_batch, addr, size, PAGE_SIZE, UVM_MEMBAR_NONE);
 }

 static void ats_batch_select_residency(uvm_gpu_va_space_t *gpu_va_space,
@@ -192,7 +203,7 @@ done:
    ats_context->prefetch_state.has_preferred_location = false;
 #endif

-    ats_context->residency_id = gpu ? gpu->parent->id : UVM_ID_CPU;
+    ats_context->residency_id = gpu ? gpu->id : UVM_ID_CPU;
    ats_context->residency_node = residency;
 }

@@ -235,7 +246,7 @@ static uvm_va_block_region_t uvm_ats_region_from_vma(struct vm_area_struct *vma,
    return uvm_ats_region_from_start_end(start, end);
 }

-#if UVM_ATS_PREFETCH_SUPPORTED()
+#if UVM_HMM_RANGE_FAULT_SUPPORTED()

 static bool uvm_ats_invalidate_notifier(struct mmu_interval_notifier *mni, unsigned long cur_seq)
 {
@@ -273,12 +284,12 @@ static NV_STATUS ats_compute_residency_mask(uvm_gpu_va_space_t *gpu_va_space,
                                            uvm_ats_fault_context_t *ats_context)
 {
    NV_STATUS status = NV_OK;
+    uvm_page_mask_t *residency_mask = &ats_context->prefetch_state.residency_mask;

-#if UVM_ATS_PREFETCH_SUPPORTED()
+#if UVM_HMM_RANGE_FAULT_SUPPORTED()
    int ret;
    NvU64 start;
    NvU64 end;
-    uvm_page_mask_t *residency_mask = &ats_context->prefetch_state.residency_mask;
    struct hmm_range range;
    uvm_page_index_t page_index;
    uvm_va_block_region_t vma_region;
@@ -359,78 +370,83 @@ static NV_STATUS ats_compute_residency_mask(uvm_gpu_va_space_t *gpu_va_space,

    mmu_interval_notifier_remove(range.notifier);

+#else
+    uvm_page_mask_zero(residency_mask);
 #endif

    return status;
 }

-static void ats_expand_fault_region(uvm_gpu_va_space_t *gpu_va_space,
-                                    struct vm_area_struct *vma,
-                                    uvm_ats_fault_context_t *ats_context,
-                                    uvm_va_block_region_t max_prefetch_region,
-                                    uvm_page_mask_t *faulted_mask)
+static void ats_compute_prefetch_mask(uvm_gpu_va_space_t *gpu_va_space,
+                                      struct vm_area_struct *vma,
+                                      uvm_ats_fault_context_t *ats_context,
+                                      uvm_va_block_region_t max_prefetch_region)
 {
-    uvm_page_mask_t *read_fault_mask = &ats_context->read_fault_mask;
-    uvm_page_mask_t *write_fault_mask = &ats_context->write_fault_mask;
+    uvm_page_mask_t *accessed_mask = &ats_context->accessed_mask;
    uvm_page_mask_t *residency_mask = &ats_context->prefetch_state.residency_mask;
    uvm_page_mask_t *prefetch_mask = &ats_context->prefetch_state.prefetch_pages_mask;
    uvm_perf_prefetch_bitmap_tree_t *bitmap_tree = &ats_context->prefetch_state.bitmap_tree;

-    if (uvm_page_mask_empty(faulted_mask))
+    if (uvm_page_mask_empty(accessed_mask))
        return;

    uvm_perf_prefetch_compute_ats(gpu_va_space->va_space,
-                                  faulted_mask,
-                                  uvm_va_block_region_from_mask(NULL, faulted_mask),
+                                  accessed_mask,
+                                  uvm_va_block_region_from_mask(NULL, accessed_mask),
                                  max_prefetch_region,
                                  residency_mask,
                                  bitmap_tree,
                                  prefetch_mask);
-
-    uvm_page_mask_or(read_fault_mask, read_fault_mask, prefetch_mask);
-
-    if (vma->vm_flags & VM_WRITE)
-        uvm_page_mask_or(write_fault_mask, write_fault_mask, prefetch_mask);
 }

-static NV_STATUS ats_fault_prefetch(uvm_gpu_va_space_t *gpu_va_space,
-                                    struct vm_area_struct *vma,
-                                    NvU64 base,
-                                    uvm_ats_fault_context_t *ats_context)
+static NV_STATUS ats_compute_prefetch(uvm_gpu_va_space_t *gpu_va_space,
+                                      struct vm_area_struct *vma,
+                                      NvU64 base,
+                                      uvm_ats_service_type_t service_type,
+                                      uvm_ats_fault_context_t *ats_context)
 {
-    NV_STATUS status = NV_OK;
-    uvm_page_mask_t *read_fault_mask = &ats_context->read_fault_mask;
-    uvm_page_mask_t *write_fault_mask = &ats_context->write_fault_mask;
-    uvm_page_mask_t *faulted_mask = &ats_context->faulted_mask;
+    NV_STATUS status;
+    uvm_page_mask_t *accessed_mask = &ats_context->accessed_mask;
    uvm_page_mask_t *prefetch_mask = &ats_context->prefetch_state.prefetch_pages_mask;
    uvm_va_block_region_t max_prefetch_region = uvm_ats_region_from_vma(vma, base);

+    // Residency mask needs to be computed even if prefetching is disabled since
+    // the residency information is also needed by access counters servicing in
+    // uvm_ats_service_access_counters()
+    status = ats_compute_residency_mask(gpu_va_space, vma, base, ats_context);
+    if (status != NV_OK)
+        return status;
+
    if (!uvm_perf_prefetch_enabled(gpu_va_space->va_space))
        return status;

-    if (uvm_page_mask_empty(faulted_mask))
-        return status;
-
-    status = ats_compute_residency_mask(gpu_va_space, vma, base, ats_context);
-    if (status != NV_OK)
+    if (uvm_page_mask_empty(accessed_mask))
        return status;

    // Prefetch the entire region if none of the pages are resident on any node
    // and if preferred_location is the faulting GPU.
    if (ats_context->prefetch_state.has_preferred_location &&
-        ats_context->prefetch_state.first_touch &&
-        uvm_id_equal(ats_context->residency_id, gpu_va_space->gpu->parent->id)) {
+        (ats_context->prefetch_state.first_touch || (service_type == UVM_ATS_SERVICE_TYPE_ACCESS_COUNTERS)) &&
+        uvm_id_equal(ats_context->residency_id, gpu_va_space->gpu->id)) {

        uvm_page_mask_init_from_region(prefetch_mask, max_prefetch_region, NULL);
+    }
+    else {
+        ats_compute_prefetch_mask(gpu_va_space, vma, ats_context, max_prefetch_region);
+    }
+
+    if (service_type == UVM_ATS_SERVICE_TYPE_FAULTS) {
+        uvm_page_mask_t *read_fault_mask = &ats_context->read_fault_mask;
+        uvm_page_mask_t *write_fault_mask = &ats_context->write_fault_mask;
+
        uvm_page_mask_or(read_fault_mask, read_fault_mask, prefetch_mask);

        if (vma->vm_flags & VM_WRITE)
            uvm_page_mask_or(write_fault_mask, write_fault_mask, prefetch_mask);
-
-        return status;
    }
-
-    ats_expand_fault_region(gpu_va_space, vma, ats_context, max_prefetch_region, faulted_mask);
+    else {
+        uvm_page_mask_or(accessed_mask, accessed_mask, prefetch_mask);
+    }

    return status;
 }
@@ -448,6 +464,7 @@ NV_STATUS uvm_ats_service_faults(uvm_gpu_va_space_t *gpu_va_space,
    uvm_page_mask_t *faults_serviced_mask = &ats_context->faults_serviced_mask;
    uvm_page_mask_t *reads_serviced_mask = &ats_context->reads_serviced_mask;
    uvm_fault_client_type_t client_type = ats_context->client_type;
+    uvm_ats_service_type_t service_type = UVM_ATS_SERVICE_TYPE_FAULTS;

    UVM_ASSERT(vma);
    UVM_ASSERT(IS_ALIGNED(base, UVM_VA_BLOCK_SIZE));
@@ -456,6 +473,9 @@ NV_STATUS uvm_ats_service_faults(uvm_gpu_va_space_t *gpu_va_space,
    UVM_ASSERT(gpu_va_space->ats.enabled);
    UVM_ASSERT(uvm_gpu_va_space_state(gpu_va_space) == UVM_GPU_VA_SPACE_STATE_ACTIVE);

+    uvm_assert_mmap_lock_locked(vma->vm_mm);
+    uvm_assert_rwsem_locked(&gpu_va_space->va_space->lock);
+
    uvm_page_mask_zero(faults_serviced_mask);
    uvm_page_mask_zero(reads_serviced_mask);

@@ -481,7 +501,7 @@ NV_STATUS uvm_ats_service_faults(uvm_gpu_va_space_t *gpu_va_space,

    ats_batch_select_residency(gpu_va_space, vma, ats_context);

-    ats_fault_prefetch(gpu_va_space, vma, base, ats_context);
+    ats_compute_prefetch(gpu_va_space, vma, base, service_type, ats_context);

    for_each_va_block_subregion_in_mask(subregion, write_fault_mask, region) {
        NvU64 start = base + (subregion.first * PAGE_SIZE);
@@ -493,12 +513,13 @@ NV_STATUS uvm_ats_service_faults(uvm_gpu_va_space_t *gpu_va_space,
        UVM_ASSERT(start >= vma->vm_start);
        UVM_ASSERT((start + length) <= vma->vm_end);

-        status = service_ats_faults(gpu_va_space, vma, start, length, access_type, ats_context);
+        status = service_ats_requests(gpu_va_space, vma, start, length, access_type, service_type, ats_context);
        if (status != NV_OK)
            return status;

        if (vma->vm_flags & VM_WRITE) {
            uvm_page_mask_region_fill(faults_serviced_mask, subregion);
+            uvm_ats_smmu_invalidate_tlbs(gpu_va_space, start, length);

            // The Linux kernel never invalidates TLB entries on mapping
            // permission upgrade. This is a problem if the GPU has cached
@@ -509,7 +530,7 @@ NV_STATUS uvm_ats_service_faults(uvm_gpu_va_space_t *gpu_va_space,
            // infinite loop because we just forward the fault to the Linux
            // kernel and it will see that the permissions in the page table are
            // correct. Therefore, we flush TLB entries on ATS write faults.
-            flush_tlb_write_faults(gpu_va_space, start, length, client_type);
+            flush_tlb_va_region(gpu_va_space, start, length, client_type);
        }
        else {
            uvm_page_mask_region_fill(reads_serviced_mask, subregion);
@@ -527,11 +548,20 @@ NV_STATUS uvm_ats_service_faults(uvm_gpu_va_space_t *gpu_va_space,
        UVM_ASSERT(start >= vma->vm_start);
        UVM_ASSERT((start + length) <= vma->vm_end);

-        status = service_ats_faults(gpu_va_space, vma, start, length, access_type, ats_context);
+        status = service_ats_requests(gpu_va_space, vma, start, length, access_type, service_type, ats_context);
        if (status != NV_OK)
            return status;

        uvm_page_mask_region_fill(faults_serviced_mask, subregion);
+
+        // Similarly to permission upgrade scenario, discussed above, GPU
+        // will not re-fetch the entry if the PTE is invalid and page size
+        // is 4K. To avoid infinite faulting loop, invalidate TLB for every
+        // new translation written explicitly like in the case of permission
+        // upgrade.
+        if (PAGE_SIZE == UVM_PAGE_SIZE_4K)
+            flush_tlb_va_region(gpu_va_space, start, length, client_type);
+
    }

    return status;
@@ -566,7 +596,7 @@ NV_STATUS uvm_ats_invalidate_tlbs(uvm_gpu_va_space_t *gpu_va_space,
    NV_STATUS status;
    uvm_push_t push;

-    if (!ats_invalidate->write_faults_in_batch)
+    if (!ats_invalidate->tlb_batch_pending)
        return NV_OK;

    UVM_ASSERT(gpu_va_space);
@@ -578,7 +608,7 @@ NV_STATUS uvm_ats_invalidate_tlbs(uvm_gpu_va_space_t *gpu_va_space,
                            "Invalidate ATS entries");

    if (status == NV_OK) {
-        uvm_tlb_batch_end(&ats_invalidate->write_faults_tlb_batch, &push, UVM_MEMBAR_NONE);
+        uvm_tlb_batch_end(&ats_invalidate->tlb_batch, &push, UVM_MEMBAR_NONE);
        uvm_push_end(&push);

        // Add this push to the GPU's tracker so that fault replays/clears can
@@ -586,7 +616,57 @@ NV_STATUS uvm_ats_invalidate_tlbs(uvm_gpu_va_space_t *gpu_va_space,
        status = uvm_tracker_add_push_safe(out_tracker, &push);
    }

-    ats_invalidate->write_faults_in_batch = false;
+    ats_invalidate->tlb_batch_pending = false;

    return status;
 }
+
+NV_STATUS uvm_ats_service_access_counters(uvm_gpu_va_space_t *gpu_va_space,
+                                          struct vm_area_struct *vma,
+                                          NvU64 base,
+                                          uvm_ats_fault_context_t *ats_context)
+{
+    uvm_va_block_region_t subregion;
+    uvm_va_block_region_t region = uvm_va_block_region(0, PAGES_PER_UVM_VA_BLOCK);
+    uvm_ats_service_type_t service_type = UVM_ATS_SERVICE_TYPE_ACCESS_COUNTERS;
+
+    UVM_ASSERT(vma);
+    UVM_ASSERT(IS_ALIGNED(base, UVM_VA_BLOCK_SIZE));
+    UVM_ASSERT(g_uvm_global.ats.enabled);
+    UVM_ASSERT(gpu_va_space);
+    UVM_ASSERT(gpu_va_space->ats.enabled);
+    UVM_ASSERT(uvm_gpu_va_space_state(gpu_va_space) == UVM_GPU_VA_SPACE_STATE_ACTIVE);
+
+    uvm_assert_mmap_lock_locked(vma->vm_mm);
+    uvm_assert_rwsem_locked(&gpu_va_space->va_space->lock);
+
+    ats_batch_select_residency(gpu_va_space, vma, ats_context);
+
+    // Ignoring the return value of ats_compute_prefetch is ok since prefetching
+    // is just an optimization and servicing access counter migrations is still
+    // worthwhile even without any prefetching added. So, let servicing continue
+    // instead of returning early even if the prefetch computation fails.
+    ats_compute_prefetch(gpu_va_space, vma, base, service_type, ats_context);
+
+    // Remove pages which are already resident at the intended destination from
+    // the accessed_mask.
+    uvm_page_mask_andnot(&ats_context->accessed_mask,
+                         &ats_context->accessed_mask,
+                         &ats_context->prefetch_state.residency_mask);
+
+    for_each_va_block_subregion_in_mask(subregion, &ats_context->accessed_mask, region) {
+        NV_STATUS status;
+        NvU64 start = base + (subregion.first * PAGE_SIZE);
+        size_t length = uvm_va_block_region_num_pages(subregion) * PAGE_SIZE;
+        uvm_fault_access_type_t access_type = UVM_FAULT_ACCESS_TYPE_COUNT;
+
+        UVM_ASSERT(start >= vma->vm_start);
+        UVM_ASSERT((start + length) <= vma->vm_end);
+
+        status = service_ats_requests(gpu_va_space, vma, start, length, access_type, service_type, ats_context);
+        if (status != NV_OK)
+            return status;
+    }
+
+    return NV_OK;
+}
--- a/kernel-open/nvidia-uvm/uvm_ats_faults.h
+++ b/kernel-open/nvidia-uvm/uvm_ats_faults.h
@@ -42,17 +42,37 @@
 // corresponding bit in read_fault_mask. These returned masks are only valid if
 // the return status is NV_OK. Status other than NV_OK indicate system global
 // fault servicing failures.
+//
+// LOCKING: The caller must retain and hold the mmap_lock and hold the va_space
+// lock.
 NV_STATUS uvm_ats_service_faults(uvm_gpu_va_space_t *gpu_va_space,
                                 struct vm_area_struct *vma,
                                 NvU64 base,
                                 uvm_ats_fault_context_t *ats_context);

+// Service access counter notifications on ATS regions in the range (base, base
+// + UVM_VA_BLOCK_SIZE) for individual pages in the range requested by page_mask
+// set in ats_context->accessed_mask. base must be aligned to UVM_VA_BLOCK_SIZE.
+// The caller is responsible for ensuring that the addresses in the
+// accessed_mask is completely covered by the VMA. The caller is also
+// responsible for handling any errors returned by this function.
+//
+// Returns NV_OK if servicing was successful. Any other error indicates an error
+// while servicing the range.
+//
+// LOCKING: The caller must retain and hold the mmap_lock and hold the va_space
+// lock.
+NV_STATUS uvm_ats_service_access_counters(uvm_gpu_va_space_t *gpu_va_space,
+                                          struct vm_area_struct *vma,
+                                          NvU64 base,
+                                          uvm_ats_fault_context_t *ats_context);
+
 // Return whether there are any VA ranges (and thus GMMU mappings) within the
 // UVM_GMMU_ATS_GRANULARITY-aligned region containing address.
 bool uvm_ats_check_in_gmmu_region(uvm_va_space_t *va_space, NvU64 address, uvm_va_range_t *next);

 // This function performs pending TLB invalidations for ATS and clears the
-// ats_invalidate->write_faults_in_batch flag
+// ats_invalidate->tlb_batch_pending flag
 NV_STATUS uvm_ats_invalidate_tlbs(uvm_gpu_va_space_t *gpu_va_space,
                                  uvm_ats_fault_invalidate_t *ats_invalidate,
                                  uvm_tracker_t *out_tracker);
--- a/kernel-open/nvidia-uvm/uvm_ats_sva.c
+++ b/kernel-open/nvidia-uvm/uvm_ats_sva.c
@@ -30,6 +30,7 @@
 #include "uvm_va_space_mm.h"

 #include <asm/io.h>
+#include <linux/log2.h>
 #include <linux/iommu.h>
 #include <linux/mm_types.h>
 #include <linux/acpi.h>
@@ -50,6 +51,12 @@
 #define UVM_IOMMU_SVA_BIND_DEVICE(dev, mm) iommu_sva_bind_device(dev, mm)
 #endif

+// Type to represent a 128-bit SMMU command queue command.
+struct smmu_cmd {
+    NvU64 low;
+    NvU64 high;
+};
+
 // Base address of SMMU CMDQ-V for GSMMU0.
 #define SMMU_CMDQV_BASE_ADDR(smmu_base) (smmu_base + 0x200000)
 #define SMMU_CMDQV_BASE_LEN 0x00830000
@@ -101,9 +108,9 @@
 // Base address offset for the VCMDQ registers.
 #define SMMU_VCMDQ_CMDQ_BASE 0x10000

-// Size of the command queue. Each command is 8 bytes and we can't
-// have a command queue greater than one page.
-#define SMMU_VCMDQ_CMDQ_BASE_LOG2SIZE 9
+// Size of the command queue. Each command is 16 bytes and we can't
+// have a command queue greater than one page in size.
+#define SMMU_VCMDQ_CMDQ_BASE_LOG2SIZE (PAGE_SHIFT - ilog2(sizeof(struct smmu_cmd)))
 #define SMMU_VCMDQ_CMDQ_ENTRIES (1UL << SMMU_VCMDQ_CMDQ_BASE_LOG2SIZE)

 // We always use VINTF63 for the WAR
@@ -175,7 +182,6 @@ static NV_STATUS uvm_ats_smmu_war_init(uvm_parent_gpu_t *parent_gpu)
    iowrite32((VINTF << SMMU_CMDQV_CMDQ_ALLOC_MAP_VIRT_INTF_INDX_SHIFT) | SMMU_CMDQV_CMDQ_ALLOC_MAP_ALLOC,
              smmu_cmdqv_base + SMMU_CMDQV_CMDQ_ALLOC_MAP(VCMDQ));

-    BUILD_BUG_ON((SMMU_VCMDQ_CMDQ_BASE_LOG2SIZE + 3) > PAGE_SHIFT);
    smmu_vcmdq_write64(smmu_cmdqv_base, SMMU_VCMDQ_CMDQ_BASE,
                       page_to_phys(parent_gpu->smmu_war.smmu_cmdq) | SMMU_VCMDQ_CMDQ_BASE_LOG2SIZE);
    smmu_vcmdq_write32(smmu_cmdqv_base, SMMU_VCMDQ_CONS, 0);
--- a/kernel-open/nvidia-uvm/uvm_ats_sva.h
+++ b/kernel-open/nvidia-uvm/uvm_ats_sva.h
@@ -53,10 +53,11 @@
        #define UVM_ATS_SVA_SUPPORTED() 0
    #endif

-// If NV_ARCH_INVALIDATE_SECONDARY_TLBS is defined it means the upstream fix is
-// in place so no need for the WAR from Bug 4130089: [GH180][r535] WAR for
-// kernel not issuing SMMU TLB invalidates on read-only
-#if defined(NV_ARCH_INVALIDATE_SECONDARY_TLBS)
+// If NV_MMU_NOTIFIER_OPS_HAS_ARCH_INVALIDATE_SECONDARY_TLBS is defined it
+// means the upstream fix is in place so no need for the WAR from
+// Bug 4130089: [GH180][r535] WAR for  kernel not issuing SMMU TLB
+// invalidates on read-only
+#if defined(NV_MMU_NOTIFIER_OPS_HAS_ARCH_INVALIDATE_SECONDARY_TLBS)
    #define UVM_ATS_SMMU_WAR_REQUIRED() 0
 #elif NVCPU_IS_AARCH64
    #define UVM_ATS_SMMU_WAR_REQUIRED() 1
--- a/kernel-open/nvidia-uvm/uvm_blackwell.c
+++ b/kernel-open/nvidia-uvm/uvm_blackwell.c
@@ -0,0 +1,105 @@
+/*******************************************************************************
+    Copyright (c) 2022-2023 NVIDIA Corporation
+
+    Permission is hereby granted, free of charge, to any person obtaining a copy
+    of this software and associated documentation files (the "Software"), to
+    deal in the Software without restriction, including without limitation the
+    rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
+    sell copies of the Software, and to permit persons to whom the Software is
+    furnished to do so, subject to the following conditions:
+
+        The above copyright notice and this permission notice shall be
+        included in all copies or substantial portions of the Software.
+
+    THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+    IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+    FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+    THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+    LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+    FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+    DEALINGS IN THE SOFTWARE.
+
+*******************************************************************************/
+
+#include "uvm_global.h"
+#include "uvm_hal.h"
+#include "uvm_gpu.h"
+#include "uvm_mem.h"
+#include "uvm_blackwell_fault_buffer.h"
+
+void uvm_hal_blackwell_arch_init_properties(uvm_parent_gpu_t *parent_gpu)
+{
+    parent_gpu->tlb_batch.va_invalidate_supported = true;
+
+    parent_gpu->tlb_batch.va_range_invalidate_supported = true;
+
+    // TODO: Bug 1767241: Run benchmarks to figure out a good number
+    parent_gpu->tlb_batch.max_ranges = 8;
+
+    parent_gpu->utlb_per_gpc_count = uvm_blackwell_get_utlbs_per_gpc(parent_gpu);
+
+    parent_gpu->fault_buffer_info.replayable.utlb_count = parent_gpu->rm_info.maxGpcCount *
+                                                          parent_gpu->utlb_per_gpc_count;
+    {
+        uvm_fault_buffer_entry_t *dummy;
+        UVM_ASSERT(parent_gpu->fault_buffer_info.replayable.utlb_count <= (1 <<
+                                                                           (sizeof(dummy->fault_source.utlb_id) * 8)));
+    }
+
+    // A single top level PDE on Blackwell covers 64 PB and that's the minimum
+    // size that can be used.
+    parent_gpu->rm_va_base = 0;
+    parent_gpu->rm_va_size = 64 * UVM_SIZE_1PB;
+
+    parent_gpu->uvm_mem_va_base = parent_gpu->rm_va_size + 384 * UVM_SIZE_1TB;
+    parent_gpu->uvm_mem_va_size = UVM_MEM_VA_SIZE;
+
+    // See uvm_mmu.h for mapping placement
+    parent_gpu->flat_vidmem_va_base = (64 * UVM_SIZE_1PB) + (32 * UVM_SIZE_1TB);
+
+    // TODO: Bug 3953852: Set this to true pending Blackwell changes
+    parent_gpu->ce_phys_vidmem_write_supported = !uvm_parent_gpu_is_coherent(parent_gpu);
+
+    parent_gpu->peer_copy_mode = g_uvm_global.peer_copy_mode;
+
+    // All GR context buffers may be mapped to 57b wide VAs. All "compute" units
+    // accessing GR context buffers support the 57-bit VA range.
+    parent_gpu->max_channel_va = 1ull << 57;
+
+    parent_gpu->max_host_va = 1ull << 57;
+
+    // Blackwell can map sysmem with any page size
+    parent_gpu->can_map_sysmem_with_large_pages = true;
+
+    // Prefetch instructions will generate faults
+    parent_gpu->prefetch_fault_supported = true;
+
+    // Blackwell can place GPFIFO in vidmem
+    parent_gpu->gpfifo_in_vidmem_supported = true;
+
+    parent_gpu->replayable_faults_supported = true;
+
+    parent_gpu->non_replayable_faults_supported = true;
+
+    parent_gpu->access_counters_supported = true;
+
+    parent_gpu->access_counters_can_use_physical_addresses = false;
+
+    parent_gpu->fault_cancel_va_supported = true;
+
+    parent_gpu->scoped_atomics_supported = true;
+
+    parent_gpu->has_clear_faulted_channel_sw_method = true;
+
+    parent_gpu->has_clear_faulted_channel_method = false;
+
+    parent_gpu->smc.supported = true;
+
+    parent_gpu->sparse_mappings_supported = true;
+
+    parent_gpu->map_remap_larger_page_promotion = false;
+
+    parent_gpu->plc_supported = true;
+
+    parent_gpu->no_ats_range_required = true;
+}
--- a/kernel-open/nvidia-uvm/uvm_blackwell_fault_buffer.c
+++ b/kernel-open/nvidia-uvm/uvm_blackwell_fault_buffer.c
@@ -0,0 +1,122 @@
+/*******************************************************************************
+    Copyright (c) 2023-2024 NVIDIA Corporation
+
+    Permission is hereby granted, free of charge, to any person obtaining a copy
+    of this software and associated documentation files (the "Software"), to
+    deal in the Software without restriction, including without limitation the
+    rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
+    sell copies of the Software, and to permit persons to whom the Software is
+    furnished to do so, subject to the following conditions:
+
+        The above copyright notice and this permission notice shall be
+        included in all copies or substantial portions of the Software.
+
+    THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+    IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+    FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+    THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+    LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+    FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+    DEALINGS IN THE SOFTWARE.
+
+*******************************************************************************/
+
+#include "uvm_linux.h"
+#include "uvm_global.h"
+#include "uvm_gpu.h"
+#include "uvm_hal.h"
+#include "uvm_hal_types.h"
+#include "hwref/blackwell/gb100/dev_fault.h"
+#include "clc369.h"
+
+// NV_PFAULT_FAULT_TYPE_COMPRESSION_FAILURE fault type is deprecated on
+// Blackwell.
+uvm_fault_type_t uvm_hal_blackwell_fault_buffer_get_fault_type(const NvU32 *fault_entry)
+{
+    NvU32 hw_fault_type_value = READ_HWVALUE_MW(fault_entry, C369, BUF_ENTRY, FAULT_TYPE);
+
+    switch (hw_fault_type_value) {
+        case NV_PFAULT_FAULT_TYPE_PDE:
+            return UVM_FAULT_TYPE_INVALID_PDE;
+        case NV_PFAULT_FAULT_TYPE_PTE:
+            return UVM_FAULT_TYPE_INVALID_PTE;
+        case NV_PFAULT_FAULT_TYPE_RO_VIOLATION:
+            return UVM_FAULT_TYPE_WRITE;
+        case NV_PFAULT_FAULT_TYPE_ATOMIC_VIOLATION:
+            return UVM_FAULT_TYPE_ATOMIC;
+        case NV_PFAULT_FAULT_TYPE_WO_VIOLATION:
+            return UVM_FAULT_TYPE_READ;
+
+        case NV_PFAULT_FAULT_TYPE_PDE_SIZE:
+            return UVM_FAULT_TYPE_PDE_SIZE;
+        case NV_PFAULT_FAULT_TYPE_VA_LIMIT_VIOLATION:
+            return UVM_FAULT_TYPE_VA_LIMIT_VIOLATION;
+        case NV_PFAULT_FAULT_TYPE_UNBOUND_INST_BLOCK:
+            return UVM_FAULT_TYPE_UNBOUND_INST_BLOCK;
+        case NV_PFAULT_FAULT_TYPE_PRIV_VIOLATION:
+            return UVM_FAULT_TYPE_PRIV_VIOLATION;
+        case NV_PFAULT_FAULT_TYPE_PITCH_MASK_VIOLATION:
+            return UVM_FAULT_TYPE_PITCH_MASK_VIOLATION;
+        case NV_PFAULT_FAULT_TYPE_WORK_CREATION:
+            return UVM_FAULT_TYPE_WORK_CREATION;
+        case NV_PFAULT_FAULT_TYPE_UNSUPPORTED_APERTURE:
+            return UVM_FAULT_TYPE_UNSUPPORTED_APERTURE;
+        case NV_PFAULT_FAULT_TYPE_CC_VIOLATION:
+            return UVM_FAULT_TYPE_CC_VIOLATION;
+        case NV_PFAULT_FAULT_TYPE_UNSUPPORTED_KIND:
+            return UVM_FAULT_TYPE_UNSUPPORTED_KIND;
+        case NV_PFAULT_FAULT_TYPE_REGION_VIOLATION:
+            return UVM_FAULT_TYPE_REGION_VIOLATION;
+        case NV_PFAULT_FAULT_TYPE_POISONED:
+            return UVM_FAULT_TYPE_POISONED;
+    }
+
+    UVM_ASSERT_MSG(false, "Invalid fault type value: %d\n", hw_fault_type_value);
+
+    return UVM_FAULT_TYPE_COUNT;
+}
+
+static bool client_id_ce(NvU16 client_id)
+{
+    if (client_id >= NV_PFAULT_CLIENT_HUB_HSCE0 && client_id <= NV_PFAULT_CLIENT_HUB_HSCE7)
+        return true;
+
+    switch (client_id) {
+        case NV_PFAULT_CLIENT_HUB_CE0:
+        case NV_PFAULT_CLIENT_HUB_CE1:
+        case NV_PFAULT_CLIENT_HUB_CE2:
+        case NV_PFAULT_CLIENT_HUB_CE3:
+            return true;
+    }
+
+    return false;
+}
+
+uvm_mmu_engine_type_t uvm_hal_blackwell_fault_buffer_get_mmu_engine_type(NvU16 mmu_engine_id,
+                                                                         uvm_fault_client_type_t client_type,
+                                                                         NvU16 client_id)
+{
+    // Servicing CE and Host (HUB clients) faults.
+    if (client_type == UVM_FAULT_CLIENT_TYPE_HUB) {
+        if (client_id_ce(client_id)) {
+            UVM_ASSERT(mmu_engine_id >= NV_PFAULT_MMU_ENG_ID_CE0 && mmu_engine_id <= NV_PFAULT_MMU_ENG_ID_CE19);
+
+            return UVM_MMU_ENGINE_TYPE_CE;
+        }
+
+        if (client_id == NV_PFAULT_CLIENT_HUB_HOST ||
+            (client_id >= NV_PFAULT_CLIENT_HUB_ESC0 && client_id <= NV_PFAULT_CLIENT_HUB_ESC11)) {
+            UVM_ASSERT((mmu_engine_id >= NV_PFAULT_MMU_ENG_ID_HOST0 && mmu_engine_id <= NV_PFAULT_MMU_ENG_ID_HOST44) ||
+                       (mmu_engine_id >= NV_PFAULT_MMU_ENG_ID_GRAPHICS));
+
+            return UVM_MMU_ENGINE_TYPE_HOST;
+        }
+    }
+
+    // We shouldn't be servicing faults from any other engines other than GR.
+    UVM_ASSERT_MSG(client_id <= NV_PFAULT_CLIENT_GPC_ROP_3, "Unexpected client ID: 0x%x\n", client_id);
+    UVM_ASSERT_MSG(mmu_engine_id >= NV_PFAULT_MMU_ENG_ID_GRAPHICS, "Unexpected engine ID: 0x%x\n", mmu_engine_id);
+    UVM_ASSERT(client_type == UVM_FAULT_CLIENT_TYPE_GPC);
+
+    return UVM_MMU_ENGINE_TYPE_GRAPHICS;
+}
--- a/kernel-open/nvidia-uvm/uvm_blackwell_fault_buffer.h
+++ b/kernel-open/nvidia-uvm/uvm_blackwell_fault_buffer.h
@@ -0,0 +1,92 @@
+/*******************************************************************************
+    Copyright (c) 2022 NVIDIA Corporation
+
+    Permission is hereby granted, free of charge, to any person obtaining a copy
+    of this software and associated documentation files (the "Software"), to
+    deal in the Software without restriction, including without limitation the
+    rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
+    sell copies of the Software, and to permit persons to whom the Software is
+    furnished to do so, subject to the following conditions:
+
+        The above copyright notice and this permission notice shall be
+        included in all copies or substantial portions of the Software.
+
+    THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+    IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+    FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+    THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+    LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+    FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+    DEALINGS IN THE SOFTWARE.
+
+*******************************************************************************/
+
+#ifndef __UVM_HAL_BLACKWELL_FAULT_BUFFER_H__
+#define __UVM_HAL_BLACKWELL_FAULT_BUFFER_H__
+
+#include "nvtypes.h"
+#include "uvm_common.h"
+#include "uvm_gpu.h"
+
+// There are up to 10 TPCs per GPC in Blackwell, and there are 2 LTP uTLBs per
+// TPC. Besides, there is one active RGG uTLB per GPC. Each TPC has a number of
+// clients that can make requests to its uTLBs: 1xTPCCS, 1xPE, 2xT1. Requests
+// from these units are routed as follows to the 2 LTP uTLBs:
+//
+// --------                    ---------
+// | T1_0 | -----------------> | uTLB0 |
+// --------                    ---------
+//
+// --------                    ---------
+// | T1_1 | -----------------> | uTLB1 |
+// --------          --------> ---------
+//                   |             ^
+// -------           |             |
+// | PE  | -----------             |
+// -------                         |
+//                                 |
+// ---------                       |
+// | TPCCS | -----------------------
+// ---------
+//
+//
+// The client ids are local to their GPC and the id mapping is linear across
+// TPCs: TPC_n has TPCCS_n, PE_n, T1_p, and T1_q, where p=2*n and q=p+1.
+//
+// NV_PFAULT_CLIENT_GPC_LTP_UTLB_n and NV_PFAULT_CLIENT_GPC_RGG_UTLB enums can
+// be ignored. These will never be reported in a fault message, and should
+// never be used in an invalidate. Therefore, we define our own values.
+typedef enum {
+    UVM_BLACKWELL_GPC_UTLB_ID_RGG = 0,
+    UVM_BLACKWELL_GPC_UTLB_ID_LTP0 = 1,
+    UVM_BLACKWELL_GPC_UTLB_ID_LTP1 = 2,
+    UVM_BLACKWELL_GPC_UTLB_ID_LTP2 = 3,
+    UVM_BLACKWELL_GPC_UTLB_ID_LTP3 = 4,
+    UVM_BLACKWELL_GPC_UTLB_ID_LTP4 = 5,
+    UVM_BLACKWELL_GPC_UTLB_ID_LTP5 = 6,
+    UVM_BLACKWELL_GPC_UTLB_ID_LTP6 = 7,
+    UVM_BLACKWELL_GPC_UTLB_ID_LTP7 = 8,
+    UVM_BLACKWELL_GPC_UTLB_ID_LTP8 = 9,
+    UVM_BLACKWELL_GPC_UTLB_ID_LTP9 = 10,
+    UVM_BLACKWELL_GPC_UTLB_ID_LTP10 = 11,
+    UVM_BLACKWELL_GPC_UTLB_ID_LTP11 = 12,
+    UVM_BLACKWELL_GPC_UTLB_ID_LTP12 = 13,
+    UVM_BLACKWELL_GPC_UTLB_ID_LTP13 = 14,
+    UVM_BLACKWELL_GPC_UTLB_ID_LTP14 = 15,
+    UVM_BLACKWELL_GPC_UTLB_ID_LTP15 = 16,
+    UVM_BLACKWELL_GPC_UTLB_ID_LTP16 = 17,
+    UVM_BLACKWELL_GPC_UTLB_ID_LTP17 = 18,
+    UVM_BLACKWELL_GPC_UTLB_ID_LTP18 = 19,
+    UVM_BLACKWELL_GPC_UTLB_ID_LTP19 = 20,
+
+    UVM_BLACKWELL_GPC_UTLB_COUNT,
+} uvm_blackwell_gpc_utlb_id_t;
+
+static NvU32 uvm_blackwell_get_utlbs_per_gpc(uvm_parent_gpu_t *parent_gpu)
+{
+    NvU32 utlbs = parent_gpu->rm_info.maxTpcPerGpcCount * 2 + 1;
+    UVM_ASSERT(utlbs <= UVM_BLACKWELL_GPC_UTLB_COUNT);
+    return utlbs;
+}
+
+#endif
--- a/kernel-open/nvidia-uvm/uvm_blackwell_host.c
+++ b/kernel-open/nvidia-uvm/uvm_blackwell_host.c
@@ -0,0 +1,256 @@
+/*******************************************************************************
+    Copyright (c) 2024 NVIDIA Corporation
+
+    Permission is hereby granted, free of charge, to any person obtaining a copy
+    of this software and associated documentation files (the "Software"), to
+    deal in the Software without restriction, including without limitation the
+    rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
+    sell copies of the Software, and to permit persons to whom the Software is
+    furnished to do so, subject to the following conditions:
+
+        The above copyright notice and this permission notice shall be
+        included in all copies or substantial portions of the Software.
+
+    THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+    IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+    FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+    THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+    LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+    FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+    DEALINGS IN THE SOFTWARE.
+
+*******************************************************************************/
+
+#include "uvm_hal.h"
+#include "uvm_push.h"
+#include "uvm_push_macros.h"
+#include "clc96f.h"
+
+// TODO: Bug 3210931: Rename HOST references and files to ESCHED.
+
+void uvm_hal_blackwell_host_tlb_invalidate_all(uvm_push_t *push,
+                                               uvm_gpu_phys_address_t pdb,
+                                               NvU32 depth,
+                                               uvm_membar_t membar)
+{
+    NvU32 aperture_value;
+    NvU32 page_table_level;
+    NvU32 pdb_lo;
+    NvU32 pdb_hi;
+    NvU32 ack_value = 0;
+    NvU32 sysmembar_value = 0;
+
+    UVM_ASSERT_MSG(pdb.aperture == UVM_APERTURE_VID || pdb.aperture == UVM_APERTURE_SYS, "aperture: %u", pdb.aperture);
+
+    if (pdb.aperture == UVM_APERTURE_VID)
+        aperture_value = HWCONST(C96F, MEM_OP_C, TLB_INVALIDATE_PDB_APERTURE, VID_MEM);
+    else
+        aperture_value = HWCONST(C96F, MEM_OP_C, TLB_INVALIDATE_PDB_APERTURE, SYS_MEM_COHERENT);
+
+    UVM_ASSERT_MSG(IS_ALIGNED(pdb.address, 1 << 12), "pdb 0x%llx\n", pdb.address);
+    pdb.address >>= 12;
+
+    pdb_lo = pdb.address & HWMASK(C96F, MEM_OP_C, TLB_INVALIDATE_PDB_ADDR_LO);
+    pdb_hi = pdb.address >> HWSIZE(C96F, MEM_OP_C, TLB_INVALIDATE_PDB_ADDR_LO);
+
+    // PDE4 is the highest level on Blackwell, see the comment in
+    // uvm_blackwell_mmu.c for details.
+    UVM_ASSERT_MSG(depth < NVC96F_MEM_OP_C_TLB_INVALIDATE_PAGE_TABLE_LEVEL_UP_TO_PDE4, "depth %u", depth);
+    page_table_level = NVC96F_MEM_OP_C_TLB_INVALIDATE_PAGE_TABLE_LEVEL_UP_TO_PDE4 - depth;
+
+    if (membar != UVM_MEMBAR_NONE)
+        ack_value = HWCONST(C96F, MEM_OP_C, TLB_INVALIDATE_ACK_TYPE, GLOBALLY);
+
+    if (membar == UVM_MEMBAR_SYS)
+        sysmembar_value = HWCONST(C96F, MEM_OP_A, TLB_INVALIDATE_SYSMEMBAR, EN);
+    else
+        sysmembar_value = HWCONST(C96F, MEM_OP_A, TLB_INVALIDATE_SYSMEMBAR, DIS);
+
+    NV_PUSH_4U(C96F, MEM_OP_A, sysmembar_value |
+                               HWCONST(C96F, MEM_OP_A, TLB_INVALIDATE_INVAL_SCOPE, NON_LINK_TLBS),
+                     MEM_OP_B, 0,
+                     MEM_OP_C, HWCONST(C96F, MEM_OP_C, TLB_INVALIDATE_PDB, ONE) |
+                               HWVALUE(C96F, MEM_OP_C, TLB_INVALIDATE_PDB_ADDR_LO, pdb_lo) |
+                               HWCONST(C96F, MEM_OP_C, TLB_INVALIDATE_GPC, ENABLE) |
+                               HWCONST(C96F, MEM_OP_C, TLB_INVALIDATE_REPLAY, NONE) |
+                               HWVALUE(C96F, MEM_OP_C, TLB_INVALIDATE_PAGE_TABLE_LEVEL, page_table_level) |
+                               aperture_value |
+                               ack_value,
+                     MEM_OP_D, HWCONST(C96F, MEM_OP_D, OPERATION, MMU_TLB_INVALIDATE) |
+                               HWVALUE(C96F, MEM_OP_D, TLB_INVALIDATE_PDB_ADDR_HI, pdb_hi));
+}
+
+void uvm_hal_blackwell_host_tlb_invalidate_va(uvm_push_t *push,
+                                              uvm_gpu_phys_address_t pdb,
+                                              NvU32 depth,
+                                              NvU64 base,
+                                              NvU64 size,
+                                              NvU64 page_size,
+                                              uvm_membar_t membar)
+{
+    NvU32 aperture_value;
+    NvU32 page_table_level;
+    NvU32 pdb_lo;
+    NvU32 pdb_hi;
+    NvU32 ack_value = 0;
+    NvU32 sysmembar_value = 0;
+    NvU32 va_lo;
+    NvU32 va_hi;
+    NvU64 end;
+    NvU64 actual_base;
+    NvU64 actual_size;
+    NvU64 actual_end;
+    NvU32 log2_invalidation_size;
+    uvm_gpu_t *gpu = uvm_push_get_gpu(push);
+
+    UVM_ASSERT_MSG(IS_ALIGNED(page_size, 1 << 12), "page_size 0x%llx\n", page_size);
+    UVM_ASSERT_MSG(IS_ALIGNED(base, page_size), "base 0x%llx page_size 0x%llx\n", base, page_size);
+    UVM_ASSERT_MSG(IS_ALIGNED(size, page_size), "size 0x%llx page_size 0x%llx\n", size, page_size);
+    UVM_ASSERT_MSG(size > 0, "size 0x%llx\n", size);
+
+    // The invalidation size must be a power-of-two number of pages containing
+    // the passed interval
+    end = base + size - 1;
+    log2_invalidation_size = __fls((unsigned long)(end ^ base)) + 1;
+
+    if (log2_invalidation_size == 64) {
+        // Invalidate everything
+        gpu->parent->host_hal->tlb_invalidate_all(push, pdb, depth, membar);
+        return;
+    }
+
+    // The hardware aligns the target address down to the invalidation size.
+    actual_size = 1ULL << log2_invalidation_size;
+    actual_base = UVM_ALIGN_DOWN(base, actual_size);
+    actual_end = actual_base + actual_size - 1;
+    UVM_ASSERT(actual_end >= end);
+
+    // The invalidation size field expects log2(invalidation size in 4K), not
+    // log2(invalidation size in bytes)
+    log2_invalidation_size -= 12;
+
+    // Address to invalidate, as a multiple of 4K.
+    base >>= 12;
+    va_lo = base & HWMASK(C96F, MEM_OP_A, TLB_INVALIDATE_TARGET_ADDR_LO);
+    va_hi = base >> HWSIZE(C96F, MEM_OP_A, TLB_INVALIDATE_TARGET_ADDR_LO);
+
+    UVM_ASSERT_MSG(pdb.aperture == UVM_APERTURE_VID || pdb.aperture == UVM_APERTURE_SYS, "aperture: %u", pdb.aperture);
+
+    if (pdb.aperture == UVM_APERTURE_VID)
+        aperture_value = HWCONST(C96F, MEM_OP_C, TLB_INVALIDATE_PDB_APERTURE, VID_MEM);
+    else
+        aperture_value = HWCONST(C96F, MEM_OP_C, TLB_INVALIDATE_PDB_APERTURE, SYS_MEM_COHERENT);
+
+    UVM_ASSERT_MSG(IS_ALIGNED(pdb.address, 1 << 12), "pdb 0x%llx\n", pdb.address);
+    pdb.address >>= 12;
+
+    pdb_lo = pdb.address & HWMASK(C96F, MEM_OP_C, TLB_INVALIDATE_PDB_ADDR_LO);
+    pdb_hi = pdb.address >> HWSIZE(C96F, MEM_OP_C, TLB_INVALIDATE_PDB_ADDR_LO);
+
+    // PDE4 is the highest level on Blackwell, see the comment in
+    // uvm_blackwell_mmu.c for details.
+    UVM_ASSERT_MSG(depth < NVC96F_MEM_OP_C_TLB_INVALIDATE_PAGE_TABLE_LEVEL_UP_TO_PDE4, "depth %u", depth);
+    page_table_level = NVC96F_MEM_OP_C_TLB_INVALIDATE_PAGE_TABLE_LEVEL_UP_TO_PDE4 - depth;
+
+    if (membar != UVM_MEMBAR_NONE)
+        ack_value = HWCONST(C96F, MEM_OP_C, TLB_INVALIDATE_ACK_TYPE, GLOBALLY);
+
+    if (membar == UVM_MEMBAR_SYS)
+        sysmembar_value = HWCONST(C96F, MEM_OP_A, TLB_INVALIDATE_SYSMEMBAR, EN);
+    else
+        sysmembar_value = HWCONST(C96F, MEM_OP_A, TLB_INVALIDATE_SYSMEMBAR, DIS);
+
+    NV_PUSH_4U(C96F, MEM_OP_A, HWVALUE(C96F, MEM_OP_A, TLB_INVALIDATE_INVALIDATION_SIZE, log2_invalidation_size) |
+                               sysmembar_value |
+                               HWCONST(C96F, MEM_OP_A, TLB_INVALIDATE_INVAL_SCOPE, NON_LINK_TLBS) |
+                               HWVALUE(C96F, MEM_OP_A, TLB_INVALIDATE_TARGET_ADDR_LO, va_lo),
+                     MEM_OP_B, HWVALUE(C96F, MEM_OP_B, TLB_INVALIDATE_TARGET_ADDR_HI, va_hi),
+                     MEM_OP_C, HWCONST(C96F, MEM_OP_C, TLB_INVALIDATE_PDB, ONE) |
+                               HWVALUE(C96F, MEM_OP_C, TLB_INVALIDATE_PDB_ADDR_LO, pdb_lo) |
+                               HWCONST(C96F, MEM_OP_C, TLB_INVALIDATE_GPC, ENABLE) |
+                               HWCONST(C96F, MEM_OP_C, TLB_INVALIDATE_REPLAY, NONE) |
+                               HWVALUE(C96F, MEM_OP_C, TLB_INVALIDATE_PAGE_TABLE_LEVEL, page_table_level) |
+                               aperture_value |
+                               ack_value,
+                     MEM_OP_D, HWCONST(C96F, MEM_OP_D, OPERATION, MMU_TLB_INVALIDATE_TARGETED) |
+                               HWVALUE(C96F, MEM_OP_D, TLB_INVALIDATE_PDB_ADDR_HI, pdb_hi));
+}
+
+void uvm_hal_blackwell_host_tlb_invalidate_test(uvm_push_t *push,
+                                                uvm_gpu_phys_address_t pdb,
+                                                UVM_TEST_INVALIDATE_TLB_PARAMS *params)
+{
+    NvU32 ack_value = 0;
+    NvU32 sysmembar_value = 0;
+    NvU32 invalidate_gpc_value = 0;
+    NvU32 aperture_value = 0;
+    NvU32 pdb_lo = 0;
+    NvU32 pdb_hi = 0;
+    NvU32 page_table_level = 0;
+
+    UVM_ASSERT_MSG(pdb.aperture == UVM_APERTURE_VID || pdb.aperture == UVM_APERTURE_SYS, "aperture: %u", pdb.aperture);
+    if (pdb.aperture == UVM_APERTURE_VID)
+        aperture_value = HWCONST(C96F, MEM_OP_C, TLB_INVALIDATE_PDB_APERTURE, VID_MEM);
+    else
+        aperture_value = HWCONST(C96F, MEM_OP_C, TLB_INVALIDATE_PDB_APERTURE, SYS_MEM_COHERENT);
+
+    UVM_ASSERT_MSG(IS_ALIGNED(pdb.address, 1 << 12), "pdb 0x%llx\n", pdb.address);
+    pdb.address >>= 12;
+
+    pdb_lo = pdb.address & HWMASK(C96F, MEM_OP_C, TLB_INVALIDATE_PDB_ADDR_LO);
+    pdb_hi = pdb.address >> HWSIZE(C96F, MEM_OP_C, TLB_INVALIDATE_PDB_ADDR_LO);
+
+    if (params->page_table_level != UvmInvalidatePageTableLevelAll) {
+        // PDE4 is the highest level on Blackwell, see the comment in
+        // uvm_blackwell_mmu.c for details.
+        page_table_level = min((NvU32)UvmInvalidatePageTableLevelPde4, params->page_table_level) - 1;
+    }
+
+    if (params->membar != UvmInvalidateTlbMemBarNone)
+        ack_value = HWCONST(C96F, MEM_OP_C, TLB_INVALIDATE_ACK_TYPE, GLOBALLY);
+
+    if (params->membar == UvmInvalidateTlbMemBarSys)
+        sysmembar_value = HWCONST(C96F, MEM_OP_A, TLB_INVALIDATE_SYSMEMBAR, EN);
+    else
+        sysmembar_value = HWCONST(C96F, MEM_OP_A, TLB_INVALIDATE_SYSMEMBAR, DIS);
+
+    if (params->disable_gpc_invalidate)
+        invalidate_gpc_value = HWCONST(C96F, MEM_OP_C, TLB_INVALIDATE_GPC, DISABLE);
+    else
+        invalidate_gpc_value = HWCONST(C96F, MEM_OP_C, TLB_INVALIDATE_GPC, ENABLE);
+
+    if (params->target_va_mode == UvmTargetVaModeTargeted) {
+        NvU64 va = params->va >> 12;
+
+        NvU32 va_lo = va & HWMASK(C96F, MEM_OP_A, TLB_INVALIDATE_TARGET_ADDR_LO);
+        NvU32 va_hi = va >> HWSIZE(C96F, MEM_OP_A, TLB_INVALIDATE_TARGET_ADDR_LO);
+
+        NV_PUSH_4U(C96F, MEM_OP_A, sysmembar_value |
+                                   HWCONST(C96F, MEM_OP_A, TLB_INVALIDATE_INVAL_SCOPE, NON_LINK_TLBS) |
+                                   HWVALUE(C96F, MEM_OP_A, TLB_INVALIDATE_TARGET_ADDR_LO, va_lo),
+                         MEM_OP_B, HWVALUE(C96F, MEM_OP_B, TLB_INVALIDATE_TARGET_ADDR_HI, va_hi),
+                         MEM_OP_C, HWCONST(C96F, MEM_OP_C, TLB_INVALIDATE_REPLAY, NONE) |
+                                   HWVALUE(C96F, MEM_OP_C, TLB_INVALIDATE_PAGE_TABLE_LEVEL, page_table_level) |
+                                   HWCONST(C96F, MEM_OP_C, TLB_INVALIDATE_PDB, ONE) |
+                                   HWVALUE(C96F, MEM_OP_C, TLB_INVALIDATE_PDB_ADDR_LO, pdb_lo) |
+                                   invalidate_gpc_value |
+                                   aperture_value |
+                                   ack_value,
+                         MEM_OP_D, HWCONST(C96F, MEM_OP_D, OPERATION, MMU_TLB_INVALIDATE_TARGETED) |
+                                   HWVALUE(C96F, MEM_OP_D, TLB_INVALIDATE_PDB_ADDR_HI, pdb_hi));
+    }
+    else {
+        NV_PUSH_4U(C96F, MEM_OP_A, sysmembar_value |
+                                   HWCONST(C96F, MEM_OP_A, TLB_INVALIDATE_INVAL_SCOPE, NON_LINK_TLBS),
+                         MEM_OP_B, 0,
+                         MEM_OP_C, HWCONST(C96F, MEM_OP_C, TLB_INVALIDATE_REPLAY, NONE) |
+                                   HWVALUE(C96F, MEM_OP_C, TLB_INVALIDATE_PAGE_TABLE_LEVEL, page_table_level) |
+                                   HWCONST(C96F, MEM_OP_C, TLB_INVALIDATE_PDB, ONE) |
+                                   HWVALUE(C96F, MEM_OP_C, TLB_INVALIDATE_PDB_ADDR_LO, pdb_lo) |
+                                   invalidate_gpc_value |
+                                   aperture_value |
+                                   ack_value,
+                         MEM_OP_D, HWCONST(C96F, MEM_OP_D, OPERATION, MMU_TLB_INVALIDATE) |
+                                   HWVALUE(C96F, MEM_OP_D, TLB_INVALIDATE_PDB_ADDR_HI, pdb_hi));
+    }
+}
--- a/kernel-open/nvidia-uvm/uvm_blackwell_mmu.c
+++ b/kernel-open/nvidia-uvm/uvm_blackwell_mmu.c
@@ -0,0 +1,165 @@
+/*******************************************************************************
+    Copyright (c) 2022-2024 NVIDIA Corporation
+
+    Permission is hereby granted, free of charge, to any person obtaining a copy
+    of this software and associated documentation files (the "Software"), to
+    deal in the Software without restriction, including without limitation the
+    rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
+    sell copies of the Software, and to permit persons to whom the Software is
+    furnished to do so, subject to the following conditions:
+
+        The above copyright notice and this permission notice shall be
+        included in all copies or substantial portions of the Software.
+
+    THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+    IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+    FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+    THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+    LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+    FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+    DEALINGS IN THE SOFTWARE.
+
+*******************************************************************************/
+
+// On Blackwell, the UVM page tree 'depth' maps to hardware as follows:
+//
+// UVM depth   HW level                            VA bits
+// 0           PDE4                                56:56
+// 1           PDE3                                55:47
+// 2           PDE2 (or 256G PTE)                  46:38
+// 3           PDE1 (or 512M PTE)                  37:29
+// 4           PDE0 (dual 64K/4K PDE, or 2M PTE)   28:21
+// 5           PTE_64K / PTE_4K                    20:16 / 20:12
+
+#include "uvm_types.h"
+#include "uvm_global.h"
+#include "uvm_hal.h"
+#include "uvm_hal_types.h"
+#include "uvm_blackwell_fault_buffer.h"
+#include "hwref/blackwell/gb100/dev_fault.h"
+#include "hwref/blackwell/gb100/dev_mmu.h"
+
+static uvm_mmu_mode_hal_t blackwell_mmu_mode_hal;
+
+static NvU32 page_table_depth_blackwell(NvU64 page_size)
+{
+    switch (page_size) {
+        case UVM_PAGE_SIZE_2M:
+            return 4;
+        case UVM_PAGE_SIZE_512M:
+            return 3;
+        case UVM_PAGE_SIZE_256G:
+            return 2;
+        default:
+            return 5;
+    }
+}
+
+static NvU64 page_sizes_blackwell(void)
+{
+    return UVM_PAGE_SIZE_256G | UVM_PAGE_SIZE_512M | UVM_PAGE_SIZE_2M | UVM_PAGE_SIZE_64K | UVM_PAGE_SIZE_4K;
+}
+
+uvm_mmu_mode_hal_t *uvm_hal_mmu_mode_blackwell(NvU64 big_page_size)
+{
+    static bool initialized = false;
+
+    UVM_ASSERT(big_page_size == UVM_PAGE_SIZE_64K || big_page_size == UVM_PAGE_SIZE_128K);
+
+    // TODO: Bug 1789555: RM should reject the creation of GPU VA spaces with
+    // 128K big page size for Pascal+ GPUs
+    if (big_page_size == UVM_PAGE_SIZE_128K)
+        return NULL;
+
+    if (!initialized) {
+        uvm_mmu_mode_hal_t *hopper_mmu_mode_hal = uvm_hal_mmu_mode_hopper(big_page_size);
+        UVM_ASSERT(hopper_mmu_mode_hal);
+
+        // The assumption made is that arch_hal->mmu_mode_hal() will be called
+        // under the global lock the first time, so check it here.
+        uvm_assert_mutex_locked(&g_uvm_global.global_lock);
+
+        blackwell_mmu_mode_hal = *hopper_mmu_mode_hal;
+        blackwell_mmu_mode_hal.page_table_depth = page_table_depth_blackwell;
+        blackwell_mmu_mode_hal.page_sizes = page_sizes_blackwell;
+
+        initialized = true;
+    }
+
+    return &blackwell_mmu_mode_hal;
+}
+
+NvU16 uvm_hal_blackwell_mmu_client_id_to_utlb_id(NvU16 client_id)
+{
+    switch (client_id) {
+        case NV_PFAULT_CLIENT_GPC_RAST:
+        case NV_PFAULT_CLIENT_GPC_GCC:
+        case NV_PFAULT_CLIENT_GPC_GPCCS:
+            return UVM_BLACKWELL_GPC_UTLB_ID_RGG;
+        case NV_PFAULT_CLIENT_GPC_T1_0:
+            return UVM_BLACKWELL_GPC_UTLB_ID_LTP0;
+        case NV_PFAULT_CLIENT_GPC_T1_1:
+        case NV_PFAULT_CLIENT_GPC_PE_0:
+        case NV_PFAULT_CLIENT_GPC_TPCCS_0:
+            return UVM_BLACKWELL_GPC_UTLB_ID_LTP1;
+        case NV_PFAULT_CLIENT_GPC_T1_2:
+            return UVM_BLACKWELL_GPC_UTLB_ID_LTP2;
+        case NV_PFAULT_CLIENT_GPC_T1_3:
+        case NV_PFAULT_CLIENT_GPC_PE_1:
+        case NV_PFAULT_CLIENT_GPC_TPCCS_1:
+            return UVM_BLACKWELL_GPC_UTLB_ID_LTP3;
+        case NV_PFAULT_CLIENT_GPC_T1_4:
+            return UVM_BLACKWELL_GPC_UTLB_ID_LTP4;
+        case NV_PFAULT_CLIENT_GPC_T1_5:
+        case NV_PFAULT_CLIENT_GPC_PE_2:
+        case NV_PFAULT_CLIENT_GPC_TPCCS_2:
+            return UVM_BLACKWELL_GPC_UTLB_ID_LTP5;
+        case NV_PFAULT_CLIENT_GPC_T1_6:
+            return UVM_BLACKWELL_GPC_UTLB_ID_LTP6;
+        case NV_PFAULT_CLIENT_GPC_T1_7:
+        case NV_PFAULT_CLIENT_GPC_PE_3:
+        case NV_PFAULT_CLIENT_GPC_TPCCS_3:
+            return UVM_BLACKWELL_GPC_UTLB_ID_LTP7;
+        case NV_PFAULT_CLIENT_GPC_T1_8:
+            return UVM_BLACKWELL_GPC_UTLB_ID_LTP8;
+        case NV_PFAULT_CLIENT_GPC_T1_9:
+        case NV_PFAULT_CLIENT_GPC_PE_4:
+        case NV_PFAULT_CLIENT_GPC_TPCCS_4:
+            return UVM_BLACKWELL_GPC_UTLB_ID_LTP9;
+        case NV_PFAULT_CLIENT_GPC_T1_10:
+            return UVM_BLACKWELL_GPC_UTLB_ID_LTP10;
+        case NV_PFAULT_CLIENT_GPC_T1_11:
+        case NV_PFAULT_CLIENT_GPC_PE_5:
+        case NV_PFAULT_CLIENT_GPC_TPCCS_5:
+            return UVM_BLACKWELL_GPC_UTLB_ID_LTP11;
+        case NV_PFAULT_CLIENT_GPC_T1_12:
+            return UVM_BLACKWELL_GPC_UTLB_ID_LTP12;
+        case NV_PFAULT_CLIENT_GPC_T1_13:
+        case NV_PFAULT_CLIENT_GPC_PE_6:
+        case NV_PFAULT_CLIENT_GPC_TPCCS_6:
+            return UVM_BLACKWELL_GPC_UTLB_ID_LTP13;
+        case NV_PFAULT_CLIENT_GPC_T1_14:
+            return UVM_BLACKWELL_GPC_UTLB_ID_LTP14;
+        case NV_PFAULT_CLIENT_GPC_T1_15:
+        case NV_PFAULT_CLIENT_GPC_PE_7:
+        case NV_PFAULT_CLIENT_GPC_TPCCS_7:
+            return UVM_BLACKWELL_GPC_UTLB_ID_LTP15;
+        case NV_PFAULT_CLIENT_GPC_T1_16:
+            return UVM_BLACKWELL_GPC_UTLB_ID_LTP16;
+        case NV_PFAULT_CLIENT_GPC_T1_17:
+        case NV_PFAULT_CLIENT_GPC_PE_8:
+        case NV_PFAULT_CLIENT_GPC_TPCCS_8:
+            return UVM_BLACKWELL_GPC_UTLB_ID_LTP17;
+        case NV_PFAULT_CLIENT_GPC_T1_18:
+            return UVM_BLACKWELL_GPC_UTLB_ID_LTP18;
+        case NV_PFAULT_CLIENT_GPC_T1_19:
+        case NV_PFAULT_CLIENT_GPC_PE_9:
+        case NV_PFAULT_CLIENT_GPC_TPCCS_9:
+            return UVM_BLACKWELL_GPC_UTLB_ID_LTP19;
+
+        default:
+            UVM_ASSERT_MSG(false, "Invalid client value: 0x%x\n", client_id);
+    }
+
+    return 0;
+}
--- a/kernel-open/nvidia-uvm/uvm_ce_test.c
+++ b/kernel-open/nvidia-uvm/uvm_ce_test.c
@@ -56,7 +56,7 @@ static NV_STATUS test_non_pipelined(uvm_gpu_t *gpu)

    // TODO: Bug 3839176: the test is waived on Confidential Computing because
    // it assumes that GPU can access system memory without using encryption.
-    if (uvm_conf_computing_mode_enabled(gpu))
+    if (g_uvm_global.conf_computing_enabled)
        return NV_OK;

    status = uvm_rm_mem_alloc_and_map_cpu(gpu, UVM_RM_MEM_TYPE_SYS, CE_TEST_MEM_SIZE, 0, &host_mem);
@@ -176,7 +176,7 @@ static NV_STATUS test_membar(uvm_gpu_t *gpu)

    // TODO: Bug 3839176: the test is waived on Confidential Computing because
    // it assumes that GPU can access system memory without using encryption.
-    if (uvm_conf_computing_mode_enabled(gpu))
+    if (g_uvm_global.conf_computing_enabled)
        return NV_OK;

    status = uvm_rm_mem_alloc_and_map_cpu(gpu, UVM_RM_MEM_TYPE_SYS, sizeof(NvU32), 0, &host_mem);
@@ -411,10 +411,11 @@ static NV_STATUS test_memcpy_and_memset(uvm_gpu_t *gpu)
    size_t i, j, k, s;
    uvm_mem_alloc_params_t mem_params = {0};

-    if (uvm_conf_computing_mode_enabled(gpu))
+    if (g_uvm_global.conf_computing_enabled)
        TEST_NV_CHECK_GOTO(uvm_mem_alloc_sysmem_dma_and_map_cpu_kernel(size, gpu, current->mm, &verif_mem), done);
    else
        TEST_NV_CHECK_GOTO(uvm_mem_alloc_sysmem_and_map_cpu_kernel(size, current->mm, &verif_mem), done);
+
    TEST_NV_CHECK_GOTO(uvm_mem_map_gpu_kernel(verif_mem, gpu), done);

    gpu_verif_addr = uvm_mem_gpu_address_virtual_kernel(verif_mem, gpu);
@@ -436,7 +437,7 @@ static NV_STATUS test_memcpy_and_memset(uvm_gpu_t *gpu)
    TEST_NV_CHECK_GOTO(uvm_rm_mem_alloc(gpu, UVM_RM_MEM_TYPE_SYS, size, 0, &sys_rm_mem), done);
    gpu_addresses[0] = uvm_rm_mem_get_gpu_va(sys_rm_mem, gpu, is_proxy_va_space);

-    if (uvm_conf_computing_mode_enabled(gpu)) {
+    if (g_uvm_global.conf_computing_enabled) {
        for (i = 0; i < iterations; ++i) {
            for (s = 0; s < ARRAY_SIZE(element_sizes); s++) {
                TEST_NV_CHECK_GOTO(test_memcpy_and_memset_inner(gpu,
@@ -559,7 +560,7 @@ static NV_STATUS test_semaphore_reduction_inc(uvm_gpu_t *gpu)

    // TODO: Bug 3839176: the test is waived on Confidential Computing because
    // it assumes that GPU can access system memory without using encryption.
-    if (uvm_conf_computing_mode_enabled(gpu))
+    if (g_uvm_global.conf_computing_enabled)
        return NV_OK;

    status = test_semaphore_alloc_sem(gpu, size, &mem);
@@ -611,7 +612,7 @@ static NV_STATUS test_semaphore_release(uvm_gpu_t *gpu)

    // TODO: Bug 3839176: the test is waived on Confidential Computing because
    // it assumes that GPU can access system memory without using encryption.
-    if (uvm_conf_computing_mode_enabled(gpu))
+    if (g_uvm_global.conf_computing_enabled)
        return NV_OK;

    status = test_semaphore_alloc_sem(gpu, size, &mem);
@@ -665,7 +666,7 @@ static NV_STATUS test_semaphore_timestamp(uvm_gpu_t *gpu)

    // TODO: Bug 3839176: the test is waived on Confidential Computing because
    // it assumes that GPU can access system memory without using encryption.
-    if (uvm_conf_computing_mode_enabled(gpu))
+    if (g_uvm_global.conf_computing_enabled)
        return NV_OK;

    status = test_semaphore_alloc_sem(gpu, size, &mem);
@@ -1153,7 +1154,7 @@ static NV_STATUS test_encryption_decryption(uvm_gpu_t *gpu,
    } small_sizes[] = {{1, 1}, {3, 1}, {8, 1}, {2, 2}, {8, 4}, {UVM_PAGE_SIZE_4K - 8, 8}, {UVM_PAGE_SIZE_4K + 8, 8}};

    // Only Confidential Computing uses CE encryption/decryption
-    if (!uvm_conf_computing_mode_enabled(gpu))
+    if (!g_uvm_global.conf_computing_enabled)
        return NV_OK;

    // Use a size, and copy size, that are not a multiple of common page sizes.
--- a/kernel-open/nvidia-uvm/uvm_channel.c
+++ b/kernel-open/nvidia-uvm/uvm_channel.c
--- a/kernel-open/nvidia-uvm/uvm_channel.h
+++ b/kernel-open/nvidia-uvm/uvm_channel.h
@@ -418,7 +418,7 @@ struct uvm_channel_manager_struct
    unsigned num_channel_pools;

    // Mask containing the indexes of the usable Copy Engines. Each usable CE
-    // has at least one pool associated with it.
+    // has at least one pool of type UVM_CHANNEL_POOL_TYPE_CE associated with it
    DECLARE_BITMAP(ce_mask, UVM_COPY_ENGINE_COUNT_MAX);

    struct
@@ -497,6 +497,10 @@ static bool uvm_channel_is_lcic(uvm_channel_t *channel)
    return uvm_channel_pool_is_lcic(channel->pool);
 }

+uvm_channel_t *uvm_channel_lcic_get_paired_wlc(uvm_channel_t *lcic_channel);
+
+uvm_channel_t *uvm_channel_wlc_get_paired_lcic(uvm_channel_t *wlc_channel);
+
 static bool uvm_channel_pool_is_proxy(uvm_channel_pool_t *pool)
 {
    UVM_ASSERT(uvm_pool_type_is_valid(pool->pool_type));
@@ -603,6 +607,11 @@ bool uvm_channel_is_value_completed(uvm_channel_t *channel, NvU64 value);
 // Update and get the latest completed value by the channel
 NvU64 uvm_channel_update_completed_value(uvm_channel_t *channel);

+// Wait for the channel to idle
+// It waits for anything that is running, but doesn't prevent new work from
+// beginning.
+NV_STATUS uvm_channel_wait(uvm_channel_t *channel);
+
 // Select and reserve a channel with the specified type for a push
 NV_STATUS uvm_channel_reserve_type(uvm_channel_manager_t *manager,
                                   uvm_channel_type_t type,
@@ -617,6 +626,9 @@ NV_STATUS uvm_channel_reserve_gpu_to_gpu(uvm_channel_manager_t *channel_manager,
 // Reserve a specific channel for a push or for a control GPFIFO entry.
 NV_STATUS uvm_channel_reserve(uvm_channel_t *channel, NvU32 num_gpfifo_entries);

+// Release reservation on a specific channel
+void uvm_channel_release(uvm_channel_t *channel, NvU32 num_gpfifo_entries);
+
 // Set optimal CE for P2P transfers between manager->gpu and peer
 void uvm_channel_manager_set_p2p_ce(uvm_channel_manager_t *manager, uvm_gpu_t *peer, NvU32 optimal_ce);

@@ -648,6 +660,8 @@ NvU32 uvm_channel_get_available_gpfifo_entries(uvm_channel_t *channel);

 void uvm_channel_print_pending_pushes(uvm_channel_t *channel);

+bool uvm_channel_is_locked_for_push(uvm_channel_t *channel);
+
 static uvm_gpu_t *uvm_channel_get_gpu(uvm_channel_t *channel)
 {
    return channel->pool->manager->gpu;
--- a/kernel-open/nvidia-uvm/uvm_channel_test.c
+++ b/kernel-open/nvidia-uvm/uvm_channel_test.c
@@ -1,5 +1,5 @@
 /*******************************************************************************
-    Copyright (c) 2015-2022 NVIDIA Corporation
+    Copyright (c) 2015-2023 NVIDIA Corporation

    Permission is hereby granted, free of charge, to any person obtaining a copy
    of this software and associated documentation files (the "Software"), to
@@ -24,6 +24,7 @@
 #include "uvm_global.h"
 #include "uvm_channel.h"
 #include "uvm_hal.h"
+#include "uvm_mem.h"
 #include "uvm_push.h"
 #include "uvm_test.h"
 #include "uvm_test_rng.h"
@@ -57,14 +58,14 @@ static NV_STATUS test_ordering(uvm_va_space_t *va_space)
    const NvU32 values_count = iters_per_channel_type_per_gpu;
    const size_t buffer_size = sizeof(NvU32) * values_count;

-    gpu = uvm_va_space_find_first_gpu(va_space);
-    TEST_CHECK_RET(gpu != NULL);
-
    // TODO: Bug 3839176: the test is waived on Confidential Computing because
    // it assumes that GPU can access system memory without using encryption.
-    if (uvm_conf_computing_mode_enabled(gpu))
+    if (g_uvm_global.conf_computing_enabled)
        return NV_OK;

+    gpu = uvm_va_space_find_first_gpu(va_space);
+    TEST_CHECK_RET(gpu != NULL);
+
    status = uvm_rm_mem_alloc_and_map_all(gpu, UVM_RM_MEM_TYPE_SYS, buffer_size, 0, &mem);
    TEST_CHECK_GOTO(status == NV_OK, done);

@@ -84,7 +85,7 @@ static NV_STATUS test_ordering(uvm_va_space_t *va_space)

    TEST_NV_CHECK_GOTO(uvm_tracker_add_push(&tracker, &push), done);

-    exclude_proxy_channel_type = uvm_gpu_uses_proxy_channel_pool(gpu);
+    exclude_proxy_channel_type = uvm_parent_gpu_needs_proxy_channel_pool(gpu->parent);

    for (i = 0; i < iters_per_channel_type_per_gpu; ++i) {
        for (j = 0; j < UVM_CHANNEL_TYPE_CE_COUNT; ++j) {
@@ -222,7 +223,7 @@ static NV_STATUS uvm_test_rc_for_gpu(uvm_gpu_t *gpu)
    // Check RC on a proxy channel (SR-IOV heavy) or internal channel (any other
    // mode). It is not allowed to use a virtual address in a memset pushed to
    // a proxy channel, so we use a physical address instead.
-    if (uvm_gpu_uses_proxy_channel_pool(gpu)) {
+    if (uvm_parent_gpu_needs_proxy_channel_pool(gpu->parent)) {
        uvm_gpu_address_t dst_address;

        // Save the line number the push that's supposed to fail was started on
@@ -314,6 +315,110 @@ static NV_STATUS test_rc(uvm_va_space_t *va_space)
    return NV_OK;
 }

+static NV_STATUS uvm_test_iommu_rc_for_gpu(uvm_gpu_t *gpu)
+{
+    NV_STATUS status = NV_OK;
+
+#if defined(NV_IOMMU_IS_DMA_DOMAIN_PRESENT) && defined(CONFIG_IOMMU_DEFAULT_DMA_STRICT)
+    // This test needs the DMA API to immediately invalidate IOMMU mappings on
+    // DMA unmap (as apposed to lazy invalidation). The policy can be changed
+    // on boot (e.g. iommu.strict=1), but there isn't a good way to check for
+    // the runtime setting. CONFIG_IOMMU_DEFAULT_DMA_STRICT checks for the
+    // default value.
+
+    uvm_push_t push;
+    uvm_mem_t *sysmem;
+    uvm_gpu_address_t sysmem_dma_addr;
+    char *cpu_ptr = NULL;
+    const size_t data_size = PAGE_SIZE;
+    size_t i;
+
+    struct device *dev = &gpu->parent->pci_dev->dev;
+    struct iommu_domain *domain = iommu_get_domain_for_dev(dev);
+
+    // Check that the iommu domain is controlled by linux DMA API
+    if (!domain || !iommu_is_dma_domain(domain))
+        return NV_OK;
+
+    // Only run if ATS is enabled with 64kB base page.
+    // Otherwise the CE doesn't get response on writing to unmapped location.
+    if (!g_uvm_global.ats.enabled || PAGE_SIZE != UVM_PAGE_SIZE_64K)
+        return NV_OK;
+
+    status = uvm_mem_alloc_sysmem_and_map_cpu_kernel(data_size, NULL, &sysmem);
+    TEST_NV_CHECK_RET(status);
+
+    status = uvm_mem_map_gpu_phys(sysmem, gpu);
+    TEST_NV_CHECK_GOTO(status, done);
+
+    cpu_ptr = uvm_mem_get_cpu_addr_kernel(sysmem);
+    sysmem_dma_addr = uvm_mem_gpu_address_physical(sysmem, gpu, 0, data_size);
+
+    status = uvm_push_begin(gpu->channel_manager, UVM_CHANNEL_TYPE_GPU_TO_CPU, &push, "Test memset to IOMMU mapped sysmem");
+    TEST_NV_CHECK_GOTO(status, done);
+
+    gpu->parent->ce_hal->memset_8(&push, sysmem_dma_addr, 0, data_size);
+
+    status = uvm_push_end_and_wait(&push);
+    TEST_NV_CHECK_GOTO(status, done);
+
+    // Check that we have zeroed the memory
+    for (i = 0; i < data_size; ++i)
+        TEST_CHECK_GOTO(cpu_ptr[i] == 0, done);
+
+    // Unmap the buffer and try write again to the same address
+    uvm_mem_unmap_gpu_phys(sysmem, gpu);
+
+    status = uvm_push_begin(gpu->channel_manager, UVM_CHANNEL_TYPE_GPU_TO_CPU, &push, "Test memset after IOMMU unmap");
+    TEST_NV_CHECK_GOTO(status, done);
+
+    gpu->parent->ce_hal->memset_4(&push, sysmem_dma_addr, 0xffffffff, data_size);
+
+    status = uvm_push_end_and_wait(&push);
+
+    TEST_CHECK_GOTO(status == NV_ERR_RC_ERROR, done);
+    TEST_CHECK_GOTO(uvm_channel_get_status(push.channel) == NV_ERR_RC_ERROR, done);
+    TEST_CHECK_GOTO(uvm_global_reset_fatal_error() == NV_ERR_RC_ERROR, done);
+
+    // Check that writes after unmap did not succeed
+    for (i = 0; i < data_size; ++i)
+        TEST_CHECK_GOTO(cpu_ptr[i] == 0, done);
+
+    status = NV_OK;
+
+done:
+    uvm_mem_free(sysmem);
+#endif
+    return status;
+}
+
+static NV_STATUS test_iommu(uvm_va_space_t *va_space)
+{
+    uvm_gpu_t *gpu;
+
+    uvm_assert_mutex_locked(&g_uvm_global.global_lock);
+
+    for_each_va_space_gpu(gpu, va_space) {
+        NV_STATUS test_status, create_status;
+
+        // The GPU channel manager is destroyed and then re-created after
+        // testing ATS RC fault, so this test requires exclusive access to the GPU.
+        TEST_CHECK_RET(uvm_gpu_retained_count(gpu) == 1);
+
+        g_uvm_global.disable_fatal_error_assert = true;
+        test_status = uvm_test_iommu_rc_for_gpu(gpu);
+        g_uvm_global.disable_fatal_error_assert = false;
+
+        uvm_channel_manager_destroy(gpu->channel_manager);
+        create_status = uvm_channel_manager_create(gpu, &gpu->channel_manager);
+
+        TEST_NV_CHECK_RET(test_status);
+        TEST_NV_CHECK_RET(create_status);
+    }
+
+    return NV_OK;
+}
+
 typedef struct
 {
    uvm_push_t push;
@@ -403,7 +508,7 @@ static uvm_channel_type_t random_ce_channel_type_except(uvm_test_rng_t *rng, uvm

 static uvm_channel_type_t gpu_random_internal_ce_channel_type(uvm_gpu_t *gpu, uvm_test_rng_t *rng)
 {
-    if (uvm_gpu_uses_proxy_channel_pool(gpu))
+    if (uvm_parent_gpu_needs_proxy_channel_pool(gpu->parent))
        return random_ce_channel_type_except(rng, uvm_channel_proxy_channel_type());

    return random_ce_channel_type(rng);
@@ -586,12 +691,16 @@ static NV_STATUS stress_test_all_gpus_in_va(uvm_va_space_t *va_space,
            if (uvm_test_rng_range_32(&rng, 0, 1) == 0) {
                NvU32 random_stream_index = uvm_test_rng_range_32(&rng, 0, num_streams - 1);
                uvm_test_stream_t *random_stream = &streams[random_stream_index];
-                uvm_push_acquire_tracker(&stream->push, &random_stream->tracker);
-                snapshot_counter(&stream->push,
-                                 random_stream->counter_mem,
-                                 stream->other_stream_counter_snapshots_mem,
-                                 i,
-                                 random_stream->queued_counter_repeat);
+
+                if ((random_stream->push.gpu == gpu) || uvm_push_allow_dependencies_across_gpus()) {
+                    uvm_push_acquire_tracker(&stream->push, &random_stream->tracker);
+
+                    snapshot_counter(&stream->push,
+                                     random_stream->counter_mem,
+                                     stream->other_stream_counter_snapshots_mem,
+                                     i,
+                                     random_stream->queued_counter_repeat);
+                }
            }

            uvm_push_end(&stream->push);
@@ -684,7 +793,7 @@ done:
 // This test verifies that concurrent pushes using the same channel pool
 // select different channels, when the Confidential Computing feature is
 // enabled.
-NV_STATUS test_conf_computing_channel_selection(uvm_va_space_t *va_space)
+static NV_STATUS test_conf_computing_channel_selection(uvm_va_space_t *va_space)
 {
    NV_STATUS status = NV_OK;
    uvm_channel_pool_t *pool;
@@ -693,9 +802,7 @@ NV_STATUS test_conf_computing_channel_selection(uvm_va_space_t *va_space)
    NvU32 i;
    NvU32 num_pushes;

-    gpu = uvm_va_space_find_first_gpu(va_space);
-
-    if (!uvm_conf_computing_mode_enabled(gpu))
+    if (!g_uvm_global.conf_computing_enabled)
        return NV_OK;

    uvm_thread_context_lock_disable_tracking();
@@ -746,7 +853,102 @@ error:
    return status;
 }

-NV_STATUS test_write_ctrl_gpfifo_noop(uvm_va_space_t *va_space)
+static NV_STATUS test_channel_iv_rotation(uvm_va_space_t *va_space)
+{
+    uvm_gpu_t *gpu;
+
+    if (!g_uvm_global.conf_computing_enabled)
+        return NV_OK;
+
+    for_each_va_space_gpu(gpu, va_space) {
+        uvm_channel_pool_t *pool;
+
+        uvm_for_each_pool(pool, gpu->channel_manager) {
+            NvU64 before_rotation_enc, before_rotation_dec, after_rotation_enc, after_rotation_dec;
+            NV_STATUS status = NV_OK;
+
+            // Check one (the first) channel per pool
+            uvm_channel_t *channel = pool->channels;
+
+            // Create a dummy encrypt/decrypt push to use few IVs.
+            // SEC2 used encrypt during initialization, no need to use a dummy
+            // push.
+            if (!uvm_channel_is_sec2(channel)) {
+                uvm_push_t push;
+                size_t data_size;
+                uvm_conf_computing_dma_buffer_t *cipher_text;
+                void *cipher_cpu_va, *plain_cpu_va, *tag_cpu_va;
+                uvm_gpu_address_t cipher_gpu_address, plain_gpu_address, tag_gpu_address;
+                uvm_channel_t *work_channel = uvm_channel_is_lcic(channel) ? uvm_channel_lcic_get_paired_wlc(channel) : channel;
+
+                plain_cpu_va = &status;
+                data_size = sizeof(status);
+
+                TEST_NV_CHECK_RET(uvm_conf_computing_dma_buffer_alloc(&gpu->conf_computing.dma_buffer_pool,
+                                                                      &cipher_text,
+                                                                      NULL));
+                cipher_cpu_va = uvm_mem_get_cpu_addr_kernel(cipher_text->alloc);
+                tag_cpu_va = uvm_mem_get_cpu_addr_kernel(cipher_text->auth_tag);
+
+                cipher_gpu_address = uvm_mem_gpu_address_virtual_kernel(cipher_text->alloc, gpu);
+                tag_gpu_address = uvm_mem_gpu_address_virtual_kernel(cipher_text->auth_tag, gpu);
+
+                TEST_NV_CHECK_GOTO(uvm_push_begin_on_channel(work_channel, &push, "Dummy push for IV rotation"), free);
+
+                (void)uvm_push_get_single_inline_buffer(&push,
+                                                        data_size,
+                                                        UVM_CONF_COMPUTING_BUF_ALIGNMENT,
+                                                        &plain_gpu_address);
+
+                uvm_conf_computing_cpu_encrypt(work_channel, cipher_cpu_va, plain_cpu_va, NULL, data_size, tag_cpu_va);
+                gpu->parent->ce_hal->decrypt(&push, plain_gpu_address, cipher_gpu_address, data_size, tag_gpu_address);
+
+                TEST_NV_CHECK_GOTO(uvm_push_end_and_wait(&push), free);
+
+free:
+                uvm_conf_computing_dma_buffer_free(&gpu->conf_computing.dma_buffer_pool, cipher_text, NULL);
+
+                if (status != NV_OK)
+                    return status;
+            }
+
+            // Reserve a channel to hold the push lock during rotation
+            if (!uvm_channel_is_lcic(channel))
+                TEST_NV_CHECK_RET(uvm_channel_reserve(channel, 1));
+
+            uvm_conf_computing_query_message_pools(channel, &before_rotation_enc, &before_rotation_dec);
+            TEST_NV_CHECK_GOTO(uvm_conf_computing_rotate_channel_ivs_below_limit(channel, -1, true), release);
+            uvm_conf_computing_query_message_pools(channel, &after_rotation_enc, &after_rotation_dec);
+
+release:
+            if (!uvm_channel_is_lcic(channel))
+                uvm_channel_release(channel, 1);
+
+            if (status != NV_OK)
+                return status;
+
+            // All channels except SEC2 used at least a single IV to release tracking.
+            // SEC2 doesn't support decrypt direction.
+            if (uvm_channel_is_sec2(channel))
+                TEST_CHECK_RET(before_rotation_dec == after_rotation_dec);
+            else
+                TEST_CHECK_RET(before_rotation_dec < after_rotation_dec);
+
+            // All channels used one CPU encrypt/GPU decrypt, either during
+            // initialization or in the push above, with the exception of LCIC.
+            // LCIC is used in tandem with WLC, but it never uses CPU encrypt/
+            // GPU decrypt ops.
+            if (uvm_channel_is_lcic(channel))
+                TEST_CHECK_RET(before_rotation_enc == after_rotation_enc);
+            else
+                TEST_CHECK_RET(before_rotation_enc < after_rotation_enc);
+        }
+    }
+
+    return NV_OK;
+}
+
+static NV_STATUS test_write_ctrl_gpfifo_noop(uvm_va_space_t *va_space)
 {
    uvm_gpu_t *gpu;

@@ -785,7 +987,7 @@ NV_STATUS test_write_ctrl_gpfifo_noop(uvm_va_space_t *va_space)
    return NV_OK;
 }

-NV_STATUS test_write_ctrl_gpfifo_and_pushes(uvm_va_space_t *va_space)
+static NV_STATUS test_write_ctrl_gpfifo_and_pushes(uvm_va_space_t *va_space)
 {
    uvm_gpu_t *gpu;

@@ -833,7 +1035,7 @@ NV_STATUS test_write_ctrl_gpfifo_and_pushes(uvm_va_space_t *va_space)
    return NV_OK;
 }

-NV_STATUS test_write_ctrl_gpfifo_tight(uvm_va_space_t *va_space)
+static NV_STATUS test_write_ctrl_gpfifo_tight(uvm_va_space_t *va_space)
 {
    NV_STATUS status = NV_OK;
    uvm_gpu_t *gpu;
@@ -845,11 +1047,9 @@ NV_STATUS test_write_ctrl_gpfifo_tight(uvm_va_space_t *va_space)
    NvU64 entry;
    uvm_push_t push;

-    gpu = uvm_va_space_find_first_gpu(va_space);
-
    // TODO: Bug 3839176: the test is waived on Confidential Computing because
    // it assumes that GPU can access system memory without using encryption.
-    if (uvm_conf_computing_mode_enabled(gpu))
+    if (g_uvm_global.conf_computing_enabled)
        return NV_OK;

    for_each_va_space_gpu(gpu, va_space) {
@@ -924,7 +1124,7 @@ static NV_STATUS test_channel_pushbuffer_extension_base(uvm_va_space_t *va_space
        uvm_channel_manager_t *manager;
        uvm_channel_pool_t *pool;

-        if (!uvm_gpu_has_pushbuffer_segments(gpu))
+        if (!uvm_parent_gpu_needs_pushbuffer_segments(gpu->parent))
            continue;

        // The GPU channel manager pushbuffer is destroyed and then re-created
@@ -999,6 +1199,10 @@ NV_STATUS uvm_test_channel_sanity(UVM_TEST_CHANNEL_SANITY_PARAMS *params, struct
    if (status != NV_OK)
        goto done;

+    status = test_channel_iv_rotation(va_space);
+    if (status != NV_OK)
+        goto done;
+
    // The following tests have side effects, they reset the GPU's
    // channel_manager.
    status = test_channel_pushbuffer_extension_base(va_space);
@@ -1019,6 +1223,10 @@ NV_STATUS uvm_test_channel_sanity(UVM_TEST_CHANNEL_SANITY_PARAMS *params, struct
            goto done;
    }

+    status = test_iommu(va_space);
+    if (status != NV_OK)
+        goto done;
+
 done:
    uvm_va_space_up_read_rm(va_space);
    uvm_mutex_unlock(&g_uvm_global.global_lock);
@@ -1034,23 +1242,22 @@ static NV_STATUS uvm_test_channel_stress_stream(uvm_va_space_t *va_space,
    if (params->iterations == 0 || params->num_streams == 0)
        return NV_ERR_INVALID_PARAMETER;

+    // TODO: Bug 3839176: the test is waived on Confidential Computing because
+    // it assumes that GPU can access system memory without using encryption.
+    if (g_uvm_global.conf_computing_enabled)
+        return NV_OK;
+
    // TODO: Bug 1764963: Rework the test to not rely on the global lock as that
    // serializes all the threads calling this at the same time.
    uvm_mutex_lock(&g_uvm_global.global_lock);
    uvm_va_space_down_read_rm(va_space);

-    // TODO: Bug 3839176: the test is waived on Confidential Computing because
-    // it assumes that GPU can access system memory without using encryption.
-    if (uvm_conf_computing_mode_enabled(uvm_va_space_find_first_gpu(va_space)))
-        goto done;
-
    status = stress_test_all_gpus_in_va(va_space,
                                        params->num_streams,
                                        params->iterations,
                                        params->seed,
                                        params->verbose);

-done:
    uvm_va_space_up_read_rm(va_space);
    uvm_mutex_unlock(&g_uvm_global.global_lock);

--- a/kernel-open/nvidia-uvm/uvm_common.c
+++ b/kernel-open/nvidia-uvm/uvm_common.c
@@ -318,10 +318,11 @@ int format_uuid_to_buffer(char *buffer, unsigned bufferLength, const NvProcessor
    unsigned i;
    unsigned dashMask = 1 << 4 | 1 << 6 | 1 << 8 | 1 << 10;

-    memcpy(buffer, "UVM-GPU-", 8);
    if (bufferLength < (8 /*prefix*/+ 16 * 2 /*digits*/ + 4 * 1 /*dashes*/ + 1 /*null*/))
        return *buffer = 0;

+    memcpy(buffer, "UVM-GPU-", 8);
+
    for (i = 0; i < 16; i++) {
        *str++ = uvm_digit_to_hex(pUuidStruct->uuid[i] >> 4);
        *str++ = uvm_digit_to_hex(pUuidStruct->uuid[i] & 0xF);
--- a/kernel-open/nvidia-uvm/uvm_common.h
+++ b/kernel-open/nvidia-uvm/uvm_common.h
@@ -21,8 +21,8 @@

 *******************************************************************************/

-#ifndef _UVM_COMMON_H
-#define _UVM_COMMON_H
+#ifndef __UVM_COMMON_H__
+#define __UVM_COMMON_H__

 #ifdef DEBUG
    #define UVM_IS_DEBUG() 1
@@ -204,13 +204,6 @@ extern bool uvm_release_asserts_set_global_error_for_tests;
 #define UVM_ASSERT_MSG_RELEASE(expr, fmt, ...)  _UVM_ASSERT_MSG_RELEASE(expr, #expr, ": " fmt, ##__VA_ARGS__)
 #define UVM_ASSERT_RELEASE(expr)                _UVM_ASSERT_MSG_RELEASE(expr, #expr, "\n")

-// Provide a short form of UUID's, typically for use in debug printing:
-#define ABBREV_UUID(uuid) (unsigned)(uuid)
-
-static inline NvBool uvm_uuid_is_cpu(const NvProcessorUuid *uuid)
-{
-    return memcmp(uuid, &NV_PROCESSOR_UUID_CPU_DEFAULT, sizeof(*uuid)) == 0;
-}
 #define UVM_SIZE_1KB (1024ULL)
 #define UVM_SIZE_1MB (1024 * UVM_SIZE_1KB)
 #define UVM_SIZE_1GB (1024 * UVM_SIZE_1MB)
@@ -409,4 +402,40 @@ static inline void uvm_touch_page(struct page *page)
 // Return true if the VMA is one used by UVM managed allocations.
 bool uvm_vma_is_managed(struct vm_area_struct *vma);

-#endif /* _UVM_COMMON_H */
+static bool uvm_platform_uses_canonical_form_address(void)
+{
+    if (NVCPU_IS_PPC64LE)
+        return false;
+
+    return true;
+}
+
+// Similar to the GPU MMU HAL num_va_bits(), it returns the CPU's num_va_bits().
+static NvU32 uvm_cpu_num_va_bits(void)
+{
+    return fls64(TASK_SIZE - 1) + 1;
+}
+
+// Return the unaddressable range in a num_va_bits-wide VA space, [first, outer)
+static void uvm_get_unaddressable_range(NvU32 num_va_bits, NvU64 *first, NvU64 *outer)
+{
+    UVM_ASSERT(num_va_bits < 64);
+    UVM_ASSERT(first);
+    UVM_ASSERT(outer);
+
+    if (uvm_platform_uses_canonical_form_address()) {
+        *first = 1ULL << (num_va_bits - 1);
+        *outer = (NvU64)((NvS64)(1ULL << 63) >> (64 - num_va_bits));
+    }
+    else {
+        *first = 1ULL << num_va_bits;
+        *outer = ~0Ull;
+    }
+}
+
+static void uvm_cpu_get_unaddressable_range(NvU64 *first, NvU64 *outer)
+{
+    return uvm_get_unaddressable_range(uvm_cpu_num_va_bits(), first, outer);
+}
+
+#endif /* __UVM_COMMON_H__ */
--- a/kernel-open/nvidia-uvm/uvm_conf_computing.c
+++ b/kernel-open/nvidia-uvm/uvm_conf_computing.c
@@ -1,5 +1,5 @@
 /*******************************************************************************
-    Copyright (c) 2021 NVIDIA Corporation
+    Copyright (c) 2021-2023 NVIDIA Corporation

    Permission is hereby granted, free of charge, to any person obtaining a copy
    of this software and associated documentation files (the "Software"), to
@@ -33,44 +33,55 @@
 #include "nv_uvm_interface.h"
 #include "uvm_va_block.h"

+// The maximum number of secure operations per push is:
+// UVM_MAX_PUSH_SIZE / min(CE encryption size, CE decryption size)
+// + 1 (tracking semaphore) =  128 * 1024 / 56 + 1 = 2342
+#define UVM_CONF_COMPUTING_IV_REMAINING_LIMIT_MIN 2342lu
+
+// Channels use 32-bit counters so the value after rotation is 0xffffffff.
+// setting the limit to this value (or higher) will result in rotation
+// on every check. However, pre-emptive rotation when submitting control
+// GPFIFO entries relies on the fact that multiple successive checks after
+// rotation do not trigger more rotations if there was no IV used in between.
+#define UVM_CONF_COMPUTING_IV_REMAINING_LIMIT_MAX 0xfffffffelu
+
+// Attempt rotation when two billion IVs are left. IV rotation call can fail if
+// the necessary locks are not available, so multiple attempts may be need for
+// IV rotation to succeed.
+#define UVM_CONF_COMPUTING_IV_REMAINING_LIMIT_DEFAULT (1lu << 31)
+
+// Start rotating after 500 encryption/decryptions when running tests.
+#define UVM_CONF_COMPUTING_IV_REMAINING_LIMIT_TESTS ((1lu << 32) - 500lu)
+static ulong uvm_conf_computing_channel_iv_rotation_limit = UVM_CONF_COMPUTING_IV_REMAINING_LIMIT_DEFAULT;
+
+module_param(uvm_conf_computing_channel_iv_rotation_limit, ulong, S_IRUGO);

 static UvmGpuConfComputeMode uvm_conf_computing_get_mode(const uvm_parent_gpu_t *parent)
 {
    return parent->rm_info.gpuConfComputeCaps.mode;
 }

-bool uvm_conf_computing_mode_enabled_parent(const uvm_parent_gpu_t *parent)
-{
-    return uvm_conf_computing_get_mode(parent) != UVM_GPU_CONF_COMPUTE_MODE_NONE;
-}
-
-bool uvm_conf_computing_mode_enabled(const uvm_gpu_t *gpu)
-{
-    return uvm_conf_computing_mode_enabled_parent(gpu->parent);
-}
-
 bool uvm_conf_computing_mode_is_hcc(const uvm_gpu_t *gpu)
 {
    return uvm_conf_computing_get_mode(gpu->parent) == UVM_GPU_CONF_COMPUTE_MODE_HCC;
 }

-NV_STATUS uvm_conf_computing_init_parent_gpu(const uvm_parent_gpu_t *parent)
+void uvm_conf_computing_check_parent_gpu(const uvm_parent_gpu_t *parent)
 {
-    UvmGpuConfComputeMode cc, sys_cc;
-    uvm_gpu_t *first;
+    uvm_parent_gpu_t *other_parent;
+    UvmGpuConfComputeMode parent_mode = uvm_conf_computing_get_mode(parent);

    uvm_assert_mutex_locked(&g_uvm_global.global_lock);

-    // TODO: Bug 2844714: since we have no routine to traverse parent GPUs,
-    // find first child GPU and get its parent.
-    first = uvm_global_processor_mask_find_first_gpu(&g_uvm_global.retained_gpus);
-    if (!first)
-        return NV_OK;
+    // The Confidential Computing state of the GPU should match that of the
+    // system.
+    UVM_ASSERT((parent_mode != UVM_GPU_CONF_COMPUTE_MODE_NONE) == g_uvm_global.conf_computing_enabled);

-    sys_cc = uvm_conf_computing_get_mode(first->parent);
-    cc = uvm_conf_computing_get_mode(parent);
-
-    return cc == sys_cc ? NV_OK : NV_ERR_NOT_SUPPORTED;
+    // All GPUs derive Confidential Computing status from their parent. By
+    // current policy all parent GPUs have identical Confidential Computing
+    // status.
+    for_each_parent_gpu(other_parent)
+        UVM_ASSERT(parent_mode == uvm_conf_computing_get_mode(other_parent));
 }

 static void dma_buffer_destroy_locked(uvm_conf_computing_dma_buffer_pool_t *dma_buffer_pool,
@@ -184,15 +195,11 @@ static void dma_buffer_pool_add(uvm_conf_computing_dma_buffer_pool_t *dma_buffer
 static NV_STATUS conf_computing_dma_buffer_pool_init(uvm_conf_computing_dma_buffer_pool_t *dma_buffer_pool)
 {
    size_t i;
-    uvm_gpu_t *gpu;
    size_t num_dma_buffers = 32;
    NV_STATUS status = NV_OK;

    UVM_ASSERT(dma_buffer_pool->num_dma_buffers == 0);
-
-    gpu = dma_buffer_pool_to_gpu(dma_buffer_pool);
-
-    UVM_ASSERT(uvm_conf_computing_mode_enabled(gpu));
+    UVM_ASSERT(g_uvm_global.conf_computing_enabled);

    INIT_LIST_HEAD(&dma_buffer_pool->free_dma_buffers);
    uvm_mutex_init(&dma_buffer_pool->lock, UVM_LOCK_ORDER_CONF_COMPUTING_DMA_BUFFER_POOL);
@@ -349,7 +356,7 @@ NV_STATUS uvm_conf_computing_gpu_init(uvm_gpu_t *gpu)
 {
    NV_STATUS status;

-    if (!uvm_conf_computing_mode_enabled(gpu))
+    if (!g_uvm_global.conf_computing_enabled)
        return NV_OK;

    status = conf_computing_dma_buffer_pool_init(&gpu->conf_computing.dma_buffer_pool);
@@ -360,6 +367,20 @@ NV_STATUS uvm_conf_computing_gpu_init(uvm_gpu_t *gpu)
    if (status != NV_OK)
        goto error;

+    if (uvm_enable_builtin_tests && uvm_conf_computing_channel_iv_rotation_limit == UVM_CONF_COMPUTING_IV_REMAINING_LIMIT_DEFAULT)
+        uvm_conf_computing_channel_iv_rotation_limit = UVM_CONF_COMPUTING_IV_REMAINING_LIMIT_TESTS;
+
+    if (uvm_conf_computing_channel_iv_rotation_limit < UVM_CONF_COMPUTING_IV_REMAINING_LIMIT_MIN ||
+        uvm_conf_computing_channel_iv_rotation_limit > UVM_CONF_COMPUTING_IV_REMAINING_LIMIT_MAX) {
+        UVM_ERR_PRINT("Value of uvm_conf_computing_channel_iv_rotation_limit: %lu is outside of the safe "
+                      "range: <%lu, %lu>. Using the default value instead (%lu)\n",
+                      uvm_conf_computing_channel_iv_rotation_limit,
+                      UVM_CONF_COMPUTING_IV_REMAINING_LIMIT_MIN,
+                      UVM_CONF_COMPUTING_IV_REMAINING_LIMIT_MAX,
+                      UVM_CONF_COMPUTING_IV_REMAINING_LIMIT_DEFAULT);
+        uvm_conf_computing_channel_iv_rotation_limit = UVM_CONF_COMPUTING_IV_REMAINING_LIMIT_DEFAULT;
+    }
+
    return NV_OK;

 error:
@@ -381,9 +402,8 @@ void uvm_conf_computing_log_gpu_encryption(uvm_channel_t *channel, UvmCslIv *iv)
    status = nvUvmInterfaceCslIncrementIv(&channel->csl.ctx, UVM_CSL_OPERATION_DECRYPT, 1, iv);
    uvm_mutex_unlock(&channel->csl.ctx_lock);

-    // TODO: Bug 4014720: If nvUvmInterfaceCslIncrementIv returns with
-    // NV_ERR_INSUFFICIENT_RESOURCES then the IV needs to be rotated via
-    // nvUvmInterfaceCslRotateIv.
+    // IV rotation is done preemptively as needed, so the above
+    // call cannot return failure.
    UVM_ASSERT(status == NV_OK);
 }

@@ -395,9 +415,8 @@ void uvm_conf_computing_acquire_encryption_iv(uvm_channel_t *channel, UvmCslIv *
    status = nvUvmInterfaceCslIncrementIv(&channel->csl.ctx, UVM_CSL_OPERATION_ENCRYPT, 1, iv);
    uvm_mutex_unlock(&channel->csl.ctx_lock);

-    // TODO: Bug 4014720: If nvUvmInterfaceCslIncrementIv returns with
-    // NV_ERR_INSUFFICIENT_RESOURCES then the IV needs to be rotated via
-    // nvUvmInterfaceCslRotateIv.
+    // IV rotation is done preemptively as needed, so the above
+    // call cannot return failure.
    UVM_ASSERT(status == NV_OK);
 }

@@ -421,8 +440,8 @@ void uvm_conf_computing_cpu_encrypt(uvm_channel_t *channel,
                                      (NvU8 *) auth_tag_buffer);
    uvm_mutex_unlock(&channel->csl.ctx_lock);

-    // nvUvmInterfaceCslEncrypt fails when a 64-bit encryption counter
-    // overflows. This is not supposed to happen on CC.
+    // IV rotation is done preemptively as needed, so the above
+    // call cannot return failure.
    UVM_ASSERT(status == NV_OK);
 }

@@ -435,11 +454,22 @@ NV_STATUS uvm_conf_computing_cpu_decrypt(uvm_channel_t *channel,
 {
    NV_STATUS status;

+    // The CSL context associated with a channel can be used by multiple
+    // threads. The IV sequence is thus guaranteed only while the channel is
+    // "locked for push". The channel/push lock is released in
+    // "uvm_channel_end_push", and at that time the GPU encryption operations
+    // have not executed, yet. Therefore the caller has to use
+    // "uvm_conf_computing_log_gpu_encryption" to explicitly store IVs needed
+    // to perform CPU decryption and pass those IVs to this function after the
+    // push that did the encryption completes.
+    UVM_ASSERT(src_iv);
+
    uvm_mutex_lock(&channel->csl.ctx_lock);
    status = nvUvmInterfaceCslDecrypt(&channel->csl.ctx,
                                      size,
                                      (const NvU8 *) src_cipher,
                                      src_iv,
+                                      NV_U32_MAX,
                                      (NvU8 *) dst_plain,
                                      NULL,
                                      0,
@@ -456,6 +486,8 @@ NV_STATUS uvm_conf_computing_fault_decrypt(uvm_parent_gpu_t *parent_gpu,
                                           NvU8 valid)
 {
    NV_STATUS status;
+    NvU32 fault_entry_size = parent_gpu->fault_buffer_hal->entry_size(parent_gpu);
+    UvmCslContext *csl_context = &parent_gpu->fault_buffer_info.rm_info.replayable.cslCtx;

    // There is no dedicated lock for the CSL context associated with replayable
    // faults. The mutual exclusion required by the RM CSL API is enforced by
@@ -463,36 +495,148 @@ NV_STATUS uvm_conf_computing_fault_decrypt(uvm_parent_gpu_t *parent_gpu,
    // decryption is invoked as part of fault servicing.
    UVM_ASSERT(uvm_sem_is_locked(&parent_gpu->isr.replayable_faults.service_lock));

-    UVM_ASSERT(!uvm_parent_gpu_replayable_fault_buffer_is_uvm_owned(parent_gpu));
+    UVM_ASSERT(g_uvm_global.conf_computing_enabled);

-    status = nvUvmInterfaceCslDecrypt(&parent_gpu->fault_buffer_info.rm_info.replayable.cslCtx,
-                                      parent_gpu->fault_buffer_hal->entry_size(parent_gpu),
+    status = nvUvmInterfaceCslLogEncryption(csl_context, UVM_CSL_OPERATION_DECRYPT, fault_entry_size);
+
+    // Informing RM of an encryption/decryption should not fail
+    UVM_ASSERT(status == NV_OK);
+
+    status = nvUvmInterfaceCslDecrypt(csl_context,
+                                      fault_entry_size,
                                      (const NvU8 *) src_cipher,
                                      NULL,
+                                      NV_U32_MAX,
                                      (NvU8 *) dst_plain,
                                      &valid,
                                      sizeof(valid),
                                      (const NvU8 *) auth_tag_buffer);

-    if (status != NV_OK)
-        UVM_ERR_PRINT("nvUvmInterfaceCslDecrypt() failed: %s, GPU %s\n", nvstatusToString(status), parent_gpu->name);
+    if (status != NV_OK) {
+        UVM_ERR_PRINT("nvUvmInterfaceCslDecrypt() failed: %s, GPU %s\n",
+                      nvstatusToString(status),
+                      uvm_parent_gpu_name(parent_gpu));
+
+    }

    return status;
 }

-void uvm_conf_computing_fault_increment_decrypt_iv(uvm_parent_gpu_t *parent_gpu, NvU64 increment)
+void uvm_conf_computing_fault_increment_decrypt_iv(uvm_parent_gpu_t *parent_gpu)
 {
    NV_STATUS status;
+    NvU32 fault_entry_size = parent_gpu->fault_buffer_hal->entry_size(parent_gpu);
+    UvmCslContext *csl_context = &parent_gpu->fault_buffer_info.rm_info.replayable.cslCtx;

    // See comment in uvm_conf_computing_fault_decrypt
    UVM_ASSERT(uvm_sem_is_locked(&parent_gpu->isr.replayable_faults.service_lock));

-    UVM_ASSERT(!uvm_parent_gpu_replayable_fault_buffer_is_uvm_owned(parent_gpu));
+    UVM_ASSERT(g_uvm_global.conf_computing_enabled);

-    status = nvUvmInterfaceCslIncrementIv(&parent_gpu->fault_buffer_info.rm_info.replayable.cslCtx,
-                                          UVM_CSL_OPERATION_DECRYPT,
-                                          increment,
-                                          NULL);
+    status = nvUvmInterfaceCslLogEncryption(csl_context, UVM_CSL_OPERATION_DECRYPT, fault_entry_size);
+
+    // Informing RM of an encryption/decryption should not fail
+    UVM_ASSERT(status == NV_OK);
+
+    status = nvUvmInterfaceCslIncrementIv(csl_context, UVM_CSL_OPERATION_DECRYPT, 1, NULL);

    UVM_ASSERT(status == NV_OK);
 }
+
+void uvm_conf_computing_query_message_pools(uvm_channel_t *channel,
+                                            NvU64 *remaining_encryptions,
+                                            NvU64 *remaining_decryptions)
+{
+    NV_STATUS status;
+
+    UVM_ASSERT(channel);
+    UVM_ASSERT(remaining_encryptions);
+    UVM_ASSERT(remaining_decryptions);
+
+    uvm_mutex_lock(&channel->csl.ctx_lock);
+    status = nvUvmInterfaceCslQueryMessagePool(&channel->csl.ctx, UVM_CSL_OPERATION_ENCRYPT, remaining_encryptions);
+    UVM_ASSERT(status == NV_OK);
+    UVM_ASSERT(*remaining_encryptions <= NV_U32_MAX);
+
+    status = nvUvmInterfaceCslQueryMessagePool(&channel->csl.ctx, UVM_CSL_OPERATION_DECRYPT, remaining_decryptions);
+    UVM_ASSERT(status == NV_OK);
+    UVM_ASSERT(*remaining_decryptions <= NV_U32_MAX);
+
+    // LCIC channels never use CPU encrypt/GPU decrypt
+    if (uvm_channel_is_lcic(channel))
+        UVM_ASSERT(*remaining_encryptions == NV_U32_MAX);
+
+    uvm_mutex_unlock(&channel->csl.ctx_lock);
+}
+
+static NV_STATUS uvm_conf_computing_rotate_channel_ivs_below_limit_internal(uvm_channel_t *channel, NvU64 limit)
+{
+    NV_STATUS status = NV_OK;
+    NvU64 remaining_encryptions, remaining_decryptions;
+    bool rotate_encryption_iv, rotate_decryption_iv;
+
+    UVM_ASSERT(uvm_channel_is_locked_for_push(channel) ||
+               (uvm_channel_is_lcic(channel) && uvm_channel_manager_is_wlc_ready(channel->pool->manager)));
+
+    uvm_conf_computing_query_message_pools(channel, &remaining_encryptions, &remaining_decryptions);
+
+    // Ignore decryption limit for SEC2, only CE channels support
+    // GPU encrypt/CPU decrypt. However, RM reports _some_ decrementing
+    // value for SEC2 decryption counter.
+    rotate_decryption_iv = (remaining_decryptions <= limit) && uvm_channel_is_ce(channel);
+    rotate_encryption_iv = remaining_encryptions <= limit;
+
+    if (!rotate_encryption_iv && !rotate_decryption_iv)
+        return NV_OK;
+
+    // Wait for all in-flight pushes. The caller needs to guarantee that there
+    // are no concurrent pushes created, e.g. by only calling rotate after
+    // a channel is locked_for_push.
+    status = uvm_channel_wait(channel);
+    if (status != NV_OK)
+        return status;
+
+    uvm_mutex_lock(&channel->csl.ctx_lock);
+
+    if (rotate_encryption_iv)
+        status = nvUvmInterfaceCslRotateIv(&channel->csl.ctx, UVM_CSL_OPERATION_ENCRYPT);
+
+    if (status == NV_OK && rotate_decryption_iv)
+        status = nvUvmInterfaceCslRotateIv(&channel->csl.ctx, UVM_CSL_OPERATION_DECRYPT);
+
+    uvm_mutex_unlock(&channel->csl.ctx_lock);
+
+    // Change the error to out of resources if the available IVs are running
+    // too low
+    if (status == NV_ERR_STATE_IN_USE &&
+        (remaining_encryptions < UVM_CONF_COMPUTING_IV_REMAINING_LIMIT_MIN ||
+         remaining_decryptions < UVM_CONF_COMPUTING_IV_REMAINING_LIMIT_MIN))
+        return NV_ERR_INSUFFICIENT_RESOURCES;
+
+    return status;
+}
+
+NV_STATUS uvm_conf_computing_rotate_channel_ivs_below_limit(uvm_channel_t *channel, NvU64 limit, bool retry_if_busy)
+{
+    NV_STATUS status;
+
+    do {
+        status = uvm_conf_computing_rotate_channel_ivs_below_limit_internal(channel, limit);
+    } while (retry_if_busy && status == NV_ERR_STATE_IN_USE);
+
+    // Hide "busy" error. The rotation will be retried at the next opportunity.
+    if (!retry_if_busy && status == NV_ERR_STATE_IN_USE)
+        status = NV_OK;
+
+    return status;
+}
+
+NV_STATUS uvm_conf_computing_maybe_rotate_channel_ivs(uvm_channel_t *channel)
+{
+    return uvm_conf_computing_rotate_channel_ivs_below_limit(channel, uvm_conf_computing_channel_iv_rotation_limit, false);
+}
+
+NV_STATUS uvm_conf_computing_maybe_rotate_channel_ivs_retry_busy(uvm_channel_t *channel)
+{
+    return uvm_conf_computing_rotate_channel_ivs_below_limit(channel, uvm_conf_computing_channel_iv_rotation_limit, true);
+}
--- a/kernel-open/nvidia-uvm/uvm_conf_computing.h
+++ b/kernel-open/nvidia-uvm/uvm_conf_computing.h
@@ -60,12 +60,8 @@
 // UVM_METHOD_SIZE * 2 * 10 = 80.
 #define UVM_CONF_COMPUTING_SIGN_BUF_MAX_SIZE 80

-// All GPUs derive confidential computing status from their parent.
-// By current policy all parent GPUs have identical confidential
-// computing status.
-NV_STATUS uvm_conf_computing_init_parent_gpu(const uvm_parent_gpu_t *parent);
-bool uvm_conf_computing_mode_enabled_parent(const uvm_parent_gpu_t *parent);
-bool uvm_conf_computing_mode_enabled(const uvm_gpu_t *gpu);
+void uvm_conf_computing_check_parent_gpu(const uvm_parent_gpu_t *parent);
+
 bool uvm_conf_computing_mode_is_hcc(const uvm_gpu_t *gpu);

 typedef struct
@@ -195,10 +191,27 @@ NV_STATUS uvm_conf_computing_fault_decrypt(uvm_parent_gpu_t *parent_gpu,
                                           NvU8 valid);

 // Increment the CPU-side decrypt IV of the CSL context associated with
-// replayable faults. The function is a no-op if the given increment is zero.
+// replayable faults.
 //
 // The IV associated with a fault CSL context is a 64-bit counter.
 //
 // Locking: this function must be invoked while holding the replayable ISR lock.
-void uvm_conf_computing_fault_increment_decrypt_iv(uvm_parent_gpu_t *parent_gpu, NvU64 increment);
+void uvm_conf_computing_fault_increment_decrypt_iv(uvm_parent_gpu_t *parent_gpu);
+
+// Query the number of remaining messages before IV needs to be rotated.
+void uvm_conf_computing_query_message_pools(uvm_channel_t *channel,
+                                            NvU64 *remaining_encryptions,
+                                            NvU64 *remaining_decryptions);
+
+// Check if there are more than uvm_conf_computing_channel_iv_rotation_limit
+// messages available in the channel and try to rotate if not.
+NV_STATUS uvm_conf_computing_maybe_rotate_channel_ivs(uvm_channel_t *channel);
+
+// Check if there are more than uvm_conf_computing_channel_iv_rotation_limit
+// messages available in the channel and rotate if not.
+NV_STATUS uvm_conf_computing_maybe_rotate_channel_ivs_retry_busy(uvm_channel_t *channel);
+
+// Check if there are fewer than 'limit' messages available in either direction
+// and rotate if not.
+NV_STATUS uvm_conf_computing_rotate_channel_ivs_below_limit(uvm_channel_t *channel, NvU64 limit, bool retry_if_busy);
 #endif // __UVM_CONF_COMPUTING_H__
--- a/kernel-open/nvidia-uvm/uvm_fault_buffer_flush_test.c
+++ b/kernel-open/nvidia-uvm/uvm_fault_buffer_flush_test.c
@@ -1,5 +1,5 @@
 /*******************************************************************************
-    Copyright (c) 2016-2019 NVIDIA Corporation
+    Copyright (c) 2016-2023 NVIDIA Corporation

    Permission is hereby granted, free of charge, to any person obtaining a copy
    of this software and associated documentation files (the "Software"), to
@@ -34,24 +34,27 @@ NV_STATUS uvm_test_fault_buffer_flush(UVM_TEST_FAULT_BUFFER_FLUSH_PARAMS *params
    NV_STATUS status = NV_OK;
    uvm_va_space_t *va_space = uvm_va_space_get(filp);
    uvm_gpu_t *gpu;
-    uvm_global_processor_mask_t retained_gpus;
+    uvm_processor_mask_t *retained_gpus;
    NvU64 i;

-    uvm_global_processor_mask_zero(&retained_gpus);
+    retained_gpus = uvm_processor_mask_cache_alloc();
+    if (!retained_gpus)
+        return NV_ERR_NO_MEMORY;
+
+    uvm_processor_mask_zero(retained_gpus);

    uvm_va_space_down_read(va_space);

-    for_each_va_space_gpu(gpu, va_space) {
-        if (gpu->parent->replayable_faults_supported)
-            uvm_global_processor_mask_set(&retained_gpus, gpu->global_id);
-    }
+    uvm_processor_mask_and(retained_gpus, &va_space->faultable_processors, &va_space->registered_gpus);

-    uvm_global_mask_retain(&retained_gpus);
+    uvm_global_gpu_retain(retained_gpus);

    uvm_va_space_up_read(va_space);

-    if (uvm_global_processor_mask_empty(&retained_gpus))
-        return NV_ERR_INVALID_DEVICE;
+    if (uvm_processor_mask_empty(retained_gpus)) {
+        status = NV_ERR_INVALID_DEVICE;
+        goto out;
+    }

    for (i = 0; i < params->iterations; i++) {
        if (fatal_signal_pending(current)) {
@@ -59,11 +62,12 @@ NV_STATUS uvm_test_fault_buffer_flush(UVM_TEST_FAULT_BUFFER_FLUSH_PARAMS *params
            break;
        }

-        for_each_global_gpu_in_mask(gpu, &retained_gpus)
+        for_each_gpu_in_mask(gpu, retained_gpus)
            TEST_CHECK_GOTO(uvm_gpu_fault_buffer_flush(gpu) == NV_OK, out);
    }

 out:
-    uvm_global_mask_release(&retained_gpus);
+    uvm_global_gpu_release(retained_gpus);
+    uvm_processor_mask_cache_free(retained_gpus);
    return status;
 }
--- a/kernel-open/nvidia-uvm/uvm_get_rm_ptes_test.c
+++ b/kernel-open/nvidia-uvm/uvm_get_rm_ptes_test.c
@@ -1,5 +1,5 @@
 /*******************************************************************************
-    Copyright (c) 2016-2021 NVidia Corporation
+    Copyright (c) 2016-2024 NVIDIA Corporation

    Permission is hereby granted, free of charge, to any person obtaining a copy
    of this software and associated documentation files (the "Software"), to
@@ -119,10 +119,6 @@ static NV_STATUS verify_mapping_info(uvm_va_space_t *va_space,
    if (memory_owning_gpu == NULL)
        return NV_ERR_INVALID_DEVICE;

-    // TODO: Bug 1903234: Once RM supports indirect peer mappings, we'll need to
-    //       update this test since the aperture will be SYS. Depending on how
-    //       RM implements things, we might not be able to compare the physical
-    //       addresses either.
    aperture = get_aperture(va_space, memory_owning_gpu, memory_mapping_gpu, memory_info, sli_supported);

    if (is_cacheable(ext_mapping_info, aperture))
@@ -168,7 +164,8 @@ static NV_STATUS test_get_rm_ptes_single_gpu(uvm_va_space_t *va_space, UVM_TEST_
    client = params->hClient;
    memory = params->hMemory;

-    // Note: This check is safe as single GPU test does not run on SLI enabled devices.
+    // Note: This check is safe as single GPU test does not run on SLI enabled
+    // devices.
    memory_mapping_gpu = uvm_va_space_get_gpu_by_uuid_with_gpu_va_space(va_space, &params->gpu_uuid);
    if (!memory_mapping_gpu)
        return NV_ERR_INVALID_DEVICE;
@@ -180,7 +177,7 @@ static NV_STATUS test_get_rm_ptes_single_gpu(uvm_va_space_t *va_space, UVM_TEST_
    if (status != NV_OK)
        return status;

-    TEST_CHECK_GOTO(uvm_processor_uuid_eq(&memory_info.uuid, &params->gpu_uuid), done);
+    TEST_CHECK_GOTO(uvm_uuid_eq(&memory_info.uuid, &params->gpu_uuid), done);

    TEST_CHECK_GOTO((memory_info.size == params->size), done);

--- a/kernel-open/nvidia-uvm/uvm_global.c
+++ b/kernel-open/nvidia-uvm/uvm_global.c
@@ -1,5 +1,5 @@
 /*******************************************************************************
-    Copyright (c) 2015-2022 NVIDIA Corporation
+    Copyright (c) 2015-2024 NVIDIA Corporation

    Permission is hereby granted, free of charge, to any person obtaining a copy
    of this software and associated documentation files (the "Software"), to
@@ -27,6 +27,7 @@
 #include "uvm_gpu_replayable_faults.h"
 #include "uvm_mem.h"
 #include "uvm_perf_events.h"
+#include "uvm_processors.h"
 #include "uvm_procfs.h"
 #include "uvm_thread_context.h"
 #include "uvm_va_range.h"
@@ -71,11 +72,6 @@ static void uvm_unregister_callbacks(void)
    }
 }

-static void sev_init(const UvmPlatformInfo *platform_info)
-{
-    g_uvm_global.sev_enabled = platform_info->sevEnabled;
-}
-
 NV_STATUS uvm_global_init(void)
 {
    NV_STATUS status;
@@ -124,8 +120,13 @@ NV_STATUS uvm_global_init(void)

    uvm_ats_init(&platform_info);
    g_uvm_global.num_simulated_devices = 0;
+    g_uvm_global.conf_computing_enabled = platform_info.confComputingEnabled;

-    sev_init(&platform_info);
+    status = uvm_processor_mask_cache_init();
+    if (status != NV_OK) {
+        UVM_ERR_PRINT("uvm_processor_mask_cache_init() failed: %s\n", nvstatusToString(status));
+        goto error;
+    }

    status = uvm_gpu_init();
    if (status != NV_OK) {
@@ -229,6 +230,7 @@ void uvm_global_exit(void)
    uvm_mem_global_exit();
    uvm_pmm_sysmem_exit();
    uvm_gpu_exit();
+    uvm_processor_mask_cache_exit();

    if (g_uvm_global.rm_session_handle != 0)
        uvm_rm_locked_call_void(nvUvmInterfaceSessionDestroy(g_uvm_global.rm_session_handle));
@@ -247,19 +249,19 @@ void uvm_global_exit(void)

 // Signal to the top-half ISR whether calls from the RM's top-half ISR are to
 // be completed without processing.
-static void uvm_gpu_set_isr_suspended(uvm_gpu_t *gpu, bool is_suspended)
+static void uvm_parent_gpu_set_isr_suspended(uvm_parent_gpu_t *parent_gpu, bool is_suspended)
 {
-    uvm_spin_lock_irqsave(&gpu->parent->isr.interrupts_lock);
+    uvm_spin_lock_irqsave(&parent_gpu->isr.interrupts_lock);

-    gpu->parent->isr.is_suspended = is_suspended;
+    parent_gpu->isr.is_suspended = is_suspended;

-    uvm_spin_unlock_irqrestore(&gpu->parent->isr.interrupts_lock);
+    uvm_spin_unlock_irqrestore(&parent_gpu->isr.interrupts_lock);
 }

 static NV_STATUS uvm_suspend(void)
 {
    uvm_va_space_t *va_space = NULL;
-    uvm_global_gpu_id_t gpu_id;
+    uvm_gpu_id_t gpu_id;
    uvm_gpu_t *gpu;

    // Upon entry into this function, the following is true:
@@ -293,7 +295,7 @@ static NV_STATUS uvm_suspend(void)
    // Though global_lock isn't held here, pm.lock indirectly prevents the
    // addition and removal of GPUs, since these operations can currently
    // only occur in response to ioctl() calls.
-    for_each_global_gpu_id_in_mask(gpu_id, &g_uvm_global.retained_gpus) {
+    for_each_gpu_id_in_mask(gpu_id, &g_uvm_global.retained_gpus) {
        gpu = uvm_gpu_get(gpu_id);

        // Since fault buffer state may be lost across sleep cycles, UVM must
@@ -312,9 +314,9 @@ static NV_STATUS uvm_suspend(void)
        // interrupts in the bottom half in the future, the bottom half flush
        // below will no longer be able to guarantee that all outstanding
        // notifications have been handled.
-        uvm_gpu_access_counters_set_ignore(gpu, true);
+        uvm_parent_gpu_access_counters_set_ignore(gpu->parent, true);

-        uvm_gpu_set_isr_suspended(gpu, true);
+        uvm_parent_gpu_set_isr_suspended(gpu->parent, true);

        nv_kthread_q_flush(&gpu->parent->isr.bottom_half_q);

@@ -347,7 +349,7 @@ NV_STATUS uvm_suspend_entry(void)
 static NV_STATUS uvm_resume(void)
 {
    uvm_va_space_t *va_space = NULL;
-    uvm_global_gpu_id_t gpu_id;
+    uvm_gpu_id_t gpu_id;
    uvm_gpu_t *gpu;

    g_uvm_global.pm.is_suspended = false;
@@ -366,18 +368,18 @@ static NV_STATUS uvm_resume(void)
    uvm_mutex_unlock(&g_uvm_global.va_spaces.lock);

    // pm.lock is held in lieu of global_lock to prevent GPU addition/removal
-    for_each_global_gpu_id_in_mask(gpu_id, &g_uvm_global.retained_gpus) {
+    for_each_gpu_id_in_mask(gpu_id, &g_uvm_global.retained_gpus) {
        gpu = uvm_gpu_get(gpu_id);

        // Bring the fault buffer software state back in sync with the
        // hardware state.
-        uvm_gpu_fault_buffer_resume(gpu->parent);
+        uvm_parent_gpu_fault_buffer_resume(gpu->parent);

-        uvm_gpu_set_isr_suspended(gpu, false);
+        uvm_parent_gpu_set_isr_suspended(gpu->parent, false);

        // Reenable access counter interrupt processing unless notifications
        // have been set to be suppressed.
-        uvm_gpu_access_counters_set_ignore(gpu, false);
+        uvm_parent_gpu_access_counters_set_ignore(gpu->parent, false);
    }

    uvm_up_write(&g_uvm_global.pm.lock);
@@ -431,35 +433,36 @@ NV_STATUS uvm_global_reset_fatal_error(void)
    return nv_atomic_xchg(&g_uvm_global.fatal_error, NV_OK);
 }

-void uvm_global_mask_retain(const uvm_global_processor_mask_t *mask)
+void uvm_global_gpu_retain(const uvm_processor_mask_t *mask)
 {
    uvm_gpu_t *gpu;
-    for_each_global_gpu_in_mask(gpu, mask)
+
+    for_each_gpu_in_mask(gpu, mask)
        uvm_gpu_retain(gpu);
 }

-void uvm_global_mask_release(const uvm_global_processor_mask_t *mask)
+void uvm_global_gpu_release(const uvm_processor_mask_t *mask)
 {
-    uvm_global_gpu_id_t gpu_id;
+    uvm_gpu_id_t gpu_id;

-    if (uvm_global_processor_mask_empty(mask))
+    if (uvm_processor_mask_empty(mask))
        return;

    uvm_mutex_lock(&g_uvm_global.global_lock);

-    // Do not use for_each_global_gpu_in_mask as it reads the GPU state and it
-    // might get destroyed
-    for_each_global_gpu_id_in_mask(gpu_id, mask)
+    // Do not use for_each_gpu_in_mask as it reads the GPU state and it
+    // might get destroyed.
+    for_each_gpu_id_in_mask(gpu_id, mask)
        uvm_gpu_release_locked(uvm_gpu_get(gpu_id));

    uvm_mutex_unlock(&g_uvm_global.global_lock);
 }

-NV_STATUS uvm_global_mask_check_ecc_error(uvm_global_processor_mask_t *gpus)
+NV_STATUS uvm_global_gpu_check_ecc_error(uvm_processor_mask_t *gpus)
 {
    uvm_gpu_t *gpu;

-    for_each_global_gpu_in_mask(gpu, gpus) {
+    for_each_gpu_in_mask(gpu, gpus) {
        NV_STATUS status = uvm_gpu_check_ecc_error(gpu);
        if (status != NV_OK)
            return status;
--- a/kernel-open/nvidia-uvm/uvm_global.h
+++ b/kernel-open/nvidia-uvm/uvm_global.h
@@ -1,5 +1,5 @@
 /*******************************************************************************
-    Copyright (c) 2015-2021 NVIDIA Corporation
+    Copyright (c) 2015-2023 NVIDIA Corporation

    Permission is hereby granted, free of charge, to any person obtaining a copy
    of this software and associated documentation files (the "Software"), to
@@ -40,13 +40,13 @@ struct uvm_global_struct
    // Note that GPUs are added to this mask as the last step of add_gpu() and
    // removed from it as the first step of remove_gpu() implying that a GPU
    // that's being initialized or deinitialized will not be in it.
-    uvm_global_processor_mask_t retained_gpus;
+    uvm_processor_mask_t retained_gpus;

    // Array of the parent GPUs registered with UVM. Note that GPUs will have
-    // ids offset by 1 to accomodate the UVM_GLOBAL_ID_CPU so e.g.
-    // parent_gpus[0] will have GPU id = 1. A GPU entry is unused iff it does
-    // not exist (is a NULL pointer) in this table.
-    uvm_parent_gpu_t *parent_gpus[UVM_MAX_GPUS];
+    // ids offset by 1 to accomodate the UVM_ID_CPU so e.g., parent_gpus[0]
+    // will have GPU id = 1. A GPU entry is unused iff it does not exist
+    // (is a NULL pointer) in this table.
+    uvm_parent_gpu_t *parent_gpus[UVM_PARENT_ID_MAX_GPUS];

    // A global RM session (RM client)
    // Created on module load and destroyed on module unload
@@ -143,11 +143,16 @@ struct uvm_global_struct
        struct page *page;
    } unload_state;

-    // AMD Secure Encrypted Virtualization (SEV) status. True if VM has SEV
-    // enabled. This field is set once during global initialization
-    // (uvm_global_init), and can be read afterwards without acquiring any
-    // locks.
-    bool sev_enabled;
+    // True if the VM has AMD's SEV, or equivalent HW security extensions such
+    // as Intel's TDX, enabled. The flag is always false on the host.
+    //
+    // This value moves in tandem with that of Confidential Computing in the
+    // GPU(s) in all supported configurations, so it is used as a proxy for the
+    // Confidential Computing state.
+    //
+    // This field is set once during global initialization (uvm_global_init),
+    // and can be read afterwards without acquiring any locks.
+    bool conf_computing_enabled;
 };

 // Initialize global uvm state
@@ -167,7 +172,7 @@ NV_STATUS uvm_resume_entry(void);
 // LOCKING: requires that you hold the global lock and gpu_table_lock
 static void uvm_global_add_parent_gpu(uvm_parent_gpu_t *parent_gpu)
 {
-    NvU32 gpu_index = uvm_id_gpu_index(parent_gpu->id);
+    NvU32 gpu_index = uvm_parent_id_gpu_index(parent_gpu->id);

    uvm_assert_mutex_locked(&g_uvm_global.global_lock);
    uvm_assert_spinlock_locked(&g_uvm_global.gpu_table_lock);
@@ -181,7 +186,7 @@ static void uvm_global_add_parent_gpu(uvm_parent_gpu_t *parent_gpu)
 // LOCKING: requires that you hold the global lock and gpu_table_lock
 static void uvm_global_remove_parent_gpu(uvm_parent_gpu_t *parent_gpu)
 {
-    NvU32 gpu_index = uvm_id_gpu_index(parent_gpu->id);
+    NvU32 gpu_index = uvm_parent_id_gpu_index(parent_gpu->id);

    uvm_assert_mutex_locked(&g_uvm_global.global_lock);
    uvm_assert_spinlock_locked(&g_uvm_global.gpu_table_lock);
@@ -196,41 +201,25 @@ static void uvm_global_remove_parent_gpu(uvm_parent_gpu_t *parent_gpu)
 //
 // LOCKING: requires that you hold the gpu_table_lock, the global lock, or have
 // retained at least one of the child GPUs.
-static uvm_parent_gpu_t *uvm_parent_gpu_get(uvm_gpu_id_t id)
+static uvm_parent_gpu_t *uvm_parent_gpu_get(uvm_parent_gpu_id_t id)
 {
-    return g_uvm_global.parent_gpus[uvm_id_gpu_index(id)];
+    return g_uvm_global.parent_gpus[uvm_parent_id_gpu_index(id)];
 }

-// Get a gpu by its global id.
+// Get a gpu by its GPU id.
 // Returns a pointer to the GPU object, or NULL if not found.
 //
 // LOCKING: requires that you hold the gpu_table_lock, the global_lock, or have
 // retained the gpu.
-static uvm_gpu_t *uvm_gpu_get(uvm_global_gpu_id_t global_gpu_id)
+static uvm_gpu_t *uvm_gpu_get(uvm_gpu_id_t gpu_id)
 {
    uvm_parent_gpu_t *parent_gpu;

-    parent_gpu = g_uvm_global.parent_gpus[uvm_id_gpu_index_from_global_gpu_id(global_gpu_id)];
+    parent_gpu = g_uvm_global.parent_gpus[uvm_parent_id_gpu_index_from_gpu_id(gpu_id)];
    if (!parent_gpu)
        return NULL;

-    return parent_gpu->gpus[uvm_global_id_sub_processor_index(global_gpu_id)];
-}
-
-// Get a gpu by its processor id.
-// Returns a pointer to the GPU object, or NULL if not found.
-//
-// LOCKING: requires that you hold the gpu_table_lock, the global_lock, or have
-// retained the gpu.
-static uvm_gpu_t *uvm_gpu_get_by_processor_id(uvm_processor_id_t id)
-{
-    uvm_global_gpu_id_t global_id = uvm_global_gpu_id_from_gpu_id(id);
-    uvm_gpu_t *gpu = uvm_gpu_get(global_id);
-
-    if (gpu)
-        UVM_ASSERT(!gpu->parent->smc.enabled);
-
-    return gpu;
+    return parent_gpu->gpus[uvm_id_sub_processor_index(gpu_id)];
 }

 static uvmGpuSessionHandle uvm_global_session_handle(void)
@@ -287,56 +276,57 @@ static NV_STATUS uvm_global_get_status(void)
 // reset call was made.
 NV_STATUS uvm_global_reset_fatal_error(void);

-static uvm_gpu_t *uvm_global_processor_mask_find_first_gpu(const uvm_global_processor_mask_t *global_gpus)
+static uvm_gpu_t *uvm_processor_mask_find_first_gpu(const uvm_processor_mask_t *gpus)
 {
    uvm_gpu_t *gpu;
-    uvm_global_gpu_id_t gpu_id = uvm_global_processor_mask_find_first_gpu_id(global_gpus);
+    uvm_gpu_id_t gpu_id = uvm_processor_mask_find_first_gpu_id(gpus);

-    if (UVM_GLOBAL_ID_IS_INVALID(gpu_id))
+    if (UVM_ID_IS_INVALID(gpu_id))
        return NULL;

    gpu = uvm_gpu_get(gpu_id);

    // If there is valid GPU id in the mask, assert that the corresponding
    // uvm_gpu_t is present. Otherwise it would stop a
-    // for_each_global_gpu_in_mask() loop pre-maturely. Today, this could only
+    // for_each_gpu_in_mask() loop pre-maturely. Today, this could only
    // happen in remove_gpu() because the GPU being removed is deleted from the
    // global table very early.
-    UVM_ASSERT_MSG(gpu, "gpu_id %u\n", uvm_global_id_value(gpu_id));
+    UVM_ASSERT_MSG(gpu, "gpu_id %u\n", uvm_id_value(gpu_id));

    return gpu;
 }

-static uvm_gpu_t *__uvm_global_processor_mask_find_next_gpu(const uvm_global_processor_mask_t *global_gpus, uvm_gpu_t *gpu)
+static uvm_gpu_t *__uvm_processor_mask_find_next_gpu(const uvm_processor_mask_t *gpus, uvm_gpu_t *gpu)
 {
-    uvm_global_gpu_id_t gpu_id;
+    uvm_gpu_id_t gpu_id;

    UVM_ASSERT(gpu);

-    gpu_id = uvm_global_processor_mask_find_next_id(global_gpus, uvm_global_gpu_id_next(gpu->global_id));
-    if (UVM_GLOBAL_ID_IS_INVALID(gpu_id))
+    gpu_id = uvm_processor_mask_find_next_id(gpus, uvm_gpu_id_next(gpu->id));
+    if (UVM_ID_IS_INVALID(gpu_id))
        return NULL;

    gpu = uvm_gpu_get(gpu_id);

-    // See comment in uvm_global_processor_mask_find_first_gpu().
-    UVM_ASSERT_MSG(gpu, "gpu_id %u\n", uvm_global_id_value(gpu_id));
+    // See comment in uvm_processor_mask_find_first_gpu().
+    UVM_ASSERT_MSG(gpu, "gpu_id %u\n", uvm_id_value(gpu_id));

    return gpu;
 }

 // Helper to iterate over all GPUs in the input mask
-#define for_each_global_gpu_in_mask(gpu, global_mask)                                         \
-    for (gpu = uvm_global_processor_mask_find_first_gpu(global_mask);                         \
-         gpu != NULL;                                                                         \
-         gpu = __uvm_global_processor_mask_find_next_gpu(global_mask, gpu))
+#define for_each_gpu_in_mask(gpu, mask)                         \
+    for (gpu = uvm_processor_mask_find_first_gpu(mask);         \
+         gpu != NULL;                                           \
+         gpu = __uvm_processor_mask_find_next_gpu(mask, gpu))

-// Helper to iterate over all GPUs retained by the UVM driver (across all va spaces)
-#define for_each_global_gpu(gpu)                                                              \
-    for (({uvm_assert_mutex_locked(&g_uvm_global.global_lock);                                \
-           gpu = uvm_global_processor_mask_find_first_gpu(&g_uvm_global.retained_gpus);});    \
-           gpu != NULL;                                                                       \
-           gpu = __uvm_global_processor_mask_find_next_gpu(&g_uvm_global.retained_gpus, gpu))
+// Helper to iterate over all GPUs retained by the UVM driver
+// (across all va spaces).
+#define for_each_gpu(gpu)                                                              \
+    for (({uvm_assert_mutex_locked(&g_uvm_global.global_lock);                         \
+           gpu = uvm_processor_mask_find_first_gpu(&g_uvm_global.retained_gpus);});    \
+           gpu != NULL;                                                                \
+           gpu = __uvm_processor_mask_find_next_gpu(&g_uvm_global.retained_gpus, gpu))

 // LOCKING: Must hold either the global_lock or the gpu_table_lock
 static uvm_parent_gpu_t *uvm_global_find_next_parent_gpu(uvm_parent_gpu_t *parent_gpu)
@@ -344,7 +334,7 @@ static uvm_parent_gpu_t *uvm_global_find_next_parent_gpu(uvm_parent_gpu_t *paren
    NvU32 i;

    if (parent_gpu) {
-        NvU32 gpu_index = uvm_id_gpu_index(parent_gpu->id);
+        NvU32 gpu_index = uvm_parent_id_gpu_index(parent_gpu->id);
        i = gpu_index + 1;
    }
    else {
@@ -353,7 +343,7 @@ static uvm_parent_gpu_t *uvm_global_find_next_parent_gpu(uvm_parent_gpu_t *paren

    parent_gpu = NULL;

-    while (i < UVM_MAX_GPUS) {
+    while (i < UVM_PARENT_ID_MAX_GPUS) {
        if (g_uvm_global.parent_gpus[i]) {
            parent_gpu = g_uvm_global.parent_gpus[i];
            break;
@@ -369,18 +359,18 @@ static uvm_parent_gpu_t *uvm_global_find_next_parent_gpu(uvm_parent_gpu_t *paren
 static uvm_gpu_t *uvm_gpu_find_next_valid_gpu_in_parent(uvm_parent_gpu_t *parent_gpu, uvm_gpu_t *cur_gpu)
 {
    uvm_gpu_t *gpu = NULL;
-    uvm_global_gpu_id_t global_gpu_id;
+    uvm_gpu_id_t gpu_id;
    NvU32 sub_processor_index;
    NvU32 cur_sub_processor_index;

    UVM_ASSERT(parent_gpu);

-    global_gpu_id = uvm_global_gpu_id_from_gpu_id(parent_gpu->id);
-    cur_sub_processor_index = cur_gpu ? uvm_global_id_sub_processor_index(cur_gpu->global_id) : -1;
+    gpu_id = uvm_gpu_id_from_parent_gpu_id(parent_gpu->id);
+    cur_sub_processor_index = cur_gpu ? uvm_id_sub_processor_index(cur_gpu->id) : -1;

-    sub_processor_index = find_next_bit(parent_gpu->valid_gpus, UVM_ID_MAX_SUB_PROCESSORS, cur_sub_processor_index + 1);
-    if (sub_processor_index < UVM_ID_MAX_SUB_PROCESSORS) {
-        gpu = uvm_gpu_get(uvm_global_id_from_value(uvm_global_id_value(global_gpu_id) + sub_processor_index));
+    sub_processor_index = find_next_bit(parent_gpu->valid_gpus, UVM_PARENT_ID_MAX_SUB_PROCESSORS, cur_sub_processor_index + 1);
+    if (sub_processor_index < UVM_PARENT_ID_MAX_SUB_PROCESSORS) {
+        gpu = uvm_gpu_get(uvm_id_from_value(uvm_id_value(gpu_id) + sub_processor_index));
        UVM_ASSERT(gpu != NULL);
    }

@@ -400,18 +390,18 @@ static uvm_gpu_t *uvm_gpu_find_next_valid_gpu_in_parent(uvm_parent_gpu_t *parent
         (gpu) != NULL;                                                                         \
         (gpu) = uvm_gpu_find_next_valid_gpu_in_parent((parent_gpu), (gpu)))

-// Helper which calls uvm_gpu_retain on each GPU in mask
-void uvm_global_mask_retain(const uvm_global_processor_mask_t *mask);
+// Helper which calls uvm_gpu_retain() on each GPU in mask.
+void uvm_global_gpu_retain(const uvm_processor_mask_t *mask);

 // Helper which calls uvm_gpu_release_locked on each GPU in mask.
 //
 // LOCKING: this function takes and releases the global lock if the input mask
 //          is not empty
-void uvm_global_mask_release(const uvm_global_processor_mask_t *mask);
+void uvm_global_gpu_release(const uvm_processor_mask_t *mask);

 // Check for ECC errors for all GPUs in a mask
 // Notably this check cannot be performed where it's not safe to call into RM.
-NV_STATUS uvm_global_mask_check_ecc_error(uvm_global_processor_mask_t *gpus);
+NV_STATUS uvm_global_gpu_check_ecc_error(uvm_processor_mask_t *gpus);

 // Pre-allocate fault service contexts.
 NV_STATUS uvm_service_block_context_init(void);
@@ -419,4 +409,10 @@ NV_STATUS uvm_service_block_context_init(void);
 // Release fault service contexts if any exist.
 void uvm_service_block_context_exit(void);

+// Allocate a service block context
+uvm_service_block_context_t *uvm_service_block_context_alloc(struct mm_struct *mm);
+
+// Free a servic block context
+void uvm_service_block_context_free(uvm_service_block_context_t *service_context);
+
 #endif // __UVM_GLOBAL_H__
--- a/Show More
+++ b/Show More
Author	SHA1	Message	Date
Gaurav Juvekar	448d5cc656	560.28.03	2024-07-19 15:45:15 -07:00
Bernhard Stoeckner	5fdf5032fb	555.58.02 (cherry picked from commit `1795a8bb20`)	2024-07-19 15:38:11 -07:00
Milos Tijanic	171c735e57	555.58 (cherry picked from commit `af77e083a2`)	2024-07-19 15:38:08 -07:00
Bernhard Stoeckner	74ee05e160	555.52.04 (cherry picked from commit `78d807e001`)	2024-07-19 15:38:04 -07:00
Bernhard Stoeckner	3084c04453	555.42.02 (cherry picked from commit `5a1c474040`)	2024-07-19 15:38:00 -07:00
Bernhard Stoeckner	caa2dd11a0	550.100	2024-07-09 15:49:19 +02:00
Bernhard Stoeckner	e45d91de02	550.90.07	2024-06-04 13:48:03 +02:00
Bernhard Stoeckner	083cd9cf17	550.78	2024-04-25 16:24:58 +02:00
Bernhard Stoeckner	ea4c27fad6	550.76	2024-04-17 17:23:37 +02:00
Bernhard Stoeckner	3bf16b890c	550.67	2024-03-19 16:56:28 +01:00
Bernhard Stoeckner	12933b2d3c	550.54.15	2024-03-18 17:52:11 +01:00
Bernhard Stoeckner	476bd34534	550.54.14	2024-02-23 16:37:56 +01:00
Bernhard Stoeckner	91676d6628	550.40.07	2024-01-24 18:28:48 +01:00
Bernhard Stöckner	bb2dac1f20	Update 20_build_bug.yml	2024-01-23 15:30:14 +01:00
Maneet Singh	4c29105335	545.29.06	2023-11-21 13:38:23 -08:00
Andy Ritger	be3cd9abcb	545.29.02	2023-10-31 16:31:08 -07:00
Andy Ritger	a2f89d6b59	Revert "545.29.03" This reverts commit `f364378a65`. 545.29.03 and 545.29.02 are functionally the same for purposes of open-gpu-kernel-modules, but there was poor NVIDIA-internal communication about which driver would actually be released. Revert 545.29.03 so that a subsequent commit can provide 545.29.02 cleanly.	2023-10-31 16:28:17 -07:00
Maneet Singh	f364378a65	545.29.03	2023-10-31 09:44:03 -07:00