mirror of
https://github.com/NVIDIA/open-gpu-kernel-modules.git
synced 2026-01-27 19:49:47 +00:00
Compare commits
9 Commits
550.40.81
...
535.183.06
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
c588c3877f | ||
|
|
4459285b60 | ||
|
|
f4bdce9a0a | ||
|
|
c042c7903d | ||
|
|
044f70bbb8 | ||
|
|
6d33efe502 | ||
|
|
ee55481a49 | ||
|
|
7165299dee | ||
|
|
e573018659 |
8
.github/ISSUE_TEMPLATE/20_build_bug.yml
vendored
8
.github/ISSUE_TEMPLATE/20_build_bug.yml
vendored
@@ -32,14 +32,6 @@ body:
|
||||
description: "Which kernel are you running? (output of `uname -a`, say if you built it yourself)."
|
||||
validations:
|
||||
required: true
|
||||
- type: checkboxes
|
||||
id: sw_host_kernel_stable
|
||||
attributes:
|
||||
label: "Please confirm you are running a stable release kernel (e.g. not a -rc). We do not accept bug reports for unreleased kernels."
|
||||
options:
|
||||
- label: "I am running on a stable kernel release."
|
||||
validations:
|
||||
required: true
|
||||
- type: textarea
|
||||
id: bug_description
|
||||
attributes:
|
||||
|
||||
71
CHANGELOG.md
71
CHANGELOG.md
@@ -1,62 +1,23 @@
|
||||
# Changelog
|
||||
|
||||
## Release 550 Entries
|
||||
|
||||
### [550.40.80] 2024-11-20
|
||||
|
||||
### [550.40.79] 2024-10-24
|
||||
|
||||
### [550.40.76] 2024-10-06
|
||||
|
||||
### [550.40.75] 2024-09-25
|
||||
|
||||
### [550.40.71] 2024-08-30
|
||||
|
||||
### [550.40.70] 2024-08-22
|
||||
|
||||
### [550.40.67] 2024-08-06
|
||||
|
||||
### [550.40.65] 2024-06-28
|
||||
|
||||
### [550.40.63] 2024-05-31
|
||||
|
||||
### [550.40.61] 2024-04-23
|
||||
|
||||
### [550.40.59] 2024-04-01
|
||||
|
||||
### [550.40.55] 2024-03-07
|
||||
|
||||
### [550.40.53] 2024-02-28
|
||||
|
||||
#### Added
|
||||
|
||||
- Added vGPU Host and vGPU Guest support. For vGPU Host, please refer to the README.vgpu packaged in the vGPU Host Package for more details.
|
||||
|
||||
### [550.40.07] 2024-01-24
|
||||
|
||||
#### Fixed
|
||||
|
||||
- Set INSTALL_MOD_DIR only if it's not defined, [#570](https://github.com/NVIDIA/open-gpu-kernel-modules/pull/570) by @keelung-yang
|
||||
## Release 545 Entries
|
||||
|
||||
#### Fixed
|
||||
|
||||
- The brightness control of NVIDIA seems to be broken, [#573](https://github.com/NVIDIA/open-gpu-kernel-modules/issues/573)
|
||||
|
||||
### [545.29.02] 2023-10-31
|
||||
|
||||
### [545.23.06] 2023-10-17
|
||||
|
||||
#### Fixed
|
||||
|
||||
- Fix always-false conditional, [#493](https://github.com/NVIDIA/open-gpu-kernel-modules/pull/493) by @meme8383
|
||||
|
||||
#### Added
|
||||
|
||||
- Added beta-quality support for GeForce and Workstation GPUs. Please see the "Open Linux Kernel Modules" chapter in the NVIDIA GPU driver end user README for details.
|
||||
|
||||
## Release 535 Entries
|
||||
|
||||
### [535.183.06] 2024-07-09
|
||||
|
||||
### [535.183.01] 2024-06-04
|
||||
|
||||
### [535.179] 2024-05-09
|
||||
|
||||
### [535.171.04] 2024-03-21
|
||||
|
||||
### [535.161.08] 2024-03-18
|
||||
|
||||
### [535.161.07] 2024-02-22
|
||||
|
||||
### [535.154.05] 2024-01-16
|
||||
|
||||
### [535.146.02] 2023-12-07
|
||||
|
||||
### [535.129.03] 2023-10-31
|
||||
|
||||
### [535.113.01] 2023-09-21
|
||||
|
||||
55
README.md
55
README.md
@@ -1,7 +1,7 @@
|
||||
# NVIDIA Linux Open GPU Kernel Module Source
|
||||
|
||||
This is the source release of the NVIDIA Linux open GPU kernel modules,
|
||||
version 550.40.81.
|
||||
version 535.183.06.
|
||||
|
||||
|
||||
## How to Build
|
||||
@@ -17,7 +17,7 @@ as root:
|
||||
|
||||
Note that the kernel modules built here must be used with GSP
|
||||
firmware and user-space NVIDIA GPU driver components from a corresponding
|
||||
550.40.81 driver release. This can be achieved by installing
|
||||
535.183.06 driver release. This can be achieved by installing
|
||||
the NVIDIA GPU driver from the .run file using the `--no-kernel-modules`
|
||||
option. E.g.,
|
||||
|
||||
@@ -179,19 +179,16 @@ software applications.
|
||||
|
||||
## Compatible GPUs
|
||||
|
||||
The NVIDIA open kernel modules can be used on any Turing or later GPU
|
||||
(see the table below). However, in the __DRIVER_VERION__ release, GeForce and
|
||||
Workstation support is considered to be Beta quality. The open kernel modules
|
||||
are suitable for broad usage, and NVIDIA requests feedback on any issues
|
||||
encountered specific to them.
|
||||
The open-gpu-kernel-modules can be used on any Turing or later GPU
|
||||
(see the table below). However, in the 535.183.06 release,
|
||||
GeForce and Workstation support is still considered alpha-quality.
|
||||
|
||||
For details on feature support and limitations, see the NVIDIA GPU driver
|
||||
end user README here:
|
||||
To enable use of the open kernel modules on GeForce and Workstation GPUs,
|
||||
set the "NVreg_OpenRmEnableUnsupportedGpus" nvidia.ko kernel module
|
||||
parameter to 1. For more details, see the NVIDIA GPU driver end user
|
||||
README here:
|
||||
|
||||
https://us.download.nvidia.com/XFree86/Linux-x86_64/550.40.81/README/kernel_open.html
|
||||
|
||||
For vGPU support, please refer to the README.vgpu packaged in the vGPU Host
|
||||
Package for more details.
|
||||
https://us.download.nvidia.com/XFree86/Linux-x86_64/535.183.06/README/kernel_open.html
|
||||
|
||||
In the below table, if three IDs are listed, the first is the PCI Device
|
||||
ID, the second is the PCI Subsystem Vendor ID, and the third is the PCI
|
||||
@@ -654,9 +651,7 @@ Subsystem Device ID.
|
||||
| NVIDIA T400E | 1FF2 103C 18FF |
|
||||
| NVIDIA T400 4GB | 1FF2 103C 8A80 |
|
||||
| NVIDIA T400 4GB | 1FF2 10DE 1613 |
|
||||
| NVIDIA T400E | 1FF2 10DE 18FF |
|
||||
| NVIDIA T400 4GB | 1FF2 17AA 1613 |
|
||||
| NVIDIA T400E | 1FF2 17AA 18FF |
|
||||
| Quadro T1000 | 1FF9 |
|
||||
| NVIDIA A100-SXM4-40GB | 20B0 |
|
||||
| NVIDIA A100-PG509-200 | 20B0 10DE 1450 |
|
||||
@@ -754,13 +749,9 @@ Subsystem Device ID.
|
||||
| NVIDIA H800 | 2324 10DE 17A8 |
|
||||
| NVIDIA H20 | 2329 10DE 198B |
|
||||
| NVIDIA H20 | 2329 10DE 198C |
|
||||
| NVIDIA H20-3e | 232C 10DE 2063 |
|
||||
| NVIDIA H20-3e | 232C 10DE 2064 |
|
||||
| NVIDIA H100 80GB HBM3 | 2330 10DE 16C0 |
|
||||
| NVIDIA H100 80GB HBM3 | 2330 10DE 16C1 |
|
||||
| NVIDIA H100 PCIe | 2331 10DE 1626 |
|
||||
| NVIDIA H200 | 2335 10DE 18BE |
|
||||
| NVIDIA H200 | 2335 10DE 18BF |
|
||||
| NVIDIA H100 | 2339 10DE 17FC |
|
||||
| NVIDIA H800 NVL | 233A 10DE 183A |
|
||||
| NVIDIA GH200 120GB | 2342 10DE 16EB |
|
||||
@@ -834,14 +825,6 @@ Subsystem Device ID.
|
||||
| NVIDIA GeForce RTX 3050 4GB Laptop GPU | 25AB |
|
||||
| NVIDIA GeForce RTX 3050 6GB Laptop GPU | 25AC |
|
||||
| NVIDIA GeForce RTX 2050 | 25AD |
|
||||
| NVIDIA RTX A1000 | 25B0 1028 1878 |
|
||||
| NVIDIA RTX A1000 | 25B0 103C 1878 |
|
||||
| NVIDIA RTX A1000 | 25B0 10DE 1878 |
|
||||
| NVIDIA RTX A1000 | 25B0 17AA 1878 |
|
||||
| NVIDIA RTX A400 | 25B2 1028 1879 |
|
||||
| NVIDIA RTX A400 | 25B2 103C 1879 |
|
||||
| NVIDIA RTX A400 | 25B2 10DE 1879 |
|
||||
| NVIDIA RTX A400 | 25B2 17AA 1879 |
|
||||
| NVIDIA A16 | 25B6 10DE 14A9 |
|
||||
| NVIDIA A2 | 25B6 10DE 157E |
|
||||
| NVIDIA RTX A2000 Laptop GPU | 25B8 |
|
||||
@@ -859,7 +842,6 @@ Subsystem Device ID.
|
||||
| NVIDIA RTX A2000 Embedded GPU | 25FA |
|
||||
| NVIDIA RTX A500 Embedded GPU | 25FB |
|
||||
| NVIDIA GeForce RTX 4090 | 2684 |
|
||||
| NVIDIA GeForce RTX 4090 D | 2685 |
|
||||
| NVIDIA RTX 6000 Ada Generation | 26B1 1028 16A1 |
|
||||
| NVIDIA RTX 6000 Ada Generation | 26B1 103C 16A1 |
|
||||
| NVIDIA RTX 6000 Ada Generation | 26B1 10DE 16A1 |
|
||||
@@ -868,28 +850,20 @@ Subsystem Device ID.
|
||||
| NVIDIA RTX 5000 Ada Generation | 26B2 103C 17FA |
|
||||
| NVIDIA RTX 5000 Ada Generation | 26B2 10DE 17FA |
|
||||
| NVIDIA RTX 5000 Ada Generation | 26B2 17AA 17FA |
|
||||
| NVIDIA RTX 5880 Ada Generation | 26B3 1028 1934 |
|
||||
| NVIDIA RTX 5880 Ada Generation | 26B3 103C 1934 |
|
||||
| NVIDIA RTX 5880 Ada Generation | 26B3 10DE 1934 |
|
||||
| NVIDIA RTX 5880 Ada Generation | 26B3 17AA 1934 |
|
||||
| NVIDIA L40 | 26B5 10DE 169D |
|
||||
| NVIDIA L40 | 26B5 10DE 17DA |
|
||||
| NVIDIA L40S | 26B9 10DE 1851 |
|
||||
| NVIDIA L40S | 26B9 10DE 18CF |
|
||||
| NVIDIA L20 | 26BA 10DE 1957 |
|
||||
| NVIDIA L20 | 26BA 10DE 1990 |
|
||||
| NVIDIA GeForce RTX 4080 SUPER | 2702 |
|
||||
| NVIDIA GeForce RTX 4080 | 2704 |
|
||||
| NVIDIA GeForce RTX 4070 Ti SUPER | 2705 |
|
||||
| NVIDIA GeForce RTX 4070 | 2709 |
|
||||
| NVIDIA GeForce RTX 4090 Laptop GPU | 2717 |
|
||||
| NVIDIA RTX 5000 Ada Generation Laptop GPU | 2730 |
|
||||
| NVIDIA GeForce RTX 4090 Laptop GPU | 2757 |
|
||||
| NVIDIA RTX 5000 Ada Generation Embedded GPU | 2770 |
|
||||
| NVIDIA GeForce RTX 4070 Ti | 2782 |
|
||||
| NVIDIA GeForce RTX 4070 SUPER | 2783 |
|
||||
| NVIDIA GeForce RTX 4070 | 2786 |
|
||||
| NVIDIA GeForce RTX 4060 Ti | 2788 |
|
||||
| NVIDIA GeForce RTX 4080 Laptop GPU | 27A0 |
|
||||
| NVIDIA RTX 4000 SFF Ada Generation | 27B0 1028 16FA |
|
||||
| NVIDIA RTX 4000 SFF Ada Generation | 27B0 103C 16FA |
|
||||
@@ -912,21 +886,12 @@ Subsystem Device ID.
|
||||
| NVIDIA RTX 3500 Ada Generation Embedded GPU | 27FB |
|
||||
| NVIDIA GeForce RTX 4060 Ti | 2803 |
|
||||
| NVIDIA GeForce RTX 4060 Ti | 2805 |
|
||||
| NVIDIA GeForce RTX 4060 | 2808 |
|
||||
| NVIDIA GeForce RTX 4070 Laptop GPU | 2820 |
|
||||
| NVIDIA GeForce RTX 3050 A Laptop GPU | 2822 |
|
||||
| NVIDIA RTX 3000 Ada Generation Laptop GPU | 2838 |
|
||||
| NVIDIA GeForce RTX 4070 Laptop GPU | 2860 |
|
||||
| NVIDIA GeForce RTX 4060 | 2882 |
|
||||
| NVIDIA GeForce RTX 4060 Laptop GPU | 28A0 |
|
||||
| NVIDIA GeForce RTX 4050 Laptop GPU | 28A1 |
|
||||
| NVIDIA RTX 2000 Ada Generation | 28B0 1028 1870 |
|
||||
| NVIDIA RTX 2000 Ada Generation | 28B0 103C 1870 |
|
||||
| NVIDIA RTX 2000E Ada Generation | 28B0 103C 1871 |
|
||||
| NVIDIA RTX 2000 Ada Generation | 28B0 10DE 1870 |
|
||||
| NVIDIA RTX 2000E Ada Generation | 28B0 10DE 1871 |
|
||||
| NVIDIA RTX 2000 Ada Generation | 28B0 17AA 1870 |
|
||||
| NVIDIA RTX 2000E Ada Generation | 28B0 17AA 1871 |
|
||||
| NVIDIA RTX 2000 Ada Generation Laptop GPU | 28B8 |
|
||||
| NVIDIA RTX 1000 Ada Generation Laptop GPU | 28B9 |
|
||||
| NVIDIA RTX 500 Ada Generation Laptop GPU | 28BA |
|
||||
|
||||
@@ -70,26 +70,14 @@ $(foreach _module, $(NV_KERNEL_MODULES), \
|
||||
|
||||
EXTRA_CFLAGS += -I$(src)/common/inc
|
||||
EXTRA_CFLAGS += -I$(src)
|
||||
EXTRA_CFLAGS += -Wall $(DEFINES) $(INCLUDES) -Wno-cast-qual -Wno-format-extra-args
|
||||
EXTRA_CFLAGS += -Wall $(DEFINES) $(INCLUDES) -Wno-cast-qual -Wno-error -Wno-format-extra-args
|
||||
EXTRA_CFLAGS += -D__KERNEL__ -DMODULE -DNVRM
|
||||
EXTRA_CFLAGS += -DNV_VERSION_STRING=\"550.40.81\"
|
||||
EXTRA_CFLAGS += -DNV_VERSION_STRING=\"535.183.06\"
|
||||
|
||||
ifneq ($(SYSSRCHOST1X),)
|
||||
EXTRA_CFLAGS += -I$(SYSSRCHOST1X)
|
||||
endif
|
||||
|
||||
# Some Android kernels prohibit driver use of filesystem functions like
|
||||
# filp_open() and kernel_read(). Disable the NV_FILESYSTEM_ACCESS_AVAILABLE
|
||||
# functionality that uses those functions when building for Android.
|
||||
|
||||
PLATFORM_IS_ANDROID ?= 0
|
||||
|
||||
ifeq ($(PLATFORM_IS_ANDROID),1)
|
||||
EXTRA_CFLAGS += -DNV_FILESYSTEM_ACCESS_AVAILABLE=0
|
||||
else
|
||||
EXTRA_CFLAGS += -DNV_FILESYSTEM_ACCESS_AVAILABLE=1
|
||||
endif
|
||||
|
||||
EXTRA_CFLAGS += -Wno-unused-function
|
||||
|
||||
ifneq ($(NV_BUILD_TYPE),debug)
|
||||
@@ -104,6 +92,7 @@ endif
|
||||
|
||||
ifeq ($(NV_BUILD_TYPE),debug)
|
||||
EXTRA_CFLAGS += -g
|
||||
EXTRA_CFLAGS += $(call cc-option,-gsplit-dwarf,)
|
||||
endif
|
||||
|
||||
EXTRA_CFLAGS += -ffreestanding
|
||||
@@ -138,13 +127,6 @@ ifdef VGX_FORCE_VFIO_PCI_CORE
|
||||
EXTRA_CFLAGS += -DNV_VGPU_FORCE_VFIO_PCI_CORE
|
||||
endif
|
||||
|
||||
WARNINGS_AS_ERRORS ?=
|
||||
ifeq ($(WARNINGS_AS_ERRORS),1)
|
||||
ccflags-y += -Werror
|
||||
else
|
||||
ccflags-y += -Wno-error
|
||||
endif
|
||||
|
||||
#
|
||||
# The conftest.sh script tests various aspects of the target kernel.
|
||||
# The per-module Kbuild files included above should:
|
||||
@@ -172,7 +154,6 @@ NV_CFLAGS_FROM_CONFTEST := $(shell $(NV_CONFTEST_CMD) build_cflags)
|
||||
NV_CONFTEST_CFLAGS = $(NV_CFLAGS_FROM_CONFTEST) $(EXTRA_CFLAGS) -fno-pie
|
||||
NV_CONFTEST_CFLAGS += $(call cc-disable-warning,pointer-sign)
|
||||
NV_CONFTEST_CFLAGS += $(call cc-option,-fshort-wchar,)
|
||||
NV_CONFTEST_CFLAGS += -Wno-error
|
||||
|
||||
NV_CONFTEST_COMPILE_TEST_HEADERS := $(obj)/conftest/macros.h
|
||||
NV_CONFTEST_COMPILE_TEST_HEADERS += $(obj)/conftest/functions.h
|
||||
@@ -232,7 +213,102 @@ $(obj)/conftest/patches.h: $(NV_CONFTEST_SCRIPT)
|
||||
@mkdir -p $(obj)/conftest
|
||||
@$(NV_CONFTEST_CMD) patch_check > $@
|
||||
|
||||
include $(src)/header-presence-tests.mk
|
||||
|
||||
# Each of these headers is checked for presence with a test #include; a
|
||||
# corresponding #define will be generated in conftest/headers.h.
|
||||
NV_HEADER_PRESENCE_TESTS = \
|
||||
asm/system.h \
|
||||
drm/drmP.h \
|
||||
drm/drm_auth.h \
|
||||
drm/drm_gem.h \
|
||||
drm/drm_crtc.h \
|
||||
drm/drm_color_mgmt.h \
|
||||
drm/drm_atomic.h \
|
||||
drm/drm_atomic_helper.h \
|
||||
drm/drm_atomic_state_helper.h \
|
||||
drm/drm_encoder.h \
|
||||
drm/drm_atomic_uapi.h \
|
||||
drm/drm_drv.h \
|
||||
drm/drm_framebuffer.h \
|
||||
drm/drm_connector.h \
|
||||
drm/drm_probe_helper.h \
|
||||
drm/drm_blend.h \
|
||||
drm/drm_fourcc.h \
|
||||
drm/drm_prime.h \
|
||||
drm/drm_plane.h \
|
||||
drm/drm_vblank.h \
|
||||
drm/drm_file.h \
|
||||
drm/drm_ioctl.h \
|
||||
drm/drm_device.h \
|
||||
drm/drm_mode_config.h \
|
||||
drm/drm_modeset_lock.h \
|
||||
dt-bindings/interconnect/tegra_icc_id.h \
|
||||
generated/autoconf.h \
|
||||
generated/compile.h \
|
||||
generated/utsrelease.h \
|
||||
linux/efi.h \
|
||||
linux/kconfig.h \
|
||||
linux/platform/tegra/mc_utils.h \
|
||||
linux/printk.h \
|
||||
linux/ratelimit.h \
|
||||
linux/prio_tree.h \
|
||||
linux/log2.h \
|
||||
linux/of.h \
|
||||
linux/bug.h \
|
||||
linux/sched.h \
|
||||
linux/sched/mm.h \
|
||||
linux/sched/signal.h \
|
||||
linux/sched/task.h \
|
||||
linux/sched/task_stack.h \
|
||||
xen/ioemu.h \
|
||||
linux/fence.h \
|
||||
linux/dma-resv.h \
|
||||
soc/tegra/chip-id.h \
|
||||
soc/tegra/fuse.h \
|
||||
soc/tegra/tegra_bpmp.h \
|
||||
video/nv_internal.h \
|
||||
linux/platform/tegra/dce/dce-client-ipc.h \
|
||||
linux/nvhost.h \
|
||||
linux/nvhost_t194.h \
|
||||
linux/host1x-next.h \
|
||||
asm/book3s/64/hash-64k.h \
|
||||
asm/set_memory.h \
|
||||
asm/prom.h \
|
||||
asm/powernv.h \
|
||||
linux/atomic.h \
|
||||
asm/barrier.h \
|
||||
asm/opal-api.h \
|
||||
sound/hdaudio.h \
|
||||
asm/pgtable_types.h \
|
||||
asm/page.h \
|
||||
linux/stringhash.h \
|
||||
linux/dma-map-ops.h \
|
||||
rdma/peer_mem.h \
|
||||
sound/hda_codec.h \
|
||||
linux/dma-buf.h \
|
||||
linux/time.h \
|
||||
linux/platform_device.h \
|
||||
linux/mutex.h \
|
||||
linux/reset.h \
|
||||
linux/of_platform.h \
|
||||
linux/of_device.h \
|
||||
linux/of_gpio.h \
|
||||
linux/gpio.h \
|
||||
linux/gpio/consumer.h \
|
||||
linux/interconnect.h \
|
||||
linux/pm_runtime.h \
|
||||
linux/clk.h \
|
||||
linux/clk-provider.h \
|
||||
linux/ioasid.h \
|
||||
linux/stdarg.h \
|
||||
linux/iosys-map.h \
|
||||
asm/coco.h \
|
||||
linux/vfio_pci_core.h \
|
||||
linux/mdev.h \
|
||||
soc/tegra/bpmp-abi.h \
|
||||
soc/tegra/bpmp.h \
|
||||
linux/cc_platform.h \
|
||||
asm/cpufeature.h
|
||||
|
||||
# Filename to store the define for the header in $(1); this is only consumed by
|
||||
# the rule below that concatenates all of these together.
|
||||
|
||||
@@ -57,15 +57,12 @@ else
|
||||
-e 's/armv[0-7]\w\+/arm/' \
|
||||
-e 's/aarch64/arm64/' \
|
||||
-e 's/ppc64le/powerpc/' \
|
||||
-e 's/riscv64/riscv/' \
|
||||
)
|
||||
endif
|
||||
|
||||
NV_KERNEL_MODULES ?= $(wildcard nvidia nvidia-uvm nvidia-vgpu-vfio nvidia-modeset nvidia-drm nvidia-peermem)
|
||||
NV_KERNEL_MODULES := $(filter-out $(NV_EXCLUDE_KERNEL_MODULES), \
|
||||
$(NV_KERNEL_MODULES))
|
||||
INSTALL_MOD_DIR ?= kernel/drivers/video
|
||||
|
||||
NV_VERBOSE ?=
|
||||
SPECTRE_V2_RETPOLINE ?= 0
|
||||
|
||||
@@ -77,7 +74,7 @@ else
|
||||
KBUILD_PARAMS += NV_KERNEL_SOURCES=$(KERNEL_SOURCES)
|
||||
KBUILD_PARAMS += NV_KERNEL_OUTPUT=$(KERNEL_OUTPUT)
|
||||
KBUILD_PARAMS += NV_KERNEL_MODULES="$(NV_KERNEL_MODULES)"
|
||||
KBUILD_PARAMS += INSTALL_MOD_DIR="$(INSTALL_MOD_DIR)"
|
||||
KBUILD_PARAMS += INSTALL_MOD_DIR=kernel/drivers/video
|
||||
KBUILD_PARAMS += NV_SPECTRE_V2=$(SPECTRE_V2_RETPOLINE)
|
||||
|
||||
.PHONY: modules module clean clean_conftest modules_install
|
||||
|
||||
@@ -1,43 +0,0 @@
|
||||
/*
|
||||
* SPDX-FileCopyrightText: Copyright (c) 2023 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
|
||||
* SPDX-License-Identifier: MIT
|
||||
*
|
||||
* Permission is hereby granted, free of charge, to any person obtaining a
|
||||
* copy of this software and associated documentation files (the "Software"),
|
||||
* to deal in the Software without restriction, including without limitation
|
||||
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
|
||||
* and/or sell copies of the Software, and to permit persons to whom the
|
||||
* Software is furnished to do so, subject to the following conditions:
|
||||
*
|
||||
* The above copyright notice and this permission notice shall be included in
|
||||
* all copies or substantial portions of the Software.
|
||||
*
|
||||
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
|
||||
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
|
||||
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
|
||||
* DEALINGS IN THE SOFTWARE.
|
||||
*/
|
||||
|
||||
|
||||
#ifndef _NV_CHARDEV_NUMBERS_H_
|
||||
#define _NV_CHARDEV_NUMBERS_H_
|
||||
|
||||
// NVIDIA's reserved major character device number (Linux).
|
||||
#define NV_MAJOR_DEVICE_NUMBER 195
|
||||
|
||||
// Minor numbers 0 to 247 reserved for regular devices
|
||||
#define NV_MINOR_DEVICE_NUMBER_REGULAR_MAX 247
|
||||
|
||||
// Minor numbers 248 to 253 currently unused
|
||||
|
||||
// Minor number 254 reserved for the modeset device (provided by NVKMS)
|
||||
#define NV_MINOR_DEVICE_NUMBER_MODESET_DEVICE 254
|
||||
|
||||
// Minor number 255 reserved for the control device
|
||||
#define NV_MINOR_DEVICE_NUMBER_CONTROL_DEVICE 255
|
||||
|
||||
#endif // _NV_CHARDEV_NUMBERS_H_
|
||||
|
||||
@@ -37,11 +37,13 @@ typedef enum _HYPERVISOR_TYPE
|
||||
OS_HYPERVISOR_UNKNOWN
|
||||
} HYPERVISOR_TYPE;
|
||||
|
||||
#define CMD_VFIO_WAKE_REMOVE_GPU 1
|
||||
#define CMD_VGPU_VFIO_PRESENT 2
|
||||
#define CMD_VFIO_PCI_CORE_PRESENT 3
|
||||
#define CMD_VGPU_VFIO_WAKE_WAIT_QUEUE 0
|
||||
#define CMD_VGPU_VFIO_INJECT_INTERRUPT 1
|
||||
#define CMD_VGPU_VFIO_REGISTER_MDEV 2
|
||||
#define CMD_VGPU_VFIO_PRESENT 3
|
||||
#define CMD_VFIO_PCI_CORE_PRESENT 4
|
||||
|
||||
#define MAX_VF_COUNT_PER_GPU 64
|
||||
#define MAX_VF_COUNT_PER_GPU 64
|
||||
|
||||
typedef enum _VGPU_TYPE_INFO
|
||||
{
|
||||
@@ -52,11 +54,17 @@ typedef enum _VGPU_TYPE_INFO
|
||||
|
||||
typedef struct
|
||||
{
|
||||
void *vgpuVfioRef;
|
||||
void *waitQueue;
|
||||
void *nv;
|
||||
NvU32 domain;
|
||||
NvU32 bus;
|
||||
NvU32 device;
|
||||
NvU32 return_status;
|
||||
NvU32 *vgpuTypeIds;
|
||||
NvU8 **vgpuNames;
|
||||
NvU32 numVgpuTypes;
|
||||
NvU32 domain;
|
||||
NvU8 bus;
|
||||
NvU8 slot;
|
||||
NvU8 function;
|
||||
NvBool is_virtfn;
|
||||
} vgpu_vfio_info;
|
||||
|
||||
typedef struct
|
||||
|
||||
@@ -25,12 +25,14 @@
|
||||
#ifndef NV_IOCTL_NUMA_H
|
||||
#define NV_IOCTL_NUMA_H
|
||||
|
||||
#if defined(NV_LINUX)
|
||||
|
||||
#include <nv-ioctl-numbers.h>
|
||||
|
||||
#if defined(NV_KERNEL_INTERFACE_LAYER) && defined(NV_LINUX)
|
||||
#if defined(NV_KERNEL_INTERFACE_LAYER)
|
||||
|
||||
#include <linux/types.h>
|
||||
#elif defined (NV_KERNEL_INTERFACE_LAYER) && defined(NV_BSD)
|
||||
#include <sys/stdint.h>
|
||||
|
||||
#else
|
||||
|
||||
#include <stdint.h>
|
||||
@@ -79,3 +81,5 @@ typedef struct nv_ioctl_set_numa_status
|
||||
#define NV_IOCTL_NUMA_STATUS_OFFLINE_FAILED 6
|
||||
|
||||
#endif
|
||||
|
||||
#endif
|
||||
|
||||
@@ -1,5 +1,5 @@
|
||||
/*
|
||||
* SPDX-FileCopyrightText: Copyright (c) 2020-2023 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
|
||||
* SPDX-FileCopyrightText: Copyright (c) 2020-2022 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
|
||||
* SPDX-License-Identifier: MIT
|
||||
*
|
||||
* Permission is hereby granted, free of charge, to any person obtaining a
|
||||
@@ -39,6 +39,5 @@
|
||||
#define NV_ESC_QUERY_DEVICE_INTR (NV_IOCTL_BASE + 13)
|
||||
#define NV_ESC_SYS_PARAMS (NV_IOCTL_BASE + 14)
|
||||
#define NV_ESC_EXPORT_TO_DMABUF_FD (NV_IOCTL_BASE + 17)
|
||||
#define NV_ESC_WAIT_OPEN_COMPLETE (NV_IOCTL_BASE + 18)
|
||||
|
||||
#endif
|
||||
|
||||
@@ -1,5 +1,5 @@
|
||||
/*
|
||||
* SPDX-FileCopyrightText: Copyright (c) 2020-2023 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
|
||||
* SPDX-FileCopyrightText: Copyright (c) 2020-2022 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
|
||||
* SPDX-License-Identifier: MIT
|
||||
*
|
||||
* Permission is hereby granted, free of charge, to any person obtaining a
|
||||
@@ -142,10 +142,4 @@ typedef struct nv_ioctl_export_to_dma_buf_fd
|
||||
NvU32 status;
|
||||
} nv_ioctl_export_to_dma_buf_fd_t;
|
||||
|
||||
typedef struct nv_ioctl_wait_open_complete
|
||||
{
|
||||
int rc;
|
||||
NvU32 adapterStatus;
|
||||
} nv_ioctl_wait_open_complete_t;
|
||||
|
||||
#endif
|
||||
|
||||
@@ -1,62 +0,0 @@
|
||||
/*
|
||||
* SPDX-FileCopyrightText: Copyright (c) 2016 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
|
||||
* SPDX-License-Identifier: MIT
|
||||
*
|
||||
* Permission is hereby granted, free of charge, to any person obtaining a
|
||||
* copy of this software and associated documentation files (the "Software"),
|
||||
* to deal in the Software without restriction, including without limitation
|
||||
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
|
||||
* and/or sell copies of the Software, and to permit persons to whom the
|
||||
* Software is furnished to do so, subject to the following conditions:
|
||||
*
|
||||
* The above copyright notice and this permission notice shall be included in
|
||||
* all copies or substantial portions of the Software.
|
||||
*
|
||||
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
|
||||
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
|
||||
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
|
||||
* DEALINGS IN THE SOFTWARE.
|
||||
*/
|
||||
|
||||
#ifndef __NV_KTHREAD_QUEUE_OS_H__
|
||||
#define __NV_KTHREAD_QUEUE_OS_H__
|
||||
|
||||
#include <linux/types.h> // atomic_t
|
||||
#include <linux/list.h> // list
|
||||
#include <linux/sched.h> // task_struct
|
||||
#include <linux/numa.h> // NUMA_NO_NODE
|
||||
#include <linux/semaphore.h>
|
||||
|
||||
#include "conftest.h"
|
||||
|
||||
struct nv_kthread_q
|
||||
{
|
||||
struct list_head q_list_head;
|
||||
spinlock_t q_lock;
|
||||
|
||||
// This is a counting semaphore. It gets incremented and decremented
|
||||
// exactly once for each item that is added to the queue.
|
||||
struct semaphore q_sem;
|
||||
atomic_t main_loop_should_exit;
|
||||
|
||||
struct task_struct *q_kthread;
|
||||
};
|
||||
|
||||
struct nv_kthread_q_item
|
||||
{
|
||||
struct list_head q_list_node;
|
||||
nv_q_func_t function_to_run;
|
||||
void *function_args;
|
||||
};
|
||||
|
||||
|
||||
#ifndef NUMA_NO_NODE
|
||||
#define NUMA_NO_NODE (-1)
|
||||
#endif
|
||||
|
||||
#define NV_KTHREAD_NO_NODE NUMA_NO_NODE
|
||||
|
||||
#endif
|
||||
@@ -24,14 +24,13 @@
|
||||
#ifndef __NV_KTHREAD_QUEUE_H__
|
||||
#define __NV_KTHREAD_QUEUE_H__
|
||||
|
||||
struct nv_kthread_q;
|
||||
struct nv_kthread_q_item;
|
||||
typedef struct nv_kthread_q nv_kthread_q_t;
|
||||
typedef struct nv_kthread_q_item nv_kthread_q_item_t;
|
||||
#include <linux/types.h> // atomic_t
|
||||
#include <linux/list.h> // list
|
||||
#include <linux/sched.h> // task_struct
|
||||
#include <linux/numa.h> // NUMA_NO_NODE
|
||||
#include <linux/semaphore.h>
|
||||
|
||||
typedef void (*nv_q_func_t)(void *args);
|
||||
|
||||
#include "nv-kthread-q-os.h"
|
||||
#include "conftest.h"
|
||||
|
||||
////////////////////////////////////////////////////////////////////////////////
|
||||
// nv_kthread_q:
|
||||
@@ -86,6 +85,38 @@ typedef void (*nv_q_func_t)(void *args);
|
||||
//
|
||||
////////////////////////////////////////////////////////////////////////////////
|
||||
|
||||
typedef struct nv_kthread_q nv_kthread_q_t;
|
||||
typedef struct nv_kthread_q_item nv_kthread_q_item_t;
|
||||
|
||||
typedef void (*nv_q_func_t)(void *args);
|
||||
|
||||
struct nv_kthread_q
|
||||
{
|
||||
struct list_head q_list_head;
|
||||
spinlock_t q_lock;
|
||||
|
||||
// This is a counting semaphore. It gets incremented and decremented
|
||||
// exactly once for each item that is added to the queue.
|
||||
struct semaphore q_sem;
|
||||
atomic_t main_loop_should_exit;
|
||||
|
||||
struct task_struct *q_kthread;
|
||||
};
|
||||
|
||||
struct nv_kthread_q_item
|
||||
{
|
||||
struct list_head q_list_node;
|
||||
nv_q_func_t function_to_run;
|
||||
void *function_args;
|
||||
};
|
||||
|
||||
|
||||
#ifndef NUMA_NO_NODE
|
||||
#define NUMA_NO_NODE (-1)
|
||||
#endif
|
||||
|
||||
#define NV_KTHREAD_NO_NODE NUMA_NO_NODE
|
||||
|
||||
//
|
||||
// The queue must not be used before calling this routine.
|
||||
//
|
||||
@@ -124,7 +155,10 @@ int nv_kthread_q_init_on_node(nv_kthread_q_t *q,
|
||||
// This routine is the same as nv_kthread_q_init_on_node() with the exception
|
||||
// that the queue stack will be allocated on the NUMA node of the caller.
|
||||
//
|
||||
int nv_kthread_q_init(nv_kthread_q_t *q, const char *qname);
|
||||
static inline int nv_kthread_q_init(nv_kthread_q_t *q, const char *qname)
|
||||
{
|
||||
return nv_kthread_q_init_on_node(q, qname, NV_KTHREAD_NO_NODE);
|
||||
}
|
||||
|
||||
//
|
||||
// The caller is responsible for stopping all queues, by calling this routine
|
||||
|
||||
@@ -35,7 +35,6 @@
|
||||
#include "os-interface.h"
|
||||
#include "nv-timer.h"
|
||||
#include "nv-time.h"
|
||||
#include "nv-chardev-numbers.h"
|
||||
|
||||
#define NV_KERNEL_NAME "Linux"
|
||||
|
||||
@@ -249,7 +248,7 @@ NV_STATUS nvos_forward_error_to_cray(struct pci_dev *, NvU32,
|
||||
#undef NV_SET_PAGES_UC_PRESENT
|
||||
#endif
|
||||
|
||||
#if !defined(NVCPU_AARCH64) && !defined(NVCPU_PPC64LE) && !defined(NVCPU_RISCV64)
|
||||
#if !defined(NVCPU_AARCH64) && !defined(NVCPU_PPC64LE)
|
||||
#if !defined(NV_SET_MEMORY_UC_PRESENT) && !defined(NV_SET_PAGES_UC_PRESENT)
|
||||
#error "This driver requires the ability to change memory types!"
|
||||
#endif
|
||||
@@ -407,6 +406,32 @@ extern int nv_pat_mode;
|
||||
#define NV_GFP_DMA32 (NV_GFP_KERNEL)
|
||||
#endif
|
||||
|
||||
extern NvBool nvos_is_chipset_io_coherent(void);
|
||||
|
||||
#if defined(NVCPU_X86_64)
|
||||
#define CACHE_FLUSH() asm volatile("wbinvd":::"memory")
|
||||
#define WRITE_COMBINE_FLUSH() asm volatile("sfence":::"memory")
|
||||
#elif defined(NVCPU_AARCH64)
|
||||
static inline void nv_flush_cache_cpu(void *info)
|
||||
{
|
||||
if (!nvos_is_chipset_io_coherent())
|
||||
{
|
||||
#if defined(NV_FLUSH_CACHE_ALL_PRESENT)
|
||||
flush_cache_all();
|
||||
#else
|
||||
WARN_ONCE(0, "NVRM: kernel does not support flush_cache_all()\n");
|
||||
#endif
|
||||
}
|
||||
}
|
||||
#define CACHE_FLUSH() nv_flush_cache_cpu(NULL)
|
||||
#define CACHE_FLUSH_ALL() on_each_cpu(nv_flush_cache_cpu, NULL, 1)
|
||||
#define WRITE_COMBINE_FLUSH() mb()
|
||||
#elif defined(NVCPU_PPC64LE)
|
||||
#define CACHE_FLUSH() asm volatile("sync; \n" \
|
||||
"isync; \n" ::: "memory")
|
||||
#define WRITE_COMBINE_FLUSH() CACHE_FLUSH()
|
||||
#endif
|
||||
|
||||
typedef enum
|
||||
{
|
||||
NV_MEMORY_TYPE_SYSTEM, /* Memory mapped for ROM, SBIOS and physical RAM. */
|
||||
@@ -415,7 +440,7 @@ typedef enum
|
||||
NV_MEMORY_TYPE_DEVICE_MMIO, /* All kinds of MMIO referred by NVRM e.g. BARs and MCFG of device */
|
||||
} nv_memory_type_t;
|
||||
|
||||
#if defined(NVCPU_AARCH64) || defined(NVCPU_PPC64LE) || defined(NVCPU_RISCV64)
|
||||
#if defined(NVCPU_AARCH64) || defined(NVCPU_PPC64LE)
|
||||
#define NV_ALLOW_WRITE_COMBINING(mt) 1
|
||||
#elif defined(NVCPU_X86_64)
|
||||
#if defined(NV_ENABLE_PAT_SUPPORT)
|
||||
@@ -728,6 +753,7 @@ static inline dma_addr_t nv_phys_to_dma(struct device *dev, NvU64 pa)
|
||||
#define NV_VMA_FILE(vma) ((vma)->vm_file)
|
||||
|
||||
#define NV_DEVICE_MINOR_NUMBER(x) minor((x)->i_rdev)
|
||||
#define NV_CONTROL_DEVICE_MINOR 255
|
||||
|
||||
#define NV_PCI_DISABLE_DEVICE(pci_dev) \
|
||||
{ \
|
||||
@@ -1350,19 +1376,7 @@ typedef struct nv_dma_map_s {
|
||||
i < dm->mapping.discontig.submap_count; \
|
||||
i++, sm = &dm->mapping.discontig.submaps[i])
|
||||
|
||||
/*
|
||||
* On 4K ARM kernels, use max submap size a multiple of 64K to keep nv-p2p happy.
|
||||
* Despite 4K OS pages, we still use 64K P2P pages due to dependent modules still using 64K.
|
||||
* Instead of using (4G-4K), use max submap size as (4G-64K) since the mapped IOVA range
|
||||
* must be aligned at 64K boundary.
|
||||
*/
|
||||
#if defined(CONFIG_ARM64_4K_PAGES)
|
||||
#define NV_DMA_U32_MAX_4K_PAGES ((NvU32)((NV_U32_MAX >> PAGE_SHIFT) + 1))
|
||||
#define NV_DMA_SUBMAP_MAX_PAGES ((NvU32)(NV_DMA_U32_MAX_4K_PAGES - 16))
|
||||
#else
|
||||
#define NV_DMA_SUBMAP_MAX_PAGES ((NvU32)(NV_U32_MAX >> PAGE_SHIFT))
|
||||
#endif
|
||||
|
||||
#define NV_DMA_SUBMAP_IDX_TO_PAGE_IDX(s) (s * NV_DMA_SUBMAP_MAX_PAGES)
|
||||
|
||||
/*
|
||||
@@ -1442,11 +1456,6 @@ typedef struct coherent_link_info_s {
|
||||
* baremetal OS environment it is System Physical Address(SPA) and in the case
|
||||
* of virutalized OS environment it is Intermediate Physical Address(IPA) */
|
||||
NvU64 gpu_mem_pa;
|
||||
|
||||
/* Physical address of the reserved portion of the GPU memory, applicable
|
||||
* only in Grace Hopper self hosted passthrough virtualizatioan platform. */
|
||||
NvU64 rsvd_mem_pa;
|
||||
|
||||
/* Bitmap of NUMA node ids, corresponding to the reserved PXMs,
|
||||
* available for adding GPU memory to the kernel as system RAM */
|
||||
DECLARE_BITMAP(free_node_bitmap, MAX_NUMNODES);
|
||||
@@ -1594,30 +1603,6 @@ typedef struct nv_linux_state_s {
|
||||
|
||||
struct nv_dma_device dma_dev;
|
||||
struct nv_dma_device niso_dma_dev;
|
||||
|
||||
/*
|
||||
* Background kthread for handling deferred open operations
|
||||
* (e.g. from O_NONBLOCK).
|
||||
*
|
||||
* Adding to open_q and reading/writing is_accepting_opens
|
||||
* are protected by nvl->open_q_lock (not nvl->ldata_lock).
|
||||
* This allows new deferred open operations to be enqueued without
|
||||
* blocking behind previous ones (which hold nvl->ldata_lock).
|
||||
*
|
||||
* Adding to open_q is only safe if is_accepting_opens is true.
|
||||
* This prevents open operations from racing with device removal.
|
||||
*
|
||||
* Stopping open_q is only safe after setting is_accepting_opens to false.
|
||||
* This ensures that the open_q (and the larger nvl structure) will
|
||||
* outlive any of the open operations enqueued.
|
||||
*/
|
||||
nv_kthread_q_t open_q;
|
||||
NvBool is_accepting_opens;
|
||||
struct semaphore open_q_lock;
|
||||
#if defined(NV_VGPU_KVM_BUILD)
|
||||
wait_queue_head_t wait;
|
||||
NvS32 return_status;
|
||||
#endif
|
||||
} nv_linux_state_t;
|
||||
|
||||
extern nv_linux_state_t *nv_linux_devices;
|
||||
@@ -1661,13 +1646,22 @@ typedef struct nvidia_event
|
||||
nv_event_t event;
|
||||
} nvidia_event_t;
|
||||
|
||||
typedef enum
|
||||
{
|
||||
NV_FOPS_STACK_INDEX_MMAP,
|
||||
NV_FOPS_STACK_INDEX_IOCTL,
|
||||
NV_FOPS_STACK_INDEX_COUNT
|
||||
} nvidia_entry_point_index_t;
|
||||
|
||||
typedef struct
|
||||
{
|
||||
nv_file_private_t nvfp;
|
||||
|
||||
nvidia_stack_t *sp;
|
||||
nvidia_stack_t *fops_sp[NV_FOPS_STACK_INDEX_COUNT];
|
||||
struct semaphore fops_sp_lock[NV_FOPS_STACK_INDEX_COUNT];
|
||||
nv_alloc_t *free_list;
|
||||
nv_linux_state_t *nvptr;
|
||||
void *nvptr;
|
||||
nvidia_event_t *event_data_head, *event_data_tail;
|
||||
NvBool dataless_event_pending;
|
||||
nv_spinlock_t fp_lock;
|
||||
@@ -1678,12 +1672,6 @@ typedef struct
|
||||
nv_alloc_mapping_context_t mmap_context;
|
||||
struct address_space mapping;
|
||||
|
||||
nv_kthread_q_item_t open_q_item;
|
||||
struct completion open_complete;
|
||||
nv_linux_state_t *deferred_open_nvl;
|
||||
int open_rc;
|
||||
NV_STATUS adapter_status;
|
||||
|
||||
struct list_head entry;
|
||||
} nv_linux_file_private_t;
|
||||
|
||||
@@ -1692,21 +1680,6 @@ static inline nv_linux_file_private_t *nv_get_nvlfp_from_nvfp(nv_file_private_t
|
||||
return container_of(nvfp, nv_linux_file_private_t, nvfp);
|
||||
}
|
||||
|
||||
static inline int nv_wait_open_complete_interruptible(nv_linux_file_private_t *nvlfp)
|
||||
{
|
||||
return wait_for_completion_interruptible(&nvlfp->open_complete);
|
||||
}
|
||||
|
||||
static inline void nv_wait_open_complete(nv_linux_file_private_t *nvlfp)
|
||||
{
|
||||
wait_for_completion(&nvlfp->open_complete);
|
||||
}
|
||||
|
||||
static inline NvBool nv_is_open_complete(nv_linux_file_private_t *nvlfp)
|
||||
{
|
||||
return completion_done(&nvlfp->open_complete);
|
||||
}
|
||||
|
||||
#define NV_SET_FILE_PRIVATE(filep,data) ((filep)->private_data = (data))
|
||||
#define NV_GET_LINUX_FILE_PRIVATE(filep) ((nv_linux_file_private_t *)(filep)->private_data)
|
||||
|
||||
@@ -1716,6 +1689,28 @@ static inline NvBool nv_is_open_complete(nv_linux_file_private_t *nvlfp)
|
||||
|
||||
#define NV_STATE_PTR(nvl) &(((nv_linux_state_t *)(nvl))->nv_state)
|
||||
|
||||
static inline nvidia_stack_t *nv_nvlfp_get_sp(nv_linux_file_private_t *nvlfp, nvidia_entry_point_index_t which)
|
||||
{
|
||||
#if defined(NVCPU_X86_64)
|
||||
if (rm_is_altstack_in_use())
|
||||
{
|
||||
down(&nvlfp->fops_sp_lock[which]);
|
||||
return nvlfp->fops_sp[which];
|
||||
}
|
||||
#endif
|
||||
return NULL;
|
||||
}
|
||||
|
||||
static inline void nv_nvlfp_put_sp(nv_linux_file_private_t *nvlfp, nvidia_entry_point_index_t which)
|
||||
{
|
||||
#if defined(NVCPU_X86_64)
|
||||
if (rm_is_altstack_in_use())
|
||||
{
|
||||
up(&nvlfp->fops_sp_lock[which]);
|
||||
}
|
||||
#endif
|
||||
}
|
||||
|
||||
#define NV_ATOMIC_READ(data) atomic_read(&(data))
|
||||
#define NV_ATOMIC_SET(data,val) atomic_set(&(data), (val))
|
||||
#define NV_ATOMIC_INC(data) atomic_inc(&(data))
|
||||
@@ -1788,18 +1783,12 @@ static inline NV_STATUS nv_check_gpu_state(nv_state_t *nv)
|
||||
extern NvU32 NVreg_EnableUserNUMAManagement;
|
||||
extern NvU32 NVreg_RegisterPCIDriver;
|
||||
extern NvU32 NVreg_EnableResizableBar;
|
||||
extern NvU32 NVreg_EnableNonblockingOpen;
|
||||
|
||||
extern NvU32 num_probed_nv_devices;
|
||||
extern NvU32 num_nv_devices;
|
||||
|
||||
#define NV_FILE_INODE(file) (file)->f_inode
|
||||
|
||||
static inline int nv_is_control_device(struct inode *inode)
|
||||
{
|
||||
return (minor((inode)->i_rdev) == NV_MINOR_DEVICE_NUMBER_CONTROL_DEVICE);
|
||||
}
|
||||
|
||||
#if defined(NV_DOM0_KERNEL_PRESENT) || defined(NV_VGPU_KVM_BUILD)
|
||||
#define NV_VGX_HYPER
|
||||
#if defined(NV_XEN_IOEMU_INJECT_MSI)
|
||||
|
||||
@@ -1,5 +1,5 @@
|
||||
/*
|
||||
* SPDX-FileCopyrightText: Copyright (c) 2017-2023 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
|
||||
* SPDX-FileCopyrightText: Copyright (c) 2017 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
|
||||
* SPDX-License-Identifier: MIT
|
||||
*
|
||||
* Permission is hereby granted, free of charge, to any person obtaining a
|
||||
@@ -37,7 +37,6 @@
|
||||
|
||||
#if defined(CONFIG_PREEMPT_RT) || defined(CONFIG_PREEMPT_RT_FULL)
|
||||
typedef raw_spinlock_t nv_spinlock_t;
|
||||
#define NV_DEFINE_SPINLOCK(lock) DEFINE_RAW_SPINLOCK(lock)
|
||||
#define NV_SPIN_LOCK_INIT(lock) raw_spin_lock_init(lock)
|
||||
#define NV_SPIN_LOCK_IRQ(lock) raw_spin_lock_irq(lock)
|
||||
#define NV_SPIN_UNLOCK_IRQ(lock) raw_spin_unlock_irq(lock)
|
||||
@@ -48,7 +47,6 @@ typedef raw_spinlock_t nv_spinlock_t;
|
||||
#define NV_SPIN_UNLOCK_WAIT(lock) raw_spin_unlock_wait(lock)
|
||||
#else
|
||||
typedef spinlock_t nv_spinlock_t;
|
||||
#define NV_DEFINE_SPINLOCK(lock) DEFINE_SPINLOCK(lock)
|
||||
#define NV_SPIN_LOCK_INIT(lock) spin_lock_init(lock)
|
||||
#define NV_SPIN_LOCK_IRQ(lock) spin_lock_irq(lock)
|
||||
#define NV_SPIN_UNLOCK_IRQ(lock) spin_unlock_irq(lock)
|
||||
|
||||
@@ -44,18 +44,12 @@ typedef int vm_fault_t;
|
||||
|
||||
#include <linux/mm.h>
|
||||
#include <linux/sched.h>
|
||||
|
||||
/*
|
||||
* FreeBSD's pin_user_pages's conftest breaks since pin_user_pages is an inline
|
||||
* function. Because it simply maps to get_user_pages, we can just replace
|
||||
* NV_PIN_USER_PAGES with NV_GET_USER_PAGES on FreeBSD
|
||||
*/
|
||||
#if defined(NV_PIN_USER_PAGES_PRESENT) && !defined(NV_BSD)
|
||||
#if defined(NV_PIN_USER_PAGES_PRESENT)
|
||||
#if defined(NV_PIN_USER_PAGES_HAS_ARGS_VMAS)
|
||||
#define NV_PIN_USER_PAGES(start, nr_pages, gup_flags, pages) \
|
||||
pin_user_pages(start, nr_pages, gup_flags, pages, NULL)
|
||||
#else
|
||||
#define NV_PIN_USER_PAGES pin_user_pages
|
||||
#else
|
||||
#define NV_PIN_USER_PAGES(start, nr_pages, gup_flags, pages, vmas) \
|
||||
pin_user_pages(start, nr_pages, gup_flags, pages)
|
||||
#endif // NV_PIN_USER_PAGES_HAS_ARGS_VMAS
|
||||
#define NV_UNPIN_USER_PAGE unpin_user_page
|
||||
#else
|
||||
@@ -86,28 +80,29 @@ typedef int vm_fault_t;
|
||||
*/
|
||||
|
||||
#if defined(NV_GET_USER_PAGES_HAS_ARGS_FLAGS)
|
||||
#define NV_GET_USER_PAGES get_user_pages
|
||||
#define NV_GET_USER_PAGES(start, nr_pages, flags, pages, vmas) \
|
||||
get_user_pages(start, nr_pages, flags, pages)
|
||||
#elif defined(NV_GET_USER_PAGES_HAS_ARGS_FLAGS_VMAS)
|
||||
#define NV_GET_USER_PAGES(start, nr_pages, flags, pages) \
|
||||
get_user_pages(start, nr_pages, flags, pages, NULL)
|
||||
#define NV_GET_USER_PAGES get_user_pages
|
||||
#elif defined(NV_GET_USER_PAGES_HAS_ARGS_TSK_FLAGS_VMAS)
|
||||
#define NV_GET_USER_PAGES(start, nr_pages, flags, pages) \
|
||||
get_user_pages(current, current->mm, start, nr_pages, flags, pages, NULL)
|
||||
#define NV_GET_USER_PAGES(start, nr_pages, flags, pages, vmas) \
|
||||
get_user_pages(current, current->mm, start, nr_pages, flags, pages, vmas)
|
||||
#else
|
||||
static inline long NV_GET_USER_PAGES(unsigned long start,
|
||||
unsigned long nr_pages,
|
||||
unsigned int flags,
|
||||
struct page **pages)
|
||||
struct page **pages,
|
||||
struct vm_area_struct **vmas)
|
||||
{
|
||||
int write = flags & FOLL_WRITE;
|
||||
int force = flags & FOLL_FORCE;
|
||||
|
||||
#if defined(NV_GET_USER_PAGES_HAS_ARGS_WRITE_FORCE_VMAS)
|
||||
return get_user_pages(start, nr_pages, write, force, pages, NULL);
|
||||
return get_user_pages(start, nr_pages, write, force, pages, vmas);
|
||||
#else
|
||||
// NV_GET_USER_PAGES_HAS_ARGS_TSK_WRITE_FORCE_VMAS
|
||||
return get_user_pages(current, current->mm, start, nr_pages, write,
|
||||
force, pages, NULL);
|
||||
force, pages, vmas);
|
||||
#endif // NV_GET_USER_PAGES_HAS_ARGS_WRITE_FORCE_VMAS
|
||||
}
|
||||
#endif // NV_GET_USER_PAGES_HAS_ARGS_FLAGS
|
||||
@@ -129,13 +124,13 @@ typedef int vm_fault_t;
|
||||
|
||||
#if defined(NV_PIN_USER_PAGES_REMOTE_PRESENT)
|
||||
#if defined(NV_PIN_USER_PAGES_REMOTE_HAS_ARGS_TSK_VMAS)
|
||||
#define NV_PIN_USER_PAGES_REMOTE(mm, start, nr_pages, flags, pages, locked) \
|
||||
pin_user_pages_remote(NULL, mm, start, nr_pages, flags, pages, NULL, locked)
|
||||
#define NV_PIN_USER_PAGES_REMOTE(mm, start, nr_pages, flags, pages, vmas, locked) \
|
||||
pin_user_pages_remote(NULL, mm, start, nr_pages, flags, pages, vmas, locked)
|
||||
#elif defined(NV_PIN_USER_PAGES_REMOTE_HAS_ARGS_VMAS)
|
||||
#define NV_PIN_USER_PAGES_REMOTE(mm, start, nr_pages, flags, pages, locked) \
|
||||
pin_user_pages_remote(mm, start, nr_pages, flags, pages, NULL, locked)
|
||||
#else
|
||||
#define NV_PIN_USER_PAGES_REMOTE pin_user_pages_remote
|
||||
#else
|
||||
#define NV_PIN_USER_PAGES_REMOTE(mm, start, nr_pages, flags, pages, vmas, locked) \
|
||||
pin_user_pages_remote(mm, start, nr_pages, flags, pages, locked)
|
||||
#endif // NV_PIN_USER_PAGES_REMOTE_HAS_ARGS_TSK_VMAS
|
||||
#else
|
||||
#define NV_PIN_USER_PAGES_REMOTE NV_GET_USER_PAGES_REMOTE
|
||||
@@ -171,19 +166,19 @@ typedef int vm_fault_t;
|
||||
|
||||
#if defined(NV_GET_USER_PAGES_REMOTE_PRESENT)
|
||||
#if defined(NV_GET_USER_PAGES_REMOTE_HAS_ARGS_FLAGS_LOCKED)
|
||||
#define NV_GET_USER_PAGES_REMOTE get_user_pages_remote
|
||||
#define NV_GET_USER_PAGES_REMOTE(mm, start, nr_pages, flags, pages, vmas, locked) \
|
||||
get_user_pages_remote(mm, start, nr_pages, flags, pages, locked)
|
||||
|
||||
#elif defined(NV_GET_USER_PAGES_REMOTE_HAS_ARGS_FLAGS_LOCKED_VMAS)
|
||||
#define NV_GET_USER_PAGES_REMOTE(mm, start, nr_pages, flags, pages, locked) \
|
||||
get_user_pages_remote(mm, start, nr_pages, flags, pages, NULL, locked)
|
||||
#define NV_GET_USER_PAGES_REMOTE get_user_pages_remote
|
||||
|
||||
#elif defined(NV_GET_USER_PAGES_REMOTE_HAS_ARGS_TSK_FLAGS_LOCKED_VMAS)
|
||||
#define NV_GET_USER_PAGES_REMOTE(mm, start, nr_pages, flags, pages, locked) \
|
||||
get_user_pages_remote(NULL, mm, start, nr_pages, flags, pages, NULL, locked)
|
||||
#define NV_GET_USER_PAGES_REMOTE(mm, start, nr_pages, flags, pages, vmas, locked) \
|
||||
get_user_pages_remote(NULL, mm, start, nr_pages, flags, pages, vmas, locked)
|
||||
|
||||
#elif defined(NV_GET_USER_PAGES_REMOTE_HAS_ARGS_TSK_FLAGS_VMAS)
|
||||
#define NV_GET_USER_PAGES_REMOTE(mm, start, nr_pages, flags, pages, locked) \
|
||||
get_user_pages_remote(NULL, mm, start, nr_pages, flags, pages, NULL)
|
||||
#define NV_GET_USER_PAGES_REMOTE(mm, start, nr_pages, flags, pages, vmas, locked) \
|
||||
get_user_pages_remote(NULL, mm, start, nr_pages, flags, pages, vmas)
|
||||
|
||||
#else
|
||||
// NV_GET_USER_PAGES_REMOTE_HAS_ARGS_TSK_WRITE_FORCE_VMAS
|
||||
@@ -192,13 +187,14 @@ typedef int vm_fault_t;
|
||||
unsigned long nr_pages,
|
||||
unsigned int flags,
|
||||
struct page **pages,
|
||||
struct vm_area_struct **vmas,
|
||||
int *locked)
|
||||
{
|
||||
int write = flags & FOLL_WRITE;
|
||||
int force = flags & FOLL_FORCE;
|
||||
|
||||
return get_user_pages_remote(NULL, mm, start, nr_pages, write, force,
|
||||
pages, NULL);
|
||||
pages, vmas);
|
||||
}
|
||||
#endif // NV_GET_USER_PAGES_REMOTE_HAS_ARGS_FLAGS_LOCKED
|
||||
#else
|
||||
@@ -208,17 +204,18 @@ typedef int vm_fault_t;
|
||||
unsigned long nr_pages,
|
||||
unsigned int flags,
|
||||
struct page **pages,
|
||||
struct vm_area_struct **vmas,
|
||||
int *locked)
|
||||
{
|
||||
int write = flags & FOLL_WRITE;
|
||||
int force = flags & FOLL_FORCE;
|
||||
|
||||
return get_user_pages(NULL, mm, start, nr_pages, write, force, pages, NULL);
|
||||
return get_user_pages(NULL, mm, start, nr_pages, write, force, pages, vmas);
|
||||
}
|
||||
|
||||
#else
|
||||
#define NV_GET_USER_PAGES_REMOTE(mm, start, nr_pages, flags, pages, locked) \
|
||||
get_user_pages(NULL, mm, start, nr_pages, flags, pages, NULL)
|
||||
#define NV_GET_USER_PAGES_REMOTE(mm, start, nr_pages, flags, pages, vmas, locked) \
|
||||
get_user_pages(NULL, mm, start, nr_pages, flags, pages, vmas)
|
||||
#endif // NV_GET_USER_PAGES_HAS_ARGS_TSK_WRITE_FORCE_VMAS
|
||||
#endif // NV_GET_USER_PAGES_REMOTE_PRESENT
|
||||
|
||||
|
||||
@@ -1,5 +1,5 @@
|
||||
/*
|
||||
* SPDX-FileCopyrightText: Copyright (c) 2015-2023 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
|
||||
* SPDX-FileCopyrightText: Copyright (c) 2015 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
|
||||
* SPDX-License-Identifier: MIT
|
||||
*
|
||||
* Permission is hereby granted, free of charge, to any person obtaining a
|
||||
@@ -60,7 +60,6 @@ static inline pgprot_t pgprot_modify_writecombine(pgprot_t old_prot)
|
||||
#endif /* !defined(NV_VMWARE) */
|
||||
|
||||
#if defined(NVCPU_AARCH64)
|
||||
extern NvBool nvos_is_chipset_io_coherent(void);
|
||||
/*
|
||||
* Don't rely on the kernel's definition of pgprot_noncached(), as on 64-bit
|
||||
* ARM that's not for system memory, but device memory instead. For I/O cache
|
||||
@@ -120,13 +119,6 @@ extern NvBool nvos_is_chipset_io_coherent(void);
|
||||
#define NV_PGPROT_WRITE_COMBINED(old_prot) old_prot
|
||||
#define NV_PGPROT_READ_ONLY(old_prot) \
|
||||
__pgprot(pgprot_val((old_prot)) & ~NV_PAGE_RW)
|
||||
#elif defined(NVCPU_RISCV64)
|
||||
#define NV_PGPROT_WRITE_COMBINED_DEVICE(old_prot) \
|
||||
pgprot_writecombine(old_prot)
|
||||
/* Don't attempt to mark sysmem pages as write combined on riscv */
|
||||
#define NV_PGPROT_WRITE_COMBINED(old_prot) old_prot
|
||||
#define NV_PGPROT_READ_ONLY(old_prot) \
|
||||
__pgprot(pgprot_val((old_prot)) & ~_PAGE_WRITE)
|
||||
#else
|
||||
/* Writecombine is not supported */
|
||||
#undef NV_PGPROT_WRITE_COMBINED_DEVICE(old_prot)
|
||||
|
||||
@@ -92,24 +92,6 @@ typedef struct file_operations nv_proc_ops_t;
|
||||
#endif
|
||||
|
||||
#define NV_DEFINE_SINGLE_PROCFS_FILE_HELPER(name, lock) \
|
||||
static ssize_t nv_procfs_read_lock_##name( \
|
||||
struct file *file, \
|
||||
char __user *buf, \
|
||||
size_t size, \
|
||||
loff_t *ppos \
|
||||
) \
|
||||
{ \
|
||||
int ret; \
|
||||
ret = nv_down_read_interruptible(&lock); \
|
||||
if (ret < 0) \
|
||||
{ \
|
||||
return ret; \
|
||||
} \
|
||||
size = seq_read(file, buf, size, ppos); \
|
||||
up_read(&lock); \
|
||||
return size; \
|
||||
} \
|
||||
\
|
||||
static int nv_procfs_open_##name( \
|
||||
struct inode *inode, \
|
||||
struct file *filep \
|
||||
@@ -122,6 +104,11 @@ typedef struct file_operations nv_proc_ops_t;
|
||||
{ \
|
||||
return ret; \
|
||||
} \
|
||||
ret = nv_down_read_interruptible(&lock); \
|
||||
if (ret < 0) \
|
||||
{ \
|
||||
single_release(inode, filep); \
|
||||
} \
|
||||
return ret; \
|
||||
} \
|
||||
\
|
||||
@@ -130,6 +117,7 @@ typedef struct file_operations nv_proc_ops_t;
|
||||
struct file *filep \
|
||||
) \
|
||||
{ \
|
||||
up_read(&lock); \
|
||||
return single_release(inode, filep); \
|
||||
}
|
||||
|
||||
@@ -139,7 +127,46 @@ typedef struct file_operations nv_proc_ops_t;
|
||||
static const nv_proc_ops_t nv_procfs_##name##_fops = { \
|
||||
NV_PROC_OPS_SET_OWNER() \
|
||||
.NV_PROC_OPS_OPEN = nv_procfs_open_##name, \
|
||||
.NV_PROC_OPS_READ = nv_procfs_read_lock_##name, \
|
||||
.NV_PROC_OPS_READ = seq_read, \
|
||||
.NV_PROC_OPS_LSEEK = seq_lseek, \
|
||||
.NV_PROC_OPS_RELEASE = nv_procfs_release_##name, \
|
||||
};
|
||||
|
||||
|
||||
#define NV_DEFINE_SINGLE_PROCFS_FILE_READ_WRITE(name, lock, \
|
||||
write_callback) \
|
||||
NV_DEFINE_SINGLE_PROCFS_FILE_HELPER(name, lock) \
|
||||
\
|
||||
static ssize_t nv_procfs_write_##name( \
|
||||
struct file *file, \
|
||||
const char __user *buf, \
|
||||
size_t size, \
|
||||
loff_t *ppos \
|
||||
) \
|
||||
{ \
|
||||
ssize_t ret; \
|
||||
struct seq_file *s; \
|
||||
\
|
||||
s = file->private_data; \
|
||||
if (s == NULL) \
|
||||
{ \
|
||||
return -EIO; \
|
||||
} \
|
||||
\
|
||||
ret = write_callback(s, buf + *ppos, size - *ppos); \
|
||||
if (ret == 0) \
|
||||
{ \
|
||||
/* avoid infinite loop */ \
|
||||
ret = -EIO; \
|
||||
} \
|
||||
return ret; \
|
||||
} \
|
||||
\
|
||||
static const nv_proc_ops_t nv_procfs_##name##_fops = { \
|
||||
NV_PROC_OPS_SET_OWNER() \
|
||||
.NV_PROC_OPS_OPEN = nv_procfs_open_##name, \
|
||||
.NV_PROC_OPS_READ = seq_read, \
|
||||
.NV_PROC_OPS_WRITE = nv_procfs_write_##name, \
|
||||
.NV_PROC_OPS_LSEEK = seq_lseek, \
|
||||
.NV_PROC_OPS_RELEASE = nv_procfs_release_##name, \
|
||||
};
|
||||
|
||||
@@ -1,5 +1,5 @@
|
||||
/*
|
||||
* SPDX-FileCopyrightText: Copyright (c) 1999-2023 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
|
||||
* SPDX-FileCopyrightText: Copyright (c) 1999-2021 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
|
||||
* SPDX-License-Identifier: MIT
|
||||
*
|
||||
* Permission is hereby granted, free of charge, to any person obtaining a
|
||||
@@ -25,8 +25,10 @@
|
||||
#define _NV_PROTO_H_
|
||||
|
||||
#include "nv-pci.h"
|
||||
#include "nv-register-module.h"
|
||||
|
||||
extern const char *nv_device_name;
|
||||
extern nvidia_module_t nv_fops;
|
||||
|
||||
void nv_acpi_register_notifier (nv_linux_state_t *);
|
||||
void nv_acpi_unregister_notifier (nv_linux_state_t *);
|
||||
@@ -84,11 +86,8 @@ void nv_shutdown_adapter(nvidia_stack_t *, nv_state_t *, nv_linux_state
|
||||
void nv_dev_free_stacks(nv_linux_state_t *);
|
||||
NvBool nv_lock_init_locks(nvidia_stack_t *, nv_state_t *);
|
||||
void nv_lock_destroy_locks(nvidia_stack_t *, nv_state_t *);
|
||||
int nv_linux_add_device_locked(nv_linux_state_t *);
|
||||
void nv_linux_add_device_locked(nv_linux_state_t *);
|
||||
void nv_linux_remove_device_locked(nv_linux_state_t *);
|
||||
NvBool nv_acpi_power_resource_method_present(struct pci_dev *);
|
||||
|
||||
int nv_linux_init_open_q(nv_linux_state_t *);
|
||||
void nv_linux_stop_open_q(nv_linux_state_t *);
|
||||
|
||||
#endif /* _NV_PROTO_H_ */
|
||||
|
||||
@@ -1,5 +1,5 @@
|
||||
/*
|
||||
* SPDX-FileCopyrightText: Copyright (c) 2021-2023 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
|
||||
* SPDX-FileCopyrightText: Copyright (c) 2012-2013 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
|
||||
* SPDX-License-Identifier: MIT
|
||||
*
|
||||
* Permission is hereby granted, free of charge, to any person obtaining a
|
||||
@@ -21,24 +21,35 @@
|
||||
* DEALINGS IN THE SOFTWARE.
|
||||
*/
|
||||
|
||||
/*
|
||||
* This file defines macros to place tracepoints for RATS (RM All-around Trace
|
||||
* System). The names of the functions and variables associated with this are
|
||||
* temporary as we begin to unify all RM tracing tools under one system.
|
||||
*/
|
||||
|
||||
#ifndef GSP_TRACE_RATS_MACRO_H
|
||||
#define GSP_TRACE_RATS_MACRO_H
|
||||
#ifndef _NV_REGISTER_MODULE_H_
|
||||
#define _NV_REGISTER_MODULE_H_
|
||||
|
||||
#include "core/core.h"
|
||||
#include <linux/module.h>
|
||||
#include <linux/fs.h>
|
||||
#include <linux/poll.h>
|
||||
|
||||
#define GSP_TRACING_RATS_ENABLED 0
|
||||
#define GSP_TRACE_RATS_ADD_RECORD(recordIdentifier, pGpu, info) (void) 0
|
||||
#include "nvtypes.h"
|
||||
|
||||
#define KERNEL_GSP_TRACING_RATS_ENABLED 0
|
||||
typedef struct nvidia_module_s {
|
||||
struct module *owner;
|
||||
|
||||
#ifndef GET_RATS_TIMESTAMP_NS
|
||||
#define GET_RATS_TIMESTAMP_NS() NV_ASSERT(0)
|
||||
#endif
|
||||
/* nvidia0, nvidia1 ..*/
|
||||
const char *module_name;
|
||||
|
||||
/* module instance */
|
||||
NvU32 instance;
|
||||
|
||||
/* file operations */
|
||||
int (*open)(struct inode *, struct file *filp);
|
||||
int (*close)(struct inode *, struct file *filp);
|
||||
int (*mmap)(struct file *filp, struct vm_area_struct *vma);
|
||||
int (*ioctl)(struct inode *, struct file * file, unsigned int cmd, unsigned long arg);
|
||||
unsigned int (*poll)(struct file * file, poll_table *wait);
|
||||
|
||||
} nvidia_module_t;
|
||||
|
||||
int nvidia_register_module(nvidia_module_t *);
|
||||
int nvidia_unregister_module(nvidia_module_t *);
|
||||
|
||||
#endif
|
||||
@@ -1,5 +1,5 @@
|
||||
/*
|
||||
* SPDX-FileCopyrightText: Copyright (c) 1999-2023 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
|
||||
* SPDX-FileCopyrightText: Copyright (c) 1999-2022 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
|
||||
* SPDX-License-Identifier: MIT
|
||||
*
|
||||
* Permission is hereby granted, free of charge, to any person obtaining a
|
||||
@@ -42,7 +42,6 @@
|
||||
#include <nv-caps.h>
|
||||
#include <nv-firmware.h>
|
||||
#include <nv-ioctl.h>
|
||||
#include <nv-ioctl-numa.h>
|
||||
#include <nvmisc.h>
|
||||
|
||||
extern nv_cap_t *nvidia_caps_root;
|
||||
@@ -51,6 +50,9 @@ extern const NvBool nv_is_rm_firmware_supported_os;
|
||||
|
||||
#include <nv-kernel-interface-api.h>
|
||||
|
||||
/* NVIDIA's reserved major character device number (Linux). */
|
||||
#define NV_MAJOR_DEVICE_NUMBER 195
|
||||
|
||||
#define GPU_UUID_LEN (16)
|
||||
|
||||
/*
|
||||
@@ -221,6 +223,7 @@ typedef struct
|
||||
#define NV_RM_PAGE_MASK (NV_RM_PAGE_SIZE - 1)
|
||||
|
||||
#define NV_RM_TO_OS_PAGE_SHIFT (os_page_shift - NV_RM_PAGE_SHIFT)
|
||||
#define NV_RM_PAGES_PER_OS_PAGE (1U << NV_RM_TO_OS_PAGE_SHIFT)
|
||||
#define NV_RM_PAGES_TO_OS_PAGES(count) \
|
||||
((((NvUPtr)(count)) >> NV_RM_TO_OS_PAGE_SHIFT) + \
|
||||
((((count) & ((1 << NV_RM_TO_OS_PAGE_SHIFT) - 1)) != 0) ? 1 : 0))
|
||||
@@ -466,9 +469,17 @@ typedef struct nv_state_t
|
||||
NvHandle hDisp;
|
||||
} rmapi;
|
||||
|
||||
/* Bool to check if ISO iommu enabled */
|
||||
NvBool iso_iommu_present;
|
||||
|
||||
/* Bool to check if NISO iommu enabled */
|
||||
NvBool niso_iommu_present;
|
||||
|
||||
/* Bool to check if dma-buf is supported */
|
||||
NvBool dma_buf_supported;
|
||||
|
||||
NvBool printed_openrm_enable_unsupported_gpus_error;
|
||||
|
||||
/* Check if NVPCF DSM function is implemented under NVPCF or GPU device scope */
|
||||
NvBool nvpcf_dsm_in_gpu_scope;
|
||||
|
||||
@@ -477,22 +488,6 @@ typedef struct nv_state_t
|
||||
|
||||
/* Bool to check if the GPU has a coherent sysmem link */
|
||||
NvBool coherent;
|
||||
|
||||
/*
|
||||
* NUMA node ID of the CPU to which the GPU is attached.
|
||||
* Holds NUMA_NO_NODE on platforms that don't support NUMA configuration.
|
||||
*/
|
||||
NvS32 cpu_numa_node_id;
|
||||
|
||||
struct {
|
||||
/* Bool to check if ISO iommu enabled */
|
||||
NvBool iso_iommu_present;
|
||||
/* Bool to check if NISO iommu enabled */
|
||||
NvBool niso_iommu_present;
|
||||
/* Display SMMU Stream IDs */
|
||||
NvU32 dispIsoStreamId;
|
||||
NvU32 dispNisoStreamId;
|
||||
} iommus;
|
||||
} nv_state_t;
|
||||
|
||||
// These define need to be in sync with defines in system.h
|
||||
@@ -510,7 +505,6 @@ struct nv_file_private_t
|
||||
NvHandle *handles;
|
||||
NvU16 maxHandles;
|
||||
NvU32 deviceInstance;
|
||||
NvU32 gpuInstanceId;
|
||||
NvU8 metadata[64];
|
||||
|
||||
nv_file_private_t *ctl_nvfp;
|
||||
@@ -630,10 +624,10 @@ typedef enum
|
||||
((addr) == ((nv)->bars[NV_GPU_BAR_INDEX_IMEM].cpu_address + 0x1000000)))
|
||||
|
||||
#define NV_SOC_IS_ISO_IOMMU_PRESENT(nv) \
|
||||
((nv)->iommus.iso_iommu_present)
|
||||
((nv)->iso_iommu_present)
|
||||
|
||||
#define NV_SOC_IS_NISO_IOMMU_PRESENT(nv) \
|
||||
((nv)->iommus.niso_iommu_present)
|
||||
((nv)->niso_iommu_present)
|
||||
/*
|
||||
* GPU add/remove events
|
||||
*/
|
||||
@@ -779,7 +773,7 @@ nv_state_t* NV_API_CALL nv_get_ctl_state (void);
|
||||
void NV_API_CALL nv_set_dma_address_size (nv_state_t *, NvU32 );
|
||||
|
||||
NV_STATUS NV_API_CALL nv_alias_pages (nv_state_t *, NvU32, NvU32, NvU32, NvU64, NvU64 *, void **);
|
||||
NV_STATUS NV_API_CALL nv_alloc_pages (nv_state_t *, NvU32, NvU64, NvBool, NvU32, NvBool, NvBool, NvS32, NvU64 *, void **);
|
||||
NV_STATUS NV_API_CALL nv_alloc_pages (nv_state_t *, NvU32, NvBool, NvU32, NvBool, NvBool, NvS32, NvU64 *, void **);
|
||||
NV_STATUS NV_API_CALL nv_free_pages (nv_state_t *, NvU32, NvBool, NvU32, void *);
|
||||
|
||||
NV_STATUS NV_API_CALL nv_register_user_pages (nv_state_t *, NvU64, NvU64 *, void *, void **);
|
||||
@@ -796,6 +790,8 @@ NV_STATUS NV_API_CALL nv_register_phys_pages (nv_state_t *, NvU64 *, NvU64,
|
||||
void NV_API_CALL nv_unregister_phys_pages (nv_state_t *, void *);
|
||||
|
||||
NV_STATUS NV_API_CALL nv_dma_map_sgt (nv_dma_device_t *, NvU64, NvU64 *, NvU32, void **);
|
||||
NV_STATUS NV_API_CALL nv_dma_map_pages (nv_dma_device_t *, NvU64, NvU64 *, NvBool, NvU32, void **);
|
||||
NV_STATUS NV_API_CALL nv_dma_unmap_pages (nv_dma_device_t *, NvU64, NvU64 *, void **);
|
||||
|
||||
NV_STATUS NV_API_CALL nv_dma_map_alloc (nv_dma_device_t *, NvU64, NvU64 *, NvBool, void **);
|
||||
NV_STATUS NV_API_CALL nv_dma_unmap_alloc (nv_dma_device_t *, NvU64, NvU64 *, void **);
|
||||
@@ -845,7 +841,7 @@ void NV_API_CALL nv_put_firmware(const void *);
|
||||
nv_file_private_t* NV_API_CALL nv_get_file_private(NvS32, NvBool, void **);
|
||||
void NV_API_CALL nv_put_file_private(void *);
|
||||
|
||||
NV_STATUS NV_API_CALL nv_get_device_memory_config(nv_state_t *, NvU64 *, NvU64 *, NvU64 *, NvU32 *, NvS32 *);
|
||||
NV_STATUS NV_API_CALL nv_get_device_memory_config(nv_state_t *, NvU64 *, NvU64 *, NvU32 *, NvS32 *);
|
||||
NV_STATUS NV_API_CALL nv_get_egm_info(nv_state_t *, NvU64 *, NvU64 *, NvS32 *);
|
||||
|
||||
NV_STATUS NV_API_CALL nv_get_ibmnpu_genreg_info(nv_state_t *, NvU64 *, NvU64 *, void**);
|
||||
@@ -886,7 +882,7 @@ NvBool NV_API_CALL nv_match_gpu_os_info(nv_state_t *, void *);
|
||||
NvU32 NV_API_CALL nv_get_os_type(void);
|
||||
|
||||
void NV_API_CALL nv_get_updated_emu_seg(NvU32 *start, NvU32 *end);
|
||||
void NV_API_CALL nv_get_screen_info(nv_state_t *, NvU64 *, NvU32 *, NvU32 *, NvU32 *, NvU32 *, NvU64 *);
|
||||
void NV_API_CALL nv_get_screen_info(nv_state_t *, NvU64 *, NvU16 *, NvU16 *, NvU16 *, NvU16 *, NvU64 *);
|
||||
|
||||
struct dma_buf;
|
||||
typedef struct nv_dma_buf nv_dma_buf_t;
|
||||
@@ -894,9 +890,9 @@ struct drm_gem_object;
|
||||
|
||||
NV_STATUS NV_API_CALL nv_dma_import_sgt (nv_dma_device_t *, struct sg_table *, struct drm_gem_object *);
|
||||
void NV_API_CALL nv_dma_release_sgt(struct sg_table *, struct drm_gem_object *);
|
||||
NV_STATUS NV_API_CALL nv_dma_import_dma_buf (nv_dma_device_t *, struct dma_buf *, NvU32 *, struct sg_table **, nv_dma_buf_t **);
|
||||
NV_STATUS NV_API_CALL nv_dma_import_from_fd (nv_dma_device_t *, NvS32, NvU32 *, struct sg_table **, nv_dma_buf_t **);
|
||||
void NV_API_CALL nv_dma_release_dma_buf (nv_dma_buf_t *);
|
||||
NV_STATUS NV_API_CALL nv_dma_import_dma_buf (nv_dma_device_t *, struct dma_buf *, NvU32 *, void **, struct sg_table **, nv_dma_buf_t **);
|
||||
NV_STATUS NV_API_CALL nv_dma_import_from_fd (nv_dma_device_t *, NvS32, NvU32 *, void **, struct sg_table **, nv_dma_buf_t **);
|
||||
void NV_API_CALL nv_dma_release_dma_buf (void *, nv_dma_buf_t *);
|
||||
|
||||
void NV_API_CALL nv_schedule_uvm_isr (nv_state_t *);
|
||||
|
||||
@@ -912,8 +908,6 @@ typedef void (*nvTegraDceClientIpcCallback)(NvU32, NvU32, NvU32, void *, void *)
|
||||
NV_STATUS NV_API_CALL nv_get_num_phys_pages (void *, NvU32 *);
|
||||
NV_STATUS NV_API_CALL nv_get_phys_pages (void *, void *, NvU32 *);
|
||||
|
||||
void NV_API_CALL nv_get_disp_smmu_stream_ids (nv_state_t *, NvU32 *, NvU32 *);
|
||||
|
||||
/*
|
||||
* ---------------------------------------------------------------------------
|
||||
*
|
||||
@@ -960,7 +954,6 @@ void NV_API_CALL rm_parse_option_string (nvidia_stack_t *, const char *
|
||||
char* NV_API_CALL rm_remove_spaces (const char *);
|
||||
char* NV_API_CALL rm_string_token (char **, const char);
|
||||
void NV_API_CALL rm_vgpu_vfio_set_driver_vm(nvidia_stack_t *, NvBool);
|
||||
NV_STATUS NV_API_CALL rm_get_adapter_status_external(nvidia_stack_t *, nv_state_t *);
|
||||
|
||||
NV_STATUS NV_API_CALL rm_run_rc_callback (nvidia_stack_t *, nv_state_t *);
|
||||
void NV_API_CALL rm_execute_work_item (nvidia_stack_t *, void *);
|
||||
@@ -999,7 +992,7 @@ NV_STATUS NV_API_CALL rm_dma_buf_dup_mem_handle (nvidia_stack_t *, nv_state_t
|
||||
void NV_API_CALL rm_dma_buf_undup_mem_handle(nvidia_stack_t *, nv_state_t *, NvHandle, NvHandle);
|
||||
NV_STATUS NV_API_CALL rm_dma_buf_map_mem_handle (nvidia_stack_t *, nv_state_t *, NvHandle, NvHandle, NvU64, NvU64, void *, nv_phys_addr_range_t **, NvU32 *);
|
||||
void NV_API_CALL rm_dma_buf_unmap_mem_handle(nvidia_stack_t *, nv_state_t *, NvHandle, NvHandle, NvU64, nv_phys_addr_range_t **, NvU32);
|
||||
NV_STATUS NV_API_CALL rm_dma_buf_get_client_and_device(nvidia_stack_t *, nv_state_t *, NvHandle, NvHandle, NvHandle *, NvHandle *, NvHandle *, void **, NvBool *);
|
||||
NV_STATUS NV_API_CALL rm_dma_buf_get_client_and_device(nvidia_stack_t *, nv_state_t *, NvHandle, NvHandle *, NvHandle *, NvHandle *, void **, NvBool *);
|
||||
void NV_API_CALL rm_dma_buf_put_client_and_device(nvidia_stack_t *, nv_state_t *, NvHandle, NvHandle, NvHandle, void *);
|
||||
NV_STATUS NV_API_CALL rm_log_gpu_crash (nv_stack_t *, nv_state_t *);
|
||||
|
||||
@@ -1011,7 +1004,7 @@ NvBool NV_API_CALL rm_gpu_need_4k_page_isolation(nv_state_t *);
|
||||
NvBool NV_API_CALL rm_is_chipset_io_coherent(nv_stack_t *);
|
||||
NvBool NV_API_CALL rm_init_event_locks(nvidia_stack_t *, nv_state_t *);
|
||||
void NV_API_CALL rm_destroy_event_locks(nvidia_stack_t *, nv_state_t *);
|
||||
NV_STATUS NV_API_CALL rm_get_gpu_numa_info(nvidia_stack_t *, nv_state_t *, nv_ioctl_numa_info_t *);
|
||||
NV_STATUS NV_API_CALL rm_get_gpu_numa_info(nvidia_stack_t *, nv_state_t *, NvS32 *, NvU64 *, NvU64 *, NvU64 *, NvU32 *);
|
||||
NV_STATUS NV_API_CALL rm_gpu_numa_online(nvidia_stack_t *, nv_state_t *);
|
||||
NV_STATUS NV_API_CALL rm_gpu_numa_offline(nvidia_stack_t *, nv_state_t *);
|
||||
NvBool NV_API_CALL rm_is_device_sequestered(nvidia_stack_t *, nv_state_t *);
|
||||
@@ -1026,7 +1019,7 @@ void NV_API_CALL rm_cleanup_dynamic_power_management(nvidia_stack_t *, nv_
|
||||
void NV_API_CALL rm_enable_dynamic_power_management(nvidia_stack_t *, nv_state_t *);
|
||||
NV_STATUS NV_API_CALL rm_ref_dynamic_power(nvidia_stack_t *, nv_state_t *, nv_dynamic_power_mode_t);
|
||||
void NV_API_CALL rm_unref_dynamic_power(nvidia_stack_t *, nv_state_t *, nv_dynamic_power_mode_t);
|
||||
NV_STATUS NV_API_CALL rm_transition_dynamic_power(nvidia_stack_t *, nv_state_t *, NvBool, NvBool *);
|
||||
NV_STATUS NV_API_CALL rm_transition_dynamic_power(nvidia_stack_t *, nv_state_t *, NvBool);
|
||||
const char* NV_API_CALL rm_get_vidmem_power_status(nvidia_stack_t *, nv_state_t *);
|
||||
const char* NV_API_CALL rm_get_dynamic_power_management_status(nvidia_stack_t *, nv_state_t *);
|
||||
const char* NV_API_CALL rm_get_gpu_gcx_support(nvidia_stack_t *, nv_state_t *, NvBool);
|
||||
@@ -1041,12 +1034,12 @@ NV_STATUS NV_API_CALL nv_vgpu_create_request(nvidia_stack_t *, nv_state_t *, c
|
||||
NV_STATUS NV_API_CALL nv_vgpu_delete(nvidia_stack_t *, const NvU8 *, NvU16);
|
||||
NV_STATUS NV_API_CALL nv_vgpu_get_type_ids(nvidia_stack_t *, nv_state_t *, NvU32 *, NvU32 *, NvBool, NvU8, NvBool);
|
||||
NV_STATUS NV_API_CALL nv_vgpu_get_type_info(nvidia_stack_t *, nv_state_t *, NvU32, char *, int, NvU8);
|
||||
NV_STATUS NV_API_CALL nv_vgpu_get_bar_info(nvidia_stack_t *, nv_state_t *, const NvU8 *, NvU64 *,
|
||||
NvU64 *, NvU64 *, NvU32 *, NvBool *, NvU8 *);
|
||||
NV_STATUS NV_API_CALL nv_vgpu_get_hbm_info(nvidia_stack_t *, nv_state_t *, const NvU8 *, NvU64 *, NvU64 *);
|
||||
NV_STATUS NV_API_CALL nv_vgpu_get_bar_info(nvidia_stack_t *, nv_state_t *, const NvU8 *, NvU64 *, NvU32, void *);
|
||||
NV_STATUS NV_API_CALL nv_vgpu_start(nvidia_stack_t *, const NvU8 *, void *, NvS32 *, NvU8 *, NvU32);
|
||||
NV_STATUS NV_API_CALL nv_vgpu_get_sparse_mmap(nvidia_stack_t *, nv_state_t *, const NvU8 *, NvU64 **, NvU64 **, NvU32 *);
|
||||
NV_STATUS NV_API_CALL nv_vgpu_process_vf_info(nvidia_stack_t *, nv_state_t *, NvU8, NvU32, NvU8, NvU8, NvU8, NvBool, void *);
|
||||
NV_STATUS NV_API_CALL nv_vgpu_update_request(nvidia_stack_t *, const NvU8 *, NvU32, NvU64 *, NvU64 *, const char *);
|
||||
NV_STATUS NV_API_CALL nv_gpu_bind_event(nvidia_stack_t *);
|
||||
NV_STATUS NV_API_CALL nv_gpu_unbind_event(nvidia_stack_t *, NvU32, NvBool *);
|
||||
|
||||
NV_STATUS NV_API_CALL nv_get_usermap_access_params(nv_state_t*, nv_usermap_access_params_t*);
|
||||
nv_soc_irq_type_t NV_API_CALL nv_get_current_irq_type(nv_state_t*);
|
||||
|
||||
@@ -86,7 +86,7 @@
|
||||
/* Not currently implemented for MSVC/ARM64. See bug 3366890. */
|
||||
# define nv_speculation_barrier()
|
||||
# define speculation_barrier() nv_speculation_barrier()
|
||||
#elif defined(NVCPU_IS_RISCV64)
|
||||
#elif defined(NVCPU_NVRISCV64) && NVOS_IS_LIBOS
|
||||
# define nv_speculation_barrier()
|
||||
#else
|
||||
#error "Unknown compiler/chip family"
|
||||
|
||||
@@ -1,5 +1,5 @@
|
||||
/*
|
||||
* SPDX-FileCopyrightText: Copyright (c) 2013-2024 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
|
||||
* SPDX-FileCopyrightText: Copyright (c) 2013-2023 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
|
||||
* SPDX-License-Identifier: MIT
|
||||
*
|
||||
* Permission is hereby granted, free of charge, to any person obtaining a
|
||||
@@ -62,10 +62,10 @@ typedef struct
|
||||
/*******************************************************************************
|
||||
nvUvmInterfaceRegisterGpu
|
||||
|
||||
Registers the GPU with the provided physical UUID for use. A GPU must be
|
||||
registered before its UUID can be used with any other API. This call is
|
||||
ref-counted so every nvUvmInterfaceRegisterGpu must be paired with a
|
||||
corresponding nvUvmInterfaceUnregisterGpu.
|
||||
Registers the GPU with the provided UUID for use. A GPU must be registered
|
||||
before its UUID can be used with any other API. This call is ref-counted so
|
||||
every nvUvmInterfaceRegisterGpu must be paired with a corresponding
|
||||
nvUvmInterfaceUnregisterGpu.
|
||||
|
||||
You don't need to call nvUvmInterfaceSessionCreate before calling this.
|
||||
|
||||
@@ -79,13 +79,12 @@ NV_STATUS nvUvmInterfaceRegisterGpu(const NvProcessorUuid *gpuUuid, UvmGpuPlatfo
|
||||
/*******************************************************************************
|
||||
nvUvmInterfaceUnregisterGpu
|
||||
|
||||
Unregisters the GPU with the provided physical UUID. This drops the ref
|
||||
count from nvUvmInterfaceRegisterGpu. Once the reference count goes to 0
|
||||
the device may no longer be accessible until the next
|
||||
nvUvmInterfaceRegisterGpu call. No automatic resource freeing is performed,
|
||||
so only make the last unregister call after destroying all your allocations
|
||||
associated with that UUID (such as those from
|
||||
nvUvmInterfaceAddressSpaceCreate).
|
||||
Unregisters the GPU with the provided UUID. This drops the ref count from
|
||||
nvUvmInterfaceRegisterGpu. Once the reference count goes to 0 the device may
|
||||
no longer be accessible until the next nvUvmInterfaceRegisterGpu call. No
|
||||
automatic resource freeing is performed, so only make the last unregister
|
||||
call after destroying all your allocations associated with that UUID (such
|
||||
as those from nvUvmInterfaceAddressSpaceCreate).
|
||||
|
||||
If the UUID is not found, no operation is performed.
|
||||
*/
|
||||
@@ -122,10 +121,10 @@ NV_STATUS nvUvmInterfaceSessionDestroy(uvmGpuSessionHandle session);
|
||||
nvUvmInterfaceDeviceCreate
|
||||
|
||||
Creates a device object under the given session for the GPU with the given
|
||||
physical UUID. Also creates a partition object for the device iff
|
||||
bCreateSmcPartition is true and pGpuInfo->smcEnabled is true.
|
||||
pGpuInfo->smcUserClientInfo will be used to determine the SMC partition in
|
||||
this case. A device handle is returned in the device output parameter.
|
||||
UUID. Also creates a partition object for the device iff bCreateSmcPartition
|
||||
is true and pGpuInfo->smcEnabled is true. pGpuInfo->smcUserClientInfo will
|
||||
be used to determine the SMC partition in this case. A device handle is
|
||||
returned in the device output parameter.
|
||||
|
||||
Error codes:
|
||||
NV_ERR_GENERIC
|
||||
@@ -162,7 +161,6 @@ void nvUvmInterfaceDeviceDestroy(uvmGpuDeviceHandle device);
|
||||
NV_STATUS nvUvmInterfaceAddressSpaceCreate(uvmGpuDeviceHandle device,
|
||||
unsigned long long vaBase,
|
||||
unsigned long long vaSize,
|
||||
NvBool enableAts,
|
||||
uvmGpuAddressSpaceHandle *vaSpace,
|
||||
UvmGpuAddressSpaceInfo *vaSpaceInfo);
|
||||
|
||||
@@ -424,6 +422,33 @@ NV_STATUS nvUvmInterfacePmaPinPages(void *pPma,
|
||||
NvU64 pageSize,
|
||||
NvU32 flags);
|
||||
|
||||
/*******************************************************************************
|
||||
nvUvmInterfacePmaUnpinPages
|
||||
|
||||
This function will unpin the physical memory allocated using PMA. The pages
|
||||
passed as input must be already pinned, else this function will return an
|
||||
error and rollback any change if any page is not previously marked "pinned".
|
||||
Behaviour is undefined if any blacklisted pages are unpinned.
|
||||
|
||||
Arguments:
|
||||
pPma[IN] - Pointer to PMA object.
|
||||
pPages[IN] - Array of pointers, containing the PA base
|
||||
address of each page to be unpinned.
|
||||
pageCount [IN] - Number of pages required to be unpinned.
|
||||
pageSize [IN] - Page size of each page to be unpinned.
|
||||
|
||||
Error codes:
|
||||
NV_ERR_INVALID_ARGUMENT - Invalid input arguments.
|
||||
NV_ERR_GENERIC - Unexpected error. We try hard to avoid
|
||||
returning this error code as is not very
|
||||
informative.
|
||||
NV_ERR_NOT_SUPPORTED - Operation not supported on broken FB
|
||||
*/
|
||||
NV_STATUS nvUvmInterfacePmaUnpinPages(void *pPma,
|
||||
NvU64 *pPages,
|
||||
NvLength pageCount,
|
||||
NvU64 pageSize);
|
||||
|
||||
/*******************************************************************************
|
||||
nvUvmInterfaceMemoryFree
|
||||
|
||||
@@ -613,8 +638,6 @@ NV_STATUS nvUvmInterfaceQueryCopyEnginesCaps(uvmGpuDeviceHandle device,
|
||||
nvUvmInterfaceGetGpuInfo
|
||||
|
||||
Return various gpu info, refer to the UvmGpuInfo struct for details.
|
||||
The input UUID is for the physical GPU and the pGpuClientInfo identifies
|
||||
the SMC partition if SMC is enabled and the partition exists.
|
||||
If no gpu matching the uuid is found, an error will be returned.
|
||||
|
||||
On Ampere+ GPUs, pGpuClientInfo contains SMC information provided by the
|
||||
@@ -622,9 +645,6 @@ NV_STATUS nvUvmInterfaceQueryCopyEnginesCaps(uvmGpuDeviceHandle device,
|
||||
|
||||
Error codes:
|
||||
NV_ERR_GENERIC
|
||||
NV_ERR_NO_MEMORY
|
||||
NV_ERR_GPU_UUID_NOT_FOUND
|
||||
NV_ERR_INSUFFICIENT_PERMISSIONS
|
||||
NV_ERR_INSUFFICIENT_RESOURCES
|
||||
*/
|
||||
NV_STATUS nvUvmInterfaceGetGpuInfo(const NvProcessorUuid *gpuUuid,
|
||||
@@ -837,7 +857,7 @@ NV_STATUS nvUvmInterfaceGetEccInfo(uvmGpuDeviceHandle device,
|
||||
UVM GPU UNLOCK
|
||||
|
||||
Arguments:
|
||||
device[IN] - Device handle associated with the gpu
|
||||
gpuUuid[IN] - UUID of the GPU to operate on
|
||||
bOwnInterrupts - Set to NV_TRUE for UVM to take ownership of the
|
||||
replayable page fault interrupts. Set to NV_FALSE
|
||||
to return ownership of the page fault interrupts
|
||||
@@ -953,45 +973,14 @@ NV_STATUS nvUvmInterfaceGetNonReplayableFaults(UvmGpuFaultInfo *pFaultInfo,
|
||||
NOTES:
|
||||
- This function DOES NOT acquire the RM API or GPU locks. That is because
|
||||
it is called during fault servicing, which could produce deadlocks.
|
||||
- This function should not be called when interrupts are disabled.
|
||||
|
||||
Arguments:
|
||||
pFaultInfo[IN] - information provided by RM for fault handling.
|
||||
used for obtaining the device handle without locks.
|
||||
bCopyAndFlush[IN] - Instructs RM to perform the flush in the Copy+Flush mode.
|
||||
In this mode, RM will perform a copy of the packets from
|
||||
the HW buffer to UVM's SW buffer as part of performing
|
||||
the flush. This mode gives UVM the opportunity to observe
|
||||
the packets contained within the HW buffer at the time
|
||||
of issuing the call.
|
||||
device[IN] - Device handle associated with the gpu
|
||||
|
||||
Error codes:
|
||||
NV_ERR_INVALID_ARGUMENT
|
||||
*/
|
||||
NV_STATUS nvUvmInterfaceFlushReplayableFaultBuffer(UvmGpuFaultInfo *pFaultInfo,
|
||||
NvBool bCopyAndFlush);
|
||||
|
||||
/*******************************************************************************
|
||||
nvUvmInterfaceTogglePrefetchFaults
|
||||
|
||||
This function sends an RPC to GSP in order to toggle the prefetch fault PRI.
|
||||
|
||||
NOTES:
|
||||
- This function DOES NOT acquire the RM API or GPU locks. That is because
|
||||
it is called during fault servicing, which could produce deadlocks.
|
||||
- This function should not be called when interrupts are disabled.
|
||||
|
||||
Arguments:
|
||||
pFaultInfo[IN] - Information provided by RM for fault handling.
|
||||
Used for obtaining the device handle without locks.
|
||||
bEnable[IN] - Instructs RM whether to toggle generating faults on
|
||||
prefetch on/off.
|
||||
|
||||
Error codes:
|
||||
NV_ERR_INVALID_ARGUMENT
|
||||
*/
|
||||
NV_STATUS nvUvmInterfaceTogglePrefetchFaults(UvmGpuFaultInfo *pFaultInfo,
|
||||
NvBool bEnable);
|
||||
NV_STATUS nvUvmInterfaceFlushReplayableFaultBuffer(uvmGpuDeviceHandle device);
|
||||
|
||||
/*******************************************************************************
|
||||
nvUvmInterfaceInitAccessCntrInfo
|
||||
@@ -1098,8 +1087,7 @@ void nvUvmInterfaceDeRegisterUvmOps(void);
|
||||
|
||||
Error codes:
|
||||
NV_ERR_INVALID_ARGUMENT
|
||||
NV_ERR_OBJECT_NOT_FOUND : If device object associated with the device
|
||||
handles isn't found.
|
||||
NV_ERR_OBJECT_NOT_FOUND : If device object associated with the uuids aren't found.
|
||||
*/
|
||||
NV_STATUS nvUvmInterfaceP2pObjectCreate(uvmGpuDeviceHandle device1,
|
||||
uvmGpuDeviceHandle device2,
|
||||
@@ -1152,8 +1140,6 @@ void nvUvmInterfaceP2pObjectDestroy(uvmGpuSessionHandle session,
|
||||
NV_ERR_NOT_READY - Returned when querying the PTEs requires a deferred setup
|
||||
which has not yet completed. It is expected that the caller
|
||||
will reattempt the call until a different code is returned.
|
||||
As an example, multi-node systems which require querying
|
||||
PTEs from the Fabric Manager may return this code.
|
||||
*/
|
||||
NV_STATUS nvUvmInterfaceGetExternalAllocPtes(uvmGpuAddressSpaceHandle vaSpace,
|
||||
NvHandle hMemory,
|
||||
@@ -1463,7 +1449,18 @@ NV_STATUS nvUvmInterfacePagingChannelPushStream(UvmGpuPagingChannelHandle channe
|
||||
NvU32 methodStreamSize);
|
||||
|
||||
/*******************************************************************************
|
||||
Cryptography Services Library (CSL) Interface
|
||||
CSL Interface and Locking
|
||||
|
||||
The following functions do not acquire the RM API or GPU locks and must not be called
|
||||
concurrently with the same UvmCslContext parameter in different threads. The caller must
|
||||
guarantee this exclusion.
|
||||
|
||||
* nvUvmInterfaceCslRotateIv
|
||||
* nvUvmInterfaceCslEncrypt
|
||||
* nvUvmInterfaceCslDecrypt
|
||||
* nvUvmInterfaceCslSign
|
||||
* nvUvmInterfaceCslQueryMessagePool
|
||||
* nvUvmInterfaceCslIncrementIv
|
||||
*/
|
||||
|
||||
/*******************************************************************************
|
||||
@@ -1474,11 +1471,8 @@ NV_STATUS nvUvmInterfacePagingChannelPushStream(UvmGpuPagingChannelHandle channe
|
||||
The lifetime of the context is the same as the lifetime of the secure channel
|
||||
it is paired with.
|
||||
|
||||
Locking: This function acquires an API lock.
|
||||
Memory : This function dynamically allocates memory.
|
||||
|
||||
Arguments:
|
||||
uvmCslContext[IN/OUT] - The CSL context associated with a channel.
|
||||
uvmCslContext[IN/OUT] - The CSL context.
|
||||
channel[IN] - Handle to a secure channel.
|
||||
|
||||
Error codes:
|
||||
@@ -1496,62 +1490,30 @@ NV_STATUS nvUvmInterfaceCslInitContext(UvmCslContext *uvmCslContext,
|
||||
|
||||
If context is already deinitialized then function returns immediately.
|
||||
|
||||
Locking: This function does not acquire an API or GPU lock.
|
||||
Memory : This function may free memory.
|
||||
|
||||
Arguments:
|
||||
uvmCslContext[IN] - The CSL context associated with a channel.
|
||||
uvmCslContext[IN] - The CSL context.
|
||||
*/
|
||||
void nvUvmInterfaceDeinitCslContext(UvmCslContext *uvmCslContext);
|
||||
|
||||
/*******************************************************************************
|
||||
nvUvmInterfaceCslRotateKey
|
||||
|
||||
Disables channels and rotates keys.
|
||||
|
||||
This function disables channels and rotates associated keys. The channels
|
||||
associated with the given CSL contexts must be idled before this function is
|
||||
called. To trigger key rotation all allocated channels for a given key must
|
||||
be present in the list. If the function returns successfully then the CSL
|
||||
contexts have been updated with the new key.
|
||||
|
||||
Locking: This function attempts to acquire the GPU lock. In case of failure
|
||||
to acquire the return code is NV_ERR_STATE_IN_USE. The caller must
|
||||
guarantee that no CSL function, including this one, is invoked
|
||||
concurrently with the CSL contexts in contextList.
|
||||
Memory : This function dynamically allocates memory.
|
||||
|
||||
Arguments:
|
||||
contextList[IN/OUT] - An array of pointers to CSL contexts.
|
||||
contextListCount[IN] - Number of CSL contexts in contextList. Its value
|
||||
must be greater than 0.
|
||||
Error codes:
|
||||
NV_ERR_INVALID_ARGUMENT - contextList is NULL or contextListCount is 0.
|
||||
NV_ERR_STATE_IN_USE - Unable to acquire lock / resource. Caller
|
||||
can retry at a later time.
|
||||
NV_ERR_GENERIC - A failure other than _STATE_IN_USE occurred
|
||||
when attempting to acquire a lock.
|
||||
*/
|
||||
NV_STATUS nvUvmInterfaceCslRotateKey(UvmCslContext *contextList[],
|
||||
NvU32 contextListCount);
|
||||
|
||||
/*******************************************************************************
|
||||
nvUvmInterfaceCslRotateIv
|
||||
|
||||
Rotates the IV for a given channel and operation.
|
||||
|
||||
This function will rotate the IV on both the CPU and the GPU.
|
||||
For a given operation the channel must be idle before calling this function.
|
||||
This function can be called regardless of the value of the IV's message counter.
|
||||
Outstanding messages that have been encrypted by the GPU should first be
|
||||
decrypted before calling this function with operation equal to
|
||||
UVM_CSL_OPERATION_DECRYPT. Similarly, outstanding messages that have been
|
||||
encrypted by the CPU should first be decrypted before calling this function
|
||||
with operation equal to UVM_CSL_OPERATION_ENCRYPT. For a given operation
|
||||
the channel must be idle before calling this function. This function can be
|
||||
called regardless of the value of the IV's message counter.
|
||||
|
||||
Locking: This function attempts to acquire the GPU lock. In case of failure to
|
||||
acquire the return code is NV_ERR_STATE_IN_USE. The caller must guarantee
|
||||
that no CSL function, including this one, is invoked concurrently with
|
||||
the same CSL context.
|
||||
Memory : This function does not dynamically allocate memory.
|
||||
See "CSL Interface and Locking" for locking requirements.
|
||||
This function does not perform dynamic memory allocation.
|
||||
|
||||
Arguments:
|
||||
uvmCslContext[IN/OUT] - The CSL context associated with a channel.
|
||||
uvmCslContext[IN/OUT] - The CSL context.
|
||||
operation[IN] - Either
|
||||
- UVM_CSL_OPERATION_ENCRYPT
|
||||
- UVM_CSL_OPERATION_DECRYPT
|
||||
@@ -1559,11 +1521,7 @@ Arguments:
|
||||
Error codes:
|
||||
NV_ERR_INSUFFICIENT_RESOURCES - The rotate operation would cause a counter
|
||||
to overflow.
|
||||
NV_ERR_STATE_IN_USE - Unable to acquire lock / resource. Caller
|
||||
can retry at a later time.
|
||||
NV_ERR_INVALID_ARGUMENT - Invalid value for operation.
|
||||
NV_ERR_GENERIC - A failure other than _STATE_IN_USE occurred
|
||||
when attempting to acquire a lock.
|
||||
*/
|
||||
NV_STATUS nvUvmInterfaceCslRotateIv(UvmCslContext *uvmCslContext,
|
||||
UvmCslOperation operation);
|
||||
@@ -1580,13 +1538,11 @@ NV_STATUS nvUvmInterfaceCslRotateIv(UvmCslContext *uvmCslContext,
|
||||
The encryptIV can be obtained from nvUvmInterfaceCslIncrementIv.
|
||||
However, it is optional. If it is NULL, the next IV in line will be used.
|
||||
|
||||
Locking: This function does not acquire an API or GPU lock.
|
||||
The caller must guarantee that no CSL function, including this one,
|
||||
is invoked concurrently with the same CSL context.
|
||||
Memory : This function does not dynamically allocate memory.
|
||||
See "CSL Interface and Locking" for locking requirements.
|
||||
This function does not perform dynamic memory allocation.
|
||||
|
||||
Arguments:
|
||||
uvmCslContext[IN/OUT] - The CSL context associated with a channel.
|
||||
uvmCslContext[IN/OUT] - The CSL context.
|
||||
bufferSize[IN] - Size of the input and output buffers in
|
||||
units of bytes. Value can range from 1 byte
|
||||
to (2^32) - 1 bytes.
|
||||
@@ -1597,9 +1553,8 @@ Arguments:
|
||||
Its size is UVM_CSL_CRYPT_AUTH_TAG_SIZE_BYTES.
|
||||
|
||||
Error codes:
|
||||
NV_ERR_INVALID_ARGUMENT - The CSL context is not associated with a channel.
|
||||
- The size of the data is 0 bytes.
|
||||
- The encryptIv has already been used.
|
||||
NV_ERR_INVALID_ARGUMENT - The size of the data is 0 bytes.
|
||||
- The encryptIv has already been used.
|
||||
*/
|
||||
NV_STATUS nvUvmInterfaceCslEncrypt(UvmCslContext *uvmCslContext,
|
||||
NvU32 bufferSize,
|
||||
@@ -1618,15 +1573,8 @@ NV_STATUS nvUvmInterfaceCslEncrypt(UvmCslContext *uvmCslContext,
|
||||
maximized when the input and output buffers are 16-byte aligned. This is
|
||||
natural alignment for AES block.
|
||||
|
||||
During a key rotation event the previous key is stored in the CSL context.
|
||||
This allows data encrypted by the GPU to be decrypted with the previous key.
|
||||
The keyRotationId parameter identifies which key is used. The first key rotation
|
||||
ID has a value of 0 that increments by one for each key rotation event.
|
||||
|
||||
Locking: This function does not acquire an API or GPU lock.
|
||||
The caller must guarantee that no CSL function, including this one,
|
||||
is invoked concurrently with the same CSL context.
|
||||
Memory : This function does not dynamically allocate memory.
|
||||
See "CSL Interface and Locking" for locking requirements.
|
||||
This function does not perform dynamic memory allocation.
|
||||
|
||||
Arguments:
|
||||
uvmCslContext[IN/OUT] - The CSL context.
|
||||
@@ -1635,8 +1583,6 @@ NV_STATUS nvUvmInterfaceCslEncrypt(UvmCslContext *uvmCslContext,
|
||||
decryptIv[IN] - IV used to decrypt the ciphertext. Its value can either be given by
|
||||
nvUvmInterfaceCslIncrementIv, or, if NULL, the CSL context's
|
||||
internal counter is used.
|
||||
keyRotationId[IN] - Specifies the key that is used for decryption.
|
||||
A value of NV_U32_MAX specifies the current key.
|
||||
inputBuffer[IN] - Address of ciphertext input buffer.
|
||||
outputBuffer[OUT] - Address of plaintext output buffer.
|
||||
addAuthData[IN] - Address of the plaintext additional authenticated data used to
|
||||
@@ -1657,7 +1603,6 @@ NV_STATUS nvUvmInterfaceCslDecrypt(UvmCslContext *uvmCslContext,
|
||||
NvU32 bufferSize,
|
||||
NvU8 const *inputBuffer,
|
||||
UvmCslIv const *decryptIv,
|
||||
NvU32 keyRotationId,
|
||||
NvU8 *outputBuffer,
|
||||
NvU8 const *addAuthData,
|
||||
NvU32 addAuthDataSize,
|
||||
@@ -1671,13 +1616,11 @@ NV_STATUS nvUvmInterfaceCslDecrypt(UvmCslContext *uvmCslContext,
|
||||
Auth and input buffers must not overlap. If they do then calling this function produces
|
||||
undefined behavior.
|
||||
|
||||
Locking: This function does not acquire an API or GPU lock.
|
||||
The caller must guarantee that no CSL function, including this one,
|
||||
is invoked concurrently with the same CSL context.
|
||||
Memory : This function does not dynamically allocate memory.
|
||||
See "CSL Interface and Locking" for locking requirements.
|
||||
This function does not perform dynamic memory allocation.
|
||||
|
||||
Arguments:
|
||||
uvmCslContext[IN/OUT] - The CSL context associated with a channel.
|
||||
uvmCslContext[IN/OUT] - The CSL context.
|
||||
bufferSize[IN] - Size of the input buffer in units of bytes.
|
||||
Value can range from 1 byte to (2^32) - 1 bytes.
|
||||
inputBuffer[IN] - Address of plaintext input buffer.
|
||||
@@ -1686,8 +1629,7 @@ NV_STATUS nvUvmInterfaceCslDecrypt(UvmCslContext *uvmCslContext,
|
||||
|
||||
Error codes:
|
||||
NV_ERR_INSUFFICIENT_RESOURCES - The signing operation would cause a counter overflow to occur.
|
||||
NV_ERR_INVALID_ARGUMENT - The CSL context is not associated with a channel.
|
||||
- The size of the data is 0 bytes.
|
||||
NV_ERR_INVALID_ARGUMENT - The size of the data is 0 bytes.
|
||||
*/
|
||||
NV_STATUS nvUvmInterfaceCslSign(UvmCslContext *uvmCslContext,
|
||||
NvU32 bufferSize,
|
||||
@@ -1699,10 +1641,8 @@ NV_STATUS nvUvmInterfaceCslSign(UvmCslContext *uvmCslContext,
|
||||
|
||||
Returns the number of messages that can be encrypted before the message counter will overflow.
|
||||
|
||||
Locking: This function does not acquire an API or GPU lock.
|
||||
Memory : This function does not dynamically allocate memory.
|
||||
The caller must guarantee that no CSL function, including this one,
|
||||
is invoked concurrently with the same CSL context.
|
||||
See "CSL Interface and Locking" for locking requirements.
|
||||
This function does not perform dynamic memory allocation.
|
||||
|
||||
Arguments:
|
||||
uvmCslContext[IN/OUT] - The CSL context.
|
||||
@@ -1726,10 +1666,8 @@ NV_STATUS nvUvmInterfaceCslQueryMessagePool(UvmCslContext *uvmCslContext,
|
||||
can be used in nvUvmInterfaceCslEncrypt. If operation is UVM_CSL_OPERATION_DECRYPT then
|
||||
the returned IV can be used in nvUvmInterfaceCslDecrypt.
|
||||
|
||||
Locking: This function does not acquire an API or GPU lock.
|
||||
The caller must guarantee that no CSL function, including this one,
|
||||
is invoked concurrently with the same CSL context.
|
||||
Memory : This function does not dynamically allocate memory.
|
||||
See "CSL Interface and Locking" for locking requirements.
|
||||
This function does not perform dynamic memory allocation.
|
||||
|
||||
Arguments:
|
||||
uvmCslContext[IN/OUT] - The CSL context.
|
||||
@@ -1737,7 +1675,7 @@ Arguments:
|
||||
- UVM_CSL_OPERATION_ENCRYPT
|
||||
- UVM_CSL_OPERATION_DECRYPT
|
||||
increment[IN] - The amount by which the IV is incremented. Can be 0.
|
||||
iv[OUT] - If non-NULL, a buffer to store the incremented IV.
|
||||
iv[out] - If non-NULL, a buffer to store the incremented IV.
|
||||
|
||||
Error codes:
|
||||
NV_ERR_INVALID_ARGUMENT - The value of the operation parameter is illegal.
|
||||
@@ -1749,42 +1687,4 @@ NV_STATUS nvUvmInterfaceCslIncrementIv(UvmCslContext *uvmCslContext,
|
||||
NvU64 increment,
|
||||
UvmCslIv *iv);
|
||||
|
||||
/*******************************************************************************
|
||||
nvUvmInterfaceCslLogEncryption
|
||||
|
||||
Checks and logs information about encryptions associated with the given
|
||||
CSL context.
|
||||
|
||||
For contexts associated with channels, this function does not modify elements of
|
||||
the UvmCslContext, and must be called for every CPU/GPU encryption.
|
||||
|
||||
For the context associated with fault buffers, bufferSize can encompass multiple
|
||||
encryption invocations, and the UvmCslContext will be updated following a key
|
||||
rotation event.
|
||||
|
||||
In either case the IV remains unmodified after this function is called.
|
||||
|
||||
Locking: This function does not acquire an API or GPU lock.
|
||||
Memory : This function does not dynamically allocate memory.
|
||||
The caller must guarantee that no CSL function, including this one,
|
||||
is invoked concurrently with the same CSL context.
|
||||
|
||||
Arguments:
|
||||
uvmCslContext[IN/OUT] - The CSL context.
|
||||
operation[IN] - If the CSL context is associated with a fault
|
||||
buffer, this argument is ignored. If it is
|
||||
associated with a channel, it must be either
|
||||
- UVM_CSL_OPERATION_ENCRYPT
|
||||
- UVM_CSL_OPERATION_DECRYPT
|
||||
bufferSize[IN] - The size of the buffer(s) encrypted by the
|
||||
external entity in units of bytes.
|
||||
|
||||
Error codes:
|
||||
NV_ERR_INSUFFICIENT_RESOURCES - The encryption would cause a counter
|
||||
to overflow.
|
||||
*/
|
||||
NV_STATUS nvUvmInterfaceCslLogEncryption(UvmCslContext *uvmCslContext,
|
||||
UvmCslOperation operation,
|
||||
NvU32 bufferSize);
|
||||
|
||||
#endif // _NV_UVM_INTERFACE_H_
|
||||
|
||||
@@ -104,10 +104,6 @@ typedef struct UvmGpuMemoryInfo_tag
|
||||
// Out: Set to TRUE, if the allocation is in sysmem.
|
||||
NvBool sysmem;
|
||||
|
||||
// Out: Set to TRUE, if this allocation is treated as EGM.
|
||||
// sysmem is also TRUE when egm is TRUE.
|
||||
NvBool egm;
|
||||
|
||||
// Out: Set to TRUE, if the allocation is a constructed
|
||||
// under a Device or Subdevice.
|
||||
// All permutations of sysmem and deviceDescendant are valid.
|
||||
@@ -129,10 +125,6 @@ typedef struct UvmGpuMemoryInfo_tag
|
||||
|
||||
// Out: Uuid of the GPU to which the allocation belongs.
|
||||
// This is only valid if deviceDescendant is NV_TRUE.
|
||||
// When egm is NV_TRUE, this is also the UUID of the GPU
|
||||
// for which EGM is local.
|
||||
// If the GPU has SMC enabled, the UUID is the GI UUID.
|
||||
// Otherwise, it is the UUID for the physical GPU.
|
||||
// Note: If the allocation is owned by a device in
|
||||
// an SLI group and the allocation is broadcast
|
||||
// across the SLI group, this UUID will be any one
|
||||
@@ -267,7 +259,6 @@ typedef struct UvmGpuChannelInfo_tag
|
||||
|
||||
// The errorNotifier is filled out when the channel hits an RC error.
|
||||
NvNotification *errorNotifier;
|
||||
NvNotification *keyRotationNotifier;
|
||||
|
||||
NvU32 hwRunlistId;
|
||||
NvU32 hwChannelId;
|
||||
@@ -293,13 +284,13 @@ typedef struct UvmGpuChannelInfo_tag
|
||||
|
||||
// GPU VAs of both GPFIFO and GPPUT are needed in Confidential Computing
|
||||
// so a channel can be controlled via another channel (SEC2 or WLC/LCIC)
|
||||
NvU64 gpFifoGpuVa;
|
||||
NvU64 gpPutGpuVa;
|
||||
NvU64 gpGetGpuVa;
|
||||
NvU64 gpFifoGpuVa;
|
||||
NvU64 gpPutGpuVa;
|
||||
NvU64 gpGetGpuVa;
|
||||
// GPU VA of work submission offset is needed in Confidential Computing
|
||||
// so CE channels can ring doorbell of other channels as required for
|
||||
// WLC/LCIC work submission
|
||||
NvU64 workSubmissionOffsetGpuVa;
|
||||
NvU64 workSubmissionOffsetGpuVa;
|
||||
} UvmGpuChannelInfo;
|
||||
|
||||
typedef enum
|
||||
@@ -341,7 +332,7 @@ typedef struct UvmGpuPagingChannelAllocParams_tag
|
||||
|
||||
// The max number of Copy Engines supported by a GPU.
|
||||
// The gpu ops build has a static assert that this is the correct number.
|
||||
#define UVM_COPY_ENGINE_COUNT_MAX 64
|
||||
#define UVM_COPY_ENGINE_COUNT_MAX 10
|
||||
|
||||
typedef struct
|
||||
{
|
||||
@@ -547,10 +538,6 @@ typedef struct UvmGpuP2PCapsParams_tag
|
||||
// the GPUs are direct peers.
|
||||
NvU32 peerIds[2];
|
||||
|
||||
// Out: peerId[i] contains gpu[i]'s EGM peer id of gpu[1 - i]. Only defined
|
||||
// if the GPUs are direct peers and EGM enabled in the system.
|
||||
NvU32 egmPeerIds[2];
|
||||
|
||||
// Out: UVM_LINK_TYPE
|
||||
NvU32 p2pLink;
|
||||
|
||||
@@ -605,8 +592,6 @@ typedef struct UvmGpuConfComputeCaps_tag
|
||||
{
|
||||
// Out: GPU's confidential compute mode
|
||||
UvmGpuConfComputeMode mode;
|
||||
// Is key rotation enabled for UVM keys
|
||||
NvBool bKeyRotationEnabled;
|
||||
} UvmGpuConfComputeCaps;
|
||||
|
||||
#define UVM_GPU_NAME_LENGTH 0x40
|
||||
@@ -616,8 +601,7 @@ typedef struct UvmGpuInfo_tag
|
||||
// Printable gpu name
|
||||
char name[UVM_GPU_NAME_LENGTH];
|
||||
|
||||
// Uuid of the physical GPU or GI UUID if nvUvmInterfaceGetGpuInfo()
|
||||
// requested information for a valid SMC partition.
|
||||
// Uuid of this gpu
|
||||
NvProcessorUuid uuid;
|
||||
|
||||
// Gpu architecture; NV2080_CTRL_MC_ARCH_INFO_ARCHITECTURE_*
|
||||
@@ -699,16 +683,6 @@ typedef struct UvmGpuInfo_tag
|
||||
// to NVSwitch peers.
|
||||
NvBool connectedToSwitch;
|
||||
NvU64 nvswitchMemoryWindowStart;
|
||||
|
||||
// local EGM properties
|
||||
// NV_TRUE if EGM is enabled
|
||||
NvBool egmEnabled;
|
||||
|
||||
// Peer ID to reach local EGM when EGM is enabled
|
||||
NvU8 egmPeerId;
|
||||
|
||||
// EGM base address to offset in the GMMU PTE entry for EGM mappings
|
||||
NvU64 egmBaseAddr;
|
||||
} UvmGpuInfo;
|
||||
|
||||
typedef struct UvmGpuFbInfo_tag
|
||||
@@ -717,10 +691,9 @@ typedef struct UvmGpuFbInfo_tag
|
||||
// RM regions that are not registered with PMA either.
|
||||
NvU64 maxAllocatableAddress;
|
||||
|
||||
NvU32 heapSize; // RAM in KB available for user allocations
|
||||
NvU32 reservedHeapSize; // RAM in KB reserved for internal RM allocation
|
||||
NvBool bZeroFb; // Zero FB mode enabled.
|
||||
NvU64 maxVidmemPageSize; // Largest GPU page size to access vidmem.
|
||||
NvU32 heapSize; // RAM in KB available for user allocations
|
||||
NvU32 reservedHeapSize; // RAM in KB reserved for internal RM allocation
|
||||
NvBool bZeroFb; // Zero FB mode enabled.
|
||||
} UvmGpuFbInfo;
|
||||
|
||||
typedef struct UvmGpuEccInfo_tag
|
||||
@@ -798,14 +771,14 @@ typedef NV_STATUS (*uvmEventResume_t) (void);
|
||||
/*******************************************************************************
|
||||
uvmEventStartDevice
|
||||
This function will be called by the GPU driver once it has finished its
|
||||
initialization to tell the UVM driver that this physical GPU has come up.
|
||||
initialization to tell the UVM driver that this GPU has come up.
|
||||
*/
|
||||
typedef NV_STATUS (*uvmEventStartDevice_t) (const NvProcessorUuid *pGpuUuidStruct);
|
||||
|
||||
/*******************************************************************************
|
||||
uvmEventStopDevice
|
||||
This function will be called by the GPU driver to let UVM know that a
|
||||
physical GPU is going down.
|
||||
This function will be called by the GPU driver to let UVM know that a GPU
|
||||
is going down.
|
||||
*/
|
||||
typedef NV_STATUS (*uvmEventStopDevice_t) (const NvProcessorUuid *pGpuUuidStruct);
|
||||
|
||||
@@ -836,7 +809,7 @@ typedef NV_STATUS (*uvmEventServiceInterrupt_t) (void *pDeviceObject,
|
||||
/*******************************************************************************
|
||||
uvmEventIsrTopHalf_t
|
||||
This function will be called by the GPU driver to let UVM know
|
||||
that an interrupt has occurred on the given physical GPU.
|
||||
that an interrupt has occurred.
|
||||
|
||||
Returns:
|
||||
NV_OK if the UVM driver handled the interrupt
|
||||
@@ -943,6 +916,11 @@ typedef struct UvmGpuFaultInfo_tag
|
||||
// CSL context used for performing decryption of replayable faults when
|
||||
// Confidential Computing is enabled.
|
||||
UvmCslContext cslCtx;
|
||||
|
||||
// Indicates whether UVM owns the replayable fault buffer.
|
||||
// The value of this field is always NV_TRUE When Confidential Computing
|
||||
// is disabled.
|
||||
NvBool bUvmOwnsHwFaultBuffer;
|
||||
} replayable;
|
||||
struct
|
||||
{
|
||||
@@ -1089,21 +1067,4 @@ typedef enum UvmCslOperation
|
||||
UVM_CSL_OPERATION_DECRYPT
|
||||
} UvmCslOperation;
|
||||
|
||||
typedef enum UVM_KEY_ROTATION_STATUS {
|
||||
// Key rotation complete/not in progress
|
||||
UVM_KEY_ROTATION_STATUS_IDLE = 0,
|
||||
// RM is waiting for clients to report their channels are idle for key rotation
|
||||
UVM_KEY_ROTATION_STATUS_PENDING = 1,
|
||||
// Key rotation is in progress
|
||||
UVM_KEY_ROTATION_STATUS_IN_PROGRESS = 2,
|
||||
// Key rotation timeout failure, RM will RC non-idle channels.
|
||||
// UVM should never see this status value.
|
||||
UVM_KEY_ROTATION_STATUS_FAILED_TIMEOUT = 3,
|
||||
// Key rotation failed because upper threshold was crossed, RM will RC non-idle channels
|
||||
UVM_KEY_ROTATION_STATUS_FAILED_THRESHOLD = 4,
|
||||
// Internal RM failure while rotating keys for a certain channel, RM will RC the channel.
|
||||
UVM_KEY_ROTATION_STATUS_FAILED_ROTATION = 5,
|
||||
UVM_KEY_ROTATION_STATUS_MAX_COUNT = 6,
|
||||
} UVM_KEY_ROTATION_STATUS;
|
||||
|
||||
#endif // _NV_UVM_TYPES_H_
|
||||
|
||||
@@ -1,5 +1,5 @@
|
||||
/*
|
||||
* SPDX-FileCopyrightText: Copyright (c) 2014-2023 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
|
||||
* SPDX-FileCopyrightText: Copyright (c) 2014-2022 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
|
||||
* SPDX-License-Identifier: MIT
|
||||
*
|
||||
* Permission is hereby granted, free of charge, to any person obtaining a
|
||||
@@ -45,11 +45,6 @@
|
||||
|
||||
#define NVKMS_DEVICE_ID_TEGRA 0x0000ffff
|
||||
|
||||
#define NVKMS_MAX_SUPERFRAME_VIEWS 4
|
||||
|
||||
#define NVKMS_LOG2_LUT_ARRAY_SIZE 10
|
||||
#define NVKMS_LUT_ARRAY_SIZE (1 << NVKMS_LOG2_LUT_ARRAY_SIZE)
|
||||
|
||||
typedef NvU32 NvKmsDeviceHandle;
|
||||
typedef NvU32 NvKmsDispHandle;
|
||||
typedef NvU32 NvKmsConnectorHandle;
|
||||
@@ -58,7 +53,6 @@ typedef NvU32 NvKmsFrameLockHandle;
|
||||
typedef NvU32 NvKmsDeferredRequestFifoHandle;
|
||||
typedef NvU32 NvKmsSwapGroupHandle;
|
||||
typedef NvU32 NvKmsVblankSyncObjectHandle;
|
||||
typedef NvU32 NvKmsVblankSemControlHandle;
|
||||
|
||||
struct NvKmsSize {
|
||||
NvU16 width;
|
||||
@@ -185,14 +179,6 @@ enum NvKmsEventType {
|
||||
NVKMS_EVENT_TYPE_FLIP_OCCURRED,
|
||||
};
|
||||
|
||||
enum NvKmsFlipResult {
|
||||
NV_KMS_FLIP_RESULT_SUCCESS = 0, /* Success */
|
||||
NV_KMS_FLIP_RESULT_INVALID_PARAMS, /* Parameter validation failed */
|
||||
NV_KMS_FLIP_RESULT_IN_PROGRESS, /* Flip would fail because an outstanding
|
||||
flip containing changes that cannot be
|
||||
queued is in progress */
|
||||
};
|
||||
|
||||
typedef enum {
|
||||
NV_EVO_SCALER_1TAP = 0,
|
||||
NV_EVO_SCALER_2TAPS = 1,
|
||||
@@ -235,16 +221,6 @@ struct NvKmsUsageBounds {
|
||||
} layer[NVKMS_MAX_LAYERS_PER_HEAD];
|
||||
};
|
||||
|
||||
/*!
|
||||
* Per-component arrays of NvU16s describing the LUT; used for both the input
|
||||
* LUT and output LUT.
|
||||
*/
|
||||
struct NvKmsLutRamps {
|
||||
NvU16 red[NVKMS_LUT_ARRAY_SIZE]; /*! in */
|
||||
NvU16 green[NVKMS_LUT_ARRAY_SIZE]; /*! in */
|
||||
NvU16 blue[NVKMS_LUT_ARRAY_SIZE]; /*! in */
|
||||
};
|
||||
|
||||
/*
|
||||
* A 3x4 row-major colorspace conversion matrix.
|
||||
*
|
||||
@@ -555,18 +531,6 @@ typedef struct {
|
||||
NvBool noncoherent;
|
||||
} NvKmsDispIOCoherencyModes;
|
||||
|
||||
enum NvKmsInputColorRange {
|
||||
/*
|
||||
* If DEFAULT is provided, driver will assume full range for RGB formats
|
||||
* and limited range for YUV formats.
|
||||
*/
|
||||
NVKMS_INPUT_COLORRANGE_DEFAULT = 0,
|
||||
|
||||
NVKMS_INPUT_COLORRANGE_LIMITED = 1,
|
||||
|
||||
NVKMS_INPUT_COLORRANGE_FULL = 2,
|
||||
};
|
||||
|
||||
enum NvKmsInputColorSpace {
|
||||
/* Unknown colorspace; no de-gamma will be applied */
|
||||
NVKMS_INPUT_COLORSPACE_NONE = 0,
|
||||
@@ -578,12 +542,6 @@ enum NvKmsInputColorSpace {
|
||||
NVKMS_INPUT_COLORSPACE_BT2100_PQ = 2,
|
||||
};
|
||||
|
||||
enum NvKmsOutputColorimetry {
|
||||
NVKMS_OUTPUT_COLORIMETRY_DEFAULT = 0,
|
||||
|
||||
NVKMS_OUTPUT_COLORIMETRY_BT2100 = 1,
|
||||
};
|
||||
|
||||
enum NvKmsOutputTf {
|
||||
/*
|
||||
* NVKMS itself won't apply any OETF (clients are still
|
||||
@@ -594,17 +552,6 @@ enum NvKmsOutputTf {
|
||||
NVKMS_OUTPUT_TF_PQ = 2,
|
||||
};
|
||||
|
||||
/*!
|
||||
* EOTF Data Byte 1 as per CTA-861-G spec.
|
||||
* This is expected to match exactly with the spec.
|
||||
*/
|
||||
enum NvKmsInfoFrameEOTF {
|
||||
NVKMS_INFOFRAME_EOTF_SDR_GAMMA = 0,
|
||||
NVKMS_INFOFRAME_EOTF_HDR_GAMMA = 1,
|
||||
NVKMS_INFOFRAME_EOTF_ST2084 = 2,
|
||||
NVKMS_INFOFRAME_EOTF_HLG = 3,
|
||||
};
|
||||
|
||||
/*!
|
||||
* HDR Static Metadata Type1 Descriptor as per CEA-861.3 spec.
|
||||
* This is expected to match exactly with the spec.
|
||||
@@ -658,29 +605,4 @@ struct NvKmsHDRStaticMetadata {
|
||||
NvU16 maxFALL;
|
||||
};
|
||||
|
||||
/*!
|
||||
* A superframe is made of two or more video streams that are combined in
|
||||
* a specific way. A DP serializer (an external device connected to a Tegra
|
||||
* ARM SOC over DP or HDMI) can receive a video stream comprising multiple
|
||||
* videos combined into a single frame and then split it into multiple
|
||||
* video streams. The following structure describes the number of views
|
||||
* and dimensions of each view inside a superframe.
|
||||
*/
|
||||
struct NvKmsSuperframeInfo {
|
||||
NvU8 numViews;
|
||||
struct {
|
||||
/* x offset inside superframe at which this view starts */
|
||||
NvU16 x;
|
||||
|
||||
/* y offset inside superframe at which this view starts */
|
||||
NvU16 y;
|
||||
|
||||
/* Horizontal active width in pixels for this view */
|
||||
NvU16 width;
|
||||
|
||||
/* Vertical active height in lines for this view */
|
||||
NvU16 height;
|
||||
} view[NVKMS_MAX_SUPERFRAME_VIEWS];
|
||||
};
|
||||
|
||||
#endif /* NVKMS_API_TYPES_H */
|
||||
|
||||
@@ -49,8 +49,6 @@ struct NvKmsKapiDevice;
|
||||
struct NvKmsKapiMemory;
|
||||
struct NvKmsKapiSurface;
|
||||
struct NvKmsKapiChannelEvent;
|
||||
struct NvKmsKapiSemaphoreSurface;
|
||||
struct NvKmsKapiSemaphoreSurfaceCallback;
|
||||
|
||||
typedef NvU32 NvKmsKapiConnector;
|
||||
typedef NvU32 NvKmsKapiDisplay;
|
||||
@@ -69,14 +67,6 @@ typedef NvU32 NvKmsKapiDisplay;
|
||||
*/
|
||||
typedef void NvKmsChannelEventProc(void *dataPtr, NvU32 dataU32);
|
||||
|
||||
/*
|
||||
* Note: Same as above, this function must not call back into NVKMS-KAPI, nor
|
||||
* directly into RM. Doing so could cause deadlocks given the notification
|
||||
* function will most likely be called from within RM's interrupt handler
|
||||
* callchain.
|
||||
*/
|
||||
typedef void NvKmsSemaphoreSurfaceCallbackProc(void *pData);
|
||||
|
||||
/** @} */
|
||||
|
||||
/**
|
||||
@@ -136,11 +126,6 @@ struct NvKmsKapiDeviceResourcesInfo {
|
||||
NvU32 validCursorCompositionModes;
|
||||
NvU64 supportedCursorSurfaceMemoryFormats;
|
||||
|
||||
struct {
|
||||
NvU64 maxSubmittedOffset;
|
||||
NvU64 stride;
|
||||
} semsurf;
|
||||
|
||||
struct {
|
||||
NvU16 validRRTransforms;
|
||||
NvU32 validCompositionModes;
|
||||
@@ -233,10 +218,8 @@ struct NvKmsKapiLayerConfig {
|
||||
struct NvKmsRRParams rrParams;
|
||||
struct NvKmsKapiSyncpt syncptParams;
|
||||
|
||||
struct {
|
||||
struct NvKmsHDRStaticMetadata val;
|
||||
NvBool enabled;
|
||||
} hdrMetadata;
|
||||
struct NvKmsHDRStaticMetadata hdrMetadata;
|
||||
NvBool hdrMetadataSpecified;
|
||||
|
||||
enum NvKmsOutputTf tf;
|
||||
|
||||
@@ -250,21 +233,16 @@ struct NvKmsKapiLayerConfig {
|
||||
NvU16 dstWidth, dstHeight;
|
||||
|
||||
enum NvKmsInputColorSpace inputColorSpace;
|
||||
struct NvKmsCscMatrix csc;
|
||||
NvBool cscUseMain;
|
||||
};
|
||||
|
||||
struct NvKmsKapiLayerRequestedConfig {
|
||||
struct NvKmsKapiLayerConfig config;
|
||||
struct {
|
||||
NvBool surfaceChanged : 1;
|
||||
NvBool srcXYChanged : 1;
|
||||
NvBool srcWHChanged : 1;
|
||||
NvBool dstXYChanged : 1;
|
||||
NvBool dstWHChanged : 1;
|
||||
NvBool cscChanged : 1;
|
||||
NvBool tfChanged : 1;
|
||||
NvBool hdrMetadataChanged : 1;
|
||||
NvBool surfaceChanged : 1;
|
||||
NvBool srcXYChanged : 1;
|
||||
NvBool srcWHChanged : 1;
|
||||
NvBool dstXYChanged : 1;
|
||||
NvBool dstWHChanged : 1;
|
||||
} flags;
|
||||
};
|
||||
|
||||
@@ -308,41 +286,14 @@ struct NvKmsKapiHeadModeSetConfig {
|
||||
struct NvKmsKapiDisplayMode mode;
|
||||
|
||||
NvBool vrrEnabled;
|
||||
|
||||
struct {
|
||||
NvBool enabled;
|
||||
enum NvKmsInfoFrameEOTF eotf;
|
||||
struct NvKmsHDRStaticMetadata staticMetadata;
|
||||
} hdrInfoFrame;
|
||||
|
||||
enum NvKmsOutputColorimetry colorimetry;
|
||||
|
||||
struct {
|
||||
struct {
|
||||
NvBool specified;
|
||||
NvU32 depth;
|
||||
NvU32 start;
|
||||
NvU32 end;
|
||||
struct NvKmsLutRamps *pRamps;
|
||||
} input;
|
||||
|
||||
struct {
|
||||
NvBool specified;
|
||||
NvBool enabled;
|
||||
struct NvKmsLutRamps *pRamps;
|
||||
} output;
|
||||
} lut;
|
||||
};
|
||||
|
||||
struct NvKmsKapiHeadRequestedConfig {
|
||||
struct NvKmsKapiHeadModeSetConfig modeSetConfig;
|
||||
struct {
|
||||
NvBool activeChanged : 1;
|
||||
NvBool displaysChanged : 1;
|
||||
NvBool modeChanged : 1;
|
||||
NvBool hdrInfoFrameChanged : 1;
|
||||
NvBool colorimetryChanged : 1;
|
||||
NvBool lutChanged : 1;
|
||||
NvBool activeChanged : 1;
|
||||
NvBool displaysChanged : 1;
|
||||
NvBool modeChanged : 1;
|
||||
} flags;
|
||||
|
||||
struct NvKmsKapiCursorRequestedConfig cursorRequestedConfig;
|
||||
@@ -367,7 +318,6 @@ struct NvKmsKapiHeadReplyConfig {
|
||||
};
|
||||
|
||||
struct NvKmsKapiModeSetReplyConfig {
|
||||
enum NvKmsFlipResult flipResult;
|
||||
struct NvKmsKapiHeadReplyConfig
|
||||
headReplyConfig[NVKMS_KAPI_MAX_HEADS];
|
||||
};
|
||||
@@ -484,14 +434,6 @@ enum NvKmsKapiAllocationType {
|
||||
NVKMS_KAPI_ALLOCATION_TYPE_OFFSCREEN = 2,
|
||||
};
|
||||
|
||||
typedef enum NvKmsKapiRegisterWaiterResultRec {
|
||||
NVKMS_KAPI_REG_WAITER_FAILED,
|
||||
NVKMS_KAPI_REG_WAITER_SUCCESS,
|
||||
NVKMS_KAPI_REG_WAITER_ALREADY_SIGNALLED,
|
||||
} NvKmsKapiRegisterWaiterResult;
|
||||
|
||||
typedef void NvKmsKapiSuspendResumeCallbackFunc(NvBool suspend);
|
||||
|
||||
struct NvKmsKapiFunctionsTable {
|
||||
|
||||
/*!
|
||||
@@ -577,8 +519,8 @@ struct NvKmsKapiFunctionsTable {
|
||||
);
|
||||
|
||||
/*!
|
||||
* Revoke modeset permissions previously granted. Only one (dispIndex,
|
||||
* head, display) is currently supported.
|
||||
* Revoke permissions previously granted. Only one (dispIndex, head,
|
||||
* display) is currently supported.
|
||||
*
|
||||
* \param [in] device A device returned by allocateDevice().
|
||||
*
|
||||
@@ -595,34 +537,6 @@ struct NvKmsKapiFunctionsTable {
|
||||
NvKmsKapiDisplay display
|
||||
);
|
||||
|
||||
/*!
|
||||
* Grant modeset sub-owner permissions to fd. This is used by clients to
|
||||
* convert drm 'master' permissions into nvkms sub-owner permission.
|
||||
*
|
||||
* \param [in] fd fd from opening /dev/nvidia-modeset.
|
||||
*
|
||||
* \param [in] device A device returned by allocateDevice().
|
||||
*
|
||||
* \return NV_TRUE on success, NV_FALSE on failure.
|
||||
*/
|
||||
NvBool (*grantSubOwnership)
|
||||
(
|
||||
NvS32 fd,
|
||||
struct NvKmsKapiDevice *device
|
||||
);
|
||||
|
||||
/*!
|
||||
* Revoke sub-owner permissions previously granted.
|
||||
*
|
||||
* \param [in] device A device returned by allocateDevice().
|
||||
*
|
||||
* \return NV_TRUE on success, NV_FALSE on failure.
|
||||
*/
|
||||
NvBool (*revokeSubOwnership)
|
||||
(
|
||||
struct NvKmsKapiDevice *device
|
||||
);
|
||||
|
||||
/*!
|
||||
* Registers for notification, via
|
||||
* NvKmsKapiAllocateDeviceParams::eventCallback, of the events specified
|
||||
@@ -1208,208 +1122,6 @@ struct NvKmsKapiFunctionsTable {
|
||||
NvP64 dmaBuf,
|
||||
NvU32 limit);
|
||||
|
||||
/*!
|
||||
* Import a semaphore surface allocated elsewhere to NVKMS and return a
|
||||
* handle to the new object.
|
||||
*
|
||||
* \param [in] device A device allocated using allocateDevice().
|
||||
*
|
||||
* \param [in] nvKmsParamsUser Userspace pointer to driver-specific
|
||||
* parameters describing the semaphore
|
||||
* surface being imported.
|
||||
*
|
||||
* \param [in] nvKmsParamsSize Size of the driver-specific parameter
|
||||
* struct.
|
||||
*
|
||||
* \param [out] pSemaphoreMap Returns a CPU mapping of the semaphore
|
||||
* surface's semaphore memory to the client.
|
||||
*
|
||||
* \param [out] pMaxSubmittedMap Returns a CPU mapping of the semaphore
|
||||
* surface's semaphore memory to the client.
|
||||
*
|
||||
* \return struct NvKmsKapiSemaphoreSurface* on success, NULL on failure.
|
||||
*/
|
||||
struct NvKmsKapiSemaphoreSurface* (*importSemaphoreSurface)
|
||||
(
|
||||
struct NvKmsKapiDevice *device,
|
||||
NvU64 nvKmsParamsUser,
|
||||
NvU64 nvKmsParamsSize,
|
||||
void **pSemaphoreMap,
|
||||
void **pMaxSubmittedMap
|
||||
);
|
||||
|
||||
/*!
|
||||
* Free an imported semaphore surface.
|
||||
*
|
||||
* \param [in] device The device passed to
|
||||
* importSemaphoreSurface() when creating
|
||||
* semaphoreSurface.
|
||||
*
|
||||
* \param [in] semaphoreSurface A semaphore surface returned by
|
||||
* importSemaphoreSurface().
|
||||
*/
|
||||
void (*freeSemaphoreSurface)
|
||||
(
|
||||
struct NvKmsKapiDevice *device,
|
||||
struct NvKmsKapiSemaphoreSurface *semaphoreSurface
|
||||
);
|
||||
|
||||
/*!
|
||||
* Register a callback to be called when a semaphore reaches a value.
|
||||
*
|
||||
* The callback will be called when the semaphore at index in
|
||||
* semaphoreSurface reaches the value wait_value. The callback will
|
||||
* be called at most once and is automatically unregistered when called.
|
||||
* It may also be unregistered (i.e., cancelled) explicitly using the
|
||||
* unregisterSemaphoreSurfaceCallback() function. To avoid leaking the
|
||||
* memory used to track the registered callback, callers must ensure one
|
||||
* of these methods of unregistration is used for every successful
|
||||
* callback registration that returns a non-NULL pCallbackHandle.
|
||||
*
|
||||
* \param [in] device The device passed to
|
||||
* importSemaphoreSurface() when creating
|
||||
* semaphoreSurface.
|
||||
*
|
||||
* \param [in] semaphoreSurface A semaphore surface returned by
|
||||
* importSemaphoreSurface().
|
||||
*
|
||||
* \param [in] pCallback A pointer to the function to call when
|
||||
* the specified value is reached. NULL
|
||||
* means no callback.
|
||||
*
|
||||
* \param [in] pData Arbitrary data to be passed back to the
|
||||
* callback as its sole parameter.
|
||||
*
|
||||
* \param [in] index The index of the semaphore within
|
||||
* semaphoreSurface.
|
||||
*
|
||||
* \param [in] wait_value The value the semaphore must reach or
|
||||
* exceed before the callback is called.
|
||||
*
|
||||
* \param [in] new_value The value the semaphore will be set to
|
||||
* when it reaches or exceeds <wait_value>.
|
||||
* 0 means do not update the value.
|
||||
*
|
||||
* \param [out] pCallbackHandle On success, the value pointed to will
|
||||
* contain an opaque handle to the
|
||||
* registered callback that may be used to
|
||||
* cancel it if needed. Unused if pCallback
|
||||
* is NULL.
|
||||
*
|
||||
* \return NVKMS_KAPI_REG_WAITER_SUCCESS if the waiter was registered or if
|
||||
* no callback was requested and the semaphore at <index> has
|
||||
* already reached or exceeded <wait_value>
|
||||
*
|
||||
* NVKMS_KAPI_REG_WAITER_ALREADY_SIGNALLED if a callback was
|
||||
* requested and the semaphore at <index> has already reached or
|
||||
* exceeded <wait_value>
|
||||
*
|
||||
* NVKMS_KAPI_REG_WAITER_FAILED if waiter registration failed.
|
||||
*/
|
||||
NvKmsKapiRegisterWaiterResult
|
||||
(*registerSemaphoreSurfaceCallback)
|
||||
(
|
||||
struct NvKmsKapiDevice *device,
|
||||
struct NvKmsKapiSemaphoreSurface *semaphoreSurface,
|
||||
NvKmsSemaphoreSurfaceCallbackProc *pCallback,
|
||||
void *pData,
|
||||
NvU64 index,
|
||||
NvU64 wait_value,
|
||||
NvU64 new_value,
|
||||
struct NvKmsKapiSemaphoreSurfaceCallback **pCallbackHandle
|
||||
);
|
||||
|
||||
/*!
|
||||
* Unregister a callback registered via registerSemaphoreSurfaceCallback()
|
||||
*
|
||||
* If the callback has not yet been called, this function will cancel the
|
||||
* callback and free its associated resources.
|
||||
*
|
||||
* Note this function treats the callback handle as a pointer. While this
|
||||
* function does not dereference that pointer itself, the underlying call
|
||||
* to RM does within a properly guarded critical section that first ensures
|
||||
* it is not in the process of being used within a callback. This means
|
||||
* the callstack must take into consideration that pointers are not in
|
||||
* general unique handles if they may have been freed, since a subsequent
|
||||
* malloc could return the same pointer value at that point. This callchain
|
||||
* avoids that by leveraging the behavior of the underlying RM APIs:
|
||||
*
|
||||
* 1) A callback handle is referenced relative to its corresponding
|
||||
* (semaphore surface, index, wait_value) tuple here and within RM. It
|
||||
* is not a valid handle outside of that scope.
|
||||
*
|
||||
* 2) A callback can not be registered against an already-reached value
|
||||
* for a given semaphore surface index.
|
||||
*
|
||||
* 3) A given callback handle can not be registered twice against the same
|
||||
* (semaphore surface, index, wait_value) tuple, so unregistration will
|
||||
* never race with registration at the RM level, and would only race at
|
||||
* a higher level if used incorrectly. Since this is kernel code, we
|
||||
* can safely assume there won't be malicious clients purposely misuing
|
||||
* the API, but the burden is placed on the caller to ensure its usage
|
||||
* does not lead to races at higher levels.
|
||||
*
|
||||
* These factors considered together ensure any valid registered handle is
|
||||
* either still in the relevant waiter list and refers to the same event/
|
||||
* callback as when it was registered, or has been removed from the list
|
||||
* as part of a critical section that also destroys the list itself and
|
||||
* makes future lookups in that list impossible, and hence eliminates the
|
||||
* chance of comparing a stale handle with a new handle of the same value
|
||||
* as part of a lookup.
|
||||
*
|
||||
* \param [in] device The device passed to
|
||||
* importSemaphoreSurface() when creating
|
||||
* semaphoreSurface.
|
||||
*
|
||||
* \param [in] semaphoreSurface The semaphore surface passed to
|
||||
* registerSemaphoreSurfaceCallback() when
|
||||
* registering the callback.
|
||||
*
|
||||
* \param [in] index The index passed to
|
||||
* registerSemaphoreSurfaceCallback() when
|
||||
* registering the callback.
|
||||
*
|
||||
* \param [in] wait_value The wait_value passed to
|
||||
* registerSemaphoreSurfaceCallback() when
|
||||
* registering the callback.
|
||||
*
|
||||
* \param [in] callbackHandle The callback handle returned by
|
||||
* registerSemaphoreSurfaceCallback().
|
||||
*/
|
||||
NvBool
|
||||
(*unregisterSemaphoreSurfaceCallback)
|
||||
(
|
||||
struct NvKmsKapiDevice *device,
|
||||
struct NvKmsKapiSemaphoreSurface *semaphoreSurface,
|
||||
NvU64 index,
|
||||
NvU64 wait_value,
|
||||
struct NvKmsKapiSemaphoreSurfaceCallback *callbackHandle
|
||||
);
|
||||
|
||||
/*!
|
||||
* Update the value of a semaphore surface from the CPU.
|
||||
*
|
||||
* Update the semaphore value at the specified index from the CPU, then
|
||||
* wake up any pending CPU waiters associated with that index that are
|
||||
* waiting on it reaching a value <= the new value.
|
||||
*/
|
||||
NvBool
|
||||
(*setSemaphoreSurfaceValue)
|
||||
(
|
||||
struct NvKmsKapiDevice *device,
|
||||
struct NvKmsKapiSemaphoreSurface *semaphoreSurface,
|
||||
NvU64 index,
|
||||
NvU64 new_value
|
||||
);
|
||||
|
||||
/*!
|
||||
* Set the callback function for suspending and resuming the display system.
|
||||
*/
|
||||
void
|
||||
(*setSuspendResumeCallback)
|
||||
(
|
||||
NvKmsKapiSuspendResumeCallbackFunc *function
|
||||
);
|
||||
};
|
||||
|
||||
/** @} */
|
||||
|
||||
@@ -919,9 +919,6 @@ static NV_FORCEINLINE void *NV_NVUPTR_TO_PTR(NvUPtr address)
|
||||
//
|
||||
#define NV_BIT_SET_128(b, lo, hi) { nvAssert( (b) < 128 ); if ( (b) < 64 ) (lo) |= NVBIT64(b); else (hi) |= NVBIT64( b & 0x3F ); }
|
||||
|
||||
// Get the number of elements the specified fixed-size array
|
||||
#define NV_ARRAY_ELEMENTS(x) ((sizeof(x)/sizeof((x)[0])))
|
||||
|
||||
#ifdef __cplusplus
|
||||
}
|
||||
#endif //__cplusplus
|
||||
|
||||
@@ -1,5 +1,5 @@
|
||||
/*
|
||||
* SPDX-FileCopyrightText: Copyright (c) 2014-2023 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
|
||||
* SPDX-FileCopyrightText: Copyright (c) 2014-2020 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
|
||||
* SPDX-License-Identifier: MIT
|
||||
*
|
||||
* Permission is hereby granted, free of charge, to any person obtaining a
|
||||
@@ -150,8 +150,6 @@ NV_STATUS_CODE(NV_ERR_NVLINK_CONFIGURATION_ERROR, 0x00000078, "Nvlink Confi
|
||||
NV_STATUS_CODE(NV_ERR_RISCV_ERROR, 0x00000079, "Generic RISC-V assert or halt")
|
||||
NV_STATUS_CODE(NV_ERR_FABRIC_MANAGER_NOT_PRESENT, 0x0000007A, "Fabric Manager is not loaded")
|
||||
NV_STATUS_CODE(NV_ERR_ALREADY_SIGNALLED, 0x0000007B, "Semaphore Surface value already >= requested wait value")
|
||||
NV_STATUS_CODE(NV_ERR_QUEUE_TASK_SLOT_NOT_AVAILABLE, 0x0000007C, "PMU RPC error due to no queue slot available for this event")
|
||||
NV_STATUS_CODE(NV_ERR_KEY_ROTATION_IN_PROGRESS, 0x0000007D, "Operation not allowed as key rotation is in progress")
|
||||
|
||||
// Warnings:
|
||||
NV_STATUS_CODE(NV_WARN_HOT_SWITCH, 0x00010001, "WARNING Hot switch")
|
||||
|
||||
@@ -145,12 +145,7 @@ typedef signed short NvS16; /* -32768 to 32767 */
|
||||
#endif
|
||||
|
||||
// Macro to build an NvU32 from four bytes, listed from msb to lsb
|
||||
#define NvU32_BUILD(a, b, c, d) \
|
||||
((NvU32)( \
|
||||
(((NvU32)(a) & 0xff) << 24) | \
|
||||
(((NvU32)(b) & 0xff) << 16) | \
|
||||
(((NvU32)(c) & 0xff) << 8) | \
|
||||
(((NvU32)(d) & 0xff))))
|
||||
#define NvU32_BUILD(a, b, c, d) (((a) << 24) | ((b) << 16) | ((c) << 8) | (d))
|
||||
|
||||
#if NVTYPES_USE_STDINT
|
||||
typedef uint32_t NvV32; /* "void": enumerated or multiple fields */
|
||||
|
||||
@@ -67,6 +67,7 @@ typedef struct os_wait_queue os_wait_queue;
|
||||
* ---------------------------------------------------------------------------
|
||||
*/
|
||||
|
||||
NvU64 NV_API_CALL os_get_num_phys_pages (void);
|
||||
NV_STATUS NV_API_CALL os_alloc_mem (void **, NvU64);
|
||||
void NV_API_CALL os_free_mem (void *);
|
||||
NV_STATUS NV_API_CALL os_get_current_time (NvU32 *, NvU32 *);
|
||||
@@ -104,6 +105,7 @@ void* NV_API_CALL os_map_kernel_space (NvU64, NvU64, NvU32);
|
||||
void NV_API_CALL os_unmap_kernel_space (void *, NvU64);
|
||||
void* NV_API_CALL os_map_user_space (NvU64, NvU64, NvU32, NvU32, void **);
|
||||
void NV_API_CALL os_unmap_user_space (void *, NvU64, void *);
|
||||
NV_STATUS NV_API_CALL os_flush_cpu_cache (void);
|
||||
NV_STATUS NV_API_CALL os_flush_cpu_cache_all (void);
|
||||
NV_STATUS NV_API_CALL os_flush_user_cache (void);
|
||||
void NV_API_CALL os_flush_cpu_write_combine_buffer(void);
|
||||
@@ -197,8 +199,6 @@ nv_cap_t* NV_API_CALL os_nv_cap_create_file_entry (nv_cap_t *, const char *,
|
||||
void NV_API_CALL os_nv_cap_destroy_entry (nv_cap_t *);
|
||||
int NV_API_CALL os_nv_cap_validate_and_dup_fd(const nv_cap_t *, int);
|
||||
void NV_API_CALL os_nv_cap_close_fd (int);
|
||||
NvS32 NV_API_CALL os_imex_channel_get (NvU64);
|
||||
NvS32 NV_API_CALL os_imex_channel_count (void);
|
||||
|
||||
enum os_pci_req_atomics_type {
|
||||
OS_INTF_PCIE_REQ_ATOMICS_32BIT,
|
||||
@@ -220,7 +220,6 @@ extern NvU8 os_page_shift;
|
||||
extern NvBool os_cc_enabled;
|
||||
extern NvBool os_cc_tdx_enabled;
|
||||
extern NvBool os_dma_buf_enabled;
|
||||
extern NvBool os_imex_channel_is_supported;
|
||||
|
||||
/*
|
||||
* ---------------------------------------------------------------------------
|
||||
|
||||
@@ -1,5 +1,5 @@
|
||||
/*
|
||||
* SPDX-FileCopyrightText: Copyright (c) 1999-2024 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
|
||||
* SPDX-FileCopyrightText: Copyright (c) 1999-2023 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
|
||||
* SPDX-License-Identifier: MIT
|
||||
*
|
||||
* Permission is hereby granted, free of charge, to any person obtaining a
|
||||
@@ -37,7 +37,7 @@ NV_STATUS NV_API_CALL rm_gpu_ops_create_session (nvidia_stack_t *, nvgpuSessio
|
||||
NV_STATUS NV_API_CALL rm_gpu_ops_destroy_session (nvidia_stack_t *, nvgpuSessionHandle_t);
|
||||
NV_STATUS NV_API_CALL rm_gpu_ops_device_create (nvidia_stack_t *, nvgpuSessionHandle_t, const nvgpuInfo_t *, const NvProcessorUuid *, nvgpuDeviceHandle_t *, NvBool);
|
||||
NV_STATUS NV_API_CALL rm_gpu_ops_device_destroy (nvidia_stack_t *, nvgpuDeviceHandle_t);
|
||||
NV_STATUS NV_API_CALL rm_gpu_ops_address_space_create(nvidia_stack_t *, nvgpuDeviceHandle_t, unsigned long long, unsigned long long, NvBool, nvgpuAddressSpaceHandle_t *, nvgpuAddressSpaceInfo_t);
|
||||
NV_STATUS NV_API_CALL rm_gpu_ops_address_space_create(nvidia_stack_t *, nvgpuDeviceHandle_t, unsigned long long, unsigned long long, nvgpuAddressSpaceHandle_t *, nvgpuAddressSpaceInfo_t);
|
||||
NV_STATUS NV_API_CALL rm_gpu_ops_dup_address_space(nvidia_stack_t *, nvgpuDeviceHandle_t, NvHandle, NvHandle, nvgpuAddressSpaceHandle_t *, nvgpuAddressSpaceInfo_t);
|
||||
NV_STATUS NV_API_CALL rm_gpu_ops_address_space_destroy(nvidia_stack_t *, nvgpuAddressSpaceHandle_t);
|
||||
NV_STATUS NV_API_CALL rm_gpu_ops_memory_alloc_fb(nvidia_stack_t *, nvgpuAddressSpaceHandle_t, NvLength, NvU64 *, nvgpuAllocInfo_t);
|
||||
@@ -45,6 +45,7 @@ NV_STATUS NV_API_CALL rm_gpu_ops_memory_alloc_fb(nvidia_stack_t *, nvgpuAddres
|
||||
NV_STATUS NV_API_CALL rm_gpu_ops_pma_alloc_pages(nvidia_stack_t *, void *, NvLength, NvU32 , nvgpuPmaAllocationOptions_t, NvU64 *);
|
||||
NV_STATUS NV_API_CALL rm_gpu_ops_pma_free_pages(nvidia_stack_t *, void *, NvU64 *, NvLength , NvU32, NvU32);
|
||||
NV_STATUS NV_API_CALL rm_gpu_ops_pma_pin_pages(nvidia_stack_t *, void *, NvU64 *, NvLength , NvU32, NvU32);
|
||||
NV_STATUS NV_API_CALL rm_gpu_ops_pma_unpin_pages(nvidia_stack_t *, void *, NvU64 *, NvLength , NvU32);
|
||||
NV_STATUS NV_API_CALL rm_gpu_ops_get_pma_object(nvidia_stack_t *, nvgpuDeviceHandle_t, void **, const nvgpuPmaStatistics_t *);
|
||||
NV_STATUS NV_API_CALL rm_gpu_ops_pma_register_callbacks(nvidia_stack_t *sp, void *, nvPmaEvictPagesCallback, nvPmaEvictRangeCallback, void *);
|
||||
void NV_API_CALL rm_gpu_ops_pma_unregister_callbacks(nvidia_stack_t *sp, void *);
|
||||
@@ -75,8 +76,7 @@ NV_STATUS NV_API_CALL rm_gpu_ops_own_page_fault_intr(nvidia_stack_t *, nvgpuDevi
|
||||
NV_STATUS NV_API_CALL rm_gpu_ops_init_fault_info(nvidia_stack_t *, nvgpuDeviceHandle_t, nvgpuFaultInfo_t);
|
||||
NV_STATUS NV_API_CALL rm_gpu_ops_destroy_fault_info(nvidia_stack_t *, nvgpuDeviceHandle_t, nvgpuFaultInfo_t);
|
||||
NV_STATUS NV_API_CALL rm_gpu_ops_get_non_replayable_faults(nvidia_stack_t *, nvgpuFaultInfo_t, void *, NvU32 *);
|
||||
NV_STATUS NV_API_CALL rm_gpu_ops_flush_replayable_fault_buffer(nvidia_stack_t *, nvgpuFaultInfo_t, NvBool);
|
||||
NV_STATUS NV_API_CALL rm_gpu_ops_toggle_prefetch_faults(nvidia_stack_t *, nvgpuFaultInfo_t, NvBool);
|
||||
NV_STATUS NV_API_CALL rm_gpu_ops_flush_replayable_fault_buffer(nvidia_stack_t *, nvgpuDeviceHandle_t);
|
||||
NV_STATUS NV_API_CALL rm_gpu_ops_has_pending_non_replayable_faults(nvidia_stack_t *, nvgpuFaultInfo_t, NvBool *);
|
||||
NV_STATUS NV_API_CALL rm_gpu_ops_init_access_cntr_info(nvidia_stack_t *, nvgpuDeviceHandle_t, nvgpuAccessCntrInfo_t, NvU32);
|
||||
NV_STATUS NV_API_CALL rm_gpu_ops_destroy_access_cntr_info(nvidia_stack_t *, nvgpuDeviceHandle_t, nvgpuAccessCntrInfo_t);
|
||||
@@ -103,14 +103,12 @@ NV_STATUS NV_API_CALL rm_gpu_ops_paging_channel_push_stream(nvidia_stack_t *, n
|
||||
|
||||
NV_STATUS NV_API_CALL rm_gpu_ops_ccsl_context_init(nvidia_stack_t *, struct ccslContext_t **, nvgpuChannelHandle_t);
|
||||
NV_STATUS NV_API_CALL rm_gpu_ops_ccsl_context_clear(nvidia_stack_t *, struct ccslContext_t *);
|
||||
NV_STATUS NV_API_CALL rm_gpu_ops_ccsl_rotate_key(nvidia_stack_t *, UvmCslContext *[], NvU32);
|
||||
NV_STATUS NV_API_CALL rm_gpu_ops_ccsl_rotate_iv(nvidia_stack_t *, struct ccslContext_t *, NvU8);
|
||||
NV_STATUS NV_API_CALL rm_gpu_ops_ccsl_encrypt(nvidia_stack_t *, struct ccslContext_t *, NvU32, NvU8 const *, NvU8 *, NvU8 *);
|
||||
NV_STATUS NV_API_CALL rm_gpu_ops_ccsl_encrypt_with_iv(nvidia_stack_t *, struct ccslContext_t *, NvU32, NvU8 const *, NvU8*, NvU8 *, NvU8 *);
|
||||
NV_STATUS NV_API_CALL rm_gpu_ops_ccsl_decrypt(nvidia_stack_t *, struct ccslContext_t *, NvU32, NvU8 const *, NvU8 const *, NvU32, NvU8 *, NvU8 const *, NvU32, NvU8 const *);
|
||||
NV_STATUS NV_API_CALL rm_gpu_ops_ccsl_decrypt(nvidia_stack_t *, struct ccslContext_t *, NvU32, NvU8 const *, NvU8 const *, NvU8 *, NvU8 const *, NvU32, NvU8 const *);
|
||||
NV_STATUS NV_API_CALL rm_gpu_ops_ccsl_sign(nvidia_stack_t *, struct ccslContext_t *, NvU32, NvU8 const *, NvU8 *);
|
||||
NV_STATUS NV_API_CALL rm_gpu_ops_ccsl_query_message_pool(nvidia_stack_t *, struct ccslContext_t *, NvU8, NvU64 *);
|
||||
NV_STATUS NV_API_CALL rm_gpu_ops_ccsl_increment_iv(nvidia_stack_t *, struct ccslContext_t *, NvU8, NvU64, NvU8 *);
|
||||
NV_STATUS NV_API_CALL rm_gpu_ops_ccsl_log_encryption(nvidia_stack_t *, struct ccslContext_t *, NvU8, NvU32);
|
||||
|
||||
#endif
|
||||
|
||||
File diff suppressed because it is too large
Load Diff
@@ -1,101 +0,0 @@
|
||||
# Each of these headers is checked for presence with a test #include; a
|
||||
# corresponding #define will be generated in conftest/headers.h.
|
||||
NV_HEADER_PRESENCE_TESTS = \
|
||||
asm/system.h \
|
||||
drm/drmP.h \
|
||||
drm/drm_aperture.h \
|
||||
drm/drm_auth.h \
|
||||
drm/drm_gem.h \
|
||||
drm/drm_crtc.h \
|
||||
drm/drm_color_mgmt.h \
|
||||
drm/drm_atomic.h \
|
||||
drm/drm_atomic_helper.h \
|
||||
drm/drm_atomic_state_helper.h \
|
||||
drm/drm_encoder.h \
|
||||
drm/drm_atomic_uapi.h \
|
||||
drm/drm_drv.h \
|
||||
drm/drm_fbdev_generic.h \
|
||||
drm/drm_framebuffer.h \
|
||||
drm/drm_connector.h \
|
||||
drm/drm_probe_helper.h \
|
||||
drm/drm_blend.h \
|
||||
drm/drm_fourcc.h \
|
||||
drm/drm_prime.h \
|
||||
drm/drm_plane.h \
|
||||
drm/drm_vblank.h \
|
||||
drm/drm_file.h \
|
||||
drm/drm_ioctl.h \
|
||||
drm/drm_device.h \
|
||||
drm/drm_mode_config.h \
|
||||
drm/drm_modeset_lock.h \
|
||||
dt-bindings/interconnect/tegra_icc_id.h \
|
||||
generated/autoconf.h \
|
||||
generated/compile.h \
|
||||
generated/utsrelease.h \
|
||||
linux/efi.h \
|
||||
linux/kconfig.h \
|
||||
linux/platform/tegra/mc_utils.h \
|
||||
linux/printk.h \
|
||||
linux/ratelimit.h \
|
||||
linux/prio_tree.h \
|
||||
linux/log2.h \
|
||||
linux/of.h \
|
||||
linux/bug.h \
|
||||
linux/sched.h \
|
||||
linux/sched/mm.h \
|
||||
linux/sched/signal.h \
|
||||
linux/sched/task.h \
|
||||
linux/sched/task_stack.h \
|
||||
xen/ioemu.h \
|
||||
linux/fence.h \
|
||||
linux/dma-fence.h \
|
||||
linux/dma-resv.h \
|
||||
soc/tegra/chip-id.h \
|
||||
soc/tegra/fuse.h \
|
||||
soc/tegra/tegra_bpmp.h \
|
||||
video/nv_internal.h \
|
||||
linux/platform/tegra/dce/dce-client-ipc.h \
|
||||
linux/nvhost.h \
|
||||
linux/nvhost_t194.h \
|
||||
linux/host1x-next.h \
|
||||
asm/book3s/64/hash-64k.h \
|
||||
asm/set_memory.h \
|
||||
asm/prom.h \
|
||||
asm/powernv.h \
|
||||
linux/atomic.h \
|
||||
asm/barrier.h \
|
||||
asm/opal-api.h \
|
||||
sound/hdaudio.h \
|
||||
asm/pgtable_types.h \
|
||||
asm/page.h \
|
||||
linux/stringhash.h \
|
||||
linux/dma-map-ops.h \
|
||||
rdma/peer_mem.h \
|
||||
sound/hda_codec.h \
|
||||
linux/dma-buf.h \
|
||||
linux/time.h \
|
||||
linux/platform_device.h \
|
||||
linux/mutex.h \
|
||||
linux/reset.h \
|
||||
linux/of_platform.h \
|
||||
linux/of_device.h \
|
||||
linux/of_gpio.h \
|
||||
linux/gpio.h \
|
||||
linux/gpio/consumer.h \
|
||||
linux/interconnect.h \
|
||||
linux/pm_runtime.h \
|
||||
linux/clk.h \
|
||||
linux/clk-provider.h \
|
||||
linux/ioasid.h \
|
||||
linux/stdarg.h \
|
||||
linux/iosys-map.h \
|
||||
asm/coco.h \
|
||||
linux/vfio_pci_core.h \
|
||||
linux/mdev.h \
|
||||
soc/tegra/bpmp-abi.h \
|
||||
soc/tegra/bpmp.h \
|
||||
linux/sync_file.h \
|
||||
linux/cc_platform.h \
|
||||
asm/cpufeature.h \
|
||||
linux/mpi.h
|
||||
|
||||
@@ -1,334 +0,0 @@
|
||||
/*
|
||||
* SPDX-FileCopyrightText: Copyright (c) 2016 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
|
||||
* SPDX-License-Identifier: MIT
|
||||
*
|
||||
* Permission is hereby granted, free of charge, to any person obtaining a
|
||||
* copy of this software and associated documentation files (the "Software"),
|
||||
* to deal in the Software without restriction, including without limitation
|
||||
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
|
||||
* and/or sell copies of the Software, and to permit persons to whom the
|
||||
* Software is furnished to do so, subject to the following conditions:
|
||||
*
|
||||
* The above copyright notice and this permission notice shall be included in
|
||||
* all copies or substantial portions of the Software.
|
||||
*
|
||||
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
|
||||
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
|
||||
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
|
||||
* DEALINGS IN THE SOFTWARE.
|
||||
*/
|
||||
|
||||
#include "nv-kthread-q.h"
|
||||
#include "nv-list-helpers.h"
|
||||
|
||||
#include <linux/kthread.h>
|
||||
#include <linux/interrupt.h>
|
||||
#include <linux/completion.h>
|
||||
#include <linux/module.h>
|
||||
#include <linux/mm.h>
|
||||
|
||||
#if defined(NV_LINUX_BUG_H_PRESENT)
|
||||
#include <linux/bug.h>
|
||||
#else
|
||||
#include <asm/bug.h>
|
||||
#endif
|
||||
|
||||
// Today's implementation is a little simpler and more limited than the
|
||||
// API description allows for in nv-kthread-q.h. Details include:
|
||||
//
|
||||
// 1. Each nv_kthread_q instance is a first-in, first-out queue.
|
||||
//
|
||||
// 2. Each nv_kthread_q instance is serviced by exactly one kthread.
|
||||
//
|
||||
// You can create any number of queues, each of which gets its own
|
||||
// named kernel thread (kthread). You can then insert arbitrary functions
|
||||
// into the queue, and those functions will be run in the context of the
|
||||
// queue's kthread.
|
||||
|
||||
#ifndef WARN
|
||||
// Only *really* old kernels (2.6.9) end up here. Just use a simple printk
|
||||
// to implement this, because such kernels won't be supported much longer.
|
||||
#define WARN(condition, format...) ({ \
|
||||
int __ret_warn_on = !!(condition); \
|
||||
if (unlikely(__ret_warn_on)) \
|
||||
printk(KERN_ERR format); \
|
||||
unlikely(__ret_warn_on); \
|
||||
})
|
||||
#endif
|
||||
|
||||
#define NVQ_WARN(fmt, ...) \
|
||||
do { \
|
||||
if (in_interrupt()) { \
|
||||
WARN(1, "nv_kthread_q: [in interrupt]: " fmt, \
|
||||
##__VA_ARGS__); \
|
||||
} \
|
||||
else { \
|
||||
WARN(1, "nv_kthread_q: task: %s: " fmt, \
|
||||
current->comm, \
|
||||
##__VA_ARGS__); \
|
||||
} \
|
||||
} while (0)
|
||||
|
||||
static int _main_loop(void *args)
|
||||
{
|
||||
nv_kthread_q_t *q = (nv_kthread_q_t *)args;
|
||||
nv_kthread_q_item_t *q_item = NULL;
|
||||
unsigned long flags;
|
||||
|
||||
while (1) {
|
||||
// Normally this thread is never interrupted. However,
|
||||
// down_interruptible (instead of down) is called here,
|
||||
// in order to avoid being classified as a potentially
|
||||
// hung task, by the kernel watchdog.
|
||||
while (down_interruptible(&q->q_sem))
|
||||
NVQ_WARN("Interrupted during semaphore wait\n");
|
||||
|
||||
if (atomic_read(&q->main_loop_should_exit))
|
||||
break;
|
||||
|
||||
spin_lock_irqsave(&q->q_lock, flags);
|
||||
|
||||
// The q_sem semaphore prevents us from getting here unless there is
|
||||
// at least one item in the list, so an empty list indicates a bug.
|
||||
if (unlikely(list_empty(&q->q_list_head))) {
|
||||
spin_unlock_irqrestore(&q->q_lock, flags);
|
||||
NVQ_WARN("_main_loop: Empty queue: q: 0x%p\n", q);
|
||||
continue;
|
||||
}
|
||||
|
||||
// Consume one item from the queue
|
||||
q_item = list_first_entry(&q->q_list_head,
|
||||
nv_kthread_q_item_t,
|
||||
q_list_node);
|
||||
|
||||
list_del_init(&q_item->q_list_node);
|
||||
|
||||
spin_unlock_irqrestore(&q->q_lock, flags);
|
||||
|
||||
// Run the item
|
||||
q_item->function_to_run(q_item->function_args);
|
||||
|
||||
// Make debugging a little simpler by clearing this between runs:
|
||||
q_item = NULL;
|
||||
}
|
||||
|
||||
while (!kthread_should_stop())
|
||||
schedule();
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
void nv_kthread_q_stop(nv_kthread_q_t *q)
|
||||
{
|
||||
// check if queue has been properly initialized
|
||||
if (unlikely(!q->q_kthread))
|
||||
return;
|
||||
|
||||
nv_kthread_q_flush(q);
|
||||
|
||||
// If this assertion fires, then a caller likely either broke the API rules,
|
||||
// by adding items after calling nv_kthread_q_stop, or possibly messed up
|
||||
// with inadequate flushing of self-rescheduling q_items.
|
||||
if (unlikely(!list_empty(&q->q_list_head)))
|
||||
NVQ_WARN("list not empty after flushing\n");
|
||||
|
||||
if (likely(!atomic_read(&q->main_loop_should_exit))) {
|
||||
|
||||
atomic_set(&q->main_loop_should_exit, 1);
|
||||
|
||||
// Wake up the kthread so that it can see that it needs to stop:
|
||||
up(&q->q_sem);
|
||||
|
||||
kthread_stop(q->q_kthread);
|
||||
q->q_kthread = NULL;
|
||||
}
|
||||
}
|
||||
|
||||
// When CONFIG_VMAP_STACK is defined, the kernel thread stack allocator used by
|
||||
// kthread_create_on_node relies on a 2 entry, per-core cache to minimize
|
||||
// vmalloc invocations. The cache is NUMA-unaware, so when there is a hit, the
|
||||
// stack location ends up being a function of the core assigned to the current
|
||||
// thread, instead of being a function of the specified NUMA node. The cache was
|
||||
// added to the kernel in commit ac496bf48d97f2503eaa353996a4dd5e4383eaf0
|
||||
// ("fork: Optimize task creation by caching two thread stacks per CPU if
|
||||
// CONFIG_VMAP_STACK=y")
|
||||
//
|
||||
// To work around the problematic cache, we create up to three kernel threads
|
||||
// -If the first thread's stack is resident on the preferred node, return this
|
||||
// thread.
|
||||
// -Otherwise, create a second thread. If its stack is resident on the
|
||||
// preferred node, stop the first thread and return this one.
|
||||
// -Otherwise, create a third thread. The stack allocator does not find a
|
||||
// cached stack, and so falls back to vmalloc, which takes the NUMA hint into
|
||||
// consideration. The first two threads are then stopped.
|
||||
//
|
||||
// When CONFIG_VMAP_STACK is not defined, the first kernel thread is returned.
|
||||
//
|
||||
// This function is never invoked when there is no NUMA preference (preferred
|
||||
// node is NUMA_NO_NODE).
|
||||
static struct task_struct *thread_create_on_node(int (*threadfn)(void *data),
|
||||
nv_kthread_q_t *q,
|
||||
int preferred_node,
|
||||
const char *q_name)
|
||||
{
|
||||
|
||||
unsigned i, j;
|
||||
const static unsigned attempts = 3;
|
||||
struct task_struct *thread[3];
|
||||
|
||||
for (i = 0;; i++) {
|
||||
struct page *stack;
|
||||
|
||||
thread[i] = kthread_create_on_node(threadfn, q, preferred_node, q_name);
|
||||
|
||||
if (unlikely(IS_ERR(thread[i]))) {
|
||||
|
||||
// Instead of failing, pick the previous thread, even if its
|
||||
// stack is not allocated on the preferred node.
|
||||
if (i > 0)
|
||||
i--;
|
||||
|
||||
break;
|
||||
}
|
||||
|
||||
// vmalloc is not used to allocate the stack, so simply return the
|
||||
// thread, even if its stack may not be allocated on the preferred node
|
||||
if (!is_vmalloc_addr(thread[i]->stack))
|
||||
break;
|
||||
|
||||
// Ran out of attempts - return thread even if its stack may not be
|
||||
// allocated on the preferred node
|
||||
if (i == (attempts - 1))
|
||||
break;
|
||||
|
||||
// Get the NUMA node where the first page of the stack is resident. If
|
||||
// it is the preferred node, select this thread.
|
||||
stack = vmalloc_to_page(thread[i]->stack);
|
||||
if (page_to_nid(stack) == preferred_node)
|
||||
break;
|
||||
}
|
||||
|
||||
for (j = i; j > 0; j--)
|
||||
kthread_stop(thread[j - 1]);
|
||||
|
||||
return thread[i];
|
||||
}
|
||||
|
||||
int nv_kthread_q_init_on_node(nv_kthread_q_t *q, const char *q_name, int preferred_node)
|
||||
{
|
||||
memset(q, 0, sizeof(*q));
|
||||
|
||||
INIT_LIST_HEAD(&q->q_list_head);
|
||||
spin_lock_init(&q->q_lock);
|
||||
sema_init(&q->q_sem, 0);
|
||||
|
||||
if (preferred_node == NV_KTHREAD_NO_NODE) {
|
||||
q->q_kthread = kthread_create(_main_loop, q, q_name);
|
||||
}
|
||||
else {
|
||||
q->q_kthread = thread_create_on_node(_main_loop, q, preferred_node, q_name);
|
||||
}
|
||||
|
||||
if (IS_ERR(q->q_kthread)) {
|
||||
int err = PTR_ERR(q->q_kthread);
|
||||
|
||||
// Clear q_kthread before returning so that nv_kthread_q_stop() can be
|
||||
// safely called on it making error handling easier.
|
||||
q->q_kthread = NULL;
|
||||
|
||||
return err;
|
||||
}
|
||||
|
||||
wake_up_process(q->q_kthread);
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
int nv_kthread_q_init(nv_kthread_q_t *q, const char *qname)
|
||||
{
|
||||
return nv_kthread_q_init_on_node(q, qname, NV_KTHREAD_NO_NODE);
|
||||
}
|
||||
|
||||
// Returns true (non-zero) if the item was actually scheduled, and false if the
|
||||
// item was already pending in a queue.
|
||||
static int _raw_q_schedule(nv_kthread_q_t *q, nv_kthread_q_item_t *q_item)
|
||||
{
|
||||
unsigned long flags;
|
||||
int ret = 1;
|
||||
|
||||
spin_lock_irqsave(&q->q_lock, flags);
|
||||
|
||||
if (likely(list_empty(&q_item->q_list_node)))
|
||||
list_add_tail(&q_item->q_list_node, &q->q_list_head);
|
||||
else
|
||||
ret = 0;
|
||||
|
||||
spin_unlock_irqrestore(&q->q_lock, flags);
|
||||
|
||||
if (likely(ret))
|
||||
up(&q->q_sem);
|
||||
|
||||
return ret;
|
||||
}
|
||||
|
||||
void nv_kthread_q_item_init(nv_kthread_q_item_t *q_item,
|
||||
nv_q_func_t function_to_run,
|
||||
void *function_args)
|
||||
{
|
||||
INIT_LIST_HEAD(&q_item->q_list_node);
|
||||
q_item->function_to_run = function_to_run;
|
||||
q_item->function_args = function_args;
|
||||
}
|
||||
|
||||
// Returns true (non-zero) if the q_item got scheduled, false otherwise.
|
||||
int nv_kthread_q_schedule_q_item(nv_kthread_q_t *q,
|
||||
nv_kthread_q_item_t *q_item)
|
||||
{
|
||||
if (unlikely(atomic_read(&q->main_loop_should_exit))) {
|
||||
NVQ_WARN("Not allowed: nv_kthread_q_schedule_q_item was "
|
||||
"called with a non-alive q: 0x%p\n", q);
|
||||
return 0;
|
||||
}
|
||||
|
||||
return _raw_q_schedule(q, q_item);
|
||||
}
|
||||
|
||||
static void _q_flush_function(void *args)
|
||||
{
|
||||
struct completion *completion = (struct completion *)args;
|
||||
complete(completion);
|
||||
}
|
||||
|
||||
|
||||
static void _raw_q_flush(nv_kthread_q_t *q)
|
||||
{
|
||||
nv_kthread_q_item_t q_item;
|
||||
DECLARE_COMPLETION_ONSTACK(completion);
|
||||
|
||||
nv_kthread_q_item_init(&q_item, _q_flush_function, &completion);
|
||||
|
||||
_raw_q_schedule(q, &q_item);
|
||||
|
||||
// Wait for the flush item to run. Once it has run, then all of the
|
||||
// previously queued items in front of it will have run, so that means
|
||||
// the flush is complete.
|
||||
wait_for_completion(&completion);
|
||||
}
|
||||
|
||||
void nv_kthread_q_flush(nv_kthread_q_t *q)
|
||||
{
|
||||
if (unlikely(atomic_read(&q->main_loop_should_exit))) {
|
||||
NVQ_WARN("Not allowed: nv_kthread_q_flush was called after "
|
||||
"nv_kthread_q_stop. q: 0x%p\n", q);
|
||||
return;
|
||||
}
|
||||
|
||||
// This 2x flush is not a typing mistake. The queue really does have to be
|
||||
// flushed twice, in order to take care of the case of a q_item that
|
||||
// reschedules itself.
|
||||
_raw_q_flush(q);
|
||||
_raw_q_flush(q);
|
||||
}
|
||||
@@ -25,15 +25,6 @@
|
||||
#include <linux/module.h>
|
||||
|
||||
#include "nv-pci-table.h"
|
||||
#include "cpuopsys.h"
|
||||
|
||||
#if defined(NV_BSD)
|
||||
/* Define PCI classes that FreeBSD's linuxkpi is missing */
|
||||
#define PCI_VENDOR_ID_NVIDIA 0x10de
|
||||
#define PCI_CLASS_DISPLAY_VGA 0x0300
|
||||
#define PCI_CLASS_DISPLAY_3D 0x0302
|
||||
#define PCI_CLASS_BRIDGE_OTHER 0x0680
|
||||
#endif
|
||||
|
||||
/* Devices supported by RM */
|
||||
struct pci_device_id nv_pci_table[] = {
|
||||
@@ -57,7 +48,7 @@ struct pci_device_id nv_pci_table[] = {
|
||||
};
|
||||
|
||||
/* Devices supported by all drivers in nvidia.ko */
|
||||
struct pci_device_id nv_module_device_table[4] = {
|
||||
struct pci_device_id nv_module_device_table[] = {
|
||||
{
|
||||
.vendor = PCI_VENDOR_ID_NVIDIA,
|
||||
.device = PCI_ANY_ID,
|
||||
@@ -85,6 +76,4 @@ struct pci_device_id nv_module_device_table[4] = {
|
||||
{ }
|
||||
};
|
||||
|
||||
#if defined(NV_LINUX)
|
||||
MODULE_DEVICE_TABLE(pci, nv_module_device_table);
|
||||
#endif
|
||||
|
||||
@@ -27,6 +27,5 @@
|
||||
#include <linux/pci.h>
|
||||
|
||||
extern struct pci_device_id nv_pci_table[];
|
||||
extern struct pci_device_id nv_module_device_table[4];
|
||||
|
||||
#endif /* _NV_PCI_TABLE_H_ */
|
||||
|
||||
@@ -43,13 +43,9 @@
|
||||
#if defined(NV_LINUX_FENCE_H_PRESENT)
|
||||
typedef struct fence nv_dma_fence_t;
|
||||
typedef struct fence_ops nv_dma_fence_ops_t;
|
||||
typedef struct fence_cb nv_dma_fence_cb_t;
|
||||
typedef fence_func_t nv_dma_fence_func_t;
|
||||
#else
|
||||
typedef struct dma_fence nv_dma_fence_t;
|
||||
typedef struct dma_fence_ops nv_dma_fence_ops_t;
|
||||
typedef struct dma_fence_cb nv_dma_fence_cb_t;
|
||||
typedef dma_fence_func_t nv_dma_fence_func_t;
|
||||
#endif
|
||||
|
||||
#if defined(NV_LINUX_FENCE_H_PRESENT)
|
||||
@@ -101,14 +97,6 @@ static inline int nv_dma_fence_signal(nv_dma_fence_t *fence) {
|
||||
#endif
|
||||
}
|
||||
|
||||
static inline int nv_dma_fence_signal_locked(nv_dma_fence_t *fence) {
|
||||
#if defined(NV_LINUX_FENCE_H_PRESENT)
|
||||
return fence_signal_locked(fence);
|
||||
#else
|
||||
return dma_fence_signal_locked(fence);
|
||||
#endif
|
||||
}
|
||||
|
||||
static inline u64 nv_dma_fence_context_alloc(unsigned num) {
|
||||
#if defined(NV_LINUX_FENCE_H_PRESENT)
|
||||
return fence_context_alloc(num);
|
||||
@@ -120,7 +108,7 @@ static inline u64 nv_dma_fence_context_alloc(unsigned num) {
|
||||
static inline void
|
||||
nv_dma_fence_init(nv_dma_fence_t *fence,
|
||||
const nv_dma_fence_ops_t *ops,
|
||||
spinlock_t *lock, u64 context, uint64_t seqno) {
|
||||
spinlock_t *lock, u64 context, unsigned seqno) {
|
||||
#if defined(NV_LINUX_FENCE_H_PRESENT)
|
||||
fence_init(fence, ops, lock, context, seqno);
|
||||
#else
|
||||
@@ -128,29 +116,6 @@ nv_dma_fence_init(nv_dma_fence_t *fence,
|
||||
#endif
|
||||
}
|
||||
|
||||
static inline void
|
||||
nv_dma_fence_set_error(nv_dma_fence_t *fence,
|
||||
int error) {
|
||||
#if defined(NV_DMA_FENCE_SET_ERROR_PRESENT)
|
||||
return dma_fence_set_error(fence, error);
|
||||
#elif defined(NV_FENCE_SET_ERROR_PRESENT)
|
||||
return fence_set_error(fence, error);
|
||||
#else
|
||||
fence->status = error;
|
||||
#endif
|
||||
}
|
||||
|
||||
static inline int
|
||||
nv_dma_fence_add_callback(nv_dma_fence_t *fence,
|
||||
nv_dma_fence_cb_t *cb,
|
||||
nv_dma_fence_func_t func) {
|
||||
#if defined(NV_LINUX_FENCE_H_PRESENT)
|
||||
return fence_add_callback(fence, cb, func);
|
||||
#else
|
||||
return dma_fence_add_callback(fence, cb, func);
|
||||
#endif
|
||||
}
|
||||
|
||||
#endif /* defined(NV_DRM_FENCE_AVAILABLE) */
|
||||
|
||||
#endif /* __NVIDIA_DMA_FENCE_HELPER_H__ */
|
||||
|
||||
@@ -121,20 +121,6 @@ static inline void nv_dma_resv_add_excl_fence(nv_dma_resv_t *obj,
|
||||
#endif
|
||||
}
|
||||
|
||||
static inline void nv_dma_resv_add_shared_fence(nv_dma_resv_t *obj,
|
||||
nv_dma_fence_t *fence)
|
||||
{
|
||||
#if defined(NV_LINUX_DMA_RESV_H_PRESENT)
|
||||
#if defined(NV_DMA_RESV_ADD_FENCE_PRESENT)
|
||||
dma_resv_add_fence(obj, fence, DMA_RESV_USAGE_READ);
|
||||
#else
|
||||
dma_resv_add_shared_fence(obj, fence);
|
||||
#endif
|
||||
#else
|
||||
reservation_object_add_shared_fence(obj, fence);
|
||||
#endif
|
||||
}
|
||||
|
||||
#endif /* defined(NV_DRM_FENCE_AVAILABLE) */
|
||||
|
||||
#endif /* __NVIDIA_DMA_RESV_HELPER_H__ */
|
||||
|
||||
@@ -24,7 +24,6 @@
|
||||
#define __NVIDIA_DRM_CONFTEST_H__
|
||||
|
||||
#include "conftest.h"
|
||||
#include "nvtypes.h"
|
||||
|
||||
/*
|
||||
* NOTE: This file is expected to get included at the top before including any
|
||||
@@ -62,132 +61,4 @@
|
||||
#undef NV_DRM_FENCE_AVAILABLE
|
||||
#endif
|
||||
|
||||
/*
|
||||
* We can support color management if either drm_helper_crtc_enable_color_mgmt()
|
||||
* or drm_crtc_enable_color_mgmt() exist.
|
||||
*/
|
||||
#if defined(NV_DRM_HELPER_CRTC_ENABLE_COLOR_MGMT_PRESENT) || \
|
||||
defined(NV_DRM_CRTC_ENABLE_COLOR_MGMT_PRESENT)
|
||||
#define NV_DRM_COLOR_MGMT_AVAILABLE
|
||||
#else
|
||||
#undef NV_DRM_COLOR_MGMT_AVAILABLE
|
||||
#endif
|
||||
|
||||
/*
|
||||
* Adapt to quirks in FreeBSD's Linux kernel compatibility layer.
|
||||
*/
|
||||
#if defined(NV_BSD)
|
||||
|
||||
#include <linux/rwsem.h>
|
||||
#include <sys/param.h>
|
||||
#include <sys/lock.h>
|
||||
#include <sys/sx.h>
|
||||
|
||||
/* For nv_drm_gem_prime_force_fence_signal */
|
||||
#ifndef spin_is_locked
|
||||
#define spin_is_locked(lock) mtx_owned(lock.m)
|
||||
#endif
|
||||
|
||||
#ifndef rwsem_is_locked
|
||||
#define rwsem_is_locked(sem) (((sem)->sx.sx_lock & (SX_LOCK_SHARED)) \
|
||||
|| ((sem)->sx.sx_lock & ~(SX_LOCK_FLAGMASK & ~SX_LOCK_SHARED)))
|
||||
#endif
|
||||
|
||||
/*
|
||||
* FreeBSD does not define vm_flags_t in its linuxkpi, since there is already
|
||||
* a FreeBSD vm_flags_t (of a different size) and they don't want the names to
|
||||
* collide. Temporarily redefine it when including nv-mm.h
|
||||
*/
|
||||
#define vm_flags_t unsigned long
|
||||
#include "nv-mm.h"
|
||||
#undef vm_flags_t
|
||||
|
||||
/*
|
||||
* sys/nv.h and nvidia/nv.h have the same header guard
|
||||
* we need to clear it for nvlist_t to get loaded
|
||||
*/
|
||||
#undef _NV_H_
|
||||
#include <sys/nv.h>
|
||||
|
||||
/*
|
||||
* For now just use set_page_dirty as the lock variant
|
||||
* is not ported for FreeBSD. (in progress). This calls
|
||||
* vm_page_dirty. Used in nv-mm.h
|
||||
*/
|
||||
#define set_page_dirty_lock set_page_dirty
|
||||
|
||||
/*
|
||||
* FreeBSD does not implement drm_atomic_state_free, simply
|
||||
* default to drm_atomic_state_put
|
||||
*/
|
||||
#define drm_atomic_state_free drm_atomic_state_put
|
||||
|
||||
#if __FreeBSD_version < 1300000
|
||||
/* redefine LIST_HEAD_INIT to the linux version */
|
||||
#include <linux/list.h>
|
||||
#define LIST_HEAD_INIT(name) LINUX_LIST_HEAD_INIT(name)
|
||||
#endif
|
||||
|
||||
/*
|
||||
* FreeBSD currently has only vmf_insert_pfn_prot defined, and it has a
|
||||
* static assert warning not to use it since all of DRM's usages are in
|
||||
* loops with the vm obj lock(s) held. Instead we should use the lkpi
|
||||
* function itself directly. For us none of this applies so we can just
|
||||
* wrap it in our own definition of vmf_insert_pfn
|
||||
*/
|
||||
#ifndef NV_VMF_INSERT_PFN_PRESENT
|
||||
#define NV_VMF_INSERT_PFN_PRESENT 1
|
||||
|
||||
#if __FreeBSD_version < 1300000
|
||||
#define VM_SHARED (1 << 17)
|
||||
|
||||
/* Not present in 12.2 */
|
||||
static inline vm_fault_t
|
||||
lkpi_vmf_insert_pfn_prot_locked(struct vm_area_struct *vma, unsigned long addr,
|
||||
unsigned long pfn, pgprot_t prot)
|
||||
{
|
||||
vm_object_t vm_obj = vma->vm_obj;
|
||||
vm_page_t page;
|
||||
vm_pindex_t pindex;
|
||||
|
||||
VM_OBJECT_ASSERT_WLOCKED(vm_obj);
|
||||
pindex = OFF_TO_IDX(addr - vma->vm_start);
|
||||
if (vma->vm_pfn_count == 0)
|
||||
vma->vm_pfn_first = pindex;
|
||||
MPASS(pindex <= OFF_TO_IDX(vma->vm_end));
|
||||
|
||||
page = vm_page_grab(vm_obj, pindex, VM_ALLOC_NORMAL);
|
||||
if (page == NULL) {
|
||||
page = PHYS_TO_VM_PAGE(IDX_TO_OFF(pfn));
|
||||
vm_page_xbusy(page);
|
||||
if (vm_page_insert(page, vm_obj, pindex)) {
|
||||
vm_page_xunbusy(page);
|
||||
return (VM_FAULT_OOM);
|
||||
}
|
||||
page->valid = VM_PAGE_BITS_ALL;
|
||||
}
|
||||
pmap_page_set_memattr(page, pgprot2cachemode(prot));
|
||||
vma->vm_pfn_count++;
|
||||
|
||||
return (VM_FAULT_NOPAGE);
|
||||
}
|
||||
#endif
|
||||
|
||||
static inline vm_fault_t
|
||||
vmf_insert_pfn(struct vm_area_struct *vma, unsigned long addr,
|
||||
unsigned long pfn)
|
||||
{
|
||||
vm_fault_t ret;
|
||||
|
||||
VM_OBJECT_WLOCK(vma->vm_obj);
|
||||
ret = lkpi_vmf_insert_pfn_prot_locked(vma, addr, pfn, vma->vm_page_prot);
|
||||
VM_OBJECT_WUNLOCK(vma->vm_obj);
|
||||
|
||||
return (ret);
|
||||
}
|
||||
|
||||
#endif
|
||||
|
||||
#endif /* defined(NV_BSD) */
|
||||
|
||||
#endif /* defined(__NVIDIA_DRM_CONFTEST_H__) */
|
||||
|
||||
@@ -349,125 +349,10 @@ nv_drm_connector_best_encoder(struct drm_connector *connector)
|
||||
return NULL;
|
||||
}
|
||||
|
||||
#if defined(NV_DRM_MODE_CREATE_DP_COLORSPACE_PROPERTY_HAS_SUPPORTED_COLORSPACES_ARG)
|
||||
static const NvU32 __nv_drm_connector_supported_colorspaces =
|
||||
BIT(DRM_MODE_COLORIMETRY_BT2020_RGB) |
|
||||
BIT(DRM_MODE_COLORIMETRY_BT2020_YCC);
|
||||
#endif
|
||||
|
||||
#if defined(NV_DRM_CONNECTOR_ATTACH_HDR_OUTPUT_METADATA_PROPERTY_PRESENT)
|
||||
static int
|
||||
__nv_drm_connector_atomic_check(struct drm_connector *connector,
|
||||
struct drm_atomic_state *state)
|
||||
{
|
||||
struct drm_connector_state *new_connector_state =
|
||||
drm_atomic_get_new_connector_state(state, connector);
|
||||
struct drm_connector_state *old_connector_state =
|
||||
drm_atomic_get_old_connector_state(state, connector);
|
||||
struct nv_drm_device *nv_dev = to_nv_device(connector->dev);
|
||||
|
||||
struct drm_crtc *crtc = new_connector_state->crtc;
|
||||
struct drm_crtc_state *crtc_state;
|
||||
struct nv_drm_crtc_state *nv_crtc_state;
|
||||
struct NvKmsKapiHeadRequestedConfig *req_config;
|
||||
|
||||
if (!crtc) {
|
||||
return 0;
|
||||
}
|
||||
|
||||
crtc_state = drm_atomic_get_new_crtc_state(state, crtc);
|
||||
nv_crtc_state = to_nv_crtc_state(crtc_state);
|
||||
req_config = &nv_crtc_state->req_config;
|
||||
|
||||
/*
|
||||
* Override metadata for the entire head instead of allowing NVKMS to derive
|
||||
* it from the layers' metadata.
|
||||
*
|
||||
* This is the metadata that will sent to the display, and if applicable,
|
||||
* layers will be tone mapped to this metadata rather than that of the
|
||||
* display.
|
||||
*/
|
||||
req_config->flags.hdrInfoFrameChanged =
|
||||
!drm_connector_atomic_hdr_metadata_equal(old_connector_state,
|
||||
new_connector_state);
|
||||
if (new_connector_state->hdr_output_metadata &&
|
||||
new_connector_state->hdr_output_metadata->data) {
|
||||
|
||||
/*
|
||||
* Note that HDMI definitions are used here even though we might not
|
||||
* be using HDMI. While that seems odd, it is consistent with
|
||||
* upstream behavior.
|
||||
*/
|
||||
|
||||
struct hdr_output_metadata *hdr_metadata =
|
||||
new_connector_state->hdr_output_metadata->data;
|
||||
struct hdr_metadata_infoframe *info_frame =
|
||||
&hdr_metadata->hdmi_metadata_type1;
|
||||
unsigned int i;
|
||||
|
||||
if (hdr_metadata->metadata_type != HDMI_STATIC_METADATA_TYPE1) {
|
||||
return -EINVAL;
|
||||
}
|
||||
|
||||
for (i = 0; i < ARRAY_SIZE(info_frame->display_primaries); i++) {
|
||||
req_config->modeSetConfig.hdrInfoFrame.staticMetadata.displayPrimaries[i].x =
|
||||
info_frame->display_primaries[i].x;
|
||||
req_config->modeSetConfig.hdrInfoFrame.staticMetadata.displayPrimaries[i].y =
|
||||
info_frame->display_primaries[i].y;
|
||||
}
|
||||
|
||||
req_config->modeSetConfig.hdrInfoFrame.staticMetadata.whitePoint.x =
|
||||
info_frame->white_point.x;
|
||||
req_config->modeSetConfig.hdrInfoFrame.staticMetadata.whitePoint.y =
|
||||
info_frame->white_point.y;
|
||||
req_config->modeSetConfig.hdrInfoFrame.staticMetadata.maxDisplayMasteringLuminance =
|
||||
info_frame->max_display_mastering_luminance;
|
||||
req_config->modeSetConfig.hdrInfoFrame.staticMetadata.minDisplayMasteringLuminance =
|
||||
info_frame->min_display_mastering_luminance;
|
||||
req_config->modeSetConfig.hdrInfoFrame.staticMetadata.maxCLL =
|
||||
info_frame->max_cll;
|
||||
req_config->modeSetConfig.hdrInfoFrame.staticMetadata.maxFALL =
|
||||
info_frame->max_fall;
|
||||
|
||||
req_config->modeSetConfig.hdrInfoFrame.eotf = info_frame->eotf;
|
||||
|
||||
req_config->modeSetConfig.hdrInfoFrame.enabled = NV_TRUE;
|
||||
} else {
|
||||
req_config->modeSetConfig.hdrInfoFrame.enabled = NV_FALSE;
|
||||
}
|
||||
|
||||
req_config->flags.colorimetryChanged =
|
||||
(old_connector_state->colorspace != new_connector_state->colorspace);
|
||||
// When adding a case here, also add to __nv_drm_connector_supported_colorspaces
|
||||
switch (new_connector_state->colorspace) {
|
||||
case DRM_MODE_COLORIMETRY_DEFAULT:
|
||||
req_config->modeSetConfig.colorimetry =
|
||||
NVKMS_OUTPUT_COLORIMETRY_DEFAULT;
|
||||
break;
|
||||
case DRM_MODE_COLORIMETRY_BT2020_RGB:
|
||||
case DRM_MODE_COLORIMETRY_BT2020_YCC:
|
||||
// Ignore RGB/YCC
|
||||
// See https://patchwork.freedesktop.org/patch/525496/?series=111865&rev=4
|
||||
req_config->modeSetConfig.colorimetry =
|
||||
NVKMS_OUTPUT_COLORIMETRY_BT2100;
|
||||
break;
|
||||
default:
|
||||
// XXX HDR TODO: Add support for more color spaces
|
||||
NV_DRM_DEV_LOG_ERR(nv_dev, "Unsupported color space");
|
||||
return -EINVAL;
|
||||
}
|
||||
|
||||
return 0;
|
||||
}
|
||||
#endif /* defined(NV_DRM_CONNECTOR_ATTACH_HDR_OUTPUT_METADATA_PROPERTY_PRESENT) */
|
||||
|
||||
static const struct drm_connector_helper_funcs nv_connector_helper_funcs = {
|
||||
.get_modes = nv_drm_connector_get_modes,
|
||||
.mode_valid = nv_drm_connector_mode_valid,
|
||||
.best_encoder = nv_drm_connector_best_encoder,
|
||||
#if defined(NV_DRM_CONNECTOR_ATTACH_HDR_OUTPUT_METADATA_PROPERTY_PRESENT)
|
||||
.atomic_check = __nv_drm_connector_atomic_check,
|
||||
#endif
|
||||
};
|
||||
|
||||
static struct drm_connector*
|
||||
@@ -520,32 +405,6 @@ nv_drm_connector_new(struct drm_device *dev,
|
||||
DRM_CONNECTOR_POLL_CONNECT | DRM_CONNECTOR_POLL_DISCONNECT;
|
||||
}
|
||||
|
||||
#if defined(NV_DRM_CONNECTOR_ATTACH_HDR_OUTPUT_METADATA_PROPERTY_PRESENT)
|
||||
if (nv_connector->type == NVKMS_CONNECTOR_TYPE_HDMI) {
|
||||
#if defined(NV_DRM_MODE_CREATE_DP_COLORSPACE_PROPERTY_HAS_SUPPORTED_COLORSPACES_ARG)
|
||||
if (drm_mode_create_hdmi_colorspace_property(
|
||||
&nv_connector->base,
|
||||
__nv_drm_connector_supported_colorspaces) == 0) {
|
||||
#else
|
||||
if (drm_mode_create_hdmi_colorspace_property(&nv_connector->base) == 0) {
|
||||
#endif
|
||||
drm_connector_attach_colorspace_property(&nv_connector->base);
|
||||
}
|
||||
drm_connector_attach_hdr_output_metadata_property(&nv_connector->base);
|
||||
} else if (nv_connector->type == NVKMS_CONNECTOR_TYPE_DP) {
|
||||
#if defined(NV_DRM_MODE_CREATE_DP_COLORSPACE_PROPERTY_HAS_SUPPORTED_COLORSPACES_ARG)
|
||||
if (drm_mode_create_dp_colorspace_property(
|
||||
&nv_connector->base,
|
||||
__nv_drm_connector_supported_colorspaces) == 0) {
|
||||
#else
|
||||
if (drm_mode_create_dp_colorspace_property(&nv_connector->base) == 0) {
|
||||
#endif
|
||||
drm_connector_attach_colorspace_property(&nv_connector->base);
|
||||
}
|
||||
drm_connector_attach_hdr_output_metadata_property(&nv_connector->base);
|
||||
}
|
||||
#endif /* defined(NV_DRM_CONNECTOR_ATTACH_HDR_OUTPUT_METADATA_PROPERTY_PRESENT) */
|
||||
|
||||
/* Register connector with DRM subsystem */
|
||||
|
||||
ret = drm_connector_register(&nv_connector->base);
|
||||
|
||||
@@ -48,11 +48,6 @@
|
||||
#include <linux/host1x-next.h>
|
||||
#endif
|
||||
|
||||
#if defined(NV_DRM_DRM_COLOR_MGMT_H_PRESENT)
|
||||
#include <drm/drm_color_mgmt.h>
|
||||
#endif
|
||||
|
||||
|
||||
#if defined(NV_DRM_HAS_HDR_OUTPUT_METADATA)
|
||||
static int
|
||||
nv_drm_atomic_replace_property_blob_from_id(struct drm_device *dev,
|
||||
@@ -92,22 +87,11 @@ static void nv_drm_plane_destroy(struct drm_plane *plane)
|
||||
nv_drm_free(nv_plane);
|
||||
}
|
||||
|
||||
static inline void
|
||||
plane_config_clear(struct NvKmsKapiLayerConfig *layerConfig)
|
||||
{
|
||||
if (layerConfig == NULL) {
|
||||
return;
|
||||
}
|
||||
|
||||
memset(layerConfig, 0, sizeof(*layerConfig));
|
||||
layerConfig->csc = NVKMS_IDENTITY_CSC_MATRIX;
|
||||
}
|
||||
|
||||
static inline void
|
||||
plane_req_config_disable(struct NvKmsKapiLayerRequestedConfig *req_config)
|
||||
{
|
||||
/* Clear layer config */
|
||||
plane_config_clear(&req_config->config);
|
||||
memset(&req_config->config, 0, sizeof(req_config->config));
|
||||
|
||||
/* Set flags to get cleared layer config applied */
|
||||
req_config->flags.surfaceChanged = NV_TRUE;
|
||||
@@ -124,45 +108,6 @@ cursor_req_config_disable(struct NvKmsKapiCursorRequestedConfig *req_config)
|
||||
req_config->flags.surfaceChanged = NV_TRUE;
|
||||
}
|
||||
|
||||
#if defined(NV_DRM_COLOR_MGMT_AVAILABLE)
|
||||
static void color_mgmt_config_ctm_to_csc(struct NvKmsCscMatrix *nvkms_csc,
|
||||
struct drm_color_ctm *drm_ctm)
|
||||
{
|
||||
int y;
|
||||
|
||||
/* CTM is a 3x3 matrix while ours is 3x4. Zero out the last column. */
|
||||
nvkms_csc->m[0][3] = nvkms_csc->m[1][3] = nvkms_csc->m[2][3] = 0;
|
||||
|
||||
for (y = 0; y < 3; y++) {
|
||||
int x;
|
||||
|
||||
for (x = 0; x < 3; x++) {
|
||||
/*
|
||||
* Values in the CTM are encoded in S31.32 sign-magnitude fixed-
|
||||
* point format, while NvKms CSC values are signed 2's-complement
|
||||
* S15.16 (Ssign-extend12-3.16?) fixed-point format.
|
||||
*/
|
||||
NvU64 ctmVal = drm_ctm->matrix[y*3 + x];
|
||||
NvU64 signBit = ctmVal & (1ULL << 63);
|
||||
NvU64 magnitude = ctmVal & ~signBit;
|
||||
|
||||
/*
|
||||
* Drop the low 16 bits of the fractional part and the high 17 bits
|
||||
* of the integral part. Drop 17 bits to avoid corner cases where
|
||||
* the highest resulting bit is a 1, causing the `cscVal = -cscVal`
|
||||
* line to result in a positive number.
|
||||
*/
|
||||
NvS32 cscVal = (magnitude >> 16) & ((1ULL << 31) - 1);
|
||||
if (signBit) {
|
||||
cscVal = -cscVal;
|
||||
}
|
||||
|
||||
nvkms_csc->m[y][x] = cscVal;
|
||||
}
|
||||
}
|
||||
}
|
||||
#endif /* NV_DRM_COLOR_MGMT_AVAILABLE */
|
||||
|
||||
static void
|
||||
cursor_plane_req_config_update(struct drm_plane *plane,
|
||||
struct drm_plane_state *plane_state,
|
||||
@@ -289,8 +234,6 @@ plane_req_config_update(struct drm_plane *plane,
|
||||
.dstY = plane_state->crtc_y,
|
||||
.dstWidth = plane_state->crtc_w,
|
||||
.dstHeight = plane_state->crtc_h,
|
||||
|
||||
.csc = old_config.csc
|
||||
},
|
||||
};
|
||||
|
||||
@@ -456,25 +399,27 @@ plane_req_config_update(struct drm_plane *plane,
|
||||
}
|
||||
|
||||
for (i = 0; i < ARRAY_SIZE(info_frame->display_primaries); i ++) {
|
||||
req_config->config.hdrMetadata.val.displayPrimaries[i].x =
|
||||
req_config->config.hdrMetadata.displayPrimaries[i].x =
|
||||
info_frame->display_primaries[i].x;
|
||||
req_config->config.hdrMetadata.val.displayPrimaries[i].y =
|
||||
req_config->config.hdrMetadata.displayPrimaries[i].y =
|
||||
info_frame->display_primaries[i].y;
|
||||
}
|
||||
|
||||
req_config->config.hdrMetadata.val.whitePoint.x =
|
||||
req_config->config.hdrMetadata.whitePoint.x =
|
||||
info_frame->white_point.x;
|
||||
req_config->config.hdrMetadata.val.whitePoint.y =
|
||||
req_config->config.hdrMetadata.whitePoint.y =
|
||||
info_frame->white_point.y;
|
||||
req_config->config.hdrMetadata.val.maxDisplayMasteringLuminance =
|
||||
req_config->config.hdrMetadata.maxDisplayMasteringLuminance =
|
||||
info_frame->max_display_mastering_luminance;
|
||||
req_config->config.hdrMetadata.val.minDisplayMasteringLuminance =
|
||||
req_config->config.hdrMetadata.minDisplayMasteringLuminance =
|
||||
info_frame->min_display_mastering_luminance;
|
||||
req_config->config.hdrMetadata.val.maxCLL =
|
||||
req_config->config.hdrMetadata.maxCLL =
|
||||
info_frame->max_cll;
|
||||
req_config->config.hdrMetadata.val.maxFALL =
|
||||
req_config->config.hdrMetadata.maxFALL =
|
||||
info_frame->max_fall;
|
||||
|
||||
req_config->config.hdrMetadataSpecified = true;
|
||||
|
||||
switch (info_frame->eotf) {
|
||||
case HDMI_EOTF_SMPTE_ST2084:
|
||||
req_config->config.tf = NVKMS_OUTPUT_TF_PQ;
|
||||
@@ -487,21 +432,10 @@ plane_req_config_update(struct drm_plane *plane,
|
||||
NV_DRM_DEV_LOG_ERR(nv_dev, "Unsupported EOTF");
|
||||
return -1;
|
||||
}
|
||||
|
||||
req_config->config.hdrMetadata.enabled = true;
|
||||
} else {
|
||||
req_config->config.hdrMetadata.enabled = false;
|
||||
req_config->config.hdrMetadataSpecified = false;
|
||||
req_config->config.tf = NVKMS_OUTPUT_TF_NONE;
|
||||
}
|
||||
|
||||
req_config->flags.hdrMetadataChanged =
|
||||
((old_config.hdrMetadata.enabled !=
|
||||
req_config->config.hdrMetadata.enabled) ||
|
||||
memcmp(&old_config.hdrMetadata.val,
|
||||
&req_config->config.hdrMetadata.val,
|
||||
sizeof(struct NvKmsHDRStaticMetadata)));
|
||||
|
||||
req_config->flags.tfChanged = (old_config.tf != req_config->config.tf);
|
||||
#endif
|
||||
|
||||
/*
|
||||
@@ -630,24 +564,6 @@ static int nv_drm_plane_atomic_check(struct drm_plane *plane,
|
||||
return ret;
|
||||
}
|
||||
|
||||
#if defined(NV_DRM_COLOR_MGMT_AVAILABLE)
|
||||
if (crtc_state->color_mgmt_changed) {
|
||||
/*
|
||||
* According to the comment in the Linux kernel's
|
||||
* drivers/gpu/drm/drm_color_mgmt.c, if this property is NULL,
|
||||
* the CTM needs to be changed to the identity matrix
|
||||
*/
|
||||
if (crtc_state->ctm) {
|
||||
color_mgmt_config_ctm_to_csc(&plane_requested_config->config.csc,
|
||||
(struct drm_color_ctm *)crtc_state->ctm->data);
|
||||
} else {
|
||||
plane_requested_config->config.csc = NVKMS_IDENTITY_CSC_MATRIX;
|
||||
}
|
||||
plane_requested_config->config.cscUseMain = NV_FALSE;
|
||||
plane_requested_config->flags.cscChanged = NV_TRUE;
|
||||
}
|
||||
#endif /* NV_DRM_COLOR_MGMT_AVAILABLE */
|
||||
|
||||
if (__is_async_flip_requested(plane, crtc_state)) {
|
||||
/*
|
||||
* Async flip requests that the flip happen 'as soon as
|
||||
@@ -738,38 +654,6 @@ static int nv_drm_plane_atomic_get_property(
|
||||
return -EINVAL;
|
||||
}
|
||||
|
||||
/**
|
||||
* nv_drm_plane_atomic_reset - plane state reset hook
|
||||
* @plane: DRM plane
|
||||
*
|
||||
* Allocate an empty DRM plane state.
|
||||
*/
|
||||
static void nv_drm_plane_atomic_reset(struct drm_plane *plane)
|
||||
{
|
||||
struct nv_drm_plane_state *nv_plane_state =
|
||||
nv_drm_calloc(1, sizeof(*nv_plane_state));
|
||||
|
||||
if (!nv_plane_state) {
|
||||
return;
|
||||
}
|
||||
|
||||
drm_atomic_helper_plane_reset(plane);
|
||||
|
||||
/*
|
||||
* The drm atomic helper function allocates a state object that is the wrong
|
||||
* size. Copy its contents into the one we allocated above and replace the
|
||||
* pointer.
|
||||
*/
|
||||
if (plane->state) {
|
||||
nv_plane_state->base = *plane->state;
|
||||
kfree(plane->state);
|
||||
plane->state = &nv_plane_state->base;
|
||||
} else {
|
||||
kfree(nv_plane_state);
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
static struct drm_plane_state *
|
||||
nv_drm_plane_atomic_duplicate_state(struct drm_plane *plane)
|
||||
{
|
||||
@@ -808,11 +692,9 @@ static inline void __nv_drm_plane_atomic_destroy_state(
|
||||
#endif
|
||||
|
||||
#if defined(NV_DRM_HAS_HDR_OUTPUT_METADATA)
|
||||
{
|
||||
struct nv_drm_plane_state *nv_drm_plane_state =
|
||||
to_nv_drm_plane_state(state);
|
||||
drm_property_blob_put(nv_drm_plane_state->hdr_output_metadata);
|
||||
}
|
||||
struct nv_drm_plane_state *nv_drm_plane_state =
|
||||
to_nv_drm_plane_state(state);
|
||||
drm_property_blob_put(nv_drm_plane_state->hdr_output_metadata);
|
||||
#endif
|
||||
}
|
||||
|
||||
@@ -829,7 +711,7 @@ static const struct drm_plane_funcs nv_plane_funcs = {
|
||||
.update_plane = drm_atomic_helper_update_plane,
|
||||
.disable_plane = drm_atomic_helper_disable_plane,
|
||||
.destroy = nv_drm_plane_destroy,
|
||||
.reset = nv_drm_plane_atomic_reset,
|
||||
.reset = drm_atomic_helper_plane_reset,
|
||||
.atomic_get_property = nv_drm_plane_atomic_get_property,
|
||||
.atomic_set_property = nv_drm_plane_atomic_set_property,
|
||||
.atomic_duplicate_state = nv_drm_plane_atomic_duplicate_state,
|
||||
@@ -886,52 +768,6 @@ static inline void nv_drm_crtc_duplicate_req_head_modeset_config(
|
||||
}
|
||||
}
|
||||
|
||||
static inline struct nv_drm_crtc_state *nv_drm_crtc_state_alloc(void)
|
||||
{
|
||||
struct nv_drm_crtc_state *nv_state = nv_drm_calloc(1, sizeof(*nv_state));
|
||||
int i;
|
||||
|
||||
if (nv_state == NULL) {
|
||||
return NULL;
|
||||
}
|
||||
|
||||
for (i = 0; i < ARRAY_SIZE(nv_state->req_config.layerRequestedConfig); i++) {
|
||||
plane_config_clear(&nv_state->req_config.layerRequestedConfig[i].config);
|
||||
}
|
||||
return nv_state;
|
||||
}
|
||||
|
||||
|
||||
/**
|
||||
* nv_drm_atomic_crtc_reset - crtc state reset hook
|
||||
* @crtc: DRM crtc
|
||||
*
|
||||
* Allocate an empty DRM crtc state.
|
||||
*/
|
||||
static void nv_drm_atomic_crtc_reset(struct drm_crtc *crtc)
|
||||
{
|
||||
struct nv_drm_crtc_state *nv_state = nv_drm_crtc_state_alloc();
|
||||
|
||||
if (!nv_state) {
|
||||
return;
|
||||
}
|
||||
|
||||
drm_atomic_helper_crtc_reset(crtc);
|
||||
|
||||
/*
|
||||
* The drm atomic helper function allocates a state object that is the wrong
|
||||
* size. Copy its contents into the one we allocated above and replace the
|
||||
* pointer.
|
||||
*/
|
||||
if (crtc->state) {
|
||||
nv_state->base = *crtc->state;
|
||||
kfree(crtc->state);
|
||||
crtc->state = &nv_state->base;
|
||||
} else {
|
||||
kfree(nv_state);
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* nv_drm_atomic_crtc_duplicate_state - crtc state duplicate hook
|
||||
* @crtc: DRM crtc
|
||||
@@ -943,7 +779,7 @@ static void nv_drm_atomic_crtc_reset(struct drm_crtc *crtc)
|
||||
static struct drm_crtc_state*
|
||||
nv_drm_atomic_crtc_duplicate_state(struct drm_crtc *crtc)
|
||||
{
|
||||
struct nv_drm_crtc_state *nv_state = nv_drm_crtc_state_alloc();
|
||||
struct nv_drm_crtc_state *nv_state = nv_drm_calloc(1, sizeof(*nv_state));
|
||||
|
||||
if (nv_state == NULL) {
|
||||
return NULL;
|
||||
@@ -964,9 +800,6 @@ nv_drm_atomic_crtc_duplicate_state(struct drm_crtc *crtc)
|
||||
&(to_nv_crtc_state(crtc->state)->req_config),
|
||||
&nv_state->req_config);
|
||||
|
||||
nv_state->ilut_ramps = NULL;
|
||||
nv_state->olut_ramps = NULL;
|
||||
|
||||
return &nv_state->base;
|
||||
}
|
||||
|
||||
@@ -990,22 +823,16 @@ static void nv_drm_atomic_crtc_destroy_state(struct drm_crtc *crtc,
|
||||
|
||||
__nv_drm_atomic_helper_crtc_destroy_state(crtc, &nv_state->base);
|
||||
|
||||
nv_drm_free(nv_state->ilut_ramps);
|
||||
nv_drm_free(nv_state->olut_ramps);
|
||||
|
||||
nv_drm_free(nv_state);
|
||||
}
|
||||
|
||||
static struct drm_crtc_funcs nv_crtc_funcs = {
|
||||
.set_config = drm_atomic_helper_set_config,
|
||||
.page_flip = drm_atomic_helper_page_flip,
|
||||
.reset = nv_drm_atomic_crtc_reset,
|
||||
.reset = drm_atomic_helper_crtc_reset,
|
||||
.destroy = nv_drm_crtc_destroy,
|
||||
.atomic_duplicate_state = nv_drm_atomic_crtc_duplicate_state,
|
||||
.atomic_destroy_state = nv_drm_atomic_crtc_destroy_state,
|
||||
#if defined(NV_DRM_ATOMIC_HELPER_LEGACY_GAMMA_SET_PRESENT)
|
||||
.gamma_set = drm_atomic_helper_legacy_gamma_set,
|
||||
#endif
|
||||
};
|
||||
|
||||
/*
|
||||
@@ -1039,132 +866,6 @@ static int head_modeset_config_attach_connector(
|
||||
return 0;
|
||||
}
|
||||
|
||||
#if defined(NV_DRM_COLOR_MGMT_AVAILABLE)
|
||||
static int color_mgmt_config_copy_lut(struct NvKmsLutRamps *nvkms_lut,
|
||||
struct drm_color_lut *drm_lut,
|
||||
uint64_t lut_len)
|
||||
{
|
||||
uint64_t i = 0;
|
||||
if (lut_len != NVKMS_LUT_ARRAY_SIZE) {
|
||||
return -EINVAL;
|
||||
}
|
||||
|
||||
/*
|
||||
* Both NvKms and drm LUT values are 16-bit linear values. NvKms LUT ramps
|
||||
* are in arrays in a single struct while drm LUT ramps are an array of
|
||||
* structs.
|
||||
*/
|
||||
for (i = 0; i < lut_len; i++) {
|
||||
nvkms_lut->red[i] = drm_lut[i].red;
|
||||
nvkms_lut->green[i] = drm_lut[i].green;
|
||||
nvkms_lut->blue[i] = drm_lut[i].blue;
|
||||
}
|
||||
return 0;
|
||||
}
|
||||
|
||||
static int color_mgmt_config_set_luts(struct nv_drm_crtc_state *nv_crtc_state,
|
||||
struct NvKmsKapiHeadRequestedConfig *req_config)
|
||||
{
|
||||
struct NvKmsKapiHeadModeSetConfig *modeset_config =
|
||||
&req_config->modeSetConfig;
|
||||
struct drm_crtc_state *crtc_state = &nv_crtc_state->base;
|
||||
int ret = 0;
|
||||
|
||||
/*
|
||||
* According to the comment in the Linux kernel's
|
||||
* drivers/gpu/drm/drm_color_mgmt.c, if either property is NULL, that LUT
|
||||
* needs to be changed to a linear LUT
|
||||
*/
|
||||
|
||||
req_config->flags.lutChanged = NV_TRUE;
|
||||
if (crtc_state->degamma_lut) {
|
||||
struct drm_color_lut *degamma_lut = NULL;
|
||||
uint64_t degamma_len = 0;
|
||||
|
||||
nv_crtc_state->ilut_ramps = nv_drm_calloc(1, sizeof(*nv_crtc_state->ilut_ramps));
|
||||
if (!nv_crtc_state->ilut_ramps) {
|
||||
ret = -ENOMEM;
|
||||
goto fail;
|
||||
}
|
||||
|
||||
degamma_lut = (struct drm_color_lut *)crtc_state->degamma_lut->data;
|
||||
degamma_len = crtc_state->degamma_lut->length /
|
||||
sizeof(struct drm_color_lut);
|
||||
|
||||
if ((ret = color_mgmt_config_copy_lut(nv_crtc_state->ilut_ramps,
|
||||
degamma_lut,
|
||||
degamma_len)) != 0) {
|
||||
goto fail;
|
||||
}
|
||||
|
||||
modeset_config->lut.input.specified = NV_TRUE;
|
||||
modeset_config->lut.input.depth = 30; /* specify the full LUT */
|
||||
modeset_config->lut.input.start = 0;
|
||||
modeset_config->lut.input.end = degamma_len - 1;
|
||||
modeset_config->lut.input.pRamps = nv_crtc_state->ilut_ramps;
|
||||
} else {
|
||||
/* setting input.end to 0 is equivalent to disabling the LUT, which
|
||||
* should be equivalent to a linear LUT */
|
||||
modeset_config->lut.input.specified = NV_TRUE;
|
||||
modeset_config->lut.input.depth = 30; /* specify the full LUT */
|
||||
modeset_config->lut.input.start = 0;
|
||||
modeset_config->lut.input.end = 0;
|
||||
modeset_config->lut.input.pRamps = NULL;
|
||||
|
||||
}
|
||||
|
||||
if (crtc_state->gamma_lut) {
|
||||
struct drm_color_lut *gamma_lut = NULL;
|
||||
uint64_t gamma_len = 0;
|
||||
|
||||
nv_crtc_state->olut_ramps = nv_drm_calloc(1, sizeof(*nv_crtc_state->olut_ramps));
|
||||
if (!nv_crtc_state->olut_ramps) {
|
||||
ret = -ENOMEM;
|
||||
goto fail;
|
||||
}
|
||||
|
||||
gamma_lut = (struct drm_color_lut *)crtc_state->gamma_lut->data;
|
||||
gamma_len = crtc_state->gamma_lut->length /
|
||||
sizeof(struct drm_color_lut);
|
||||
|
||||
if ((ret = color_mgmt_config_copy_lut(nv_crtc_state->olut_ramps,
|
||||
gamma_lut,
|
||||
gamma_len)) != 0) {
|
||||
goto fail;
|
||||
}
|
||||
|
||||
modeset_config->lut.output.specified = NV_TRUE;
|
||||
modeset_config->lut.output.enabled = NV_TRUE;
|
||||
modeset_config->lut.output.pRamps = nv_crtc_state->olut_ramps;
|
||||
} else {
|
||||
/* disabling the output LUT should be equivalent to setting a linear
|
||||
* LUT */
|
||||
modeset_config->lut.output.specified = NV_TRUE;
|
||||
modeset_config->lut.output.enabled = NV_FALSE;
|
||||
modeset_config->lut.output.pRamps = NULL;
|
||||
}
|
||||
|
||||
return 0;
|
||||
|
||||
fail:
|
||||
/* free allocated state */
|
||||
nv_drm_free(nv_crtc_state->ilut_ramps);
|
||||
nv_drm_free(nv_crtc_state->olut_ramps);
|
||||
|
||||
/* remove dangling pointers */
|
||||
nv_crtc_state->ilut_ramps = NULL;
|
||||
nv_crtc_state->olut_ramps = NULL;
|
||||
modeset_config->lut.input.pRamps = NULL;
|
||||
modeset_config->lut.output.pRamps = NULL;
|
||||
|
||||
/* prevent attempts at reading NULLs */
|
||||
modeset_config->lut.input.specified = NV_FALSE;
|
||||
modeset_config->lut.output.specified = NV_FALSE;
|
||||
|
||||
return ret;
|
||||
}
|
||||
#endif /* NV_DRM_COLOR_MGMT_AVAILABLE */
|
||||
|
||||
/**
|
||||
* nv_drm_crtc_atomic_check() can fail after it has modified
|
||||
* the 'nv_drm_crtc_state::req_config', that is fine because 'nv_drm_crtc_state'
|
||||
@@ -1186,9 +887,6 @@ static int nv_drm_crtc_atomic_check(struct drm_crtc *crtc,
|
||||
struct NvKmsKapiHeadRequestedConfig *req_config =
|
||||
&nv_crtc_state->req_config;
|
||||
int ret = 0;
|
||||
#if defined(NV_DRM_COLOR_MGMT_AVAILABLE)
|
||||
struct nv_drm_device *nv_dev = to_nv_device(crtc_state->crtc->dev);
|
||||
#endif
|
||||
|
||||
if (crtc_state->mode_changed) {
|
||||
drm_mode_to_nvkms_display_mode(&crtc_state->mode,
|
||||
@@ -1227,25 +925,6 @@ static int nv_drm_crtc_atomic_check(struct drm_crtc *crtc,
|
||||
req_config->flags.activeChanged = NV_TRUE;
|
||||
}
|
||||
|
||||
#if defined(NV_DRM_CRTC_STATE_HAS_VRR_ENABLED)
|
||||
req_config->modeSetConfig.vrrEnabled = crtc_state->vrr_enabled;
|
||||
#endif
|
||||
|
||||
#if defined(NV_DRM_COLOR_MGMT_AVAILABLE)
|
||||
if (nv_dev->drmMasterChangedSinceLastAtomicCommit &&
|
||||
(crtc_state->degamma_lut ||
|
||||
crtc_state->ctm ||
|
||||
crtc_state->gamma_lut)) {
|
||||
|
||||
crtc_state->color_mgmt_changed = NV_TRUE;
|
||||
}
|
||||
if (crtc_state->color_mgmt_changed) {
|
||||
if ((ret = color_mgmt_config_set_luts(nv_crtc_state, req_config)) != 0) {
|
||||
return ret;
|
||||
}
|
||||
}
|
||||
#endif
|
||||
|
||||
return ret;
|
||||
}
|
||||
|
||||
@@ -1477,8 +1156,6 @@ nv_drm_plane_create(struct drm_device *dev,
|
||||
plane,
|
||||
validLayerRRTransforms);
|
||||
|
||||
nv_drm_free(formats);
|
||||
|
||||
return plane;
|
||||
|
||||
failed_plane_init:
|
||||
@@ -1510,7 +1187,7 @@ static struct drm_crtc *__nv_drm_crtc_create(struct nv_drm_device *nv_dev,
|
||||
goto failed;
|
||||
}
|
||||
|
||||
nv_state = nv_drm_crtc_state_alloc();
|
||||
nv_state = nv_drm_calloc(1, sizeof(*nv_state));
|
||||
if (nv_state == NULL) {
|
||||
goto failed_state_alloc;
|
||||
}
|
||||
@@ -1543,22 +1220,6 @@ static struct drm_crtc *__nv_drm_crtc_create(struct nv_drm_device *nv_dev,
|
||||
|
||||
drm_crtc_helper_add(&nv_crtc->base, &nv_crtc_helper_funcs);
|
||||
|
||||
#if defined(NV_DRM_COLOR_MGMT_AVAILABLE)
|
||||
#if defined(NV_DRM_CRTC_ENABLE_COLOR_MGMT_PRESENT)
|
||||
drm_crtc_enable_color_mgmt(&nv_crtc->base, NVKMS_LUT_ARRAY_SIZE, true,
|
||||
NVKMS_LUT_ARRAY_SIZE);
|
||||
#else
|
||||
drm_helper_crtc_enable_color_mgmt(&nv_crtc->base, NVKMS_LUT_ARRAY_SIZE,
|
||||
NVKMS_LUT_ARRAY_SIZE);
|
||||
#endif
|
||||
ret = drm_mode_crtc_set_gamma_size(&nv_crtc->base, NVKMS_LUT_ARRAY_SIZE);
|
||||
if (ret != 0) {
|
||||
NV_DRM_DEV_LOG_WARN(
|
||||
nv_dev,
|
||||
"Failed to initialize legacy gamma support for head %u", head);
|
||||
}
|
||||
#endif
|
||||
|
||||
return &nv_crtc->base;
|
||||
|
||||
failed_init_crtc:
|
||||
@@ -1667,16 +1328,10 @@ static void NvKmsKapiCrcsToDrm(const struct NvKmsKapiCrcs *crcs,
|
||||
{
|
||||
drmCrcs->outputCrc32.value = crcs->outputCrc32.value;
|
||||
drmCrcs->outputCrc32.supported = crcs->outputCrc32.supported;
|
||||
drmCrcs->outputCrc32.__pad0 = 0;
|
||||
drmCrcs->outputCrc32.__pad1 = 0;
|
||||
drmCrcs->rasterGeneratorCrc32.value = crcs->rasterGeneratorCrc32.value;
|
||||
drmCrcs->rasterGeneratorCrc32.supported = crcs->rasterGeneratorCrc32.supported;
|
||||
drmCrcs->rasterGeneratorCrc32.__pad0 = 0;
|
||||
drmCrcs->rasterGeneratorCrc32.__pad1 = 0;
|
||||
drmCrcs->compositorCrc32.value = crcs->compositorCrc32.value;
|
||||
drmCrcs->compositorCrc32.supported = crcs->compositorCrc32.supported;
|
||||
drmCrcs->compositorCrc32.__pad0 = 0;
|
||||
drmCrcs->compositorCrc32.__pad1 = 0;
|
||||
}
|
||||
|
||||
int nv_drm_get_crtc_crc32_v2_ioctl(struct drm_device *dev,
|
||||
|
||||
@@ -129,9 +129,6 @@ struct nv_drm_crtc_state {
|
||||
*/
|
||||
struct NvKmsKapiHeadRequestedConfig req_config;
|
||||
|
||||
struct NvKmsLutRamps *ilut_ramps;
|
||||
struct NvKmsLutRamps *olut_ramps;
|
||||
|
||||
/**
|
||||
* @nv_flip:
|
||||
*
|
||||
|
||||
@@ -44,10 +44,6 @@
|
||||
#include <drm/drmP.h>
|
||||
#endif
|
||||
|
||||
#if defined(NV_DRM_DRM_ATOMIC_UAPI_H_PRESENT)
|
||||
#include <drm/drm_atomic_uapi.h>
|
||||
#endif
|
||||
|
||||
#if defined(NV_DRM_DRM_VBLANK_H_PRESENT)
|
||||
#include <drm/drm_vblank.h>
|
||||
#endif
|
||||
@@ -64,17 +60,7 @@
|
||||
#include <drm/drm_ioctl.h>
|
||||
#endif
|
||||
|
||||
#if defined(NV_DRM_FBDEV_GENERIC_AVAILABLE)
|
||||
#include <drm/drm_aperture.h>
|
||||
#include <drm/drm_fb_helper.h>
|
||||
#endif
|
||||
|
||||
#if defined(NV_DRM_DRM_FBDEV_GENERIC_H_PRESENT)
|
||||
#include <drm/drm_fbdev_generic.h>
|
||||
#endif
|
||||
|
||||
#include <linux/pci.h>
|
||||
#include <linux/workqueue.h>
|
||||
|
||||
/*
|
||||
* Commit fcd70cd36b9b ("drm: Split out drm_probe_helper.h")
|
||||
@@ -98,11 +84,6 @@
|
||||
#include <drm/drm_atomic_helper.h>
|
||||
#endif
|
||||
|
||||
static int nv_drm_revoke_modeset_permission(struct drm_device *dev,
|
||||
struct drm_file *filep,
|
||||
NvU32 dpyId);
|
||||
static int nv_drm_revoke_sub_ownership(struct drm_device *dev);
|
||||
|
||||
static struct nv_drm_device *dev_list = NULL;
|
||||
|
||||
static const char* nv_get_input_colorspace_name(
|
||||
@@ -406,27 +387,6 @@ static int nv_drm_create_properties(struct nv_drm_device *nv_dev)
|
||||
return 0;
|
||||
}
|
||||
|
||||
#if defined(NV_DRM_ATOMIC_MODESET_AVAILABLE)
|
||||
/*
|
||||
* We can't just call drm_kms_helper_hotplug_event directly because
|
||||
* fbdev_generic may attempt to set a mode from inside the hotplug event
|
||||
* handler. Because kapi event handling runs on nvkms_kthread_q, this blocks
|
||||
* other event processing including the flip completion notifier expected by
|
||||
* nv_drm_atomic_commit.
|
||||
*
|
||||
* Defer hotplug event handling to a work item so that nvkms_kthread_q can
|
||||
* continue processing events while a DRM modeset is in progress.
|
||||
*/
|
||||
static void nv_drm_handle_hotplug_event(struct work_struct *work)
|
||||
{
|
||||
struct delayed_work *dwork = to_delayed_work(work);
|
||||
struct nv_drm_device *nv_dev =
|
||||
container_of(dwork, struct nv_drm_device, hotplug_event_work);
|
||||
|
||||
drm_kms_helper_hotplug_event(nv_dev->dev);
|
||||
}
|
||||
#endif
|
||||
|
||||
static int nv_drm_load(struct drm_device *dev, unsigned long flags)
|
||||
{
|
||||
#if defined(NV_DRM_ATOMIC_MODESET_AVAILABLE)
|
||||
@@ -500,11 +460,6 @@ static int nv_drm_load(struct drm_device *dev, unsigned long flags)
|
||||
|
||||
nv_dev->supportsSyncpts = resInfo.caps.supportsSyncpts;
|
||||
|
||||
nv_dev->semsurf_stride = resInfo.caps.semsurf.stride;
|
||||
|
||||
nv_dev->semsurf_max_submitted_offset =
|
||||
resInfo.caps.semsurf.maxSubmittedOffset;
|
||||
|
||||
#if defined(NV_DRM_FORMAT_MODIFIERS_PRESENT)
|
||||
gen = nv_dev->pageKindGeneration;
|
||||
kind = nv_dev->genericPageKind;
|
||||
@@ -562,7 +517,6 @@ static int nv_drm_load(struct drm_device *dev, unsigned long flags)
|
||||
|
||||
/* Enable event handling */
|
||||
|
||||
INIT_DELAYED_WORK(&nv_dev->hotplug_event_work, nv_drm_handle_hotplug_event);
|
||||
atomic_set(&nv_dev->enable_event_handling, true);
|
||||
|
||||
init_waitqueue_head(&nv_dev->flip_event_wq);
|
||||
@@ -590,11 +544,8 @@ static void __nv_drm_unload(struct drm_device *dev)
|
||||
return;
|
||||
}
|
||||
|
||||
cancel_delayed_work_sync(&nv_dev->hotplug_event_work);
|
||||
mutex_lock(&nv_dev->lock);
|
||||
|
||||
WARN_ON(nv_dev->subOwnershipGranted);
|
||||
|
||||
/* Disable event handling */
|
||||
|
||||
atomic_set(&nv_dev->enable_event_handling, false);
|
||||
@@ -644,15 +595,9 @@ static int __nv_drm_master_set(struct drm_device *dev,
|
||||
{
|
||||
struct nv_drm_device *nv_dev = to_nv_device(dev);
|
||||
|
||||
/*
|
||||
* If this device is driving a framebuffer, then nvidia-drm already has
|
||||
* modeset ownership. Otherwise, grab ownership now.
|
||||
*/
|
||||
if (!nv_dev->hasFramebufferConsole &&
|
||||
!nvKms->grabOwnership(nv_dev->pDevice)) {
|
||||
if (!nvKms->grabOwnership(nv_dev->pDevice)) {
|
||||
return -EINVAL;
|
||||
}
|
||||
nv_dev->drmMasterChangedSinceLastAtomicCommit = NV_TRUE;
|
||||
|
||||
return 0;
|
||||
}
|
||||
@@ -686,9 +631,6 @@ void nv_drm_master_drop(struct drm_device *dev, struct drm_file *file_priv)
|
||||
struct nv_drm_device *nv_dev = to_nv_device(dev);
|
||||
int err;
|
||||
|
||||
nv_drm_revoke_modeset_permission(dev, file_priv, 0);
|
||||
nv_drm_revoke_sub_ownership(dev);
|
||||
|
||||
/*
|
||||
* After dropping nvkms modeset onwership, it is not guaranteed that
|
||||
* drm and nvkms modeset state will remain in sync. Therefore, disable
|
||||
@@ -713,9 +655,7 @@ void nv_drm_master_drop(struct drm_device *dev, struct drm_file *file_priv)
|
||||
|
||||
drm_modeset_unlock_all(dev);
|
||||
|
||||
if (!nv_dev->hasFramebufferConsole) {
|
||||
nvKms->releaseOwnership(nv_dev->pDevice);
|
||||
}
|
||||
nvKms->releaseOwnership(nv_dev->pDevice);
|
||||
}
|
||||
#endif /* NV_DRM_ATOMIC_MODESET_AVAILABLE */
|
||||
|
||||
@@ -753,30 +693,15 @@ static int nv_drm_get_dev_info_ioctl(struct drm_device *dev,
|
||||
|
||||
params->gpu_id = nv_dev->gpu_info.gpu_id;
|
||||
params->primary_index = dev->primary->index;
|
||||
params->supports_alloc = false;
|
||||
#if defined(NV_DRM_ATOMIC_MODESET_AVAILABLE)
|
||||
params->generic_page_kind = nv_dev->genericPageKind;
|
||||
params->page_kind_generation = nv_dev->pageKindGeneration;
|
||||
params->sector_layout = nv_dev->sectorLayout;
|
||||
#else
|
||||
params->generic_page_kind = 0;
|
||||
params->page_kind_generation = 0;
|
||||
params->sector_layout = 0;
|
||||
params->supports_sync_fd = false;
|
||||
params->supports_semsurf = false;
|
||||
|
||||
#if defined(NV_DRM_ATOMIC_MODESET_AVAILABLE)
|
||||
/* Memory allocation and semaphore surfaces are only supported
|
||||
* if the modeset = 1 parameter is set */
|
||||
if (nv_dev->pDevice != NULL) {
|
||||
params->supports_alloc = true;
|
||||
params->generic_page_kind = nv_dev->genericPageKind;
|
||||
params->page_kind_generation = nv_dev->pageKindGeneration;
|
||||
params->sector_layout = nv_dev->sectorLayout;
|
||||
|
||||
if (nv_dev->semsurf_stride != 0) {
|
||||
params->supports_semsurf = true;
|
||||
#if defined(NV_SYNC_FILE_GET_FENCE_PRESENT)
|
||||
params->supports_sync_fd = true;
|
||||
#endif /* defined(NV_SYNC_FILE_GET_FENCE_PRESENT) */
|
||||
}
|
||||
}
|
||||
#endif /* defined(NV_DRM_ATOMIC_MODESET_AVAILABLE) */
|
||||
#endif
|
||||
|
||||
return 0;
|
||||
}
|
||||
@@ -908,10 +833,10 @@ static NvU32 nv_drm_get_head_bit_from_connector(struct drm_connector *connector)
|
||||
return 0;
|
||||
}
|
||||
|
||||
static int nv_drm_grant_modeset_permission(struct drm_device *dev,
|
||||
struct drm_nvidia_grant_permissions_params *params,
|
||||
struct drm_file *filep)
|
||||
static int nv_drm_grant_permission_ioctl(struct drm_device *dev, void *data,
|
||||
struct drm_file *filep)
|
||||
{
|
||||
struct drm_nvidia_grant_permissions_params *params = data;
|
||||
struct nv_drm_device *nv_dev = to_nv_device(dev);
|
||||
struct nv_drm_connector *target_nv_connector = NULL;
|
||||
struct nv_drm_crtc *target_nv_crtc = NULL;
|
||||
@@ -1033,102 +958,26 @@ done:
|
||||
return ret;
|
||||
}
|
||||
|
||||
static int nv_drm_grant_sub_ownership(struct drm_device *dev,
|
||||
struct drm_nvidia_grant_permissions_params *params)
|
||||
static bool nv_drm_revoke_connector(struct nv_drm_device *nv_dev,
|
||||
struct nv_drm_connector *nv_connector)
|
||||
{
|
||||
int ret = -EINVAL;
|
||||
struct nv_drm_device *nv_dev = to_nv_device(dev);
|
||||
struct drm_modeset_acquire_ctx *pctx;
|
||||
#if NV_DRM_MODESET_LOCK_ALL_END_ARGUMENT_COUNT == 3
|
||||
struct drm_modeset_acquire_ctx ctx;
|
||||
DRM_MODESET_LOCK_ALL_BEGIN(dev, ctx, DRM_MODESET_ACQUIRE_INTERRUPTIBLE,
|
||||
ret);
|
||||
pctx = &ctx;
|
||||
#else
|
||||
mutex_lock(&dev->mode_config.mutex);
|
||||
pctx = dev->mode_config.acquire_ctx;
|
||||
#endif
|
||||
|
||||
if (nv_dev->subOwnershipGranted ||
|
||||
!nvKms->grantSubOwnership(params->fd, nv_dev->pDevice)) {
|
||||
goto done;
|
||||
}
|
||||
|
||||
/*
|
||||
* When creating an ownership grant, shut down all heads and disable flip
|
||||
* notifications.
|
||||
*/
|
||||
ret = nv_drm_atomic_helper_disable_all(dev, pctx);
|
||||
if (ret != 0) {
|
||||
NV_DRM_DEV_LOG_ERR(
|
||||
nv_dev,
|
||||
"nv_drm_atomic_helper_disable_all failed with error code %d!",
|
||||
ret);
|
||||
}
|
||||
|
||||
atomic_set(&nv_dev->enable_event_handling, false);
|
||||
nv_dev->subOwnershipGranted = NV_TRUE;
|
||||
|
||||
ret = 0;
|
||||
|
||||
done:
|
||||
#if NV_DRM_MODESET_LOCK_ALL_END_ARGUMENT_COUNT == 3
|
||||
DRM_MODESET_LOCK_ALL_END(dev, ctx, ret);
|
||||
#else
|
||||
mutex_unlock(&dev->mode_config.mutex);
|
||||
#endif
|
||||
return 0;
|
||||
}
|
||||
|
||||
static int nv_drm_grant_permission_ioctl(struct drm_device *dev, void *data,
|
||||
struct drm_file *filep)
|
||||
{
|
||||
struct drm_nvidia_grant_permissions_params *params = data;
|
||||
|
||||
if (params->type == NV_DRM_PERMISSIONS_TYPE_MODESET) {
|
||||
return nv_drm_grant_modeset_permission(dev, params, filep);
|
||||
} else if (params->type == NV_DRM_PERMISSIONS_TYPE_SUB_OWNER) {
|
||||
return nv_drm_grant_sub_ownership(dev, params);
|
||||
}
|
||||
|
||||
return -EINVAL;
|
||||
}
|
||||
|
||||
static int
|
||||
nv_drm_atomic_disable_connector(struct drm_atomic_state *state,
|
||||
struct nv_drm_connector *nv_connector)
|
||||
{
|
||||
struct drm_crtc_state *crtc_state;
|
||||
struct drm_connector_state *connector_state;
|
||||
int ret = 0;
|
||||
|
||||
bool ret = true;
|
||||
if (nv_connector->modeset_permission_crtc) {
|
||||
crtc_state = drm_atomic_get_crtc_state(
|
||||
state, &nv_connector->modeset_permission_crtc->base);
|
||||
if (!crtc_state) {
|
||||
return -EINVAL;
|
||||
}
|
||||
|
||||
crtc_state->active = false;
|
||||
ret = drm_atomic_set_mode_prop_for_crtc(crtc_state, NULL);
|
||||
if (ret < 0) {
|
||||
return ret;
|
||||
if (nv_connector->nv_detected_encoder) {
|
||||
ret = nvKms->revokePermissions(
|
||||
nv_dev->pDevice, nv_connector->modeset_permission_crtc->head,
|
||||
nv_connector->nv_detected_encoder->hDisplay);
|
||||
}
|
||||
nv_connector->modeset_permission_crtc->modeset_permission_filep = NULL;
|
||||
nv_connector->modeset_permission_crtc = NULL;
|
||||
}
|
||||
|
||||
connector_state = drm_atomic_get_connector_state(state, &nv_connector->base);
|
||||
if (!connector_state) {
|
||||
return -EINVAL;
|
||||
}
|
||||
|
||||
return drm_atomic_set_crtc_for_connector(connector_state, NULL);
|
||||
nv_connector->modeset_permission_filep = NULL;
|
||||
return ret;
|
||||
}
|
||||
|
||||
static int nv_drm_revoke_modeset_permission(struct drm_device *dev,
|
||||
struct drm_file *filep, NvU32 dpyId)
|
||||
static int nv_drm_revoke_permission(struct drm_device *dev,
|
||||
struct drm_file *filep, NvU32 dpyId)
|
||||
{
|
||||
struct drm_modeset_acquire_ctx *pctx;
|
||||
struct drm_atomic_state *state;
|
||||
struct drm_connector *connector;
|
||||
struct drm_crtc *crtc;
|
||||
int ret = 0;
|
||||
@@ -1139,19 +988,10 @@ static int nv_drm_revoke_modeset_permission(struct drm_device *dev,
|
||||
struct drm_modeset_acquire_ctx ctx;
|
||||
DRM_MODESET_LOCK_ALL_BEGIN(dev, ctx, DRM_MODESET_ACQUIRE_INTERRUPTIBLE,
|
||||
ret);
|
||||
pctx = &ctx;
|
||||
#else
|
||||
mutex_lock(&dev->mode_config.mutex);
|
||||
pctx = dev->mode_config.acquire_ctx;
|
||||
#endif
|
||||
|
||||
state = drm_atomic_state_alloc(dev);
|
||||
if (!state) {
|
||||
ret = -ENOMEM;
|
||||
goto done;
|
||||
}
|
||||
state->acquire_ctx = pctx;
|
||||
|
||||
/*
|
||||
* If dpyId is set, only revoke those specific resources. Otherwise,
|
||||
* it is from closing the file so revoke all resources for that filep.
|
||||
@@ -1163,13 +1003,10 @@ static int nv_drm_revoke_modeset_permission(struct drm_device *dev,
|
||||
struct nv_drm_connector *nv_connector = to_nv_connector(connector);
|
||||
if (nv_connector->modeset_permission_filep == filep &&
|
||||
(!dpyId || nv_drm_connector_is_dpy_id(connector, dpyId))) {
|
||||
ret = nv_drm_atomic_disable_connector(state, nv_connector);
|
||||
if (ret < 0) {
|
||||
goto done;
|
||||
if (!nv_drm_connector_revoke_permissions(dev, nv_connector)) {
|
||||
ret = -EINVAL;
|
||||
// Continue trying to revoke as much as possible.
|
||||
}
|
||||
|
||||
// Continue trying to revoke as much as possible.
|
||||
nv_drm_connector_revoke_permissions(dev, nv_connector);
|
||||
}
|
||||
}
|
||||
#if defined(NV_DRM_CONNECTOR_LIST_ITER_PRESENT)
|
||||
@@ -1183,25 +1020,6 @@ static int nv_drm_revoke_modeset_permission(struct drm_device *dev,
|
||||
}
|
||||
}
|
||||
|
||||
ret = drm_atomic_commit(state);
|
||||
done:
|
||||
#if defined(NV_DRM_ATOMIC_STATE_REF_COUNTING_PRESENT)
|
||||
drm_atomic_state_put(state);
|
||||
#else
|
||||
if (ret != 0) {
|
||||
drm_atomic_state_free(state);
|
||||
} else {
|
||||
/*
|
||||
* In case of success, drm_atomic_commit() takes care to cleanup and
|
||||
* free @state.
|
||||
*
|
||||
* Comment placed above drm_atomic_commit() says: The caller must not
|
||||
* free or in any other way access @state. If the function fails then
|
||||
* the caller must clean up @state itself.
|
||||
*/
|
||||
}
|
||||
#endif
|
||||
|
||||
#if NV_DRM_MODESET_LOCK_ALL_END_ARGUMENT_COUNT == 3
|
||||
DRM_MODESET_LOCK_ALL_END(dev, ctx, ret);
|
||||
#else
|
||||
@@ -1211,55 +1029,14 @@ done:
|
||||
return ret;
|
||||
}
|
||||
|
||||
static int nv_drm_revoke_sub_ownership(struct drm_device *dev)
|
||||
{
|
||||
int ret = -EINVAL;
|
||||
struct nv_drm_device *nv_dev = to_nv_device(dev);
|
||||
#if NV_DRM_MODESET_LOCK_ALL_END_ARGUMENT_COUNT == 3
|
||||
struct drm_modeset_acquire_ctx ctx;
|
||||
DRM_MODESET_LOCK_ALL_BEGIN(dev, ctx, DRM_MODESET_ACQUIRE_INTERRUPTIBLE,
|
||||
ret);
|
||||
#else
|
||||
mutex_lock(&dev->mode_config.mutex);
|
||||
#endif
|
||||
|
||||
if (!nv_dev->subOwnershipGranted) {
|
||||
goto done;
|
||||
}
|
||||
|
||||
if (!nvKms->revokeSubOwnership(nv_dev->pDevice)) {
|
||||
NV_DRM_DEV_LOG_ERR(nv_dev, "Failed to revoke sub-ownership from NVKMS");
|
||||
goto done;
|
||||
}
|
||||
|
||||
nv_dev->subOwnershipGranted = NV_FALSE;
|
||||
atomic_set(&nv_dev->enable_event_handling, true);
|
||||
ret = 0;
|
||||
|
||||
done:
|
||||
#if NV_DRM_MODESET_LOCK_ALL_END_ARGUMENT_COUNT == 3
|
||||
DRM_MODESET_LOCK_ALL_END(dev, ctx, ret);
|
||||
#else
|
||||
mutex_unlock(&dev->mode_config.mutex);
|
||||
#endif
|
||||
return ret;
|
||||
}
|
||||
|
||||
static int nv_drm_revoke_permission_ioctl(struct drm_device *dev, void *data,
|
||||
struct drm_file *filep)
|
||||
{
|
||||
struct drm_nvidia_revoke_permissions_params *params = data;
|
||||
|
||||
if (params->type == NV_DRM_PERMISSIONS_TYPE_MODESET) {
|
||||
if (!params->dpyId) {
|
||||
return -EINVAL;
|
||||
}
|
||||
return nv_drm_revoke_modeset_permission(dev, filep, params->dpyId);
|
||||
} else if (params->type == NV_DRM_PERMISSIONS_TYPE_SUB_OWNER) {
|
||||
return nv_drm_revoke_sub_ownership(dev);
|
||||
if (!params->dpyId) {
|
||||
return -EINVAL;
|
||||
}
|
||||
|
||||
return -EINVAL;
|
||||
return nv_drm_revoke_permission(dev, filep, params->dpyId);
|
||||
}
|
||||
|
||||
static void nv_drm_postclose(struct drm_device *dev, struct drm_file *filep)
|
||||
@@ -1274,7 +1051,7 @@ static void nv_drm_postclose(struct drm_device *dev, struct drm_file *filep)
|
||||
dev->mode_config.num_connector > 0 &&
|
||||
dev->mode_config.connector_list.next != NULL &&
|
||||
dev->mode_config.connector_list.prev != NULL) {
|
||||
nv_drm_revoke_modeset_permission(dev, filep, 0);
|
||||
nv_drm_revoke_permission(dev, filep, 0);
|
||||
}
|
||||
}
|
||||
#endif /* NV_DRM_ATOMIC_MODESET_AVAILABLE */
|
||||
@@ -1533,18 +1310,6 @@ static const struct drm_ioctl_desc nv_drm_ioctls[] = {
|
||||
DRM_IOCTL_DEF_DRV(NVIDIA_GEM_PRIME_FENCE_ATTACH,
|
||||
nv_drm_gem_prime_fence_attach_ioctl,
|
||||
DRM_RENDER_ALLOW|DRM_UNLOCKED),
|
||||
DRM_IOCTL_DEF_DRV(NVIDIA_SEMSURF_FENCE_CTX_CREATE,
|
||||
nv_drm_semsurf_fence_ctx_create_ioctl,
|
||||
DRM_RENDER_ALLOW|DRM_UNLOCKED),
|
||||
DRM_IOCTL_DEF_DRV(NVIDIA_SEMSURF_FENCE_CREATE,
|
||||
nv_drm_semsurf_fence_create_ioctl,
|
||||
DRM_RENDER_ALLOW|DRM_UNLOCKED),
|
||||
DRM_IOCTL_DEF_DRV(NVIDIA_SEMSURF_FENCE_WAIT,
|
||||
nv_drm_semsurf_fence_wait_ioctl,
|
||||
DRM_RENDER_ALLOW|DRM_UNLOCKED),
|
||||
DRM_IOCTL_DEF_DRV(NVIDIA_SEMSURF_FENCE_ATTACH,
|
||||
nv_drm_semsurf_fence_attach_ioctl,
|
||||
DRM_RENDER_ALLOW|DRM_UNLOCKED),
|
||||
#endif
|
||||
|
||||
/*
|
||||
@@ -1683,7 +1448,7 @@ static struct drm_driver nv_drm_driver = {
|
||||
* kernel supports atomic modeset and the 'modeset' kernel module
|
||||
* parameter is true.
|
||||
*/
|
||||
void nv_drm_update_drm_driver_features(void)
|
||||
static void nv_drm_update_drm_driver_features(void)
|
||||
{
|
||||
#if defined(NV_DRM_ATOMIC_MODESET_AVAILABLE)
|
||||
|
||||
@@ -1709,7 +1474,7 @@ void nv_drm_update_drm_driver_features(void)
|
||||
/*
|
||||
* Helper function for allocate/register DRM device for given NVIDIA GPU ID.
|
||||
*/
|
||||
void nv_drm_register_drm_device(const nv_gpu_info_t *gpu_info)
|
||||
static void nv_drm_register_drm_device(const nv_gpu_info_t *gpu_info)
|
||||
{
|
||||
struct nv_drm_device *nv_dev = NULL;
|
||||
struct drm_device *dev = NULL;
|
||||
@@ -1747,15 +1512,8 @@ void nv_drm_register_drm_device(const nv_gpu_info_t *gpu_info)
|
||||
dev->dev_private = nv_dev;
|
||||
nv_dev->dev = dev;
|
||||
|
||||
bool bus_is_pci =
|
||||
#if defined(NV_LINUX)
|
||||
device->bus == &pci_bus_type;
|
||||
#elif defined(NV_BSD)
|
||||
devclass_find("pci");
|
||||
#endif
|
||||
|
||||
#if defined(NV_DRM_DEVICE_HAS_PDEV)
|
||||
if (bus_is_pci) {
|
||||
if (device->bus == &pci_bus_type) {
|
||||
dev->pdev = to_pci_dev(device);
|
||||
}
|
||||
#endif
|
||||
@@ -1767,30 +1525,6 @@ void nv_drm_register_drm_device(const nv_gpu_info_t *gpu_info)
|
||||
goto failed_drm_register;
|
||||
}
|
||||
|
||||
#if defined(NV_DRM_FBDEV_GENERIC_AVAILABLE)
|
||||
if (nv_drm_fbdev_module_param &&
|
||||
drm_core_check_feature(dev, DRIVER_MODESET)) {
|
||||
|
||||
if (!nvKms->grabOwnership(nv_dev->pDevice)) {
|
||||
NV_DRM_DEV_LOG_ERR(nv_dev, "Failed to grab NVKMS modeset ownership");
|
||||
goto failed_grab_ownership;
|
||||
}
|
||||
|
||||
if (bus_is_pci) {
|
||||
struct pci_dev *pdev = to_pci_dev(device);
|
||||
|
||||
#if defined(NV_DRM_APERTURE_REMOVE_CONFLICTING_PCI_FRAMEBUFFERS_HAS_DRIVER_ARG)
|
||||
drm_aperture_remove_conflicting_pci_framebuffers(pdev, &nv_drm_driver);
|
||||
#else
|
||||
drm_aperture_remove_conflicting_pci_framebuffers(pdev, nv_drm_driver.name);
|
||||
#endif
|
||||
}
|
||||
drm_fbdev_generic_setup(dev, 32);
|
||||
|
||||
nv_dev->hasFramebufferConsole = NV_TRUE;
|
||||
}
|
||||
#endif /* defined(NV_DRM_FBDEV_GENERIC_AVAILABLE) */
|
||||
|
||||
/* Add NVIDIA-DRM device into list */
|
||||
|
||||
nv_dev->next = dev_list;
|
||||
@@ -1798,12 +1532,6 @@ void nv_drm_register_drm_device(const nv_gpu_info_t *gpu_info)
|
||||
|
||||
return; /* Success */
|
||||
|
||||
#if defined(NV_DRM_FBDEV_GENERIC_AVAILABLE)
|
||||
failed_grab_ownership:
|
||||
|
||||
drm_dev_unregister(dev);
|
||||
#endif
|
||||
|
||||
failed_drm_register:
|
||||
|
||||
nv_drm_dev_free(dev);
|
||||
@@ -1816,7 +1544,6 @@ failed_drm_alloc:
|
||||
/*
|
||||
* Enumerate NVIDIA GPUs and allocate/register DRM device for each of them.
|
||||
*/
|
||||
#if defined(NV_LINUX)
|
||||
int nv_drm_probe_devices(void)
|
||||
{
|
||||
nv_gpu_info_t *gpu_info = NULL;
|
||||
@@ -1859,7 +1586,6 @@ done:
|
||||
|
||||
return ret;
|
||||
}
|
||||
#endif
|
||||
|
||||
/*
|
||||
* Unregister all NVIDIA DRM devices.
|
||||
@@ -1868,16 +1594,9 @@ void nv_drm_remove_devices(void)
|
||||
{
|
||||
while (dev_list != NULL) {
|
||||
struct nv_drm_device *next = dev_list->next;
|
||||
struct drm_device *dev = dev_list->dev;
|
||||
|
||||
#if defined(NV_DRM_FBDEV_GENERIC_AVAILABLE)
|
||||
if (dev_list->hasFramebufferConsole) {
|
||||
drm_atomic_helper_shutdown(dev);
|
||||
nvKms->releaseOwnership(dev_list->pDevice);
|
||||
}
|
||||
#endif
|
||||
drm_dev_unregister(dev);
|
||||
nv_drm_dev_free(dev);
|
||||
drm_dev_unregister(dev_list->dev);
|
||||
nv_drm_dev_free(dev_list->dev);
|
||||
|
||||
nv_drm_free(dev_list);
|
||||
|
||||
@@ -1885,79 +1604,4 @@ void nv_drm_remove_devices(void)
|
||||
}
|
||||
}
|
||||
|
||||
/*
|
||||
* Handle system suspend and resume.
|
||||
*
|
||||
* Normally, a DRM driver would use drm_mode_config_helper_suspend() to save the
|
||||
* current state on suspend and drm_mode_config_helper_resume() to restore it
|
||||
* after resume. This works for upstream drivers because user-mode tasks are
|
||||
* frozen before the suspend hook is called.
|
||||
*
|
||||
* In the case of nvidia-drm, the suspend hook is also called when 'suspend' is
|
||||
* written to /proc/driver/nvidia/suspend, before user-mode tasks are frozen.
|
||||
* However, we don't actually need to save and restore the display state because
|
||||
* the driver requires a VT switch to an unused VT before suspending and a
|
||||
* switch back to the application (or fbdev console) on resume. The DRM client
|
||||
* (or fbdev helper functions) will restore the appropriate mode on resume.
|
||||
*
|
||||
*/
|
||||
void nv_drm_suspend_resume(NvBool suspend)
|
||||
{
|
||||
static DEFINE_MUTEX(nv_drm_suspend_mutex);
|
||||
static NvU32 nv_drm_suspend_count = 0;
|
||||
struct nv_drm_device *nv_dev;
|
||||
|
||||
mutex_lock(&nv_drm_suspend_mutex);
|
||||
|
||||
/*
|
||||
* Count the number of times the driver is asked to suspend. Suspend all DRM
|
||||
* devices on the first suspend call and resume them on the last resume
|
||||
* call. This is necessary because the kernel may call nvkms_suspend()
|
||||
* simultaneously for each GPU, but NVKMS itself also suspends all GPUs on
|
||||
* the first call.
|
||||
*/
|
||||
if (suspend) {
|
||||
if (nv_drm_suspend_count++ > 0) {
|
||||
goto done;
|
||||
}
|
||||
} else {
|
||||
BUG_ON(nv_drm_suspend_count == 0);
|
||||
|
||||
if (--nv_drm_suspend_count > 0) {
|
||||
goto done;
|
||||
}
|
||||
}
|
||||
|
||||
#if defined(NV_DRM_ATOMIC_MODESET_AVAILABLE)
|
||||
nv_dev = dev_list;
|
||||
|
||||
/*
|
||||
* NVKMS shuts down all heads on suspend. Update DRM state accordingly.
|
||||
*/
|
||||
for (nv_dev = dev_list; nv_dev; nv_dev = nv_dev->next) {
|
||||
struct drm_device *dev = nv_dev->dev;
|
||||
|
||||
if (!drm_core_check_feature(dev, DRIVER_MODESET)) {
|
||||
continue;
|
||||
}
|
||||
|
||||
if (suspend) {
|
||||
drm_kms_helper_poll_disable(dev);
|
||||
#if defined(NV_DRM_FBDEV_GENERIC_AVAILABLE)
|
||||
drm_fb_helper_set_suspend_unlocked(dev->fb_helper, 1);
|
||||
#endif
|
||||
drm_mode_config_reset(dev);
|
||||
} else {
|
||||
#if defined(NV_DRM_FBDEV_GENERIC_AVAILABLE)
|
||||
drm_fb_helper_set_suspend_unlocked(dev->fb_helper, 0);
|
||||
#endif
|
||||
drm_kms_helper_poll_enable(dev);
|
||||
}
|
||||
}
|
||||
#endif /* NV_DRM_ATOMIC_MODESET_AVAILABLE */
|
||||
|
||||
done:
|
||||
mutex_unlock(&nv_drm_suspend_mutex);
|
||||
}
|
||||
|
||||
#endif /* NV_DRM_AVAILABLE */
|
||||
|
||||
@@ -31,12 +31,6 @@ int nv_drm_probe_devices(void);
|
||||
|
||||
void nv_drm_remove_devices(void);
|
||||
|
||||
void nv_drm_suspend_resume(NvBool suspend);
|
||||
|
||||
void nv_drm_register_drm_device(const nv_gpu_info_t *);
|
||||
|
||||
void nv_drm_update_drm_driver_features(void);
|
||||
|
||||
#endif /* defined(NV_DRM_AVAILABLE) */
|
||||
|
||||
#endif /* __NVIDIA_DRM_DRV_H__ */
|
||||
|
||||
@@ -300,7 +300,7 @@ void nv_drm_handle_display_change(struct nv_drm_device *nv_dev,
|
||||
|
||||
nv_drm_connector_mark_connection_status_dirty(nv_encoder->nv_connector);
|
||||
|
||||
schedule_delayed_work(&nv_dev->hotplug_event_work, 0);
|
||||
drm_kms_helper_hotplug_event(dev);
|
||||
}
|
||||
|
||||
void nv_drm_handle_dynamic_display_connected(struct nv_drm_device *nv_dev,
|
||||
@@ -347,6 +347,6 @@ void nv_drm_handle_dynamic_display_connected(struct nv_drm_device *nv_dev,
|
||||
drm_reinit_primary_mode_group(dev);
|
||||
#endif
|
||||
|
||||
schedule_delayed_work(&nv_dev->hotplug_event_work, 0);
|
||||
drm_kms_helper_hotplug_event(dev);
|
||||
}
|
||||
#endif
|
||||
|
||||
@@ -240,7 +240,7 @@ struct drm_framebuffer *nv_drm_internal_framebuffer_create(
|
||||
if (nv_dev->modifiers[i] == DRM_FORMAT_MOD_INVALID) {
|
||||
NV_DRM_DEV_DEBUG_DRIVER(
|
||||
nv_dev,
|
||||
"Invalid format modifier for framebuffer object: 0x%016" NvU64_fmtx,
|
||||
"Invalid format modifier for framebuffer object: 0x%016llx",
|
||||
modifier);
|
||||
return ERR_PTR(-EINVAL);
|
||||
}
|
||||
|
||||
File diff suppressed because it is too large
Load Diff
@@ -41,22 +41,6 @@ int nv_drm_prime_fence_context_create_ioctl(struct drm_device *dev,
|
||||
int nv_drm_gem_prime_fence_attach_ioctl(struct drm_device *dev,
|
||||
void *data, struct drm_file *filep);
|
||||
|
||||
int nv_drm_semsurf_fence_ctx_create_ioctl(struct drm_device *dev,
|
||||
void *data,
|
||||
struct drm_file *filep);
|
||||
|
||||
int nv_drm_semsurf_fence_create_ioctl(struct drm_device *dev,
|
||||
void *data,
|
||||
struct drm_file *filep);
|
||||
|
||||
int nv_drm_semsurf_fence_wait_ioctl(struct drm_device *dev,
|
||||
void *data,
|
||||
struct drm_file *filep);
|
||||
|
||||
int nv_drm_semsurf_fence_attach_ioctl(struct drm_device *dev,
|
||||
void *data,
|
||||
struct drm_file *filep);
|
||||
|
||||
#endif /* NV_DRM_FENCE_AVAILABLE */
|
||||
|
||||
#endif /* NV_DRM_AVAILABLE */
|
||||
|
||||
@@ -71,42 +71,12 @@ static int __nv_drm_gem_dma_buf_create_mmap_offset(
|
||||
static int __nv_drm_gem_dma_buf_mmap(struct nv_drm_gem_object *nv_gem,
|
||||
struct vm_area_struct *vma)
|
||||
{
|
||||
#if defined(NV_LINUX)
|
||||
struct dma_buf_attachment *attach = nv_gem->base.import_attach;
|
||||
struct dma_buf *dma_buf = attach->dmabuf;
|
||||
#endif
|
||||
struct file *old_file;
|
||||
int ret;
|
||||
|
||||
/* check if buffer supports mmap */
|
||||
#if defined(NV_BSD)
|
||||
/*
|
||||
* Most of the FreeBSD DRM code refers to struct file*, which is actually
|
||||
* a struct linux_file*. The dmabuf code in FreeBSD is not actually plumbed
|
||||
* through the same linuxkpi bits it seems (probably so it can be used
|
||||
* elsewhere), so dma_buf->file really is a native FreeBSD struct file...
|
||||
*/
|
||||
if (!nv_gem->base.filp->f_op->mmap)
|
||||
return -EINVAL;
|
||||
|
||||
/* readjust the vma */
|
||||
get_file(nv_gem->base.filp);
|
||||
old_file = vma->vm_file;
|
||||
vma->vm_file = nv_gem->base.filp;
|
||||
vma->vm_pgoff -= drm_vma_node_start(&nv_gem->base.vma_node);
|
||||
|
||||
ret = nv_gem->base.filp->f_op->mmap(nv_gem->base.filp, vma);
|
||||
|
||||
if (ret) {
|
||||
/* restore old parameters on failure */
|
||||
vma->vm_file = old_file;
|
||||
vma->vm_pgoff += drm_vma_node_start(&nv_gem->base.vma_node);
|
||||
fput(nv_gem->base.filp);
|
||||
} else {
|
||||
if (old_file)
|
||||
fput(old_file);
|
||||
}
|
||||
#else
|
||||
if (!dma_buf->file->f_op->mmap)
|
||||
return -EINVAL;
|
||||
|
||||
@@ -114,20 +84,18 @@ static int __nv_drm_gem_dma_buf_mmap(struct nv_drm_gem_object *nv_gem,
|
||||
get_file(dma_buf->file);
|
||||
old_file = vma->vm_file;
|
||||
vma->vm_file = dma_buf->file;
|
||||
vma->vm_pgoff -= drm_vma_node_start(&nv_gem->base.vma_node);
|
||||
vma->vm_pgoff -= drm_vma_node_start(&nv_gem->base.vma_node);;
|
||||
|
||||
ret = dma_buf->file->f_op->mmap(dma_buf->file, vma);
|
||||
|
||||
if (ret) {
|
||||
/* restore old parameters on failure */
|
||||
vma->vm_file = old_file;
|
||||
vma->vm_pgoff += drm_vma_node_start(&nv_gem->base.vma_node);
|
||||
fput(dma_buf->file);
|
||||
} else {
|
||||
if (old_file)
|
||||
fput(old_file);
|
||||
}
|
||||
#endif
|
||||
|
||||
return ret;
|
||||
}
|
||||
|
||||
@@ -37,9 +37,6 @@
|
||||
#endif
|
||||
|
||||
#include <linux/io.h>
|
||||
#if defined(NV_BSD)
|
||||
#include <vm/vm_pageout.h>
|
||||
#endif
|
||||
|
||||
#include "nv-mm.h"
|
||||
|
||||
@@ -96,17 +93,7 @@ static vm_fault_t __nv_drm_gem_nvkms_handle_vma_fault(
|
||||
if (nv_nvkms_memory->pages_count == 0) {
|
||||
pfn = (unsigned long)(uintptr_t)nv_nvkms_memory->pPhysicalAddress;
|
||||
pfn >>= PAGE_SHIFT;
|
||||
#if defined(NV_LINUX)
|
||||
/*
|
||||
* FreeBSD doesn't set pgoff. We instead have pfn be the base physical
|
||||
* address, and we will calculate the index pidx from the virtual address.
|
||||
*
|
||||
* This only works because linux_cdev_pager_populate passes the pidx as
|
||||
* vmf->virtual_address. Then we turn the virtual address
|
||||
* into a physical page number.
|
||||
*/
|
||||
pfn += page_offset;
|
||||
#endif
|
||||
} else {
|
||||
BUG_ON(page_offset >= nv_nvkms_memory->pages_count);
|
||||
pfn = page_to_pfn(nv_nvkms_memory->pages[page_offset]);
|
||||
@@ -336,7 +323,7 @@ int nv_drm_dumb_create(
|
||||
ret = -ENOMEM;
|
||||
NV_DRM_DEV_LOG_ERR(
|
||||
nv_dev,
|
||||
"Failed to allocate NvKmsKapiMemory for dumb object of size %" NvU64_fmtu,
|
||||
"Failed to allocate NvKmsKapiMemory for dumb object of size %llu",
|
||||
args->size);
|
||||
goto nvkms_alloc_memory_failed;
|
||||
}
|
||||
@@ -487,7 +474,7 @@ int nv_drm_gem_alloc_nvkms_memory_ioctl(struct drm_device *dev,
|
||||
goto failed;
|
||||
}
|
||||
|
||||
if ((p->__pad0 != 0) || (p->__pad1 != 0)) {
|
||||
if (p->__pad != 0) {
|
||||
ret = -EINVAL;
|
||||
NV_DRM_DEV_LOG_ERR(nv_dev, "non-zero value in padding field");
|
||||
goto failed;
|
||||
|
||||
@@ -36,10 +36,6 @@
|
||||
#include "linux/mm.h"
|
||||
#include "nv-mm.h"
|
||||
|
||||
#if defined(NV_BSD)
|
||||
#include <vm/vm_pageout.h>
|
||||
#endif
|
||||
|
||||
static inline
|
||||
void __nv_drm_gem_user_memory_free(struct nv_drm_gem_object *nv_gem)
|
||||
{
|
||||
@@ -117,10 +113,6 @@ static vm_fault_t __nv_drm_gem_user_memory_handle_vma_fault(
|
||||
page_offset = vmf->pgoff - drm_vma_node_start(&gem->vma_node);
|
||||
|
||||
BUG_ON(page_offset >= nv_user_memory->pages_count);
|
||||
|
||||
#if !defined(NV_LINUX)
|
||||
ret = vmf_insert_pfn(vma, address, page_to_pfn(nv_user_memory->pages[page_offset]));
|
||||
#else /* !defined(NV_LINUX) */
|
||||
ret = vm_insert_page(vma, address, nv_user_memory->pages[page_offset]);
|
||||
switch (ret) {
|
||||
case 0:
|
||||
@@ -139,7 +131,6 @@ static vm_fault_t __nv_drm_gem_user_memory_handle_vma_fault(
|
||||
ret = VM_FAULT_SIGBUS;
|
||||
break;
|
||||
}
|
||||
#endif /* !defined(NV_LINUX) */
|
||||
|
||||
return ret;
|
||||
}
|
||||
@@ -179,7 +170,7 @@ int nv_drm_gem_import_userspace_memory_ioctl(struct drm_device *dev,
|
||||
if ((params->size % PAGE_SIZE) != 0) {
|
||||
NV_DRM_DEV_LOG_ERR(
|
||||
nv_dev,
|
||||
"Userspace memory 0x%" NvU64_fmtx " size should be in a multiple of page "
|
||||
"Userspace memory 0x%llx size should be in a multiple of page "
|
||||
"size to create a gem object",
|
||||
params->address);
|
||||
return -EINVAL;
|
||||
@@ -192,7 +183,7 @@ int nv_drm_gem_import_userspace_memory_ioctl(struct drm_device *dev,
|
||||
if (ret != 0) {
|
||||
NV_DRM_DEV_LOG_ERR(
|
||||
nv_dev,
|
||||
"Failed to lock user pages for address 0x%" NvU64_fmtx ": %d",
|
||||
"Failed to lock user pages for address 0x%llx: %d",
|
||||
params->address, ret);
|
||||
return ret;
|
||||
}
|
||||
|
||||
@@ -95,16 +95,6 @@ static inline struct nv_drm_gem_object *to_nv_gem_object(
|
||||
* 3e70fd160cf0b1945225eaa08dd2cb8544f21cb8 (2018-11-15).
|
||||
*/
|
||||
|
||||
static inline void
|
||||
nv_drm_gem_object_reference(struct nv_drm_gem_object *nv_gem)
|
||||
{
|
||||
#if defined(NV_DRM_GEM_OBJECT_GET_PRESENT)
|
||||
drm_gem_object_get(&nv_gem->base);
|
||||
#else
|
||||
drm_gem_object_reference(&nv_gem->base);
|
||||
#endif
|
||||
}
|
||||
|
||||
static inline void
|
||||
nv_drm_gem_object_unreference_unlocked(struct nv_drm_gem_object *nv_gem)
|
||||
{
|
||||
|
||||
@@ -306,36 +306,6 @@ int nv_drm_atomic_helper_disable_all(struct drm_device *dev,
|
||||
for_each_plane_in_state(__state, plane, plane_state, __i)
|
||||
#endif
|
||||
|
||||
/*
|
||||
* for_each_new_plane_in_state() was added by kernel commit
|
||||
* 581e49fe6b411f407102a7f2377648849e0fa37f which was Signed-off-by:
|
||||
* Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
|
||||
* Daniel Vetter <daniel.vetter@ffwll.ch>
|
||||
*
|
||||
* This commit also added the old_state and new_state pointers to
|
||||
* __drm_planes_state. Because of this, the best that can be done on kernel
|
||||
* versions without this macro is for_each_plane_in_state.
|
||||
*/
|
||||
|
||||
/**
|
||||
* nv_drm_for_each_new_plane_in_state - iterate over all planes in an atomic update
|
||||
* @__state: &struct drm_atomic_state pointer
|
||||
* @plane: &struct drm_plane iteration cursor
|
||||
* @new_plane_state: &struct drm_plane_state iteration cursor for the new state
|
||||
* @__i: int iteration cursor, for macro-internal use
|
||||
*
|
||||
* This iterates over all planes in an atomic update, tracking only the new
|
||||
* state. This is useful in enable functions, where we need the new state the
|
||||
* hardware should be in when the atomic commit operation has completed.
|
||||
*/
|
||||
#if !defined(for_each_new_plane_in_state)
|
||||
#define nv_drm_for_each_new_plane_in_state(__state, plane, new_plane_state, __i) \
|
||||
nv_drm_for_each_plane_in_state(__state, plane, new_plane_state, __i)
|
||||
#else
|
||||
#define nv_drm_for_each_new_plane_in_state(__state, plane, new_plane_state, __i) \
|
||||
for_each_new_plane_in_state(__state, plane, new_plane_state, __i)
|
||||
#endif
|
||||
|
||||
static inline struct drm_connector *
|
||||
nv_drm_connector_lookup(struct drm_device *dev, struct drm_file *filep,
|
||||
uint32_t id)
|
||||
|
||||
@@ -48,10 +48,6 @@
|
||||
#define DRM_NVIDIA_GET_CONNECTOR_ID_FOR_DPY_ID 0x11
|
||||
#define DRM_NVIDIA_GRANT_PERMISSIONS 0x12
|
||||
#define DRM_NVIDIA_REVOKE_PERMISSIONS 0x13
|
||||
#define DRM_NVIDIA_SEMSURF_FENCE_CTX_CREATE 0x14
|
||||
#define DRM_NVIDIA_SEMSURF_FENCE_CREATE 0x15
|
||||
#define DRM_NVIDIA_SEMSURF_FENCE_WAIT 0x16
|
||||
#define DRM_NVIDIA_SEMSURF_FENCE_ATTACH 0x17
|
||||
|
||||
#define DRM_IOCTL_NVIDIA_GEM_IMPORT_NVKMS_MEMORY \
|
||||
DRM_IOWR((DRM_COMMAND_BASE + DRM_NVIDIA_GEM_IMPORT_NVKMS_MEMORY), \
|
||||
@@ -71,7 +67,7 @@
|
||||
*
|
||||
* 'warning: suggest parentheses around arithmetic in operand of |'
|
||||
*/
|
||||
#if defined(NV_LINUX) || defined(NV_BSD)
|
||||
#if defined(NV_LINUX)
|
||||
#define DRM_IOCTL_NVIDIA_FENCE_SUPPORTED \
|
||||
DRM_IO(DRM_COMMAND_BASE + DRM_NVIDIA_FENCE_SUPPORTED)
|
||||
#define DRM_IOCTL_NVIDIA_DMABUF_SUPPORTED \
|
||||
@@ -137,26 +133,6 @@
|
||||
DRM_IOWR((DRM_COMMAND_BASE + DRM_NVIDIA_REVOKE_PERMISSIONS), \
|
||||
struct drm_nvidia_revoke_permissions_params)
|
||||
|
||||
#define DRM_IOCTL_NVIDIA_SEMSURF_FENCE_CTX_CREATE \
|
||||
DRM_IOWR((DRM_COMMAND_BASE + \
|
||||
DRM_NVIDIA_SEMSURF_FENCE_CTX_CREATE), \
|
||||
struct drm_nvidia_semsurf_fence_ctx_create_params)
|
||||
|
||||
#define DRM_IOCTL_NVIDIA_SEMSURF_FENCE_CREATE \
|
||||
DRM_IOWR((DRM_COMMAND_BASE + \
|
||||
DRM_NVIDIA_SEMSURF_FENCE_CREATE), \
|
||||
struct drm_nvidia_semsurf_fence_create_params)
|
||||
|
||||
#define DRM_IOCTL_NVIDIA_SEMSURF_FENCE_WAIT \
|
||||
DRM_IOW((DRM_COMMAND_BASE + \
|
||||
DRM_NVIDIA_SEMSURF_FENCE_WAIT), \
|
||||
struct drm_nvidia_semsurf_fence_wait_params)
|
||||
|
||||
#define DRM_IOCTL_NVIDIA_SEMSURF_FENCE_ATTACH \
|
||||
DRM_IOW((DRM_COMMAND_BASE + \
|
||||
DRM_NVIDIA_SEMSURF_FENCE_ATTACH), \
|
||||
struct drm_nvidia_semsurf_fence_attach_params)
|
||||
|
||||
struct drm_nvidia_gem_import_nvkms_memory_params {
|
||||
uint64_t mem_size; /* IN */
|
||||
|
||||
@@ -178,15 +154,10 @@ struct drm_nvidia_get_dev_info_params {
|
||||
uint32_t gpu_id; /* OUT */
|
||||
uint32_t primary_index; /* OUT; the "card%d" value */
|
||||
|
||||
uint32_t supports_alloc; /* OUT */
|
||||
/* The generic_page_kind, page_kind_generation, and sector_layout
|
||||
* fields are only valid if supports_alloc is true.
|
||||
* See DRM_FORMAT_MOD_NVIDIA_BLOCK_LINEAR_2D definitions of these. */
|
||||
/* See DRM_FORMAT_MOD_NVIDIA_BLOCK_LINEAR_2D definitions of these */
|
||||
uint32_t generic_page_kind; /* OUT */
|
||||
uint32_t page_kind_generation; /* OUT */
|
||||
uint32_t sector_layout; /* OUT */
|
||||
uint32_t supports_sync_fd; /* OUT */
|
||||
uint32_t supports_semsurf; /* OUT */
|
||||
};
|
||||
|
||||
struct drm_nvidia_prime_fence_context_create_params {
|
||||
@@ -208,7 +179,6 @@ struct drm_nvidia_gem_prime_fence_attach_params {
|
||||
uint32_t handle; /* IN GEM handle to attach fence to */
|
||||
uint32_t fence_context_handle; /* IN GEM handle to fence context on which fence is run on */
|
||||
uint32_t sem_thresh; /* IN Semaphore value to reach before signal */
|
||||
uint32_t __pad;
|
||||
};
|
||||
|
||||
struct drm_nvidia_get_client_capability_params {
|
||||
@@ -220,8 +190,6 @@ struct drm_nvidia_get_client_capability_params {
|
||||
struct drm_nvidia_crtc_crc32 {
|
||||
uint32_t value; /* Read value, undefined if supported is false */
|
||||
uint8_t supported; /* Supported boolean, true if readable by hardware */
|
||||
uint8_t __pad0;
|
||||
uint16_t __pad1;
|
||||
};
|
||||
|
||||
struct drm_nvidia_crtc_crc32_v2_out {
|
||||
@@ -261,11 +229,10 @@ struct drm_nvidia_gem_alloc_nvkms_memory_params {
|
||||
uint32_t handle; /* OUT */
|
||||
uint8_t block_linear; /* IN */
|
||||
uint8_t compressible; /* IN/OUT */
|
||||
uint16_t __pad0;
|
||||
uint16_t __pad;
|
||||
|
||||
uint64_t memory_size; /* IN */
|
||||
uint32_t flags; /* IN */
|
||||
uint32_t __pad1;
|
||||
};
|
||||
|
||||
struct drm_nvidia_gem_export_dmabuf_memory_params {
|
||||
@@ -299,90 +266,13 @@ struct drm_nvidia_get_connector_id_for_dpy_id_params {
|
||||
uint32_t connectorId; /* OUT */
|
||||
};
|
||||
|
||||
enum drm_nvidia_permissions_type {
|
||||
NV_DRM_PERMISSIONS_TYPE_MODESET = 2,
|
||||
NV_DRM_PERMISSIONS_TYPE_SUB_OWNER = 3
|
||||
};
|
||||
|
||||
struct drm_nvidia_grant_permissions_params {
|
||||
int32_t fd; /* IN */
|
||||
uint32_t dpyId; /* IN */
|
||||
uint32_t type; /* IN */
|
||||
};
|
||||
|
||||
struct drm_nvidia_revoke_permissions_params {
|
||||
uint32_t dpyId; /* IN */
|
||||
uint32_t type; /* IN */
|
||||
};
|
||||
|
||||
struct drm_nvidia_semsurf_fence_ctx_create_params {
|
||||
uint64_t index; /* IN Index of the desired semaphore in the
|
||||
* fence context's semaphore surface */
|
||||
|
||||
/* Params for importing userspace semaphore surface */
|
||||
uint64_t nvkms_params_ptr; /* IN */
|
||||
uint64_t nvkms_params_size; /* IN */
|
||||
|
||||
uint32_t handle; /* OUT GEM handle to fence context */
|
||||
uint32_t __pad;
|
||||
};
|
||||
|
||||
struct drm_nvidia_semsurf_fence_create_params {
|
||||
uint32_t fence_context_handle; /* IN GEM handle to fence context on which
|
||||
* fence is run on */
|
||||
|
||||
uint32_t timeout_value_ms; /* IN Timeout value in ms for the fence
|
||||
* after which the fence will be signaled
|
||||
* with its error status set to -ETIMEDOUT.
|
||||
* Default timeout value is 5000ms */
|
||||
|
||||
uint64_t wait_value; /* IN Semaphore value to reach before signal */
|
||||
|
||||
int32_t fd; /* OUT sync FD object representing the
|
||||
* semaphore at the specified index reaching
|
||||
* a value >= wait_value */
|
||||
uint32_t __pad;
|
||||
};
|
||||
|
||||
/*
|
||||
* Note there is no provision for timeouts in this ioctl. The kernel
|
||||
* documentation asserts timeouts should be handled by fence producers, and
|
||||
* that waiters should not second-guess their logic, as it is producers rather
|
||||
* than consumers that have better information when it comes to determining a
|
||||
* reasonable timeout for a given workload.
|
||||
*/
|
||||
struct drm_nvidia_semsurf_fence_wait_params {
|
||||
uint32_t fence_context_handle; /* IN GEM handle to fence context which will
|
||||
* be used to wait on the sync FD. Need not
|
||||
* be the fence context used to create the
|
||||
* sync FD. */
|
||||
|
||||
int32_t fd; /* IN sync FD object to wait on */
|
||||
|
||||
uint64_t pre_wait_value; /* IN Wait for the semaphore represented by
|
||||
* fence_context to reach this value before
|
||||
* waiting for the sync file. */
|
||||
|
||||
uint64_t post_wait_value; /* IN Signal the semaphore represented by
|
||||
* fence_context to this value after waiting
|
||||
* for the sync file */
|
||||
};
|
||||
|
||||
struct drm_nvidia_semsurf_fence_attach_params {
|
||||
uint32_t handle; /* IN GEM handle of buffer */
|
||||
|
||||
uint32_t fence_context_handle; /* IN GEM handle of fence context */
|
||||
|
||||
uint32_t timeout_value_ms; /* IN Timeout value in ms for the fence
|
||||
* after which the fence will be signaled
|
||||
* with its error status set to -ETIMEDOUT.
|
||||
* Default timeout value is 5000ms */
|
||||
|
||||
uint32_t shared; /* IN If true, fence will reserve shared
|
||||
* access to the buffer, otherwise it will
|
||||
* reserve exclusive access */
|
||||
|
||||
uint64_t wait_value; /* IN Semaphore value to reach before signal */
|
||||
};
|
||||
|
||||
#endif /* _UAPI_NVIDIA_DRM_IOCTL_H_ */
|
||||
|
||||
@@ -1,5 +1,5 @@
|
||||
/*
|
||||
* Copyright (c) 2015-2023, NVIDIA CORPORATION. All rights reserved.
|
||||
* Copyright (c) 2015, NVIDIA CORPORATION. All rights reserved.
|
||||
*
|
||||
* Permission is hereby granted, free of charge, to any person obtaining a
|
||||
* copy of this software and associated documentation files (the "Software"),
|
||||
@@ -21,6 +21,8 @@
|
||||
*/
|
||||
|
||||
#include <linux/module.h>
|
||||
#include <linux/slab.h>
|
||||
#include <linux/err.h>
|
||||
|
||||
#include "nvidia-drm-os-interface.h"
|
||||
#include "nvidia-drm.h"
|
||||
@@ -29,18 +31,135 @@
|
||||
|
||||
#if defined(NV_DRM_AVAILABLE)
|
||||
|
||||
#if defined(NV_DRM_DRMP_H_PRESENT)
|
||||
#include <drm/drmP.h>
|
||||
#endif
|
||||
|
||||
#include <linux/vmalloc.h>
|
||||
|
||||
#include "nv-mm.h"
|
||||
|
||||
MODULE_PARM_DESC(
|
||||
modeset,
|
||||
"Enable atomic kernel modesetting (1 = enable, 0 = disable (default))");
|
||||
bool nv_drm_modeset_module_param = false;
|
||||
module_param_named(modeset, nv_drm_modeset_module_param, bool, 0400);
|
||||
|
||||
#if defined(NV_DRM_FBDEV_GENERIC_AVAILABLE)
|
||||
MODULE_PARM_DESC(
|
||||
fbdev,
|
||||
"Create a framebuffer device (1 = enable, 0 = disable (default)) (EXPERIMENTAL)");
|
||||
module_param_named(fbdev, nv_drm_fbdev_module_param, bool, 0400);
|
||||
void *nv_drm_calloc(size_t nmemb, size_t size)
|
||||
{
|
||||
size_t total_size = nmemb * size;
|
||||
//
|
||||
// Check for overflow.
|
||||
//
|
||||
if ((nmemb != 0) && ((total_size / nmemb) != size))
|
||||
{
|
||||
return NULL;
|
||||
}
|
||||
return kzalloc(nmemb * size, GFP_KERNEL);
|
||||
}
|
||||
|
||||
void nv_drm_free(void *ptr)
|
||||
{
|
||||
if (IS_ERR(ptr)) {
|
||||
return;
|
||||
}
|
||||
|
||||
kfree(ptr);
|
||||
}
|
||||
|
||||
char *nv_drm_asprintf(const char *fmt, ...)
|
||||
{
|
||||
va_list ap;
|
||||
char *p;
|
||||
|
||||
va_start(ap, fmt);
|
||||
p = kvasprintf(GFP_KERNEL, fmt, ap);
|
||||
va_end(ap);
|
||||
|
||||
return p;
|
||||
}
|
||||
|
||||
#if defined(NVCPU_X86) || defined(NVCPU_X86_64)
|
||||
#define WRITE_COMBINE_FLUSH() asm volatile("sfence":::"memory")
|
||||
#elif defined(NVCPU_FAMILY_ARM)
|
||||
#if defined(NVCPU_ARM)
|
||||
#define WRITE_COMBINE_FLUSH() { dsb(); outer_sync(); }
|
||||
#elif defined(NVCPU_AARCH64)
|
||||
#define WRITE_COMBINE_FLUSH() mb()
|
||||
#endif
|
||||
#elif defined(NVCPU_PPC64LE)
|
||||
#define WRITE_COMBINE_FLUSH() asm volatile("sync":::"memory")
|
||||
#endif
|
||||
|
||||
void nv_drm_write_combine_flush(void)
|
||||
{
|
||||
WRITE_COMBINE_FLUSH();
|
||||
}
|
||||
|
||||
int nv_drm_lock_user_pages(unsigned long address,
|
||||
unsigned long pages_count, struct page ***pages)
|
||||
{
|
||||
struct mm_struct *mm = current->mm;
|
||||
struct page **user_pages;
|
||||
int pages_pinned;
|
||||
|
||||
user_pages = nv_drm_calloc(pages_count, sizeof(*user_pages));
|
||||
|
||||
if (user_pages == NULL) {
|
||||
return -ENOMEM;
|
||||
}
|
||||
|
||||
nv_mmap_read_lock(mm);
|
||||
|
||||
pages_pinned = NV_PIN_USER_PAGES(address, pages_count, FOLL_WRITE,
|
||||
user_pages, NULL);
|
||||
nv_mmap_read_unlock(mm);
|
||||
|
||||
if (pages_pinned < 0 || (unsigned)pages_pinned < pages_count) {
|
||||
goto failed;
|
||||
}
|
||||
|
||||
*pages = user_pages;
|
||||
|
||||
return 0;
|
||||
|
||||
failed:
|
||||
|
||||
if (pages_pinned > 0) {
|
||||
int i;
|
||||
|
||||
for (i = 0; i < pages_pinned; i++) {
|
||||
NV_UNPIN_USER_PAGE(user_pages[i]);
|
||||
}
|
||||
}
|
||||
|
||||
nv_drm_free(user_pages);
|
||||
|
||||
return (pages_pinned < 0) ? pages_pinned : -EINVAL;
|
||||
}
|
||||
|
||||
void nv_drm_unlock_user_pages(unsigned long pages_count, struct page **pages)
|
||||
{
|
||||
unsigned long i;
|
||||
|
||||
for (i = 0; i < pages_count; i++) {
|
||||
set_page_dirty_lock(pages[i]);
|
||||
NV_UNPIN_USER_PAGE(pages[i]);
|
||||
}
|
||||
|
||||
nv_drm_free(pages);
|
||||
}
|
||||
|
||||
void *nv_drm_vmap(struct page **pages, unsigned long pages_count)
|
||||
{
|
||||
return vmap(pages, pages_count, VM_USERMAP, PAGE_KERNEL);
|
||||
}
|
||||
|
||||
void nv_drm_vunmap(void *address)
|
||||
{
|
||||
vunmap(address);
|
||||
}
|
||||
|
||||
#endif /* NV_DRM_AVAILABLE */
|
||||
|
||||
/*************************************************************************
|
||||
|
||||
@@ -237,14 +237,6 @@ nv_drm_atomic_apply_modeset_config(struct drm_device *dev,
|
||||
int i;
|
||||
int ret;
|
||||
|
||||
/*
|
||||
* If sub-owner permission was granted to another NVKMS client, disallow
|
||||
* modesets through the DRM interface.
|
||||
*/
|
||||
if (nv_dev->subOwnershipGranted) {
|
||||
return -EINVAL;
|
||||
}
|
||||
|
||||
memset(requested_config, 0, sizeof(*requested_config));
|
||||
|
||||
/* Loop over affected crtcs and construct NvKmsKapiRequestedModeSetConfig */
|
||||
@@ -282,6 +274,9 @@ nv_drm_atomic_apply_modeset_config(struct drm_device *dev,
|
||||
|
||||
nv_new_crtc_state->nv_flip = NULL;
|
||||
}
|
||||
#if defined(NV_DRM_CRTC_STATE_HAS_VRR_ENABLED)
|
||||
requested_config->headRequestedConfig[nv_crtc->head].modeSetConfig.vrrEnabled = new_crtc_state->vrr_enabled;
|
||||
#endif
|
||||
}
|
||||
}
|
||||
|
||||
@@ -297,9 +292,7 @@ nv_drm_atomic_apply_modeset_config(struct drm_device *dev,
|
||||
requested_config,
|
||||
&reply_config,
|
||||
commit)) {
|
||||
if (commit || reply_config.flipResult != NV_KMS_FLIP_RESULT_IN_PROGRESS) {
|
||||
return -EINVAL;
|
||||
}
|
||||
return -EINVAL;
|
||||
}
|
||||
|
||||
if (commit && nv_dev->supportsSyncpts) {
|
||||
@@ -321,24 +314,6 @@ int nv_drm_atomic_check(struct drm_device *dev,
|
||||
{
|
||||
int ret = 0;
|
||||
|
||||
#if defined(NV_DRM_COLOR_MGMT_AVAILABLE)
|
||||
struct drm_crtc *crtc;
|
||||
struct drm_crtc_state *crtc_state;
|
||||
int i;
|
||||
|
||||
nv_drm_for_each_crtc_in_state(state, crtc, crtc_state, i) {
|
||||
/*
|
||||
* if the color management changed on the crtc, we need to update the
|
||||
* crtc's plane's CSC matrices, so add the crtc's planes to the commit
|
||||
*/
|
||||
if (crtc_state->color_mgmt_changed) {
|
||||
if ((ret = drm_atomic_add_affected_planes(state, crtc)) != 0) {
|
||||
goto done;
|
||||
}
|
||||
}
|
||||
}
|
||||
#endif /* NV_DRM_COLOR_MGMT_AVAILABLE */
|
||||
|
||||
if ((ret = drm_atomic_helper_check(dev, state)) != 0) {
|
||||
goto done;
|
||||
}
|
||||
@@ -413,56 +388,42 @@ int nv_drm_atomic_commit(struct drm_device *dev,
|
||||
struct nv_drm_device *nv_dev = to_nv_device(dev);
|
||||
|
||||
/*
|
||||
* XXX: drm_mode_config_funcs::atomic_commit() mandates to return -EBUSY
|
||||
* for nonblocking commit if the commit would need to wait for previous
|
||||
* updates (commit tasks/flip event) to complete. In case of blocking
|
||||
* commits it mandates to wait for previous updates to complete. However,
|
||||
* the kernel DRM-KMS documentation does explicitly allow maintaining a
|
||||
* queue of outstanding commits.
|
||||
*
|
||||
* Our system already implements such a queue, but due to
|
||||
* bug 4054608, it is currently not used.
|
||||
* drm_mode_config_funcs::atomic_commit() mandates to return -EBUSY
|
||||
* for nonblocking commit if previous updates (commit tasks/flip event) are
|
||||
* pending. In case of blocking commits it mandates to wait for previous
|
||||
* updates to complete.
|
||||
*/
|
||||
nv_drm_for_each_crtc_in_state(state, crtc, crtc_state, i) {
|
||||
struct nv_drm_crtc *nv_crtc = to_nv_crtc(crtc);
|
||||
if (nonblock) {
|
||||
nv_drm_for_each_crtc_in_state(state, crtc, crtc_state, i) {
|
||||
struct nv_drm_crtc *nv_crtc = to_nv_crtc(crtc);
|
||||
|
||||
/*
|
||||
* Here you aren't required to hold nv_drm_crtc::flip_list_lock
|
||||
* because:
|
||||
*
|
||||
* The core DRM driver acquires lock for all affected crtcs before
|
||||
* calling into ->commit() hook, therefore it is not possible for
|
||||
* other threads to call into ->commit() hook affecting same crtcs
|
||||
* and enqueue flip objects into flip_list -
|
||||
*
|
||||
* nv_drm_atomic_commit_internal()
|
||||
* |-> nv_drm_atomic_apply_modeset_config(commit=true)
|
||||
* |-> nv_drm_crtc_enqueue_flip()
|
||||
*
|
||||
* Only possibility is list_empty check races with code path
|
||||
* dequeuing flip object -
|
||||
*
|
||||
* __nv_drm_handle_flip_event()
|
||||
* |-> nv_drm_crtc_dequeue_flip()
|
||||
*
|
||||
* But this race condition can't lead list_empty() to return
|
||||
* incorrect result. nv_drm_crtc_dequeue_flip() in the middle of
|
||||
* updating the list could not trick us into thinking the list is
|
||||
* empty when it isn't.
|
||||
*/
|
||||
if (nonblock) {
|
||||
/*
|
||||
* Here you aren't required to hold nv_drm_crtc::flip_list_lock
|
||||
* because:
|
||||
*
|
||||
* The core DRM driver acquires lock for all affected crtcs before
|
||||
* calling into ->commit() hook, therefore it is not possible for
|
||||
* other threads to call into ->commit() hook affecting same crtcs
|
||||
* and enqueue flip objects into flip_list -
|
||||
*
|
||||
* nv_drm_atomic_commit_internal()
|
||||
* |-> nv_drm_atomic_apply_modeset_config(commit=true)
|
||||
* |-> nv_drm_crtc_enqueue_flip()
|
||||
*
|
||||
* Only possibility is list_empty check races with code path
|
||||
* dequeuing flip object -
|
||||
*
|
||||
* __nv_drm_handle_flip_event()
|
||||
* |-> nv_drm_crtc_dequeue_flip()
|
||||
*
|
||||
* But this race condition can't lead list_empty() to return
|
||||
* incorrect result. nv_drm_crtc_dequeue_flip() in the middle of
|
||||
* updating the list could not trick us into thinking the list is
|
||||
* empty when it isn't.
|
||||
*/
|
||||
if (!list_empty(&nv_crtc->flip_list)) {
|
||||
return -EBUSY;
|
||||
}
|
||||
} else {
|
||||
if (wait_event_timeout(
|
||||
nv_dev->flip_event_wq,
|
||||
list_empty(&nv_crtc->flip_list),
|
||||
3 * HZ /* 3 second */) == 0) {
|
||||
NV_DRM_DEV_LOG_ERR(
|
||||
nv_dev,
|
||||
"Flip event timeout on head %u", nv_crtc->head);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
@@ -506,7 +467,6 @@ int nv_drm_atomic_commit(struct drm_device *dev,
|
||||
|
||||
goto done;
|
||||
}
|
||||
nv_dev->drmMasterChangedSinceLastAtomicCommit = NV_FALSE;
|
||||
|
||||
nv_drm_for_each_crtc_in_state(state, crtc, crtc_state, i) {
|
||||
struct nv_drm_crtc *nv_crtc = to_nv_crtc(crtc);
|
||||
|
||||
@@ -1,285 +0,0 @@
|
||||
/*
|
||||
* Copyright (c) 2015-2023, NVIDIA CORPORATION. All rights reserved.
|
||||
*
|
||||
* Permission is hereby granted, free of charge, to any person obtaining a
|
||||
* copy of this software and associated documentation files (the "Software"),
|
||||
* to deal in the Software without restriction, including without limitation
|
||||
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
|
||||
* and/or sell copies of the Software, and to permit persons to whom the
|
||||
* Software is furnished to do so, subject to the following conditions:
|
||||
*
|
||||
* The above copyright notice and this permission notice shall be included in
|
||||
* all copies or substantial portions of the Software.
|
||||
*
|
||||
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
|
||||
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
|
||||
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
|
||||
* DEALINGS IN THE SOFTWARE.
|
||||
*/
|
||||
|
||||
#include <linux/slab.h>
|
||||
|
||||
#include "nvidia-drm-os-interface.h"
|
||||
|
||||
#if defined(NV_DRM_AVAILABLE)
|
||||
|
||||
#if defined(NV_LINUX_SYNC_FILE_H_PRESENT)
|
||||
#include <linux/file.h>
|
||||
#include <linux/sync_file.h>
|
||||
#endif
|
||||
|
||||
#include <linux/vmalloc.h>
|
||||
#include <linux/sched.h>
|
||||
#include <linux/device.h>
|
||||
|
||||
#include "nv-mm.h"
|
||||
|
||||
#if defined(NV_DRM_DRMP_H_PRESENT)
|
||||
#include <drm/drmP.h>
|
||||
#endif
|
||||
|
||||
bool nv_drm_modeset_module_param = false;
|
||||
bool nv_drm_fbdev_module_param = false;
|
||||
|
||||
void *nv_drm_calloc(size_t nmemb, size_t size)
|
||||
{
|
||||
size_t total_size = nmemb * size;
|
||||
//
|
||||
// Check for overflow.
|
||||
//
|
||||
if ((nmemb != 0) && ((total_size / nmemb) != size))
|
||||
{
|
||||
return NULL;
|
||||
}
|
||||
return kzalloc(nmemb * size, GFP_KERNEL);
|
||||
}
|
||||
|
||||
void nv_drm_free(void *ptr)
|
||||
{
|
||||
if (IS_ERR(ptr)) {
|
||||
return;
|
||||
}
|
||||
|
||||
kfree(ptr);
|
||||
}
|
||||
|
||||
char *nv_drm_asprintf(const char *fmt, ...)
|
||||
{
|
||||
va_list ap;
|
||||
char *p;
|
||||
|
||||
va_start(ap, fmt);
|
||||
p = kvasprintf(GFP_KERNEL, fmt, ap);
|
||||
va_end(ap);
|
||||
|
||||
return p;
|
||||
}
|
||||
|
||||
#if defined(NVCPU_X86) || defined(NVCPU_X86_64)
|
||||
#define WRITE_COMBINE_FLUSH() asm volatile("sfence":::"memory")
|
||||
#elif defined(NVCPU_PPC64LE)
|
||||
#define WRITE_COMBINE_FLUSH() asm volatile("sync":::"memory")
|
||||
#else
|
||||
#define WRITE_COMBINE_FLUSH() mb()
|
||||
#endif
|
||||
|
||||
void nv_drm_write_combine_flush(void)
|
||||
{
|
||||
WRITE_COMBINE_FLUSH();
|
||||
}
|
||||
|
||||
int nv_drm_lock_user_pages(unsigned long address,
|
||||
unsigned long pages_count, struct page ***pages)
|
||||
{
|
||||
struct mm_struct *mm = current->mm;
|
||||
struct page **user_pages;
|
||||
int pages_pinned;
|
||||
|
||||
user_pages = nv_drm_calloc(pages_count, sizeof(*user_pages));
|
||||
|
||||
if (user_pages == NULL) {
|
||||
return -ENOMEM;
|
||||
}
|
||||
|
||||
nv_mmap_read_lock(mm);
|
||||
|
||||
pages_pinned = NV_PIN_USER_PAGES(address, pages_count, FOLL_WRITE,
|
||||
user_pages);
|
||||
nv_mmap_read_unlock(mm);
|
||||
|
||||
if (pages_pinned < 0 || (unsigned)pages_pinned < pages_count) {
|
||||
goto failed;
|
||||
}
|
||||
|
||||
*pages = user_pages;
|
||||
|
||||
return 0;
|
||||
|
||||
failed:
|
||||
|
||||
if (pages_pinned > 0) {
|
||||
int i;
|
||||
|
||||
for (i = 0; i < pages_pinned; i++) {
|
||||
NV_UNPIN_USER_PAGE(user_pages[i]);
|
||||
}
|
||||
}
|
||||
|
||||
nv_drm_free(user_pages);
|
||||
|
||||
return (pages_pinned < 0) ? pages_pinned : -EINVAL;
|
||||
}
|
||||
|
||||
void nv_drm_unlock_user_pages(unsigned long pages_count, struct page **pages)
|
||||
{
|
||||
unsigned long i;
|
||||
|
||||
for (i = 0; i < pages_count; i++) {
|
||||
set_page_dirty_lock(pages[i]);
|
||||
NV_UNPIN_USER_PAGE(pages[i]);
|
||||
}
|
||||
|
||||
nv_drm_free(pages);
|
||||
}
|
||||
|
||||
/*
|
||||
* linuxkpi vmap doesn't use the flags argument as it
|
||||
* doesn't seem to be needed. Define VM_USERMAP to 0
|
||||
* to make errors go away
|
||||
*
|
||||
* vmap: sys/compat/linuxkpi/common/src/linux_compat.c
|
||||
*/
|
||||
#if defined(NV_BSD)
|
||||
#define VM_USERMAP 0
|
||||
#endif
|
||||
|
||||
void *nv_drm_vmap(struct page **pages, unsigned long pages_count)
|
||||
{
|
||||
return vmap(pages, pages_count, VM_USERMAP, PAGE_KERNEL);
|
||||
}
|
||||
|
||||
void nv_drm_vunmap(void *address)
|
||||
{
|
||||
vunmap(address);
|
||||
}
|
||||
|
||||
bool nv_drm_workthread_init(nv_drm_workthread *worker, const char *name)
|
||||
{
|
||||
worker->shutting_down = false;
|
||||
if (nv_kthread_q_init(&worker->q, name)) {
|
||||
return false;
|
||||
}
|
||||
|
||||
spin_lock_init(&worker->lock);
|
||||
|
||||
return true;
|
||||
}
|
||||
|
||||
void nv_drm_workthread_shutdown(nv_drm_workthread *worker)
|
||||
{
|
||||
unsigned long flags;
|
||||
|
||||
spin_lock_irqsave(&worker->lock, flags);
|
||||
worker->shutting_down = true;
|
||||
spin_unlock_irqrestore(&worker->lock, flags);
|
||||
|
||||
nv_kthread_q_stop(&worker->q);
|
||||
}
|
||||
|
||||
void nv_drm_workthread_work_init(nv_drm_work *work,
|
||||
void (*callback)(void *),
|
||||
void *arg)
|
||||
{
|
||||
nv_kthread_q_item_init(work, callback, arg);
|
||||
}
|
||||
|
||||
int nv_drm_workthread_add_work(nv_drm_workthread *worker, nv_drm_work *work)
|
||||
{
|
||||
unsigned long flags;
|
||||
int ret = 0;
|
||||
|
||||
spin_lock_irqsave(&worker->lock, flags);
|
||||
if (!worker->shutting_down) {
|
||||
ret = nv_kthread_q_schedule_q_item(&worker->q, work);
|
||||
}
|
||||
spin_unlock_irqrestore(&worker->lock, flags);
|
||||
|
||||
return ret;
|
||||
}
|
||||
|
||||
void nv_drm_timer_setup(nv_drm_timer *timer, void (*callback)(nv_drm_timer *nv_drm_timer))
|
||||
{
|
||||
nv_timer_setup(timer, callback);
|
||||
}
|
||||
|
||||
void nv_drm_mod_timer(nv_drm_timer *timer, unsigned long timeout_native)
|
||||
{
|
||||
mod_timer(&timer->kernel_timer, timeout_native);
|
||||
}
|
||||
|
||||
unsigned long nv_drm_timer_now(void)
|
||||
{
|
||||
return jiffies;
|
||||
}
|
||||
|
||||
unsigned long nv_drm_timeout_from_ms(NvU64 relative_timeout_ms)
|
||||
{
|
||||
return jiffies + msecs_to_jiffies(relative_timeout_ms);
|
||||
}
|
||||
|
||||
bool nv_drm_del_timer_sync(nv_drm_timer *timer)
|
||||
{
|
||||
if (del_timer_sync(&timer->kernel_timer)) {
|
||||
return true;
|
||||
} else {
|
||||
return false;
|
||||
}
|
||||
}
|
||||
|
||||
#if defined(NV_DRM_FENCE_AVAILABLE)
|
||||
int nv_drm_create_sync_file(nv_dma_fence_t *fence)
|
||||
{
|
||||
#if defined(NV_LINUX_SYNC_FILE_H_PRESENT)
|
||||
struct sync_file *sync;
|
||||
int fd = get_unused_fd_flags(O_CLOEXEC);
|
||||
|
||||
if (fd < 0) {
|
||||
return fd;
|
||||
}
|
||||
|
||||
/* sync_file_create() generates its own reference to the fence */
|
||||
sync = sync_file_create(fence);
|
||||
|
||||
if (IS_ERR(sync)) {
|
||||
put_unused_fd(fd);
|
||||
return PTR_ERR(sync);
|
||||
}
|
||||
|
||||
fd_install(fd, sync->file);
|
||||
|
||||
return fd;
|
||||
#else /* defined(NV_LINUX_SYNC_FILE_H_PRESENT) */
|
||||
return -EINVAL;
|
||||
#endif /* defined(NV_LINUX_SYNC_FILE_H_PRESENT) */
|
||||
}
|
||||
|
||||
nv_dma_fence_t *nv_drm_sync_file_get_fence(int fd)
|
||||
{
|
||||
#if defined(NV_SYNC_FILE_GET_FENCE_PRESENT)
|
||||
return sync_file_get_fence(fd);
|
||||
#else /* defined(NV_SYNC_FILE_GET_FENCE_PRESENT) */
|
||||
return NULL;
|
||||
#endif /* defined(NV_SYNC_FILE_GET_FENCE_PRESENT) */
|
||||
}
|
||||
#endif /* defined(NV_DRM_FENCE_AVAILABLE) */
|
||||
|
||||
void nv_drm_yield(void)
|
||||
{
|
||||
set_current_state(TASK_INTERRUPTIBLE);
|
||||
schedule_timeout(1);
|
||||
}
|
||||
|
||||
#endif /* NV_DRM_AVAILABLE */
|
||||
@@ -29,47 +29,10 @@
|
||||
|
||||
#if defined(NV_DRM_AVAILABLE)
|
||||
|
||||
#if defined(NV_DRM_FENCE_AVAILABLE)
|
||||
#include "nvidia-dma-fence-helper.h"
|
||||
#endif
|
||||
|
||||
#if defined(NV_LINUX) || defined(NV_BSD)
|
||||
#include "nv-kthread-q.h"
|
||||
#include "linux/spinlock.h"
|
||||
|
||||
typedef struct nv_drm_workthread {
|
||||
spinlock_t lock;
|
||||
struct nv_kthread_q q;
|
||||
bool shutting_down;
|
||||
} nv_drm_workthread;
|
||||
|
||||
typedef nv_kthread_q_item_t nv_drm_work;
|
||||
|
||||
#else
|
||||
#error "Need to define deferred work primitives for this OS"
|
||||
#endif
|
||||
|
||||
#if defined(NV_LINUX) || defined(NV_BSD)
|
||||
#include "nv-timer.h"
|
||||
|
||||
typedef struct nv_timer nv_drm_timer;
|
||||
|
||||
#else
|
||||
#error "Need to define kernel timer callback primitives for this OS"
|
||||
#endif
|
||||
|
||||
#if defined(NV_DRM_FBDEV_GENERIC_SETUP_PRESENT) && defined(NV_DRM_APERTURE_REMOVE_CONFLICTING_PCI_FRAMEBUFFERS_PRESENT)
|
||||
#define NV_DRM_FBDEV_GENERIC_AVAILABLE
|
||||
#endif
|
||||
|
||||
struct page;
|
||||
|
||||
/* Set to true when the atomic modeset feature is enabled. */
|
||||
extern bool nv_drm_modeset_module_param;
|
||||
#if defined(NV_DRM_FBDEV_GENERIC_AVAILABLE)
|
||||
/* Set to true when the nvidia-drm driver should install a framebuffer device */
|
||||
extern bool nv_drm_fbdev_module_param;
|
||||
#endif
|
||||
|
||||
void *nv_drm_calloc(size_t nmemb, size_t size);
|
||||
|
||||
@@ -88,37 +51,6 @@ void *nv_drm_vmap(struct page **pages, unsigned long pages_count);
|
||||
|
||||
void nv_drm_vunmap(void *address);
|
||||
|
||||
bool nv_drm_workthread_init(nv_drm_workthread *worker, const char *name);
|
||||
|
||||
/* Can be called concurrently with nv_drm_workthread_add_work() */
|
||||
void nv_drm_workthread_shutdown(nv_drm_workthread *worker);
|
||||
|
||||
void nv_drm_workthread_work_init(nv_drm_work *work,
|
||||
void (*callback)(void *),
|
||||
void *arg);
|
||||
|
||||
/* Can be called concurrently with nv_drm_workthread_shutdown() */
|
||||
int nv_drm_workthread_add_work(nv_drm_workthread *worker, nv_drm_work *work);
|
||||
|
||||
void nv_drm_timer_setup(nv_drm_timer *timer,
|
||||
void (*callback)(nv_drm_timer *nv_drm_timer));
|
||||
|
||||
void nv_drm_mod_timer(nv_drm_timer *timer, unsigned long relative_timeout_ms);
|
||||
|
||||
bool nv_drm_del_timer_sync(nv_drm_timer *timer);
|
||||
|
||||
unsigned long nv_drm_timer_now(void);
|
||||
|
||||
unsigned long nv_drm_timeout_from_ms(NvU64 relative_timeout_ms);
|
||||
|
||||
#if defined(NV_DRM_FENCE_AVAILABLE)
|
||||
int nv_drm_create_sync_file(nv_dma_fence_t *fence);
|
||||
|
||||
nv_dma_fence_t *nv_drm_sync_file_get_fence(int fd);
|
||||
#endif /* defined(NV_DRM_FENCE_AVAILABLE) */
|
||||
|
||||
void nv_drm_yield(void);
|
||||
|
||||
#endif /* defined(NV_DRM_AVAILABLE) */
|
||||
#endif
|
||||
|
||||
#endif /* __NVIDIA_DRM_OS_INTERFACE_H__ */
|
||||
|
||||
@@ -46,33 +46,12 @@
|
||||
#define NV_DRM_LOG_ERR(__fmt, ...) \
|
||||
DRM_ERROR("[nvidia-drm] " __fmt "\n", ##__VA_ARGS__)
|
||||
|
||||
/*
|
||||
* DRM_WARN() was added in v4.9 by kernel commit
|
||||
* 30b0da8d556e65ff935a56cd82c05ba0516d3e4a
|
||||
*
|
||||
* Before this commit, only DRM_INFO and DRM_ERROR were defined and
|
||||
* DRM_INFO(fmt, ...) was defined as
|
||||
* printk(KERN_INFO "[" DRM_NAME "] " fmt, ##__VA_ARGS__). So, if
|
||||
* DRM_WARN is undefined this defines NV_DRM_LOG_WARN following the
|
||||
* same pattern as DRM_INFO.
|
||||
*/
|
||||
#ifdef DRM_WARN
|
||||
#define NV_DRM_LOG_WARN(__fmt, ...) \
|
||||
DRM_WARN("[nvidia-drm] " __fmt "\n", ##__VA_ARGS__)
|
||||
#else
|
||||
#define NV_DRM_LOG_WARN(__fmt, ...) \
|
||||
printk(KERN_WARNING "[" DRM_NAME "] [nvidia-drm] " __fmt "\n", ##__VA_ARGS__)
|
||||
#endif
|
||||
|
||||
#define NV_DRM_LOG_INFO(__fmt, ...) \
|
||||
DRM_INFO("[nvidia-drm] " __fmt "\n", ##__VA_ARGS__)
|
||||
|
||||
#define NV_DRM_DEV_LOG_INFO(__dev, __fmt, ...) \
|
||||
NV_DRM_LOG_INFO("[GPU ID 0x%08x] " __fmt, __dev->gpu_info.gpu_id, ##__VA_ARGS__)
|
||||
|
||||
#define NV_DRM_DEV_LOG_WARN(__dev, __fmt, ...) \
|
||||
NV_DRM_LOG_WARN("[GPU ID 0x%08x] " __fmt, __dev->gpu_info.gpu_id, ##__VA_ARGS__)
|
||||
|
||||
#define NV_DRM_DEV_LOG_ERR(__dev, __fmt, ...) \
|
||||
NV_DRM_LOG_ERR("[GPU ID 0x%08x] " __fmt, __dev->gpu_info.gpu_id, ##__VA_ARGS__)
|
||||
|
||||
@@ -126,7 +105,6 @@ struct nv_drm_device {
|
||||
NvU64 modifiers[6 /* block linear */ + 1 /* linear */ + 1 /* terminator */];
|
||||
#endif
|
||||
|
||||
struct delayed_work hotplug_event_work;
|
||||
atomic_t enable_event_handling;
|
||||
|
||||
/**
|
||||
@@ -139,26 +117,9 @@ struct nv_drm_device {
|
||||
|
||||
#endif
|
||||
|
||||
#if defined(NV_DRM_FENCE_AVAILABLE)
|
||||
NvU64 semsurf_stride;
|
||||
NvU64 semsurf_max_submitted_offset;
|
||||
#endif
|
||||
|
||||
NvBool hasVideoMemory;
|
||||
|
||||
NvBool supportsSyncpts;
|
||||
NvBool subOwnershipGranted;
|
||||
NvBool hasFramebufferConsole;
|
||||
|
||||
/**
|
||||
* @drmMasterChangedSinceLastAtomicCommit:
|
||||
*
|
||||
* This flag is set in nv_drm_master_set and reset after a completed atomic
|
||||
* commit. It is used to restore or recommit state that is lost by the
|
||||
* NvKms modeset owner change, such as the CRTC color management
|
||||
* properties.
|
||||
*/
|
||||
NvBool drmMasterChangedSinceLastAtomicCommit;
|
||||
|
||||
struct drm_property *nv_out_fence_property;
|
||||
struct drm_property *nv_input_colorspace_property;
|
||||
|
||||
@@ -1,131 +0,0 @@
|
||||
###########################################################################
|
||||
# Kbuild fragment for nvidia-drm.ko
|
||||
###########################################################################
|
||||
|
||||
#
|
||||
# Define NVIDIA_DRM_SOURCES
|
||||
#
|
||||
|
||||
NVIDIA_DRM_SOURCES =
|
||||
NVIDIA_DRM_SOURCES += nvidia-drm/nvidia-drm.c
|
||||
NVIDIA_DRM_SOURCES += nvidia-drm/nvidia-drm-drv.c
|
||||
NVIDIA_DRM_SOURCES += nvidia-drm/nvidia-drm-utils.c
|
||||
NVIDIA_DRM_SOURCES += nvidia-drm/nvidia-drm-crtc.c
|
||||
NVIDIA_DRM_SOURCES += nvidia-drm/nvidia-drm-encoder.c
|
||||
NVIDIA_DRM_SOURCES += nvidia-drm/nvidia-drm-connector.c
|
||||
NVIDIA_DRM_SOURCES += nvidia-drm/nvidia-drm-gem.c
|
||||
NVIDIA_DRM_SOURCES += nvidia-drm/nvidia-drm-fb.c
|
||||
NVIDIA_DRM_SOURCES += nvidia-drm/nvidia-drm-modeset.c
|
||||
NVIDIA_DRM_SOURCES += nvidia-drm/nvidia-drm-fence.c
|
||||
NVIDIA_DRM_SOURCES += nvidia-drm/nvidia-drm-helper.c
|
||||
NVIDIA_DRM_SOURCES += nvidia-drm/nv-kthread-q.c
|
||||
NVIDIA_DRM_SOURCES += nvidia-drm/nv-pci-table.c
|
||||
NVIDIA_DRM_SOURCES += nvidia-drm/nvidia-drm-gem-nvkms-memory.c
|
||||
NVIDIA_DRM_SOURCES += nvidia-drm/nvidia-drm-gem-user-memory.c
|
||||
NVIDIA_DRM_SOURCES += nvidia-drm/nvidia-drm-gem-dma-buf.c
|
||||
NVIDIA_DRM_SOURCES += nvidia-drm/nvidia-drm-format.c
|
||||
NVIDIA_DRM_SOURCES += nvidia-drm/nvidia-drm-os-interface.c
|
||||
|
||||
#
|
||||
# Register the conftests needed by nvidia-drm.ko
|
||||
#
|
||||
|
||||
NV_CONFTEST_GENERIC_COMPILE_TESTS += drm_available
|
||||
NV_CONFTEST_GENERIC_COMPILE_TESTS += drm_atomic_available
|
||||
NV_CONFTEST_GENERIC_COMPILE_TESTS += is_export_symbol_gpl_refcount_inc
|
||||
NV_CONFTEST_GENERIC_COMPILE_TESTS += is_export_symbol_gpl_refcount_dec_and_test
|
||||
NV_CONFTEST_GENERIC_COMPILE_TESTS += drm_alpha_blending_available
|
||||
NV_CONFTEST_GENERIC_COMPILE_TESTS += is_export_symbol_present_drm_gem_prime_fd_to_handle
|
||||
NV_CONFTEST_GENERIC_COMPILE_TESTS += is_export_symbol_present_drm_gem_prime_handle_to_fd
|
||||
|
||||
NV_CONFTEST_FUNCTION_COMPILE_TESTS += drm_dev_unref
|
||||
NV_CONFTEST_FUNCTION_COMPILE_TESTS += drm_reinit_primary_mode_group
|
||||
NV_CONFTEST_FUNCTION_COMPILE_TESTS += get_user_pages_remote
|
||||
NV_CONFTEST_FUNCTION_COMPILE_TESTS += get_user_pages
|
||||
NV_CONFTEST_FUNCTION_COMPILE_TESTS += pin_user_pages_remote
|
||||
NV_CONFTEST_FUNCTION_COMPILE_TESTS += pin_user_pages
|
||||
NV_CONFTEST_FUNCTION_COMPILE_TESTS += drm_gem_object_lookup
|
||||
NV_CONFTEST_FUNCTION_COMPILE_TESTS += drm_atomic_state_ref_counting
|
||||
NV_CONFTEST_FUNCTION_COMPILE_TESTS += drm_driver_has_gem_prime_res_obj
|
||||
NV_CONFTEST_FUNCTION_COMPILE_TESTS += drm_atomic_helper_connector_dpms
|
||||
NV_CONFTEST_FUNCTION_COMPILE_TESTS += drm_connector_funcs_have_mode_in_name
|
||||
NV_CONFTEST_FUNCTION_COMPILE_TESTS += drm_connector_has_vrr_capable_property
|
||||
NV_CONFTEST_FUNCTION_COMPILE_TESTS += vmf_insert_pfn
|
||||
NV_CONFTEST_FUNCTION_COMPILE_TESTS += drm_framebuffer_get
|
||||
NV_CONFTEST_FUNCTION_COMPILE_TESTS += drm_gem_object_get
|
||||
NV_CONFTEST_FUNCTION_COMPILE_TESTS += drm_dev_put
|
||||
NV_CONFTEST_FUNCTION_COMPILE_TESTS += drm_format_num_planes
|
||||
NV_CONFTEST_FUNCTION_COMPILE_TESTS += drm_connector_for_each_possible_encoder
|
||||
NV_CONFTEST_FUNCTION_COMPILE_TESTS += drm_rotation_available
|
||||
NV_CONFTEST_FUNCTION_COMPILE_TESTS += drm_vma_offset_exact_lookup_locked
|
||||
NV_CONFTEST_FUNCTION_COMPILE_TESTS += drm_gem_object_put_unlocked
|
||||
NV_CONFTEST_FUNCTION_COMPILE_TESTS += nvhost_dma_fence_unpack
|
||||
NV_CONFTEST_FUNCTION_COMPILE_TESTS += list_is_first
|
||||
NV_CONFTEST_FUNCTION_COMPILE_TESTS += timer_setup
|
||||
NV_CONFTEST_FUNCTION_COMPILE_TESTS += dma_fence_set_error
|
||||
NV_CONFTEST_FUNCTION_COMPILE_TESTS += fence_set_error
|
||||
NV_CONFTEST_FUNCTION_COMPILE_TESTS += sync_file_get_fence
|
||||
NV_CONFTEST_FUNCTION_COMPILE_TESTS += drm_aperture_remove_conflicting_pci_framebuffers
|
||||
NV_CONFTEST_FUNCTION_COMPILE_TESTS += drm_fbdev_generic_setup
|
||||
NV_CONFTEST_FUNCTION_COMPILE_TESTS += drm_connector_attach_hdr_output_metadata_property
|
||||
NV_CONFTEST_FUNCTION_COMPILE_TESTS += drm_helper_crtc_enable_color_mgmt
|
||||
NV_CONFTEST_FUNCTION_COMPILE_TESTS += drm_crtc_enable_color_mgmt
|
||||
NV_CONFTEST_FUNCTION_COMPILE_TESTS += drm_atomic_helper_legacy_gamma_set
|
||||
|
||||
NV_CONFTEST_TYPE_COMPILE_TESTS += drm_bus_present
|
||||
NV_CONFTEST_TYPE_COMPILE_TESTS += drm_bus_has_bus_type
|
||||
NV_CONFTEST_TYPE_COMPILE_TESTS += drm_bus_has_get_irq
|
||||
NV_CONFTEST_TYPE_COMPILE_TESTS += drm_bus_has_get_name
|
||||
NV_CONFTEST_TYPE_COMPILE_TESTS += drm_driver_has_device_list
|
||||
NV_CONFTEST_TYPE_COMPILE_TESTS += drm_driver_has_legacy_dev_list
|
||||
NV_CONFTEST_TYPE_COMPILE_TESTS += drm_driver_has_set_busid
|
||||
NV_CONFTEST_TYPE_COMPILE_TESTS += drm_crtc_state_has_connectors_changed
|
||||
NV_CONFTEST_TYPE_COMPILE_TESTS += drm_init_function_args
|
||||
NV_CONFTEST_TYPE_COMPILE_TESTS += drm_helper_mode_fill_fb_struct
|
||||
NV_CONFTEST_TYPE_COMPILE_TESTS += drm_master_drop_has_from_release_arg
|
||||
NV_CONFTEST_TYPE_COMPILE_TESTS += drm_driver_unload_has_int_return_type
|
||||
NV_CONFTEST_TYPE_COMPILE_TESTS += vm_fault_has_address
|
||||
NV_CONFTEST_TYPE_COMPILE_TESTS += vm_ops_fault_removed_vma_arg
|
||||
NV_CONFTEST_TYPE_COMPILE_TESTS += drm_atomic_helper_crtc_destroy_state_has_crtc_arg
|
||||
NV_CONFTEST_TYPE_COMPILE_TESTS += drm_atomic_helper_plane_destroy_state_has_plane_arg
|
||||
NV_CONFTEST_TYPE_COMPILE_TESTS += drm_mode_object_find_has_file_priv_arg
|
||||
NV_CONFTEST_TYPE_COMPILE_TESTS += dma_buf_owner
|
||||
NV_CONFTEST_TYPE_COMPILE_TESTS += drm_connector_list_iter
|
||||
NV_CONFTEST_TYPE_COMPILE_TESTS += drm_atomic_helper_swap_state_has_stall_arg
|
||||
NV_CONFTEST_TYPE_COMPILE_TESTS += drm_driver_prime_flag_present
|
||||
NV_CONFTEST_TYPE_COMPILE_TESTS += vm_fault_t
|
||||
NV_CONFTEST_TYPE_COMPILE_TESTS += drm_gem_object_has_resv
|
||||
NV_CONFTEST_TYPE_COMPILE_TESTS += drm_crtc_state_has_async_flip
|
||||
NV_CONFTEST_TYPE_COMPILE_TESTS += drm_crtc_state_has_pageflip_flags
|
||||
NV_CONFTEST_TYPE_COMPILE_TESTS += drm_crtc_state_has_vrr_enabled
|
||||
NV_CONFTEST_TYPE_COMPILE_TESTS += drm_format_modifiers_present
|
||||
NV_CONFTEST_TYPE_COMPILE_TESTS += mm_has_mmap_lock
|
||||
NV_CONFTEST_TYPE_COMPILE_TESTS += drm_vma_node_is_allowed_has_tag_arg
|
||||
NV_CONFTEST_TYPE_COMPILE_TESTS += drm_vma_offset_node_has_readonly
|
||||
NV_CONFTEST_TYPE_COMPILE_TESTS += drm_display_mode_has_vrefresh
|
||||
NV_CONFTEST_TYPE_COMPILE_TESTS += drm_driver_master_set_has_int_return_type
|
||||
NV_CONFTEST_TYPE_COMPILE_TESTS += drm_driver_has_gem_free_object
|
||||
NV_CONFTEST_TYPE_COMPILE_TESTS += drm_prime_pages_to_sg_has_drm_device_arg
|
||||
NV_CONFTEST_TYPE_COMPILE_TESTS += drm_driver_has_gem_prime_callbacks
|
||||
NV_CONFTEST_TYPE_COMPILE_TESTS += drm_crtc_atomic_check_has_atomic_state_arg
|
||||
NV_CONFTEST_TYPE_COMPILE_TESTS += drm_gem_object_vmap_has_map_arg
|
||||
NV_CONFTEST_TYPE_COMPILE_TESTS += drm_plane_atomic_check_has_atomic_state_arg
|
||||
NV_CONFTEST_TYPE_COMPILE_TESTS += drm_device_has_pdev
|
||||
NV_CONFTEST_TYPE_COMPILE_TESTS += drm_crtc_state_has_no_vblank
|
||||
NV_CONFTEST_TYPE_COMPILE_TESTS += drm_mode_config_has_allow_fb_modifiers
|
||||
NV_CONFTEST_TYPE_COMPILE_TESTS += drm_has_hdr_output_metadata
|
||||
NV_CONFTEST_TYPE_COMPILE_TESTS += dma_resv_add_fence
|
||||
NV_CONFTEST_TYPE_COMPILE_TESTS += dma_resv_reserve_fences
|
||||
NV_CONFTEST_TYPE_COMPILE_TESTS += reservation_object_reserve_shared_has_num_fences_arg
|
||||
NV_CONFTEST_TYPE_COMPILE_TESTS += drm_connector_has_override_edid
|
||||
NV_CONFTEST_TYPE_COMPILE_TESTS += drm_master_has_leases
|
||||
NV_CONFTEST_TYPE_COMPILE_TESTS += drm_file_get_master
|
||||
NV_CONFTEST_TYPE_COMPILE_TESTS += drm_modeset_lock_all_end
|
||||
NV_CONFTEST_TYPE_COMPILE_TESTS += drm_connector_lookup
|
||||
NV_CONFTEST_TYPE_COMPILE_TESTS += drm_connector_put
|
||||
NV_CONFTEST_TYPE_COMPILE_TESTS += vm_area_struct_has_const_vm_flags
|
||||
NV_CONFTEST_TYPE_COMPILE_TESTS += drm_driver_has_dumb_destroy
|
||||
NV_CONFTEST_TYPE_COMPILE_TESTS += fence_ops_use_64bit_seqno
|
||||
NV_CONFTEST_TYPE_COMPILE_TESTS += drm_aperture_remove_conflicting_pci_framebuffers_has_driver_arg
|
||||
NV_CONFTEST_TYPE_COMPILE_TESTS += drm_mode_create_dp_colorspace_property_has_supported_colorspaces_arg
|
||||
NV_CONFTEST_TYPE_COMPILE_TESTS += drm_unlocked_ioctl_flag_present
|
||||
@@ -2,16 +2,29 @@
|
||||
# Kbuild fragment for nvidia-drm.ko
|
||||
###########################################################################
|
||||
|
||||
# Get our source file list and conftest list from the common file
|
||||
include $(src)/nvidia-drm/nvidia-drm-sources.mk
|
||||
|
||||
# Linux-specific sources
|
||||
NVIDIA_DRM_SOURCES += nvidia-drm/nvidia-drm-linux.c
|
||||
|
||||
#
|
||||
# Define NVIDIA_DRM_{SOURCES,OBJECTS}
|
||||
#
|
||||
|
||||
NVIDIA_DRM_SOURCES =
|
||||
NVIDIA_DRM_SOURCES += nvidia-drm/nvidia-drm.c
|
||||
NVIDIA_DRM_SOURCES += nvidia-drm/nvidia-drm-drv.c
|
||||
NVIDIA_DRM_SOURCES += nvidia-drm/nvidia-drm-utils.c
|
||||
NVIDIA_DRM_SOURCES += nvidia-drm/nvidia-drm-crtc.c
|
||||
NVIDIA_DRM_SOURCES += nvidia-drm/nvidia-drm-encoder.c
|
||||
NVIDIA_DRM_SOURCES += nvidia-drm/nvidia-drm-connector.c
|
||||
NVIDIA_DRM_SOURCES += nvidia-drm/nvidia-drm-gem.c
|
||||
NVIDIA_DRM_SOURCES += nvidia-drm/nvidia-drm-fb.c
|
||||
NVIDIA_DRM_SOURCES += nvidia-drm/nvidia-drm-modeset.c
|
||||
NVIDIA_DRM_SOURCES += nvidia-drm/nvidia-drm-fence.c
|
||||
NVIDIA_DRM_SOURCES += nvidia-drm/nvidia-drm-linux.c
|
||||
NVIDIA_DRM_SOURCES += nvidia-drm/nvidia-drm-helper.c
|
||||
NVIDIA_DRM_SOURCES += nvidia-drm/nv-pci-table.c
|
||||
NVIDIA_DRM_SOURCES += nvidia-drm/nvidia-drm-gem-nvkms-memory.c
|
||||
NVIDIA_DRM_SOURCES += nvidia-drm/nvidia-drm-gem-user-memory.c
|
||||
NVIDIA_DRM_SOURCES += nvidia-drm/nvidia-drm-gem-dma-buf.c
|
||||
NVIDIA_DRM_SOURCES += nvidia-drm/nvidia-drm-format.c
|
||||
|
||||
NVIDIA_DRM_OBJECTS = $(patsubst %.c,%.o,$(NVIDIA_DRM_SOURCES))
|
||||
|
||||
obj-m += nvidia-drm.o
|
||||
@@ -30,4 +43,94 @@ NVIDIA_DRM_CFLAGS += -UDEBUG -U_DEBUG -DNDEBUG -DNV_BUILD_MODULE_INSTANCES=0
|
||||
|
||||
$(call ASSIGN_PER_OBJ_CFLAGS, $(NVIDIA_DRM_OBJECTS), $(NVIDIA_DRM_CFLAGS))
|
||||
|
||||
#
|
||||
# Register the conftests needed by nvidia-drm.ko
|
||||
#
|
||||
|
||||
NV_OBJECTS_DEPEND_ON_CONFTEST += $(NVIDIA_DRM_OBJECTS)
|
||||
|
||||
NV_CONFTEST_GENERIC_COMPILE_TESTS += drm_available
|
||||
NV_CONFTEST_GENERIC_COMPILE_TESTS += drm_atomic_available
|
||||
NV_CONFTEST_GENERIC_COMPILE_TESTS += is_export_symbol_gpl_refcount_inc
|
||||
NV_CONFTEST_GENERIC_COMPILE_TESTS += is_export_symbol_gpl_refcount_dec_and_test
|
||||
NV_CONFTEST_GENERIC_COMPILE_TESTS += drm_alpha_blending_available
|
||||
NV_CONFTEST_GENERIC_COMPILE_TESTS += is_export_symbol_present_drm_gem_prime_fd_to_handle
|
||||
NV_CONFTEST_GENERIC_COMPILE_TESTS += is_export_symbol_present_drm_gem_prime_handle_to_fd
|
||||
|
||||
NV_CONFTEST_FUNCTION_COMPILE_TESTS += drm_dev_unref
|
||||
NV_CONFTEST_FUNCTION_COMPILE_TESTS += drm_reinit_primary_mode_group
|
||||
NV_CONFTEST_FUNCTION_COMPILE_TESTS += get_user_pages_remote
|
||||
NV_CONFTEST_FUNCTION_COMPILE_TESTS += get_user_pages
|
||||
NV_CONFTEST_FUNCTION_COMPILE_TESTS += pin_user_pages_remote
|
||||
NV_CONFTEST_FUNCTION_COMPILE_TESTS += pin_user_pages
|
||||
NV_CONFTEST_FUNCTION_COMPILE_TESTS += drm_gem_object_lookup
|
||||
NV_CONFTEST_FUNCTION_COMPILE_TESTS += drm_atomic_state_ref_counting
|
||||
NV_CONFTEST_FUNCTION_COMPILE_TESTS += drm_driver_has_gem_prime_res_obj
|
||||
NV_CONFTEST_FUNCTION_COMPILE_TESTS += drm_atomic_helper_connector_dpms
|
||||
NV_CONFTEST_FUNCTION_COMPILE_TESTS += drm_connector_funcs_have_mode_in_name
|
||||
NV_CONFTEST_FUNCTION_COMPILE_TESTS += drm_connector_has_vrr_capable_property
|
||||
NV_CONFTEST_FUNCTION_COMPILE_TESTS += vmf_insert_pfn
|
||||
NV_CONFTEST_FUNCTION_COMPILE_TESTS += drm_framebuffer_get
|
||||
NV_CONFTEST_FUNCTION_COMPILE_TESTS += drm_gem_object_get
|
||||
NV_CONFTEST_FUNCTION_COMPILE_TESTS += drm_dev_put
|
||||
NV_CONFTEST_FUNCTION_COMPILE_TESTS += drm_format_num_planes
|
||||
NV_CONFTEST_FUNCTION_COMPILE_TESTS += drm_connector_for_each_possible_encoder
|
||||
NV_CONFTEST_FUNCTION_COMPILE_TESTS += drm_rotation_available
|
||||
NV_CONFTEST_FUNCTION_COMPILE_TESTS += drm_vma_offset_exact_lookup_locked
|
||||
NV_CONFTEST_FUNCTION_COMPILE_TESTS += drm_gem_object_put_unlocked
|
||||
NV_CONFTEST_FUNCTION_COMPILE_TESTS += nvhost_dma_fence_unpack
|
||||
|
||||
NV_CONFTEST_TYPE_COMPILE_TESTS += drm_bus_present
|
||||
NV_CONFTEST_TYPE_COMPILE_TESTS += drm_bus_has_bus_type
|
||||
NV_CONFTEST_TYPE_COMPILE_TESTS += drm_bus_has_get_irq
|
||||
NV_CONFTEST_TYPE_COMPILE_TESTS += drm_bus_has_get_name
|
||||
NV_CONFTEST_TYPE_COMPILE_TESTS += drm_driver_has_device_list
|
||||
NV_CONFTEST_TYPE_COMPILE_TESTS += drm_driver_has_legacy_dev_list
|
||||
NV_CONFTEST_TYPE_COMPILE_TESTS += drm_driver_has_set_busid
|
||||
NV_CONFTEST_TYPE_COMPILE_TESTS += drm_crtc_state_has_connectors_changed
|
||||
NV_CONFTEST_TYPE_COMPILE_TESTS += drm_init_function_args
|
||||
NV_CONFTEST_TYPE_COMPILE_TESTS += drm_helper_mode_fill_fb_struct
|
||||
NV_CONFTEST_TYPE_COMPILE_TESTS += drm_master_drop_has_from_release_arg
|
||||
NV_CONFTEST_TYPE_COMPILE_TESTS += drm_driver_unload_has_int_return_type
|
||||
NV_CONFTEST_TYPE_COMPILE_TESTS += vm_fault_has_address
|
||||
NV_CONFTEST_TYPE_COMPILE_TESTS += vm_ops_fault_removed_vma_arg
|
||||
NV_CONFTEST_TYPE_COMPILE_TESTS += drm_atomic_helper_crtc_destroy_state_has_crtc_arg
|
||||
NV_CONFTEST_TYPE_COMPILE_TESTS += drm_atomic_helper_plane_destroy_state_has_plane_arg
|
||||
NV_CONFTEST_TYPE_COMPILE_TESTS += drm_mode_object_find_has_file_priv_arg
|
||||
NV_CONFTEST_TYPE_COMPILE_TESTS += dma_buf_owner
|
||||
NV_CONFTEST_TYPE_COMPILE_TESTS += drm_connector_list_iter
|
||||
NV_CONFTEST_TYPE_COMPILE_TESTS += drm_atomic_helper_swap_state_has_stall_arg
|
||||
NV_CONFTEST_TYPE_COMPILE_TESTS += drm_driver_prime_flag_present
|
||||
NV_CONFTEST_TYPE_COMPILE_TESTS += vm_fault_t
|
||||
NV_CONFTEST_TYPE_COMPILE_TESTS += drm_gem_object_has_resv
|
||||
NV_CONFTEST_TYPE_COMPILE_TESTS += drm_crtc_state_has_async_flip
|
||||
NV_CONFTEST_TYPE_COMPILE_TESTS += drm_crtc_state_has_pageflip_flags
|
||||
NV_CONFTEST_TYPE_COMPILE_TESTS += drm_crtc_state_has_vrr_enabled
|
||||
NV_CONFTEST_TYPE_COMPILE_TESTS += drm_format_modifiers_present
|
||||
NV_CONFTEST_TYPE_COMPILE_TESTS += mm_has_mmap_lock
|
||||
NV_CONFTEST_TYPE_COMPILE_TESTS += drm_vma_node_is_allowed_has_tag_arg
|
||||
NV_CONFTEST_TYPE_COMPILE_TESTS += drm_vma_offset_node_has_readonly
|
||||
NV_CONFTEST_TYPE_COMPILE_TESTS += drm_display_mode_has_vrefresh
|
||||
NV_CONFTEST_TYPE_COMPILE_TESTS += drm_driver_master_set_has_int_return_type
|
||||
NV_CONFTEST_TYPE_COMPILE_TESTS += drm_driver_has_gem_free_object
|
||||
NV_CONFTEST_TYPE_COMPILE_TESTS += drm_prime_pages_to_sg_has_drm_device_arg
|
||||
NV_CONFTEST_TYPE_COMPILE_TESTS += drm_driver_has_gem_prime_callbacks
|
||||
NV_CONFTEST_TYPE_COMPILE_TESTS += drm_crtc_atomic_check_has_atomic_state_arg
|
||||
NV_CONFTEST_TYPE_COMPILE_TESTS += drm_gem_object_vmap_has_map_arg
|
||||
NV_CONFTEST_TYPE_COMPILE_TESTS += drm_plane_atomic_check_has_atomic_state_arg
|
||||
NV_CONFTEST_TYPE_COMPILE_TESTS += drm_device_has_pdev
|
||||
NV_CONFTEST_TYPE_COMPILE_TESTS += drm_crtc_state_has_no_vblank
|
||||
NV_CONFTEST_TYPE_COMPILE_TESTS += drm_mode_config_has_allow_fb_modifiers
|
||||
NV_CONFTEST_TYPE_COMPILE_TESTS += drm_has_hdr_output_metadata
|
||||
NV_CONFTEST_TYPE_COMPILE_TESTS += dma_resv_add_fence
|
||||
NV_CONFTEST_TYPE_COMPILE_TESTS += dma_resv_reserve_fences
|
||||
NV_CONFTEST_TYPE_COMPILE_TESTS += reservation_object_reserve_shared_has_num_fences_arg
|
||||
NV_CONFTEST_TYPE_COMPILE_TESTS += drm_connector_has_override_edid
|
||||
NV_CONFTEST_TYPE_COMPILE_TESTS += drm_master_has_leases
|
||||
NV_CONFTEST_TYPE_COMPILE_TESTS += drm_file_get_master
|
||||
NV_CONFTEST_TYPE_COMPILE_TESTS += drm_modeset_lock_all_end
|
||||
NV_CONFTEST_TYPE_COMPILE_TESTS += drm_connector_lookup
|
||||
NV_CONFTEST_TYPE_COMPILE_TESTS += drm_connector_put
|
||||
NV_CONFTEST_TYPE_COMPILE_TESTS += vm_area_struct_has_const_vm_flags
|
||||
NV_CONFTEST_TYPE_COMPILE_TESTS += drm_driver_has_dumb_destroy
|
||||
NV_CONFTEST_TYPE_COMPILE_TESTS += drm_unlocked_ioctl_flag_present
|
||||
|
||||
@@ -45,7 +45,6 @@ int nv_drm_init(void)
|
||||
return -EINVAL;
|
||||
}
|
||||
|
||||
nvKms->setSuspendResumeCallback(nv_drm_suspend_resume);
|
||||
return nv_drm_probe_devices();
|
||||
#else
|
||||
return 0;
|
||||
@@ -55,7 +54,6 @@ int nv_drm_init(void)
|
||||
void nv_drm_exit(void)
|
||||
{
|
||||
#if defined(NV_DRM_AVAILABLE)
|
||||
nvKms->setSuspendResumeCallback(NULL);
|
||||
nv_drm_remove_devices();
|
||||
#endif
|
||||
}
|
||||
|
||||
@@ -201,7 +201,7 @@ static struct task_struct *thread_create_on_node(int (*threadfn)(void *data),
|
||||
|
||||
// Ran out of attempts - return thread even if its stack may not be
|
||||
// allocated on the preferred node
|
||||
if (i == (attempts - 1))
|
||||
if ((i == (attempts - 1)))
|
||||
break;
|
||||
|
||||
// Get the NUMA node where the first page of the stack is resident. If
|
||||
@@ -247,11 +247,6 @@ int nv_kthread_q_init_on_node(nv_kthread_q_t *q, const char *q_name, int preferr
|
||||
return 0;
|
||||
}
|
||||
|
||||
int nv_kthread_q_init(nv_kthread_q_t *q, const char *qname)
|
||||
{
|
||||
return nv_kthread_q_init_on_node(q, qname, NV_KTHREAD_NO_NODE);
|
||||
}
|
||||
|
||||
// Returns true (non-zero) if the item was actually scheduled, and false if the
|
||||
// item was already pending in a queue.
|
||||
static int _raw_q_schedule(nv_kthread_q_t *q, nv_kthread_q_item_t *q_item)
|
||||
|
||||
@@ -1,5 +1,5 @@
|
||||
/*
|
||||
* SPDX-FileCopyrightText: Copyright (c) 2015-2023 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
|
||||
* SPDX-FileCopyrightText: Copyright (c) 2015-21 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
|
||||
* SPDX-License-Identifier: MIT
|
||||
*
|
||||
* Permission is hereby granted, free of charge, to any person obtaining a
|
||||
@@ -35,13 +35,12 @@
|
||||
#include <linux/list.h>
|
||||
#include <linux/rwsem.h>
|
||||
#include <linux/freezer.h>
|
||||
#include <linux/poll.h>
|
||||
#include <linux/cdev.h>
|
||||
|
||||
#include <acpi/video.h>
|
||||
|
||||
#include "nvstatus.h"
|
||||
|
||||
#include "nv-register-module.h"
|
||||
#include "nv-modeset-interface.h"
|
||||
#include "nv-kref.h"
|
||||
|
||||
@@ -54,7 +53,6 @@
|
||||
#include "nv-kthread-q.h"
|
||||
#include "nv-time.h"
|
||||
#include "nv-lock.h"
|
||||
#include "nv-chardev-numbers.h"
|
||||
|
||||
/*
|
||||
* Commit aefb2f2e619b ("x86/bugs: Rename CONFIG_RETPOLINE =>
|
||||
@@ -71,18 +69,9 @@
|
||||
static bool output_rounding_fix = true;
|
||||
module_param_named(output_rounding_fix, output_rounding_fix, bool, 0400);
|
||||
|
||||
static bool disable_hdmi_frl = false;
|
||||
module_param_named(disable_hdmi_frl, disable_hdmi_frl, bool, 0400);
|
||||
|
||||
static bool disable_vrr_memclk_switch = false;
|
||||
module_param_named(disable_vrr_memclk_switch, disable_vrr_memclk_switch, bool, 0400);
|
||||
|
||||
static bool hdmi_deepcolor = false;
|
||||
module_param_named(hdmi_deepcolor, hdmi_deepcolor, bool, 0400);
|
||||
|
||||
static bool vblank_sem_control = false;
|
||||
module_param_named(vblank_sem_control, vblank_sem_control, bool, 0400);
|
||||
|
||||
static bool opportunistic_display_sync = true;
|
||||
module_param_named(opportunistic_display_sync, opportunistic_display_sync, bool, 0400);
|
||||
|
||||
@@ -96,7 +85,6 @@ MODULE_PARM_DESC(malloc_verbose, "Report information about malloc calls on modul
|
||||
static bool malloc_verbose = false;
|
||||
module_param_named(malloc_verbose, malloc_verbose, bool, 0400);
|
||||
|
||||
#if NVKMS_CONFIG_FILE_SUPPORTED
|
||||
/* This parameter is used to find the dpy override conf file */
|
||||
#define NVKMS_CONF_FILE_SPECIFIED (nvkms_conf != NULL)
|
||||
|
||||
@@ -105,7 +93,6 @@ MODULE_PARM_DESC(config_file,
|
||||
"(default: disabled)");
|
||||
static char *nvkms_conf = NULL;
|
||||
module_param_named(config_file, nvkms_conf, charp, 0400);
|
||||
#endif
|
||||
|
||||
static atomic_t nvkms_alloc_called_count;
|
||||
|
||||
@@ -114,26 +101,11 @@ NvBool nvkms_output_rounding_fix(void)
|
||||
return output_rounding_fix;
|
||||
}
|
||||
|
||||
NvBool nvkms_disable_hdmi_frl(void)
|
||||
{
|
||||
return disable_hdmi_frl;
|
||||
}
|
||||
|
||||
NvBool nvkms_disable_vrr_memclk_switch(void)
|
||||
{
|
||||
return disable_vrr_memclk_switch;
|
||||
}
|
||||
|
||||
NvBool nvkms_hdmi_deepcolor(void)
|
||||
{
|
||||
return hdmi_deepcolor;
|
||||
}
|
||||
|
||||
NvBool nvkms_vblank_sem_control(void)
|
||||
{
|
||||
return vblank_sem_control;
|
||||
}
|
||||
|
||||
NvBool nvkms_opportunistic_display_sync(void)
|
||||
{
|
||||
return opportunistic_display_sync;
|
||||
@@ -389,7 +361,7 @@ NvU64 nvkms_get_usec(void)
|
||||
struct timespec64 ts;
|
||||
NvU64 ns;
|
||||
|
||||
ktime_get_raw_ts64(&ts);
|
||||
ktime_get_real_ts64(&ts);
|
||||
|
||||
ns = timespec64_to_ns(&ts);
|
||||
return ns / 1000;
|
||||
@@ -503,8 +475,6 @@ nvkms_event_queue_changed(nvkms_per_open_handle_t *pOpenKernel,
|
||||
|
||||
static void nvkms_suspend(NvU32 gpuId)
|
||||
{
|
||||
nvKmsKapiSuspendResume(NV_TRUE /* suspend */);
|
||||
|
||||
if (gpuId == 0) {
|
||||
nvkms_write_lock_pm_lock();
|
||||
}
|
||||
@@ -523,8 +493,6 @@ static void nvkms_resume(NvU32 gpuId)
|
||||
if (gpuId == 0) {
|
||||
nvkms_write_unlock_pm_lock();
|
||||
}
|
||||
|
||||
nvKmsKapiSuspendResume(NV_FALSE /* suspend */);
|
||||
}
|
||||
|
||||
|
||||
@@ -853,6 +821,49 @@ void nvkms_free_timer(nvkms_timer_handle_t *handle)
|
||||
timer->cancel = NV_TRUE;
|
||||
}
|
||||
|
||||
void* nvkms_get_per_open_data(int fd)
|
||||
{
|
||||
struct file *filp = fget(fd);
|
||||
struct nvkms_per_open *popen = NULL;
|
||||
dev_t rdev = 0;
|
||||
void *data = NULL;
|
||||
|
||||
if (filp == NULL) {
|
||||
return NULL;
|
||||
}
|
||||
|
||||
if (filp->f_inode == NULL) {
|
||||
goto done;
|
||||
}
|
||||
rdev = filp->f_inode->i_rdev;
|
||||
|
||||
if ((MAJOR(rdev) != NVKMS_MAJOR_DEVICE_NUMBER) ||
|
||||
(MINOR(rdev) != NVKMS_MINOR_DEVICE_NUMBER)) {
|
||||
goto done;
|
||||
}
|
||||
|
||||
popen = filp->private_data;
|
||||
if (popen == NULL) {
|
||||
goto done;
|
||||
}
|
||||
|
||||
data = popen->data;
|
||||
|
||||
done:
|
||||
/*
|
||||
* fget() incremented the struct file's reference count, which
|
||||
* needs to be balanced with a call to fput(). It is safe to
|
||||
* decrement the reference count before returning
|
||||
* filp->private_data because core NVKMS is currently holding the
|
||||
* nvkms_lock, which prevents the nvkms_close() => nvKmsClose()
|
||||
* call chain from freeing the file out from under the caller of
|
||||
* nvkms_get_per_open_data().
|
||||
*/
|
||||
fput(filp);
|
||||
|
||||
return data;
|
||||
}
|
||||
|
||||
NvBool nvkms_fd_is_nvidia_chardev(int fd)
|
||||
{
|
||||
struct file *filp = fget(fd);
|
||||
@@ -1397,7 +1408,6 @@ static void nvkms_proc_exit(void)
|
||||
/*************************************************************************
|
||||
* NVKMS Config File Read
|
||||
************************************************************************/
|
||||
#if NVKMS_CONFIG_FILE_SUPPORTED
|
||||
static NvBool nvkms_fs_mounted(void)
|
||||
{
|
||||
return current->fs != NULL;
|
||||
@@ -1505,11 +1515,6 @@ static void nvkms_read_config_file_locked(void)
|
||||
|
||||
nvkms_free(buffer, buf_size);
|
||||
}
|
||||
#else
|
||||
static void nvkms_read_config_file_locked(void)
|
||||
{
|
||||
}
|
||||
#endif
|
||||
|
||||
/*************************************************************************
|
||||
* NVKMS KAPI functions
|
||||
@@ -1604,12 +1609,6 @@ static int nvkms_ioctl(struct inode *inode, struct file *filp,
|
||||
return status;
|
||||
}
|
||||
|
||||
static long nvkms_unlocked_ioctl(struct file *filp, unsigned int cmd,
|
||||
unsigned long arg)
|
||||
{
|
||||
return nvkms_ioctl(filp->f_inode, filp, cmd, arg);
|
||||
}
|
||||
|
||||
static unsigned int nvkms_poll(struct file *filp, poll_table *wait)
|
||||
{
|
||||
unsigned int mask = 0;
|
||||
@@ -1637,73 +1636,17 @@ static unsigned int nvkms_poll(struct file *filp, poll_table *wait)
|
||||
* Module loading support code.
|
||||
*************************************************************************/
|
||||
|
||||
#define NVKMS_RDEV (MKDEV(NV_MAJOR_DEVICE_NUMBER, \
|
||||
NV_MINOR_DEVICE_NUMBER_MODESET_DEVICE))
|
||||
|
||||
static struct file_operations nvkms_fops = {
|
||||
static nvidia_module_t nvidia_modeset_module = {
|
||||
.owner = THIS_MODULE,
|
||||
.poll = nvkms_poll,
|
||||
.unlocked_ioctl = nvkms_unlocked_ioctl,
|
||||
#if NVCPU_IS_X86_64 || NVCPU_IS_AARCH64
|
||||
.compat_ioctl = nvkms_unlocked_ioctl,
|
||||
#endif
|
||||
.mmap = nvkms_mmap,
|
||||
.module_name = "nvidia-modeset",
|
||||
.instance = 1, /* minor number: 255-1=254 */
|
||||
.open = nvkms_open,
|
||||
.release = nvkms_close,
|
||||
.close = nvkms_close,
|
||||
.mmap = nvkms_mmap,
|
||||
.ioctl = nvkms_ioctl,
|
||||
.poll = nvkms_poll,
|
||||
};
|
||||
|
||||
static struct cdev nvkms_device_cdev;
|
||||
|
||||
static int __init nvkms_register_chrdev(void)
|
||||
{
|
||||
int ret;
|
||||
|
||||
ret = register_chrdev_region(NVKMS_RDEV, 1, "nvidia-modeset");
|
||||
if (ret < 0) {
|
||||
return ret;
|
||||
}
|
||||
|
||||
cdev_init(&nvkms_device_cdev, &nvkms_fops);
|
||||
ret = cdev_add(&nvkms_device_cdev, NVKMS_RDEV, 1);
|
||||
if (ret < 0) {
|
||||
unregister_chrdev_region(NVKMS_RDEV, 1);
|
||||
return ret;
|
||||
}
|
||||
|
||||
return ret;
|
||||
}
|
||||
|
||||
static void nvkms_unregister_chrdev(void)
|
||||
{
|
||||
cdev_del(&nvkms_device_cdev);
|
||||
unregister_chrdev_region(NVKMS_RDEV, 1);
|
||||
}
|
||||
|
||||
void* nvkms_get_per_open_data(int fd)
|
||||
{
|
||||
struct file *filp = fget(fd);
|
||||
void *data = NULL;
|
||||
|
||||
if (filp) {
|
||||
if (filp->f_op == &nvkms_fops && filp->private_data) {
|
||||
struct nvkms_per_open *popen = filp->private_data;
|
||||
data = popen->data;
|
||||
}
|
||||
|
||||
/*
|
||||
* fget() incremented the struct file's reference count, which needs to
|
||||
* be balanced with a call to fput(). It is safe to decrement the
|
||||
* reference count before returning filp->private_data because core
|
||||
* NVKMS is currently holding the nvkms_lock, which prevents the
|
||||
* nvkms_close() => nvKmsClose() call chain from freeing the file out
|
||||
* from under the caller of nvkms_get_per_open_data().
|
||||
*/
|
||||
fput(filp);
|
||||
}
|
||||
|
||||
return data;
|
||||
}
|
||||
|
||||
static int __init nvkms_init(void)
|
||||
{
|
||||
int ret;
|
||||
@@ -1734,9 +1677,10 @@ static int __init nvkms_init(void)
|
||||
INIT_LIST_HEAD(&nvkms_timers.list);
|
||||
spin_lock_init(&nvkms_timers.lock);
|
||||
|
||||
ret = nvkms_register_chrdev();
|
||||
ret = nvidia_register_module(&nvidia_modeset_module);
|
||||
|
||||
if (ret != 0) {
|
||||
goto fail_register_chrdev;
|
||||
goto fail_register_module;
|
||||
}
|
||||
|
||||
down(&nvkms_lock);
|
||||
@@ -1755,8 +1699,8 @@ static int __init nvkms_init(void)
|
||||
return 0;
|
||||
|
||||
fail_module_load:
|
||||
nvkms_unregister_chrdev();
|
||||
fail_register_chrdev:
|
||||
nvidia_unregister_module(&nvidia_modeset_module);
|
||||
fail_register_module:
|
||||
nv_kthread_q_stop(&nvkms_deferred_close_kthread_q);
|
||||
fail_deferred_close_kthread:
|
||||
nv_kthread_q_stop(&nvkms_kthread_q);
|
||||
@@ -1820,7 +1764,7 @@ restart:
|
||||
nv_kthread_q_stop(&nvkms_deferred_close_kthread_q);
|
||||
nv_kthread_q_stop(&nvkms_kthread_q);
|
||||
|
||||
nvkms_unregister_chrdev();
|
||||
nvidia_unregister_module(&nvidia_modeset_module);
|
||||
nvkms_free_rm();
|
||||
|
||||
if (malloc_verbose) {
|
||||
|
||||
@@ -97,10 +97,8 @@ typedef struct {
|
||||
} NvKmsSyncPtOpParams;
|
||||
|
||||
NvBool nvkms_output_rounding_fix(void);
|
||||
NvBool nvkms_disable_hdmi_frl(void);
|
||||
|
||||
NvBool nvkms_disable_vrr_memclk_switch(void);
|
||||
NvBool nvkms_hdmi_deepcolor(void);
|
||||
NvBool nvkms_vblank_sem_control(void);
|
||||
NvBool nvkms_opportunistic_display_sync(void);
|
||||
|
||||
void nvkms_call_rm (void *ops);
|
||||
|
||||
@@ -58,18 +58,6 @@ nvidia-modeset-y += $(NVIDIA_MODESET_BINARY_OBJECT_O)
|
||||
NVIDIA_MODESET_CFLAGS += -I$(src)/nvidia-modeset
|
||||
NVIDIA_MODESET_CFLAGS += -UDEBUG -U_DEBUG -DNDEBUG -DNV_BUILD_MODULE_INSTANCES=0
|
||||
|
||||
# Some Android kernels prohibit driver use of filesystem functions like
|
||||
# filp_open() and kernel_read(). Disable the NVKMS_CONFIG_FILE_SUPPORTED
|
||||
# functionality that uses those functions when building for Android.
|
||||
|
||||
PLATFORM_IS_ANDROID ?= 0
|
||||
|
||||
ifeq ($(PLATFORM_IS_ANDROID),1)
|
||||
NVIDIA_MODESET_CFLAGS += -DNVKMS_CONFIG_FILE_SUPPORTED=0
|
||||
else
|
||||
NVIDIA_MODESET_CFLAGS += -DNVKMS_CONFIG_FILE_SUPPORTED=1
|
||||
endif
|
||||
|
||||
$(call ASSIGN_PER_OBJ_CFLAGS, $(NVIDIA_MODESET_OBJECTS), $(NVIDIA_MODESET_CFLAGS))
|
||||
|
||||
|
||||
|
||||
@@ -66,8 +66,6 @@ enum NvKmsClientType {
|
||||
NVKMS_CLIENT_KERNEL_SPACE,
|
||||
};
|
||||
|
||||
struct NvKmsPerOpenDev;
|
||||
|
||||
NvBool nvKmsIoctl(
|
||||
void *pOpenVoid,
|
||||
NvU32 cmd,
|
||||
@@ -103,11 +101,7 @@ NvBool nvKmsKapiGetFunctionsTableInternal
|
||||
struct NvKmsKapiFunctionsTable *funcsTable
|
||||
);
|
||||
|
||||
void nvKmsKapiSuspendResume(NvBool suspend);
|
||||
|
||||
NvBool nvKmsGetBacklight(NvU32 display_id, void *drv_priv, NvU32 *brightness);
|
||||
NvBool nvKmsSetBacklight(NvU32 display_id, void *drv_priv, NvU32 brightness);
|
||||
|
||||
NvBool nvKmsOpenDevHasSubOwnerPermissionOrBetter(const struct NvKmsPerOpenDev *pOpenDev);
|
||||
|
||||
#endif /* __NV_KMS_H__ */
|
||||
|
||||
@@ -283,8 +283,8 @@ static int nv_dma_map(struct sg_table *sg_head, void *context,
|
||||
nv_mem_context->sg_allocated = 1;
|
||||
for_each_sg(sg_head->sgl, sg, nv_mem_context->npages, i) {
|
||||
sg_set_page(sg, NULL, nv_mem_context->page_size, 0);
|
||||
sg_dma_address(sg) = dma_mapping->dma_addresses[i];
|
||||
sg_dma_len(sg) = nv_mem_context->page_size;
|
||||
sg->dma_address = dma_mapping->dma_addresses[i];
|
||||
sg->dma_length = nv_mem_context->page_size;
|
||||
}
|
||||
nv_mem_context->sg_head = *sg_head;
|
||||
*nmap = nv_mem_context->npages;
|
||||
@@ -338,13 +338,8 @@ static void nv_mem_put_pages_common(int nc,
|
||||
return;
|
||||
|
||||
if (nc) {
|
||||
#ifdef NVIDIA_P2P_CAP_GET_PAGES_PERSISTENT_API
|
||||
ret = nvidia_p2p_put_pages_persistent(nv_mem_context->page_virt_start,
|
||||
nv_mem_context->page_table, 0);
|
||||
#else
|
||||
ret = nvidia_p2p_put_pages(0, 0, nv_mem_context->page_virt_start,
|
||||
nv_mem_context->page_table);
|
||||
#endif
|
||||
} else {
|
||||
ret = nvidia_p2p_put_pages(0, 0, nv_mem_context->page_virt_start,
|
||||
nv_mem_context->page_table);
|
||||
@@ -452,15 +447,9 @@ static int nv_mem_get_pages_nc(unsigned long addr,
|
||||
nv_mem_context->core_context = core_context;
|
||||
nv_mem_context->page_size = GPU_PAGE_SIZE;
|
||||
|
||||
#ifdef NVIDIA_P2P_CAP_GET_PAGES_PERSISTENT_API
|
||||
ret = nvidia_p2p_get_pages_persistent(nv_mem_context->page_virt_start,
|
||||
nv_mem_context->mapped_size,
|
||||
&nv_mem_context->page_table, 0);
|
||||
#else
|
||||
ret = nvidia_p2p_get_pages(0, 0, nv_mem_context->page_virt_start, nv_mem_context->mapped_size,
|
||||
&nv_mem_context->page_table, NULL, NULL);
|
||||
#endif
|
||||
|
||||
if (ret < 0) {
|
||||
peer_err("error %d while calling nvidia_p2p_get_pages() with NULL callback\n", ret);
|
||||
return ret;
|
||||
@@ -505,6 +494,8 @@ static int __init nv_mem_client_init(void)
|
||||
}
|
||||
|
||||
#if defined (NV_MLNX_IB_PEER_MEM_SYMBOLS_PRESENT)
|
||||
int status = 0;
|
||||
|
||||
// off by one, to leave space for the trailing '1' which is flagging
|
||||
// the new client type
|
||||
BUG_ON(strlen(DRV_NAME) > IB_PEER_MEMORY_NAME_MAX-1);
|
||||
@@ -533,7 +524,7 @@ static int __init nv_mem_client_init(void)
|
||||
&mem_invalidate_callback);
|
||||
if (!reg_handle) {
|
||||
peer_err("nv_mem_client_init -- error while registering traditional client\n");
|
||||
rc = -EINVAL;
|
||||
status = -EINVAL;
|
||||
goto out;
|
||||
}
|
||||
|
||||
@@ -543,12 +534,12 @@ static int __init nv_mem_client_init(void)
|
||||
reg_handle_nc = ib_register_peer_memory_client(&nv_mem_client_nc, NULL);
|
||||
if (!reg_handle_nc) {
|
||||
peer_err("nv_mem_client_init -- error while registering nc client\n");
|
||||
rc = -EINVAL;
|
||||
status = -EINVAL;
|
||||
goto out;
|
||||
}
|
||||
|
||||
out:
|
||||
if (rc) {
|
||||
if (status) {
|
||||
if (reg_handle) {
|
||||
ib_unregister_peer_memory_client(reg_handle);
|
||||
reg_handle = NULL;
|
||||
@@ -560,7 +551,7 @@ out:
|
||||
}
|
||||
}
|
||||
|
||||
return rc;
|
||||
return status;
|
||||
#else
|
||||
return -EINVAL;
|
||||
#endif
|
||||
|
||||
@@ -1,5 +1,5 @@
|
||||
/*******************************************************************************
|
||||
Copyright (c) 2023 NVIDIA Corporation
|
||||
Copyright (c) 2022 NVIDIA Corporation
|
||||
|
||||
Permission is hereby granted, free of charge, to any person obtaining a copy
|
||||
of this software and associated documentation files (the "Software"), to
|
||||
|
||||
@@ -1,5 +1,5 @@
|
||||
/*******************************************************************************
|
||||
Copyright (c) 2023 NVIDIA Corporation
|
||||
Copyright (c) 2022 NVIDIA Corporation
|
||||
|
||||
Permission is hereby granted, free of charge, to any person obtaining a copy
|
||||
of this software and associated documentation files (the "Software"), to
|
||||
|
||||
@@ -1,5 +1,5 @@
|
||||
/*******************************************************************************
|
||||
Copyright (c) 2013-2023 NVIDIA Corporation
|
||||
Copyright (c) 2013-2022 NVIDIA Corporation
|
||||
|
||||
Permission is hereby granted, free of charge, to any person obtaining a copy
|
||||
of this software and associated documentation files (the "Software"), to
|
||||
|
||||
@@ -201,7 +201,7 @@ static struct task_struct *thread_create_on_node(int (*threadfn)(void *data),
|
||||
|
||||
// Ran out of attempts - return thread even if its stack may not be
|
||||
// allocated on the preferred node
|
||||
if (i == (attempts - 1))
|
||||
if ((i == (attempts - 1)))
|
||||
break;
|
||||
|
||||
// Get the NUMA node where the first page of the stack is resident. If
|
||||
@@ -247,11 +247,6 @@ int nv_kthread_q_init_on_node(nv_kthread_q_t *q, const char *q_name, int preferr
|
||||
return 0;
|
||||
}
|
||||
|
||||
int nv_kthread_q_init(nv_kthread_q_t *q, const char *qname)
|
||||
{
|
||||
return nv_kthread_q_init_on_node(q, qname, NV_KTHREAD_NO_NODE);
|
||||
}
|
||||
|
||||
// Returns true (non-zero) if the item was actually scheduled, and false if the
|
||||
// item was already pending in a queue.
|
||||
static int _raw_q_schedule(nv_kthread_q_t *q, nv_kthread_q_item_t *q_item)
|
||||
|
||||
@@ -27,7 +27,6 @@ NVIDIA_UVM_SOURCES += nvidia-uvm/uvm_rm_mem.c
|
||||
NVIDIA_UVM_SOURCES += nvidia-uvm/uvm_channel.c
|
||||
NVIDIA_UVM_SOURCES += nvidia-uvm/uvm_lock.c
|
||||
NVIDIA_UVM_SOURCES += nvidia-uvm/uvm_hal.c
|
||||
NVIDIA_UVM_SOURCES += nvidia-uvm/uvm_processors.c
|
||||
NVIDIA_UVM_SOURCES += nvidia-uvm/uvm_range_tree.c
|
||||
NVIDIA_UVM_SOURCES += nvidia-uvm/uvm_rb_tree.c
|
||||
NVIDIA_UVM_SOURCES += nvidia-uvm/uvm_range_allocator.c
|
||||
|
||||
@@ -86,8 +86,6 @@ NV_CONFTEST_FUNCTION_COMPILE_TESTS += mmget_not_zero
|
||||
NV_CONFTEST_FUNCTION_COMPILE_TESTS += mmgrab
|
||||
NV_CONFTEST_FUNCTION_COMPILE_TESTS += iommu_sva_bind_device_has_drvdata_arg
|
||||
NV_CONFTEST_FUNCTION_COMPILE_TESTS += vm_fault_to_errno
|
||||
NV_CONFTEST_FUNCTION_COMPILE_TESTS += find_next_bit_wrap
|
||||
NV_CONFTEST_FUNCTION_COMPILE_TESTS += iommu_is_dma_domain
|
||||
|
||||
NV_CONFTEST_TYPE_COMPILE_TESTS += backing_dev_info
|
||||
NV_CONFTEST_TYPE_COMPILE_TESTS += mm_context_t
|
||||
|
||||
@@ -24,17 +24,16 @@
|
||||
#include "nvstatus.h"
|
||||
|
||||
#if !defined(NV_PRINTF_STRING_SECTION)
|
||||
#if defined(NVRM) && NVOS_IS_LIBOS
|
||||
#include "libos_log.h"
|
||||
#define NV_PRINTF_STRING_SECTION LIBOS_SECTION_LOGGING
|
||||
#else // defined(NVRM) && NVOS_IS_LIBOS
|
||||
#if defined(NVRM) && NVCPU_IS_RISCV64
|
||||
#define NV_PRINTF_STRING_SECTION __attribute__ ((section (".logging")))
|
||||
#else // defined(NVRM) && NVCPU_IS_RISCV64
|
||||
#define NV_PRINTF_STRING_SECTION
|
||||
#endif // defined(NVRM) && NVOS_IS_LIBOS
|
||||
#endif // defined(NVRM) && NVCPU_IS_RISCV64
|
||||
#endif // !defined(NV_PRINTF_STRING_SECTION)
|
||||
|
||||
/*
|
||||
* Include nvstatuscodes.h twice. Once for creating constant strings in the
|
||||
* the NV_PRINTF_STRING_SECTION section of the executable, and once to build
|
||||
* the NV_PRINTF_STRING_SECTION section of the ececutable, and once to build
|
||||
* the g_StatusCodeList table.
|
||||
*/
|
||||
#undef NV_STATUS_CODE
|
||||
|
||||
@@ -1,5 +1,5 @@
|
||||
/*******************************************************************************
|
||||
Copyright (c) 2015-2023 NVIDIA Corporation
|
||||
Copyright (c) 2015-2022 NVIDIA Corporation
|
||||
|
||||
Permission is hereby granted, free of charge, to any person obtaining a copy
|
||||
of this software and associated documentation files (the "Software"), to
|
||||
@@ -1053,7 +1053,7 @@ NV_STATUS uvm_test_register_unload_state_buffer(UVM_TEST_REGISTER_UNLOAD_STATE_B
|
||||
// are not used because unload_state_buf may be a managed memory pointer and
|
||||
// therefore a locking assertion from the CPU fault handler could be fired.
|
||||
nv_mmap_read_lock(current->mm);
|
||||
ret = NV_PIN_USER_PAGES(params->unload_state_buf, 1, FOLL_WRITE, &page);
|
||||
ret = NV_PIN_USER_PAGES(params->unload_state_buf, 1, FOLL_WRITE, &page, NULL);
|
||||
nv_mmap_read_unlock(current->mm);
|
||||
|
||||
if (ret < 0)
|
||||
|
||||
@@ -1,5 +1,5 @@
|
||||
/*******************************************************************************
|
||||
Copyright (c) 2013-2023 NVIDIA Corporation
|
||||
Copyright (c) 2013-2022 NVIDIA Corporation
|
||||
|
||||
Permission is hereby granted, free of charge, to any person obtaining a copy
|
||||
of this software and associated documentation files (the "Software"), to
|
||||
@@ -45,20 +45,16 @@
|
||||
// #endif
|
||||
// 3) Do the same thing for the function definition, and for any structs that
|
||||
// are taken as arguments to these functions.
|
||||
// 4) Let this change propagate over to cuda_a and dev_a, so that the CUDA and
|
||||
// nvidia-cfg libraries can start using the new API by bumping up the API
|
||||
// version number it's using.
|
||||
// Places where UVM_API_REVISION is defined are:
|
||||
// drivers/gpgpu/cuda/cuda.nvmk (cuda_a)
|
||||
// drivers/setup/linux/nvidia-cfg/makefile.nvmk (dev_a)
|
||||
// 5) Once the dev_a and cuda_a changes have made it back into chips_a,
|
||||
// remove the old API declaration, definition, and any old structs that were
|
||||
// in use.
|
||||
// 4) Let this change propagate over to cuda_a, so that the CUDA driver can
|
||||
// start using the new API by bumping up the API version number its using.
|
||||
// This can be found in gpgpu/cuda/cuda.nvmk.
|
||||
// 5) Once the cuda_a changes have made it back into chips_a, remove the old API
|
||||
// declaration, definition, and any old structs that were in use.
|
||||
|
||||
#ifndef _UVM_H_
|
||||
#define _UVM_H_
|
||||
|
||||
#define UVM_API_LATEST_REVISION 11
|
||||
#define UVM_API_LATEST_REVISION 8
|
||||
|
||||
#if !defined(UVM_API_REVISION)
|
||||
#error "please define UVM_API_REVISION macro to a desired version number or UVM_API_LATEST_REVISION macro"
|
||||
@@ -184,8 +180,12 @@ NV_STATUS UvmSetDriverVersion(NvU32 major, NvU32 changelist);
|
||||
// because it is not very informative.
|
||||
//
|
||||
//------------------------------------------------------------------------------
|
||||
#if UVM_API_REV_IS_AT_MOST(4)
|
||||
NV_STATUS UvmInitialize(UvmFileDescriptor fd);
|
||||
#else
|
||||
NV_STATUS UvmInitialize(UvmFileDescriptor fd,
|
||||
NvU64 flags);
|
||||
#endif
|
||||
|
||||
//------------------------------------------------------------------------------
|
||||
// UvmDeinitialize
|
||||
@@ -216,10 +216,6 @@ NV_STATUS UvmDeinitialize(void);
|
||||
// Note that it is not required to release VA ranges that were reserved with
|
||||
// UvmReserveVa().
|
||||
//
|
||||
// This is useful for per-process checkpoint and restore, where kernel-mode
|
||||
// state needs to be reconfigured to match the expectations of a pre-existing
|
||||
// user-mode process.
|
||||
//
|
||||
// UvmReopen() closes the open file returned by UvmGetFileDescriptor() and
|
||||
// replaces it with a new open file with the same name.
|
||||
//
|
||||
@@ -297,9 +293,7 @@ NV_STATUS UvmIsPageableMemoryAccessSupported(NvBool *pageableMemAccess);
|
||||
//
|
||||
// Arguments:
|
||||
// gpuUuid: (INPUT)
|
||||
// UUID of the physical GPU if the GPU is not SMC capable or SMC
|
||||
// enabled, or the GPU instance UUID of the partition for which
|
||||
// pageable memory access support is queried.
|
||||
// UUID of the GPU for which pageable memory access support is queried.
|
||||
//
|
||||
// pageableMemAccess: (OUTPUT)
|
||||
// Returns true (non-zero) if the GPU represented by gpuUuid supports
|
||||
@@ -329,19 +323,9 @@ NV_STATUS UvmIsPageableMemoryAccessSupportedOnGpu(const NvProcessorUuid *gpuUuid
|
||||
// usage. Calling UvmRegisterGpu multiple times on the same GPU from the same
|
||||
// process results in an error.
|
||||
//
|
||||
// After successfully registering a GPU partition, all subsequent API calls
|
||||
// which take a NvProcessorUuid argument (including UvmGpuMappingAttributes),
|
||||
// must use the GI partition UUID which can be obtained with
|
||||
// NvRmControl(NVC637_CTRL_CMD_GET_UUID). Otherwise, if the GPU is not SMC
|
||||
// capable or SMC enabled, the physical GPU UUID must be used.
|
||||
//
|
||||
// Arguments:
|
||||
// gpuUuid: (INPUT)
|
||||
// UUID of the physical GPU to register.
|
||||
//
|
||||
// platformParams: (INPUT)
|
||||
// User handles identifying the GPU partition to register.
|
||||
// This should be NULL if the GPU is not SMC capable or SMC enabled.
|
||||
// UUID of the GPU to register.
|
||||
//
|
||||
// Error codes:
|
||||
// NV_ERR_NO_MEMORY:
|
||||
@@ -376,31 +360,27 @@ NV_STATUS UvmIsPageableMemoryAccessSupportedOnGpu(const NvProcessorUuid *gpuUuid
|
||||
// OS state required to register the GPU is not present.
|
||||
//
|
||||
// NV_ERR_INVALID_STATE:
|
||||
// OS state required to register the GPU is malformed, or the partition
|
||||
// identified by the user handles or its configuration changed.
|
||||
// OS state required to register the GPU is malformed.
|
||||
//
|
||||
// NV_ERR_GENERIC:
|
||||
// Unexpected error. We try hard to avoid returning this error code,
|
||||
// because it is not very informative.
|
||||
//
|
||||
//------------------------------------------------------------------------------
|
||||
#if UVM_API_REV_IS_AT_MOST(8)
|
||||
NV_STATUS UvmRegisterGpu(const NvProcessorUuid *gpuUuid);
|
||||
#else
|
||||
NV_STATUS UvmRegisterGpu(const NvProcessorUuid *gpuUuid,
|
||||
const UvmGpuPlatformParams *platformParams);
|
||||
#endif
|
||||
|
||||
#if UVM_API_REV_IS_AT_MOST(8)
|
||||
//------------------------------------------------------------------------------
|
||||
// UvmRegisterGpuSmc
|
||||
//
|
||||
// The same as UvmRegisterGpu, but takes additional parameters to specify the
|
||||
// GPU partition being registered if SMC is enabled.
|
||||
//
|
||||
// TODO: Bug 2844714: Merge UvmRegisterGpuSmc() with UvmRegisterGpu() once
|
||||
// the initial SMC support is in place.
|
||||
//
|
||||
// Arguments:
|
||||
// gpuUuid: (INPUT)
|
||||
// UUID of the physical GPU of the SMC partition to register.
|
||||
// UUID of the parent GPU of the SMC partition to register.
|
||||
//
|
||||
// platformParams: (INPUT)
|
||||
// User handles identifying the partition to register.
|
||||
@@ -413,7 +393,6 @@ NV_STATUS UvmRegisterGpu(const NvProcessorUuid *gpuUuid,
|
||||
//
|
||||
NV_STATUS UvmRegisterGpuSmc(const NvProcessorUuid *gpuUuid,
|
||||
const UvmGpuPlatformParams *platformParams);
|
||||
#endif
|
||||
|
||||
//------------------------------------------------------------------------------
|
||||
// UvmUnregisterGpu
|
||||
@@ -439,8 +418,7 @@ NV_STATUS UvmRegisterGpuSmc(const NvProcessorUuid *gpuUuid,
|
||||
//
|
||||
// Arguments:
|
||||
// gpuUuid: (INPUT)
|
||||
// UUID of the physical GPU if the GPU is not SMC capable or SMC
|
||||
// enabled, or the GPU instance UUID of the partition to unregister.
|
||||
// UUID of the GPU to unregister.
|
||||
//
|
||||
// Error codes:
|
||||
// NV_ERR_INVALID_DEVICE:
|
||||
@@ -498,8 +476,7 @@ NV_STATUS UvmUnregisterGpu(const NvProcessorUuid *gpuUuid);
|
||||
//
|
||||
// Arguments:
|
||||
// gpuUuid: (INPUT)
|
||||
// UUID of the physical GPU if the GPU is not SMC capable or SMC
|
||||
// enabled, or the GPU instance UUID of the partition to register.
|
||||
// UUID of the GPU to register.
|
||||
//
|
||||
// platformParams: (INPUT)
|
||||
// On Linux: RM ctrl fd, hClient and hVaSpace.
|
||||
@@ -570,9 +547,7 @@ NV_STATUS UvmRegisterGpuVaSpace(const NvProcessorUuid *gpuUuid,
|
||||
//
|
||||
// Arguments:
|
||||
// gpuUuid: (INPUT)
|
||||
// UUID of the physical GPU if the GPU is not SMC capable or SMC
|
||||
// enabled, or the GPU instance UUID of the partition whose VA space
|
||||
// should be unregistered.
|
||||
// UUID of the GPU whose VA space should be unregistered.
|
||||
//
|
||||
// Error codes:
|
||||
// NV_ERR_INVALID_DEVICE:
|
||||
@@ -602,7 +577,7 @@ NV_STATUS UvmUnregisterGpuVaSpace(const NvProcessorUuid *gpuUuid);
|
||||
//
|
||||
// The two GPUs must be connected via PCIe. An error is returned if the GPUs are
|
||||
// not connected or are connected over an interconnect different than PCIe
|
||||
// (NVLink or SMC partitions, for example).
|
||||
// (NVLink, for example).
|
||||
//
|
||||
// If both GPUs have GPU VA spaces registered for them, the two GPU VA spaces
|
||||
// must support the same set of page sizes for GPU mappings.
|
||||
@@ -615,12 +590,10 @@ NV_STATUS UvmUnregisterGpuVaSpace(const NvProcessorUuid *gpuUuid);
|
||||
//
|
||||
// Arguments:
|
||||
// gpuUuidA: (INPUT)
|
||||
// UUID of the physical GPU if the GPU is not SMC capable or SMC
|
||||
// enabled, or the GPU instance UUID of the partition A.
|
||||
// UUID of GPU A.
|
||||
//
|
||||
// gpuUuidB: (INPUT)
|
||||
// UUID of the physical GPU if the GPU is not SMC capable or SMC
|
||||
// enabled, or the GPU instance UUID of the partition B.
|
||||
// UUID of GPU B.
|
||||
//
|
||||
// Error codes:
|
||||
// NV_ERR_NO_MEMORY:
|
||||
@@ -666,12 +639,10 @@ NV_STATUS UvmEnablePeerAccess(const NvProcessorUuid *gpuUuidA,
|
||||
//
|
||||
// Arguments:
|
||||
// gpuUuidA: (INPUT)
|
||||
// UUID of the physical GPU if the GPU is not SMC capable or SMC
|
||||
// enabled, or the GPU instance UUID of the partition A.
|
||||
// UUID of GPU A.
|
||||
//
|
||||
// gpuUuidB: (INPUT)
|
||||
// UUID of the physical GPU if the GPU is not SMC capable or SMC
|
||||
// enabled, or the GPU instance UUID of the partition B.
|
||||
// UUID of GPU B.
|
||||
//
|
||||
// Error codes:
|
||||
// NV_ERR_INVALID_DEVICE:
|
||||
@@ -716,9 +687,7 @@ NV_STATUS UvmDisablePeerAccess(const NvProcessorUuid *gpuUuidA,
|
||||
//
|
||||
// Arguments:
|
||||
// gpuUuid: (INPUT)
|
||||
// UUID of the physical GPU if the GPU is not SMC capable or SMC
|
||||
// enabled, or the GPU instance UUID of the partition that the channel is
|
||||
// associated with.
|
||||
// UUID of the GPU that the channel is associated with.
|
||||
//
|
||||
// platformParams: (INPUT)
|
||||
// On Linux: RM ctrl fd, hClient and hChannel.
|
||||
@@ -1157,14 +1126,11 @@ NV_STATUS UvmAllowMigrationRangeGroups(const NvU64 *rangeGroupIds,
|
||||
// Length, in bytes, of the range.
|
||||
//
|
||||
// preferredLocationUuid: (INPUT)
|
||||
// UUID of the CPU, UUID of the physical GPU if the GPU is not SMC
|
||||
// capable or SMC enabled, or the GPU instance UUID of the partition of
|
||||
// the preferred location for this VA range.
|
||||
// UUID of the preferred location for this VA range.
|
||||
//
|
||||
// accessedByUuids: (INPUT)
|
||||
// UUID of the CPU, UUID of the physical GPUs if the GPUs are not SMC
|
||||
// capable or SMC enabled, or the GPU instance UUID of the partitions
|
||||
// that should have persistent mappings to this VA range.
|
||||
// UUIDs of all processors that should have persistent mappings to this
|
||||
// VA range.
|
||||
//
|
||||
// accessedByCount: (INPUT)
|
||||
// Number of elements in the accessedByUuids array.
|
||||
@@ -1442,15 +1408,12 @@ NV_STATUS UvmAllocSemaphorePool(void *base,
|
||||
// Length, in bytes, of the range.
|
||||
//
|
||||
// destinationUuid: (INPUT)
|
||||
// UUID of the physical GPU if the GPU is not SMC capable or SMC
|
||||
// enabled, the GPU instance UUID of the partition, or the CPU UUID to
|
||||
// migrate pages to.
|
||||
// UUID of the destination processor to migrate pages to.
|
||||
//
|
||||
// preferredCpuMemoryNode: (INPUT)
|
||||
// Preferred CPU NUMA memory node used if the destination processor is
|
||||
// the CPU. -1 indicates no preference, in which case the pages used
|
||||
// can be on any of the available CPU NUMA nodes. If NUMA is disabled
|
||||
// only 0 and -1 are allowed.
|
||||
// the CPU. This argument is ignored if the given virtual address range
|
||||
// corresponds to managed memory.
|
||||
//
|
||||
// Error codes:
|
||||
// NV_ERR_INVALID_ADDRESS:
|
||||
@@ -1464,11 +1427,6 @@ NV_STATUS UvmAllocSemaphorePool(void *base,
|
||||
// The VA range exceeds the largest virtual address supported by the
|
||||
// destination processor.
|
||||
//
|
||||
// NV_ERR_INVALID_ARGUMENT:
|
||||
// preferredCpuMemoryNode is not a valid CPU NUMA node or it corresponds
|
||||
// to a NUMA node ID for a registered GPU. If NUMA is disabled, it
|
||||
// indicates that preferredCpuMemoryNode was not either 0 or -1.
|
||||
//
|
||||
// NV_ERR_INVALID_DEVICE:
|
||||
// destinationUuid does not represent a valid processor such as a CPU or
|
||||
// a GPU with a GPU VA space registered for it. Or destinationUuid is a
|
||||
@@ -1494,10 +1452,16 @@ NV_STATUS UvmAllocSemaphorePool(void *base,
|
||||
// pages were associated with a non-migratable range group.
|
||||
//
|
||||
//------------------------------------------------------------------------------
|
||||
#if UVM_API_REV_IS_AT_MOST(5)
|
||||
NV_STATUS UvmMigrate(void *base,
|
||||
NvLength length,
|
||||
const NvProcessorUuid *destinationUuid);
|
||||
#else
|
||||
NV_STATUS UvmMigrate(void *base,
|
||||
NvLength length,
|
||||
const NvProcessorUuid *destinationUuid,
|
||||
NvS32 preferredCpuMemoryNode);
|
||||
#endif
|
||||
|
||||
//------------------------------------------------------------------------------
|
||||
// UvmMigrateAsync
|
||||
@@ -1529,15 +1493,12 @@ NV_STATUS UvmMigrate(void *base,
|
||||
// Length, in bytes, of the range.
|
||||
//
|
||||
// destinationUuid: (INPUT)
|
||||
// UUID of the physical GPU if the GPU is not SMC capable or SMC
|
||||
// enabled, the GPU instance UUID of the partition, or the CPU UUID to
|
||||
// migrate pages to.
|
||||
// UUID of the destination processor to migrate pages to.
|
||||
//
|
||||
// preferredCpuMemoryNode: (INPUT)
|
||||
// Preferred CPU NUMA memory node used if the destination processor is
|
||||
// the CPU. -1 indicates no preference, in which case the pages used
|
||||
// can be on any of the available CPU NUMA nodes. If NUMA is disabled
|
||||
// only 0 and -1 are allowed.
|
||||
// the CPU. This argument is ignored if the given virtual address range
|
||||
// corresponds to managed memory.
|
||||
//
|
||||
// semaphoreAddress: (INPUT)
|
||||
// Base address of the semaphore.
|
||||
@@ -1582,20 +1543,30 @@ NV_STATUS UvmMigrate(void *base,
|
||||
// pages were associated with a non-migratable range group.
|
||||
//
|
||||
//------------------------------------------------------------------------------
|
||||
#if UVM_API_REV_IS_AT_MOST(5)
|
||||
NV_STATUS UvmMigrateAsync(void *base,
|
||||
NvLength length,
|
||||
const NvProcessorUuid *destinationUuid,
|
||||
void *semaphoreAddress,
|
||||
NvU32 semaphorePayload);
|
||||
#else
|
||||
NV_STATUS UvmMigrateAsync(void *base,
|
||||
NvLength length,
|
||||
const NvProcessorUuid *destinationUuid,
|
||||
NvS32 preferredCpuMemoryNode,
|
||||
void *semaphoreAddress,
|
||||
NvU32 semaphorePayload);
|
||||
#endif
|
||||
|
||||
//------------------------------------------------------------------------------
|
||||
// UvmMigrateRangeGroup
|
||||
//
|
||||
// Migrates the backing of all virtual address ranges associated with the given
|
||||
// range group to the specified destination processor. The behavior of this API
|
||||
// is equivalent to calling UvmMigrate with preferredCpuMemoryNode = -1 on each
|
||||
// VA range associated with this range group.
|
||||
// is equivalent to calling UvmMigrate on each VA range associated with this
|
||||
// range group. The value for the preferredCpuMemoryNode is irrelevant in this
|
||||
// case as it only applies to migrations of pageable address, which cannot be
|
||||
// used to create range groups.
|
||||
//
|
||||
// Any errors encountered during migration are returned immediately. No attempt
|
||||
// is made to migrate the remaining unmigrated ranges and the ranges that are
|
||||
@@ -1609,9 +1580,7 @@ NV_STATUS UvmMigrateAsync(void *base,
|
||||
// Id of the range group whose associated VA ranges have to be migrated.
|
||||
//
|
||||
// destinationUuid: (INPUT)
|
||||
// UUID of the physical GPU if the GPU is not SMC capable or SMC
|
||||
// enabled, the GPU instance UUID of the partition, or the CPU UUID to
|
||||
// migrate pages to.
|
||||
// UUID of the destination processor to migrate pages to.
|
||||
//
|
||||
// Error codes:
|
||||
// NV_ERR_OBJECT_NOT_FOUND:
|
||||
@@ -1973,9 +1942,7 @@ NV_STATUS UvmMapExternalAllocation(void *base,
|
||||
//
|
||||
//
|
||||
// gpuUuid: (INPUT)
|
||||
// UUID of the physical GPU if the GPU is not SMC capable or SMC
|
||||
// enabled, or the GPU instance UUID of the partition to map the sparse
|
||||
// region on.
|
||||
// UUID of the GPU to map the sparse region on.
|
||||
//
|
||||
// Errors:
|
||||
// NV_ERR_INVALID_ADDRESS:
|
||||
@@ -2032,9 +1999,7 @@ NV_STATUS UvmMapExternalSparse(void *base,
|
||||
// The length of the virtual address range.
|
||||
//
|
||||
// gpuUuid: (INPUT)
|
||||
// UUID of the physical GPU if the GPU is not SMC capable or SMC
|
||||
// enabled, or the GPU instance UUID of the partition to unmap the VA
|
||||
// range from.
|
||||
// UUID of the GPU to unmap the VA range from.
|
||||
//
|
||||
// Errors:
|
||||
// NV_ERR_INVALID_ADDRESS:
|
||||
@@ -2101,9 +2066,7 @@ NV_STATUS UvmUnmapExternalAllocation(void *base,
|
||||
// supported by the GPU.
|
||||
//
|
||||
// gpuUuid: (INPUT)
|
||||
// UUID of the physical GPU if the GPU is not SMC capable or SMC
|
||||
// enabled, or the GPU instance UUID of the partition to map the
|
||||
// dynamic parallelism region on.
|
||||
// UUID of the GPU to map the dynamic parallelism region on.
|
||||
//
|
||||
// Errors:
|
||||
// NV_ERR_UVM_ADDRESS_IN_USE:
|
||||
@@ -2177,8 +2140,7 @@ NV_STATUS UvmMapDynamicParallelismRegion(void *base,
|
||||
//
|
||||
// If any page in the VA range has a preferred location, then the migration and
|
||||
// mapping policies associated with this API take precedence over those related
|
||||
// to the preferred location. If the preferred location is a specific CPU NUMA
|
||||
// node, that NUMA node will be used for a CPU-resident copy of the page.
|
||||
// to the preferred location.
|
||||
//
|
||||
// If any pages in this VA range have any processors present in their
|
||||
// accessed-by list, the migration and mapping policies associated with this
|
||||
@@ -2309,7 +2271,7 @@ NV_STATUS UvmDisableReadDuplication(void *base,
|
||||
// UvmPreventMigrationRangeGroups has not been called on the range group that
|
||||
// those pages are associated with, then the migration and mapping policies
|
||||
// associated with UvmEnableReadDuplication override the policies outlined
|
||||
// above. Note that enabling read duplication on any pages in this VA range
|
||||
// above. Note that enabling read duplication on on any pages in this VA range
|
||||
// does not clear the state set by this API for those pages. It merely overrides
|
||||
// the policies associated with this state until read duplication is disabled
|
||||
// for those pages.
|
||||
@@ -2335,15 +2297,15 @@ NV_STATUS UvmDisableReadDuplication(void *base,
|
||||
// Length, in bytes, of the range.
|
||||
//
|
||||
// preferredLocationUuid: (INPUT)
|
||||
// UUID of the physical GPU if the GPU is not SMC capable or SMC
|
||||
// enabled, the GPU instance UUID of the partition, or the CPU UUID
|
||||
// preferred location.
|
||||
// UUID of the preferred location.
|
||||
//
|
||||
// preferredCpuMemoryNode: (INPUT)
|
||||
// preferredCpuNumaNode: (INPUT)
|
||||
// Preferred CPU NUMA memory node used if preferredLocationUuid is the
|
||||
// UUID of the CPU. -1 is a special value which indicates all CPU nodes
|
||||
// allowed by the global and thread memory policies. If NUMA is disabled
|
||||
// only 0 and -1 are allowed.
|
||||
// allowed by the global and thread memory policies. This argument is
|
||||
// ignored if preferredLocationUuid refers to a GPU or the given virtual
|
||||
// address range corresponds to managed memory. If NUMA is not enabled,
|
||||
// only 0 or -1 is allowed.
|
||||
//
|
||||
// Errors:
|
||||
// NV_ERR_INVALID_ADDRESS:
|
||||
@@ -2373,11 +2335,10 @@ NV_STATUS UvmDisableReadDuplication(void *base,
|
||||
//
|
||||
// NV_ERR_INVALID_ARGUMENT:
|
||||
// One of the following occured:
|
||||
// - preferredLocationUuid is the UUID of the CPU and
|
||||
// preferredCpuMemoryNode is either:
|
||||
// - not a valid NUMA node,
|
||||
// - not a possible NUMA node, or
|
||||
// - a NUMA node ID corresponding to a registered GPU.
|
||||
// - preferredLocationUuid is the UUID of a CPU and preferredCpuNumaNode
|
||||
// refers to a registered GPU.
|
||||
// - preferredCpuNumaNode is invalid and preferredLocationUuid is the
|
||||
// UUID of the CPU.
|
||||
//
|
||||
// NV_ERR_NOT_SUPPORTED:
|
||||
// The UVM file descriptor is associated with another process and the
|
||||
@@ -2388,10 +2349,16 @@ NV_STATUS UvmDisableReadDuplication(void *base,
|
||||
// because it is not very informative.
|
||||
//
|
||||
//------------------------------------------------------------------------------
|
||||
#if UVM_API_REV_IS_AT_MOST(7)
|
||||
NV_STATUS UvmSetPreferredLocation(void *base,
|
||||
NvLength length,
|
||||
const NvProcessorUuid *preferredLocationUuid);
|
||||
#else
|
||||
NV_STATUS UvmSetPreferredLocation(void *base,
|
||||
NvLength length,
|
||||
const NvProcessorUuid *preferredLocationUuid,
|
||||
NvS32 preferredCpuMemoryNode);
|
||||
NvS32 preferredCpuNumaNode);
|
||||
#endif
|
||||
|
||||
//------------------------------------------------------------------------------
|
||||
// UvmUnsetPreferredLocation
|
||||
@@ -2514,9 +2481,8 @@ NV_STATUS UvmUnsetPreferredLocation(void *base,
|
||||
// Length, in bytes, of the range.
|
||||
//
|
||||
// accessedByUuid: (INPUT)
|
||||
// UUID of the physical GPU if the GPU is not SMC capable or SMC
|
||||
// enabled, the GPU instance UUID of the partition, or the CPU UUID
|
||||
// that should have pages in the VA range mapped when possible.
|
||||
// UUID of the processor that should have pages in the the VA range
|
||||
// mapped when possible.
|
||||
//
|
||||
// Errors:
|
||||
// NV_ERR_INVALID_ADDRESS:
|
||||
@@ -2584,10 +2550,8 @@ NV_STATUS UvmSetAccessedBy(void *base,
|
||||
// Length, in bytes, of the range.
|
||||
//
|
||||
// accessedByUuid: (INPUT)
|
||||
// UUID of the physical GPU if the GPU is not SMC capable or SMC
|
||||
// enabled, the GPU instance UUID of the partition, or the CPU UUID
|
||||
// from which any policies set by UvmSetAccessedBy should be revoked
|
||||
// for the given VA range.
|
||||
// UUID of the processor from which any policies set by
|
||||
// UvmSetAccessedBy should be revoked for the given VA range.
|
||||
//
|
||||
// Errors:
|
||||
// NV_ERR_INVALID_ADDRESS:
|
||||
@@ -2645,9 +2609,7 @@ NV_STATUS UvmUnsetAccessedBy(void *base,
|
||||
//
|
||||
// Arguments:
|
||||
// gpuUuid: (INPUT)
|
||||
// UUID of the physical GPU if the GPU is not SMC capable or SMC
|
||||
// enabled, or the GPU instance UUID of the partition to enable
|
||||
// software-assisted system-wide atomics on.
|
||||
// UUID of the GPU to enable software-assisted system-wide atomics on.
|
||||
//
|
||||
// Error codes:
|
||||
// NV_ERR_NO_MEMORY:
|
||||
@@ -2683,9 +2645,7 @@ NV_STATUS UvmEnableSystemWideAtomics(const NvProcessorUuid *gpuUuid);
|
||||
//
|
||||
// Arguments:
|
||||
// gpuUuid: (INPUT)
|
||||
// UUID of the physical GPU if the GPU is not SMC capable or SMC
|
||||
// enabled, or the GPU instance UUID of the partition to disable
|
||||
// software-assisted system-wide atomics on.
|
||||
// UUID of the GPU to disable software-assisted system-wide atomics on.
|
||||
//
|
||||
// Error codes:
|
||||
// NV_ERR_INVALID_DEVICE:
|
||||
@@ -2914,9 +2874,7 @@ NV_STATUS UvmDebugCountersEnable(UvmDebugSession session,
|
||||
// Name of the counter in that scope.
|
||||
//
|
||||
// gpu: (INPUT)
|
||||
// UUID of the physical GPU if the GPU is not SMC capable or SMC
|
||||
// enabled, or the GPU instance UUID of the partition of the scoped GPU.
|
||||
// This parameter is ignored in AllGpu scopes.
|
||||
// Gpuid of the scoped GPU. This parameter is ignored in AllGpu scopes.
|
||||
//
|
||||
// pCounterHandle: (OUTPUT)
|
||||
// Handle to the counter address.
|
||||
@@ -2970,7 +2928,7 @@ NV_STATUS UvmDebugGetCounterVal(UvmDebugSession session,
|
||||
// UvmEventQueueCreate
|
||||
//
|
||||
// This call creates an event queue of the given size.
|
||||
// No events are added in the queue until they are enabled by the user.
|
||||
// No events are added in the queue till they are enabled by the user.
|
||||
// Event queue data is visible to the user even after the target process dies
|
||||
// if the session is active and queue is not freed.
|
||||
//
|
||||
@@ -3021,7 +2979,7 @@ NV_STATUS UvmEventQueueCreate(UvmDebugSession sessionHandle,
|
||||
// UvmEventQueueDestroy
|
||||
//
|
||||
// This call frees all interal resources associated with the queue, including
|
||||
// unpinning of the memory associated with that queue. Freeing user buffer is
|
||||
// upinning of the memory associated with that queue. Freeing user buffer is
|
||||
// responsibility of a caller. Event queue might be also destroyed as a side
|
||||
// effect of destroying a session associated with this queue.
|
||||
//
|
||||
@@ -3205,9 +3163,9 @@ NV_STATUS UvmEventGetNotificationHandles(UvmEventQueueHandle *queueHandleArray,
|
||||
// UvmEventGetGpuUuidTable
|
||||
//
|
||||
// Each migration event entry contains the gpu index to/from where data is
|
||||
// migrated. This index maps to a corresponding physical gpu UUID in the
|
||||
// gpuUuidTable. Using indices saves on the size of each event entry. This API
|
||||
// provides the gpuIndex to gpuUuid relation to the user.
|
||||
// migrated. This index maps to a corresponding gpu UUID in the gpuUuidTable.
|
||||
// Using indices saves on the size of each event entry. This API provides the
|
||||
// gpuIndex to gpuUuid relation to the user.
|
||||
//
|
||||
// This API does not access the queue state maintained in the user
|
||||
// library and so the user doesn't need to acquire a lock to protect the
|
||||
@@ -3215,9 +3173,9 @@ NV_STATUS UvmEventGetNotificationHandles(UvmEventQueueHandle *queueHandleArray,
|
||||
//
|
||||
// Arguments:
|
||||
// gpuUuidTable: (OUTPUT)
|
||||
// The return value is an array of physical GPU UUIDs. The array index
|
||||
// is the corresponding gpuIndex. There can be at max 32 GPUs
|
||||
// associated with UVM, so array size is 32.
|
||||
// The return value is an array of UUIDs. The array index is the
|
||||
// corresponding gpuIndex. There can be at max 32 gpus associated with
|
||||
// UVM, so array size is 32.
|
||||
//
|
||||
// validCount: (OUTPUT)
|
||||
// The system doesn't normally contain 32 GPUs. This field gives the
|
||||
@@ -3276,7 +3234,7 @@ NV_STATUS UvmEventGetGpuUuidTable(NvProcessorUuid *gpuUuidTable,
|
||||
//------------------------------------------------------------------------------
|
||||
NV_STATUS UvmEventFetch(UvmDebugSession sessionHandle,
|
||||
UvmEventQueueHandle queueHandle,
|
||||
UvmEventEntry_V1 *pBuffer,
|
||||
UvmEventEntry *pBuffer,
|
||||
NvU64 *nEntries);
|
||||
|
||||
//------------------------------------------------------------------------------
|
||||
@@ -3472,15 +3430,10 @@ NV_STATUS UvmToolsDestroySession(UvmToolsSessionHandle session);
|
||||
// 4. Destroy event Queue using UvmToolsDestroyEventQueue
|
||||
//
|
||||
|
||||
#if UVM_API_REV_IS_AT_MOST(10)
|
||||
// This is deprecated and replaced by sizeof(UvmToolsEventControlData_V1) or
|
||||
// sizeof(UvmToolsEventControlData_V2).
|
||||
|
||||
NvLength UvmToolsGetEventControlSize(void);
|
||||
|
||||
// This is deprecated and replaced by sizeof(UvmEventEntry_V1) or
|
||||
// sizeof(UvmEventEntry_V2).
|
||||
NvLength UvmToolsGetEventEntrySize(void);
|
||||
#endif
|
||||
|
||||
NvLength UvmToolsGetNumberOfCounters(void);
|
||||
|
||||
@@ -3495,12 +3448,6 @@ NvLength UvmToolsGetNumberOfCounters(void);
|
||||
// session: (INPUT)
|
||||
// Handle to the tools session.
|
||||
//
|
||||
// version: (INPUT)
|
||||
// Requested version for events or counters.
|
||||
// See UvmEventEntry_V1 and UvmEventEntry_V2.
|
||||
// UvmToolsEventControlData_V2::version records the entry version that
|
||||
// will be generated.
|
||||
//
|
||||
// event_buffer: (INPUT)
|
||||
// User allocated buffer. Must be page-aligned. Must be large enough to
|
||||
// hold at least event_buffer_size events. Gets pinned until queue is
|
||||
@@ -3512,9 +3459,10 @@ NvLength UvmToolsGetNumberOfCounters(void);
|
||||
//
|
||||
// event_control (INPUT)
|
||||
// User allocated buffer. Must be page-aligned. Must be large enough to
|
||||
// hold UvmToolsEventControlData_V1 if version is UvmEventEntry_V1 or
|
||||
// UvmToolsEventControlData_V2 (although single page-size allocation
|
||||
// should be more than enough). Gets pinned until queue is destroyed.
|
||||
// hold UvmToolsEventControlData (although single page-size allocation
|
||||
// should be more than enough). One could call
|
||||
// UvmToolsGetEventControlSize() function to find out current size of
|
||||
// UvmToolsEventControlData. Gets pinned until queue is destroyed.
|
||||
//
|
||||
// queue: (OUTPUT)
|
||||
// Handle to the created queue.
|
||||
@@ -3524,32 +3472,22 @@ NvLength UvmToolsGetNumberOfCounters(void);
|
||||
// Session handle does not refer to a valid session
|
||||
//
|
||||
// NV_ERR_INVALID_ARGUMENT:
|
||||
// The version is not UvmEventEntry_V1 or UvmEventEntry_V2.
|
||||
// One of the parameters: event_buffer, event_buffer_size, event_control
|
||||
// is not valid
|
||||
//
|
||||
// NV_ERR_INSUFFICIENT_RESOURCES:
|
||||
// There could be multiple reasons for this error. One would be that
|
||||
// it's not possible to allocate a queue of requested size. Another
|
||||
// would be either event_buffer or event_control memory couldn't be
|
||||
// pinned (e.g. because of OS limitation of pinnable memory). Also it
|
||||
// could not have been possible to create UvmToolsEventQueueDescriptor.
|
||||
// There could be multiple reasons for this error. One would be that it's
|
||||
// not possible to allocate a queue of requested size. Another would be
|
||||
// that either event_buffer or event_control memory couldn't be pinned
|
||||
// (e.g. because of OS limitation of pinnable memory). Also it could not
|
||||
// have been possible to create UvmToolsEventQueueDescriptor.
|
||||
//
|
||||
//------------------------------------------------------------------------------
|
||||
#if UVM_API_REV_IS_AT_MOST(10)
|
||||
NV_STATUS UvmToolsCreateEventQueue(UvmToolsSessionHandle session,
|
||||
void *event_buffer,
|
||||
NvLength event_buffer_size,
|
||||
void *event_control,
|
||||
UvmToolsEventQueueHandle *queue);
|
||||
#else
|
||||
NV_STATUS UvmToolsCreateEventQueue(UvmToolsSessionHandle session,
|
||||
UvmToolsEventQueueVersion version,
|
||||
void *event_buffer,
|
||||
NvLength event_buffer_size,
|
||||
void *event_control,
|
||||
UvmToolsEventQueueHandle *queue);
|
||||
#endif
|
||||
|
||||
UvmToolsEventQueueDescriptor UvmToolsGetEventQueueDescriptor(UvmToolsEventQueueHandle queue);
|
||||
|
||||
@@ -3586,7 +3524,7 @@ NV_STATUS UvmToolsSetNotificationThreshold(UvmToolsEventQueueHandle queue,
|
||||
//------------------------------------------------------------------------------
|
||||
// UvmToolsDestroyEventQueue
|
||||
//
|
||||
// Destroys all internal resources associated with the queue. It unpins the
|
||||
// Destroys all internal resources associated with the queue. It unpinns the
|
||||
// buffers provided in UvmToolsCreateEventQueue. Event Queue is also auto
|
||||
// destroyed when corresponding session gets destroyed.
|
||||
//
|
||||
@@ -3608,7 +3546,7 @@ NV_STATUS UvmToolsDestroyEventQueue(UvmToolsEventQueueHandle queue);
|
||||
// UvmEventQueueEnableEvents
|
||||
//
|
||||
// This call enables a particular event type in the event queue. All events are
|
||||
// disabled by default. Any event type is considered listed if and only if its
|
||||
// disabled by default. Any event type is considered listed if and only if it's
|
||||
// corresponding value is equal to 1 (in other words, bit is set). Disabled
|
||||
// events listed in eventTypeFlags are going to be enabled. Enabled events and
|
||||
// events not listed in eventTypeFlags are not affected by this call.
|
||||
@@ -3641,7 +3579,7 @@ NV_STATUS UvmToolsEventQueueEnableEvents(UvmToolsEventQueueHandle queue,
|
||||
// UvmToolsEventQueueDisableEvents
|
||||
//
|
||||
// This call disables a particular event type in the event queue. Any event type
|
||||
// is considered listed if and only if its corresponding value is equal to 1
|
||||
// is considered listed if and only if it's corresponding value is equal to 1
|
||||
// (in other words, bit is set). Enabled events listed in eventTypeFlags are
|
||||
// going to be disabled. Disabled events and events not listed in eventTypeFlags
|
||||
// are not affected by this call.
|
||||
@@ -3679,7 +3617,7 @@ NV_STATUS UvmToolsEventQueueDisableEvents(UvmToolsEventQueueHandle queue,
|
||||
//
|
||||
// Counters position follows the layout of the memory that UVM driver decides to
|
||||
// use. To obtain particular counter value, user should perform consecutive
|
||||
// atomic reads at a given buffer + offset address.
|
||||
// atomic reads at a a given buffer + offset address.
|
||||
//
|
||||
// It is not defined what is the initial value of a counter. User should rely on
|
||||
// a difference between each snapshot.
|
||||
@@ -3702,9 +3640,9 @@ NV_STATUS UvmToolsEventQueueDisableEvents(UvmToolsEventQueueHandle queue,
|
||||
// Provided session is not valid
|
||||
//
|
||||
// NV_ERR_INSUFFICIENT_RESOURCES
|
||||
// There could be multiple reasons for this error. One would be that
|
||||
// it's not possible to allocate counters structure. Another would be
|
||||
// that either event_buffer or event_control memory couldn't be pinned
|
||||
// There could be multiple reasons for this error. One would be that it's
|
||||
// not possible to allocate counters structure. Another would be that
|
||||
// either event_buffer or event_control memory couldn't be pinned
|
||||
// (e.g. because of OS limitation of pinnable memory)
|
||||
//
|
||||
//------------------------------------------------------------------------------
|
||||
@@ -3715,12 +3653,12 @@ NV_STATUS UvmToolsCreateProcessAggregateCounters(UvmToolsSessionHandle session
|
||||
//------------------------------------------------------------------------------
|
||||
// UvmToolsCreateProcessorCounters
|
||||
//
|
||||
// Creates the counters structure for tracking per-processor counters.
|
||||
// Creates the counters structure for tracking per-process counters.
|
||||
// These counters are disabled by default.
|
||||
//
|
||||
// Counters position follows the layout of the memory that UVM driver decides to
|
||||
// use. To obtain particular counter value, user should perform consecutive
|
||||
// atomic reads at a given buffer + offset address.
|
||||
// atomic reads at a a given buffer + offset address.
|
||||
//
|
||||
// It is not defined what is the initial value of a counter. User should rely on
|
||||
// a difference between each snapshot.
|
||||
@@ -3736,9 +3674,7 @@ NV_STATUS UvmToolsCreateProcessAggregateCounters(UvmToolsSessionHandle session
|
||||
// counters are destroyed.
|
||||
//
|
||||
// processorUuid: (INPUT)
|
||||
// UUID of the physical GPU if the GPU is not SMC capable or SMC
|
||||
// enabled, the GPU instance UUID of the partition, or the CPU UUID of
|
||||
// the resource, for which counters will provide statistic data.
|
||||
// UUID of the resource, for which counters will provide statistic data.
|
||||
//
|
||||
// counters: (OUTPUT)
|
||||
// Handle to the created counters.
|
||||
@@ -3748,9 +3684,9 @@ NV_STATUS UvmToolsCreateProcessAggregateCounters(UvmToolsSessionHandle session
|
||||
// session handle does not refer to a valid tools session
|
||||
//
|
||||
// NV_ERR_INSUFFICIENT_RESOURCES
|
||||
// There could be multiple reasons for this error. One would be that
|
||||
// it's not possible to allocate counters structure. Another would be
|
||||
// that either event_buffer or event_control memory couldn't be pinned
|
||||
// There could be multiple reasons for this error. One would be that it's
|
||||
// not possible to allocate counters structure. Another would be that
|
||||
// either event_buffer or event_control memory couldn't be pinned
|
||||
// (e.g. because of OS limitation of pinnable memory)
|
||||
//
|
||||
// NV_ERR_INVALID_ARGUMENT
|
||||
@@ -3766,7 +3702,7 @@ NV_STATUS UvmToolsCreateProcessorCounters(UvmToolsSessionHandle session,
|
||||
// UvmToolsDestroyCounters
|
||||
//
|
||||
// Destroys all internal resources associated with this counters structure.
|
||||
// It unpins the buffer provided in UvmToolsCreate*Counters. Counters structure
|
||||
// It unpinns the buffer provided in UvmToolsCreate*Counters. Counters structure
|
||||
// also gest destroyed when corresponding session is destroyed.
|
||||
//
|
||||
// Arguments:
|
||||
@@ -3787,7 +3723,7 @@ NV_STATUS UvmToolsDestroyCounters(UvmToolsCountersHandle counters);
|
||||
// UvmToolsEnableCounters
|
||||
//
|
||||
// This call enables certain counter types in the counters structure. Any
|
||||
// counter type is considered listed if and only if its corresponding value is
|
||||
// counter type is considered listed if and only if it's corresponding value is
|
||||
// equal to 1 (in other words, bit is set). Disabled counter types listed in
|
||||
// counterTypeFlags are going to be enabled. Already enabled counter types and
|
||||
// counter types not listed in counterTypeFlags are not affected by this call.
|
||||
@@ -3821,7 +3757,7 @@ NV_STATUS UvmToolsEnableCounters(UvmToolsCountersHandle counters,
|
||||
// UvmToolsDisableCounters
|
||||
//
|
||||
// This call disables certain counter types in the counters structure. Any
|
||||
// counter type is considered listed if and only if its corresponding value is
|
||||
// counter type is considered listed if and only if it's corresponding value is
|
||||
// equal to 1 (in other words, bit is set). Enabled counter types listed in
|
||||
// counterTypeFlags are going to be disabled. Already disabled counter types and
|
||||
// counter types not listed in counterTypeFlags are not affected by this call.
|
||||
@@ -3966,72 +3902,32 @@ NV_STATUS UvmToolsWriteProcessMemory(UvmToolsSessionHandle session,
|
||||
// UvmToolsGetProcessorUuidTable
|
||||
//
|
||||
// Populate a table with the UUIDs of all the currently registered processors
|
||||
// in the target process. When a GPU is registered, it is added to the table.
|
||||
// When a GPU is unregistered, it is removed. As long as a GPU remains
|
||||
// registered, its index in the table does not change.
|
||||
// Note that the index in the table corresponds to the processor ID reported
|
||||
// in UvmEventEntry event records and that the table is not contiguously packed
|
||||
// with non-zero UUIDs even with no GPU unregistrations.
|
||||
// in the target process. When a GPU is registered, it is added to the table.
|
||||
// When a GPU is unregistered, it is removed. As long as a GPU remains registered,
|
||||
// its index in the table does not change. New registrations obtain the first
|
||||
// unused index.
|
||||
//
|
||||
// Arguments:
|
||||
// session: (INPUT)
|
||||
// Handle to the tools session.
|
||||
//
|
||||
// version: (INPUT)
|
||||
// Requested version for the UUID table returned. The version must
|
||||
// match the requested version of the event queue created with
|
||||
// UvmToolsCreateEventQueue().
|
||||
// See UvmEventEntry_V1 and UvmEventEntry_V2.
|
||||
//
|
||||
// table: (OUTPUT)
|
||||
// Array of processor UUIDs, including the CPU's UUID which is always
|
||||
// at index zero. The srcIndex and dstIndex fields of the
|
||||
// UvmEventMigrationInfo struct index this array. Unused indices will
|
||||
// have a UUID of zero. Version UvmEventEntry_V1 only uses GPU UUIDs
|
||||
// for the UUID of the physical GPU and only supports a single SMC
|
||||
// partition registered per process. Version UvmEventEntry_V2 supports
|
||||
// multiple SMC partitions registered per process and uses physical GPU
|
||||
// UUIDs if the GPU is not SMC capable or SMC enabled and GPU instance
|
||||
// UUIDs for SMC partitions.
|
||||
// The table pointer can be NULL in which case, the size of the table
|
||||
// needed to hold all the UUIDs is returned in 'count'.
|
||||
//
|
||||
// table_size: (INPUT)
|
||||
// The size of the table in number of array elements. This can be
|
||||
// zero if the table pointer is NULL.
|
||||
// have a UUID of zero.
|
||||
//
|
||||
// count: (OUTPUT)
|
||||
// On output, it is set by UVM to the number of UUIDs needed to hold
|
||||
// all the UUIDs, including any gaps in the table due to unregistered
|
||||
// GPUs.
|
||||
// Set by UVM to the number of UUIDs written, including any gaps in
|
||||
// the table due to unregistered GPUs.
|
||||
//
|
||||
// Error codes:
|
||||
// NV_ERR_INVALID_ADDRESS:
|
||||
// writing to table failed or the count pointer was invalid.
|
||||
//
|
||||
// NV_ERR_INVALID_ARGUMENT:
|
||||
// The version is not UvmEventEntry_V1 or UvmEventEntry_V2.
|
||||
// The count pointer is NULL.
|
||||
// See UvmToolsEventQueueVersion.
|
||||
//
|
||||
// NV_WARN_MISMATCHED_TARGET:
|
||||
// The kernel returned a table suitable for UvmEventEntry_V1 events.
|
||||
// (i.e., the kernel is older and doesn't support UvmEventEntry_V2).
|
||||
//
|
||||
// NV_ERR_NO_MEMORY:
|
||||
// Internal memory allocation failed.
|
||||
// writing to table failed.
|
||||
//------------------------------------------------------------------------------
|
||||
#if UVM_API_REV_IS_AT_MOST(10)
|
||||
NV_STATUS UvmToolsGetProcessorUuidTable(UvmToolsSessionHandle session,
|
||||
NvProcessorUuid *table,
|
||||
NvLength *count);
|
||||
#else
|
||||
NV_STATUS UvmToolsGetProcessorUuidTable(UvmToolsSessionHandle session,
|
||||
UvmToolsEventQueueVersion version,
|
||||
NvProcessorUuid *table,
|
||||
NvLength table_size,
|
||||
NvLength *count);
|
||||
#endif
|
||||
|
||||
//------------------------------------------------------------------------------
|
||||
// UvmToolsFlushEvents
|
||||
|
||||
@@ -79,8 +79,6 @@ void uvm_hal_ada_arch_init_properties(uvm_parent_gpu_t *parent_gpu)
|
||||
|
||||
parent_gpu->access_counters_supported = true;
|
||||
|
||||
parent_gpu->access_counters_can_use_physical_addresses = false;
|
||||
|
||||
parent_gpu->fault_cancel_va_supported = true;
|
||||
|
||||
parent_gpu->scoped_atomics_supported = true;
|
||||
|
||||
@@ -1,5 +1,5 @@
|
||||
/*******************************************************************************
|
||||
Copyright (c) 2018-2023 NVIDIA Corporation
|
||||
Copyright (c) 2018-20221 NVIDIA Corporation
|
||||
|
||||
Permission is hereby granted, free of charge, to any person obtaining a copy
|
||||
of this software and associated documentation files (the "Software"), to
|
||||
@@ -38,12 +38,10 @@ void uvm_hal_ampere_arch_init_properties(uvm_parent_gpu_t *parent_gpu)
|
||||
|
||||
parent_gpu->utlb_per_gpc_count = uvm_ampere_get_utlbs_per_gpc(parent_gpu);
|
||||
|
||||
parent_gpu->fault_buffer_info.replayable.utlb_count = parent_gpu->rm_info.maxGpcCount *
|
||||
parent_gpu->utlb_per_gpc_count;
|
||||
parent_gpu->fault_buffer_info.replayable.utlb_count = parent_gpu->rm_info.maxGpcCount * parent_gpu->utlb_per_gpc_count;
|
||||
{
|
||||
uvm_fault_buffer_entry_t *dummy;
|
||||
UVM_ASSERT(parent_gpu->fault_buffer_info.replayable.utlb_count <= (1 <<
|
||||
(sizeof(dummy->fault_source.utlb_id) * 8)));
|
||||
UVM_ASSERT(parent_gpu->fault_buffer_info.replayable.utlb_count <= (1 << (sizeof(dummy->fault_source.utlb_id) * 8)));
|
||||
}
|
||||
|
||||
// A single top level PDE on Ampere covers 128 TB and that's the minimum
|
||||
@@ -55,7 +53,7 @@ void uvm_hal_ampere_arch_init_properties(uvm_parent_gpu_t *parent_gpu)
|
||||
parent_gpu->uvm_mem_va_size = UVM_MEM_VA_SIZE;
|
||||
|
||||
// See uvm_mmu.h for mapping placement
|
||||
parent_gpu->flat_vidmem_va_base = 160 * UVM_SIZE_1TB;
|
||||
parent_gpu->flat_vidmem_va_base = 136 * UVM_SIZE_1TB;
|
||||
parent_gpu->flat_sysmem_va_base = 256 * UVM_SIZE_1TB;
|
||||
|
||||
parent_gpu->ce_phys_vidmem_write_supported = true;
|
||||
@@ -83,8 +81,6 @@ void uvm_hal_ampere_arch_init_properties(uvm_parent_gpu_t *parent_gpu)
|
||||
|
||||
parent_gpu->access_counters_supported = true;
|
||||
|
||||
parent_gpu->access_counters_can_use_physical_addresses = false;
|
||||
|
||||
parent_gpu->fault_cancel_va_supported = true;
|
||||
|
||||
parent_gpu->scoped_atomics_supported = true;
|
||||
|
||||
@@ -1,5 +1,5 @@
|
||||
/*******************************************************************************
|
||||
Copyright (c) 2018-2023 NVIDIA Corporation
|
||||
Copyright (c) 2018-2022 NVIDIA Corporation
|
||||
|
||||
Permission is hereby granted, free of charge, to any person obtaining a copy
|
||||
of this software and associated documentation files (the "Software"), to
|
||||
@@ -117,7 +117,7 @@ bool uvm_hal_ampere_ce_memcopy_is_valid_c6b5(uvm_push_t *push, uvm_gpu_address_t
|
||||
NvU64 push_begin_gpu_va;
|
||||
uvm_gpu_t *gpu = uvm_push_get_gpu(push);
|
||||
|
||||
if (!uvm_parent_gpu_is_virt_mode_sriov_heavy(gpu->parent))
|
||||
if (!uvm_gpu_is_virt_mode_sriov_heavy(gpu))
|
||||
return true;
|
||||
|
||||
if (uvm_channel_is_proxy(push->channel)) {
|
||||
@@ -196,7 +196,7 @@ bool uvm_hal_ampere_ce_memset_is_valid_c6b5(uvm_push_t *push,
|
||||
{
|
||||
uvm_gpu_t *gpu = uvm_push_get_gpu(push);
|
||||
|
||||
if (!uvm_parent_gpu_is_virt_mode_sriov_heavy(gpu->parent))
|
||||
if (!uvm_gpu_is_virt_mode_sriov_heavy(gpu))
|
||||
return true;
|
||||
|
||||
if (uvm_channel_is_proxy(push->channel)) {
|
||||
|
||||
@@ -1,5 +1,5 @@
|
||||
/*******************************************************************************
|
||||
Copyright (c) 2018-2023 NVIDIA Corporation
|
||||
Copyright (c) 2018-2022 NVIDIA Corporation
|
||||
|
||||
Permission is hereby granted, free of charge, to any person obtaining a copy
|
||||
of this software and associated documentation files (the "Software"), to
|
||||
@@ -33,7 +33,7 @@ bool uvm_hal_ampere_host_method_is_valid(uvm_push_t *push, NvU32 method_address,
|
||||
{
|
||||
uvm_gpu_t *gpu = uvm_push_get_gpu(push);
|
||||
|
||||
if (!uvm_parent_gpu_is_virt_mode_sriov_heavy(gpu->parent))
|
||||
if (!uvm_gpu_is_virt_mode_sriov_heavy(gpu))
|
||||
return true;
|
||||
|
||||
if (uvm_channel_is_privileged(push->channel)) {
|
||||
|
||||
@@ -203,7 +203,7 @@ done:
|
||||
ats_context->prefetch_state.has_preferred_location = false;
|
||||
#endif
|
||||
|
||||
ats_context->residency_id = gpu ? gpu->id : UVM_ID_CPU;
|
||||
ats_context->residency_id = gpu ? gpu->parent->id : UVM_ID_CPU;
|
||||
ats_context->residency_node = residency;
|
||||
}
|
||||
|
||||
|
||||
@@ -53,11 +53,10 @@
|
||||
#define UVM_ATS_SVA_SUPPORTED() 0
|
||||
#endif
|
||||
|
||||
// If NV_MMU_NOTIFIER_OPS_HAS_ARCH_INVALIDATE_SECONDARY_TLBS is defined it
|
||||
// means the upstream fix is in place so no need for the WAR from
|
||||
// Bug 4130089: [GH180][r535] WAR for kernel not issuing SMMU TLB
|
||||
// invalidates on read-only
|
||||
#if defined(NV_MMU_NOTIFIER_OPS_HAS_ARCH_INVALIDATE_SECONDARY_TLBS)
|
||||
// If NV_ARCH_INVALIDATE_SECONDARY_TLBS is defined it means the upstream fix is
|
||||
// in place so no need for the WAR from Bug 4130089: [GH180][r535] WAR for
|
||||
// kernel not issuing SMMU TLB invalidates on read-only
|
||||
#if defined(NV_ARCH_INVALIDATE_SECONDARY_TLBS)
|
||||
#define UVM_ATS_SMMU_WAR_REQUIRED() 0
|
||||
#elif NVCPU_IS_AARCH64
|
||||
#define UVM_ATS_SMMU_WAR_REQUIRED() 1
|
||||
|
||||
@@ -56,7 +56,7 @@ static NV_STATUS test_non_pipelined(uvm_gpu_t *gpu)
|
||||
|
||||
// TODO: Bug 3839176: the test is waived on Confidential Computing because
|
||||
// it assumes that GPU can access system memory without using encryption.
|
||||
if (g_uvm_global.conf_computing_enabled)
|
||||
if (uvm_conf_computing_mode_enabled(gpu))
|
||||
return NV_OK;
|
||||
|
||||
status = uvm_rm_mem_alloc_and_map_cpu(gpu, UVM_RM_MEM_TYPE_SYS, CE_TEST_MEM_SIZE, 0, &host_mem);
|
||||
@@ -176,7 +176,7 @@ static NV_STATUS test_membar(uvm_gpu_t *gpu)
|
||||
|
||||
// TODO: Bug 3839176: the test is waived on Confidential Computing because
|
||||
// it assumes that GPU can access system memory without using encryption.
|
||||
if (g_uvm_global.conf_computing_enabled)
|
||||
if (uvm_conf_computing_mode_enabled(gpu))
|
||||
return NV_OK;
|
||||
|
||||
status = uvm_rm_mem_alloc_and_map_cpu(gpu, UVM_RM_MEM_TYPE_SYS, sizeof(NvU32), 0, &host_mem);
|
||||
@@ -411,11 +411,10 @@ static NV_STATUS test_memcpy_and_memset(uvm_gpu_t *gpu)
|
||||
size_t i, j, k, s;
|
||||
uvm_mem_alloc_params_t mem_params = {0};
|
||||
|
||||
if (g_uvm_global.conf_computing_enabled)
|
||||
if (uvm_conf_computing_mode_enabled(gpu))
|
||||
TEST_NV_CHECK_GOTO(uvm_mem_alloc_sysmem_dma_and_map_cpu_kernel(size, gpu, current->mm, &verif_mem), done);
|
||||
else
|
||||
TEST_NV_CHECK_GOTO(uvm_mem_alloc_sysmem_and_map_cpu_kernel(size, current->mm, &verif_mem), done);
|
||||
|
||||
TEST_NV_CHECK_GOTO(uvm_mem_map_gpu_kernel(verif_mem, gpu), done);
|
||||
|
||||
gpu_verif_addr = uvm_mem_gpu_address_virtual_kernel(verif_mem, gpu);
|
||||
@@ -437,7 +436,7 @@ static NV_STATUS test_memcpy_and_memset(uvm_gpu_t *gpu)
|
||||
TEST_NV_CHECK_GOTO(uvm_rm_mem_alloc(gpu, UVM_RM_MEM_TYPE_SYS, size, 0, &sys_rm_mem), done);
|
||||
gpu_addresses[0] = uvm_rm_mem_get_gpu_va(sys_rm_mem, gpu, is_proxy_va_space);
|
||||
|
||||
if (g_uvm_global.conf_computing_enabled) {
|
||||
if (uvm_conf_computing_mode_enabled(gpu)) {
|
||||
for (i = 0; i < iterations; ++i) {
|
||||
for (s = 0; s < ARRAY_SIZE(element_sizes); s++) {
|
||||
TEST_NV_CHECK_GOTO(test_memcpy_and_memset_inner(gpu,
|
||||
@@ -560,7 +559,7 @@ static NV_STATUS test_semaphore_reduction_inc(uvm_gpu_t *gpu)
|
||||
|
||||
// TODO: Bug 3839176: the test is waived on Confidential Computing because
|
||||
// it assumes that GPU can access system memory without using encryption.
|
||||
if (g_uvm_global.conf_computing_enabled)
|
||||
if (uvm_conf_computing_mode_enabled(gpu))
|
||||
return NV_OK;
|
||||
|
||||
status = test_semaphore_alloc_sem(gpu, size, &mem);
|
||||
@@ -612,7 +611,7 @@ static NV_STATUS test_semaphore_release(uvm_gpu_t *gpu)
|
||||
|
||||
// TODO: Bug 3839176: the test is waived on Confidential Computing because
|
||||
// it assumes that GPU can access system memory without using encryption.
|
||||
if (g_uvm_global.conf_computing_enabled)
|
||||
if (uvm_conf_computing_mode_enabled(gpu))
|
||||
return NV_OK;
|
||||
|
||||
status = test_semaphore_alloc_sem(gpu, size, &mem);
|
||||
@@ -666,7 +665,7 @@ static NV_STATUS test_semaphore_timestamp(uvm_gpu_t *gpu)
|
||||
|
||||
// TODO: Bug 3839176: the test is waived on Confidential Computing because
|
||||
// it assumes that GPU can access system memory without using encryption.
|
||||
if (g_uvm_global.conf_computing_enabled)
|
||||
if (uvm_conf_computing_mode_enabled(gpu))
|
||||
return NV_OK;
|
||||
|
||||
status = test_semaphore_alloc_sem(gpu, size, &mem);
|
||||
@@ -855,7 +854,6 @@ static NV_STATUS cpu_decrypt_in_order(uvm_channel_t *channel,
|
||||
uvm_mem_t *dst_mem,
|
||||
uvm_mem_t *src_mem,
|
||||
const UvmCslIv *decrypt_iv,
|
||||
NvU32 key_version,
|
||||
uvm_mem_t *auth_tag_mem,
|
||||
size_t size,
|
||||
NvU32 copy_size)
|
||||
@@ -870,7 +868,6 @@ static NV_STATUS cpu_decrypt_in_order(uvm_channel_t *channel,
|
||||
dst_plain + i * copy_size,
|
||||
src_cipher + i * copy_size,
|
||||
decrypt_iv + i,
|
||||
key_version,
|
||||
copy_size,
|
||||
auth_tag_buffer + i * UVM_CONF_COMPUTING_AUTH_TAG_SIZE));
|
||||
}
|
||||
@@ -881,7 +878,6 @@ static NV_STATUS cpu_decrypt_out_of_order(uvm_channel_t *channel,
|
||||
uvm_mem_t *dst_mem,
|
||||
uvm_mem_t *src_mem,
|
||||
const UvmCslIv *decrypt_iv,
|
||||
NvU32 key_version,
|
||||
uvm_mem_t *auth_tag_mem,
|
||||
size_t size,
|
||||
NvU32 copy_size)
|
||||
@@ -899,7 +895,6 @@ static NV_STATUS cpu_decrypt_out_of_order(uvm_channel_t *channel,
|
||||
dst_plain + i * copy_size,
|
||||
src_cipher + i * copy_size,
|
||||
decrypt_iv + i,
|
||||
key_version,
|
||||
copy_size,
|
||||
auth_tag_buffer + i * UVM_CONF_COMPUTING_AUTH_TAG_SIZE));
|
||||
}
|
||||
@@ -963,7 +958,7 @@ static void gpu_encrypt(uvm_push_t *push,
|
||||
i * UVM_CONF_COMPUTING_AUTH_TAG_SIZE,
|
||||
dst_cipher);
|
||||
|
||||
uvm_conf_computing_log_gpu_encryption(push->channel, copy_size, decrypt_iv);
|
||||
uvm_conf_computing_log_gpu_encryption(push->channel, decrypt_iv);
|
||||
|
||||
if (i > 0)
|
||||
uvm_push_set_flag(push, UVM_PUSH_FLAG_CE_NEXT_PIPELINED);
|
||||
@@ -1024,7 +1019,6 @@ static NV_STATUS test_cpu_to_gpu_roundtrip(uvm_gpu_t *gpu,
|
||||
size_t auth_tag_buffer_size = (size / copy_size) * UVM_CONF_COMPUTING_AUTH_TAG_SIZE;
|
||||
UvmCslIv *decrypt_iv = NULL;
|
||||
UvmCslIv *encrypt_iv = NULL;
|
||||
NvU32 key_version;
|
||||
uvm_tracker_t tracker;
|
||||
size_t src_plain_size;
|
||||
|
||||
@@ -1094,11 +1088,6 @@ static NV_STATUS test_cpu_to_gpu_roundtrip(uvm_gpu_t *gpu,
|
||||
|
||||
gpu_encrypt(&push, dst_cipher, dst_plain_gpu, auth_tag_mem, decrypt_iv, size, copy_size);
|
||||
|
||||
// There shouldn't be any key rotation between the end of the push and the
|
||||
// CPU decryption(s), but it is more robust against test changes to force
|
||||
// decryption to use the saved key.
|
||||
key_version = uvm_channel_pool_key_version(push.channel->pool);
|
||||
|
||||
TEST_NV_CHECK_GOTO(uvm_push_end_and_wait(&push), out);
|
||||
|
||||
TEST_CHECK_GOTO(!mem_match(src_plain, src_cipher, size), out);
|
||||
@@ -1111,7 +1100,6 @@ static NV_STATUS test_cpu_to_gpu_roundtrip(uvm_gpu_t *gpu,
|
||||
dst_plain,
|
||||
dst_cipher,
|
||||
decrypt_iv,
|
||||
key_version,
|
||||
auth_tag_mem,
|
||||
size,
|
||||
copy_size),
|
||||
@@ -1122,7 +1110,6 @@ static NV_STATUS test_cpu_to_gpu_roundtrip(uvm_gpu_t *gpu,
|
||||
dst_plain,
|
||||
dst_cipher,
|
||||
decrypt_iv,
|
||||
key_version,
|
||||
auth_tag_mem,
|
||||
size,
|
||||
copy_size),
|
||||
@@ -1166,7 +1153,7 @@ static NV_STATUS test_encryption_decryption(uvm_gpu_t *gpu,
|
||||
} small_sizes[] = {{1, 1}, {3, 1}, {8, 1}, {2, 2}, {8, 4}, {UVM_PAGE_SIZE_4K - 8, 8}, {UVM_PAGE_SIZE_4K + 8, 8}};
|
||||
|
||||
// Only Confidential Computing uses CE encryption/decryption
|
||||
if (!g_uvm_global.conf_computing_enabled)
|
||||
if (!uvm_conf_computing_mode_enabled(gpu))
|
||||
return NV_OK;
|
||||
|
||||
// Use a size, and copy size, that are not a multiple of common page sizes.
|
||||
|
||||
File diff suppressed because it is too large
Load Diff
@@ -228,65 +228,21 @@ typedef struct
|
||||
// variant is required when the thread holding the pool lock must sleep
|
||||
// (ex: acquire another mutex) deeper in the call stack, either in UVM or
|
||||
// RM.
|
||||
union
|
||||
{
|
||||
union {
|
||||
uvm_spinlock_t spinlock;
|
||||
uvm_mutex_t mutex;
|
||||
};
|
||||
|
||||
struct
|
||||
{
|
||||
// Secure operations require that uvm_push_begin order matches
|
||||
// uvm_push_end order, because the engine's state is used in its
|
||||
// internal operation and each push may modify this state.
|
||||
// push_locks is protected by the channel pool lock.
|
||||
DECLARE_BITMAP(push_locks, UVM_CHANNEL_MAX_NUM_CHANNELS_PER_POOL);
|
||||
// Secure operations require that uvm_push_begin order matches
|
||||
// uvm_push_end order, because the engine's state is used in its internal
|
||||
// operation and each push may modify this state. push_locks is protected by
|
||||
// the channel pool lock.
|
||||
DECLARE_BITMAP(push_locks, UVM_CHANNEL_MAX_NUM_CHANNELS_PER_POOL);
|
||||
|
||||
// Counting semaphore for available and unlocked channels, it must be
|
||||
// acquired before submitting work to a channel when the Confidential
|
||||
// Computing feature is enabled.
|
||||
uvm_semaphore_t push_sem;
|
||||
|
||||
// Per channel buffers in unprotected sysmem.
|
||||
uvm_rm_mem_t *pool_sysmem;
|
||||
|
||||
// Per channel buffers in protected vidmem.
|
||||
uvm_rm_mem_t *pool_vidmem;
|
||||
|
||||
struct
|
||||
{
|
||||
// Current encryption key version, incremented upon key rotation.
|
||||
// While there are separate keys for encryption and decryption, the
|
||||
// two keys are rotated at once, so the versioning applies to both.
|
||||
NvU32 version;
|
||||
|
||||
// Lock used to ensure mutual exclusion during key rotation.
|
||||
uvm_mutex_t mutex;
|
||||
|
||||
// CSL contexts passed to RM for key rotation. This is usually an
|
||||
// array containing the CSL contexts associated with the channels in
|
||||
// the pool. In the case of the WLC pool, the array also includes
|
||||
// CSL contexts associated with LCIC channels.
|
||||
UvmCslContext **csl_contexts;
|
||||
|
||||
// Number of elements in the CSL context array.
|
||||
unsigned num_csl_contexts;
|
||||
|
||||
// Number of bytes encrypted, or decrypted, on the engine associated
|
||||
// with the pool since the last key rotation. Only used during
|
||||
// testing, to force key rotations after a certain encryption size,
|
||||
// see UVM_CONF_COMPUTING_KEY_ROTATION_LOWER_THRESHOLD.
|
||||
//
|
||||
// Encryptions on a LCIC pool are accounted for in the paired WLC
|
||||
// pool.
|
||||
//
|
||||
// TODO: Bug 4612912: these accounting variables can be removed once
|
||||
// RM exposes an API to set the key rotation lower threshold.
|
||||
atomic64_t encrypted;
|
||||
atomic64_t decrypted;
|
||||
} key_rotation;
|
||||
|
||||
} conf_computing;
|
||||
// Counting semaphore for available and unlocked channels, it must be
|
||||
// acquired before submitting work to a channel when the Confidential
|
||||
// Computing feature is enabled.
|
||||
uvm_semaphore_t push_sem;
|
||||
} uvm_channel_pool_t;
|
||||
|
||||
struct uvm_channel_struct
|
||||
@@ -366,14 +322,43 @@ struct uvm_channel_struct
|
||||
// work launches to match the order of push end-s that triggered them.
|
||||
volatile NvU32 gpu_put;
|
||||
|
||||
// Protected sysmem location makes WLC independent from the pushbuffer
|
||||
// allocator. Unprotected sysmem and protected vidmem counterparts
|
||||
// are allocated from the channel pool (sysmem, vidmem).
|
||||
// Static pushbuffer for channels with static schedule (WLC/LCIC)
|
||||
uvm_rm_mem_t *static_pb_protected_vidmem;
|
||||
|
||||
// Static pushbuffer staging buffer for WLC
|
||||
uvm_rm_mem_t *static_pb_unprotected_sysmem;
|
||||
void *static_pb_unprotected_sysmem_cpu;
|
||||
void *static_pb_unprotected_sysmem_auth_tag_cpu;
|
||||
|
||||
// The above static locations are required by the WLC (and LCIC)
|
||||
// schedule. Protected sysmem location completes WLC's independence
|
||||
// from the pushbuffer allocator.
|
||||
void *static_pb_protected_sysmem;
|
||||
|
||||
// Static tracking semaphore notifier values
|
||||
// Because of LCIC's fixed schedule, the secure semaphore release
|
||||
// mechanism uses two additional static locations for incrementing the
|
||||
// notifier values. See:
|
||||
// . channel_semaphore_secure_release()
|
||||
// . setup_lcic_schedule()
|
||||
// . internal_channel_submit_work_wlc()
|
||||
uvm_rm_mem_t *static_notifier_unprotected_sysmem;
|
||||
NvU32 *static_notifier_entry_unprotected_sysmem_cpu;
|
||||
NvU32 *static_notifier_exit_unprotected_sysmem_cpu;
|
||||
uvm_gpu_address_t static_notifier_entry_unprotected_sysmem_gpu_va;
|
||||
uvm_gpu_address_t static_notifier_exit_unprotected_sysmem_gpu_va;
|
||||
|
||||
// Explicit location for push launch tag used by WLC.
|
||||
// Encryption auth tags have to be located in unprotected sysmem.
|
||||
void *launch_auth_tag_cpu;
|
||||
NvU64 launch_auth_tag_gpu_va;
|
||||
|
||||
// Used to decrypt the push back to protected sysmem.
|
||||
// This happens when profilers register callbacks for migration data.
|
||||
uvm_push_crypto_bundle_t *push_crypto_bundles;
|
||||
|
||||
// Accompanying authentication tags for the crypto bundles
|
||||
uvm_rm_mem_t *push_crypto_bundle_auth_tags;
|
||||
} conf_computing;
|
||||
|
||||
// RM channel information
|
||||
@@ -433,7 +418,7 @@ struct uvm_channel_manager_struct
|
||||
unsigned num_channel_pools;
|
||||
|
||||
// Mask containing the indexes of the usable Copy Engines. Each usable CE
|
||||
// has at least one pool of type UVM_CHANNEL_POOL_TYPE_CE associated with it
|
||||
// has at least one pool associated with it.
|
||||
DECLARE_BITMAP(ce_mask, UVM_COPY_ENGINE_COUNT_MAX);
|
||||
|
||||
struct
|
||||
@@ -466,16 +451,6 @@ struct uvm_channel_manager_struct
|
||||
UVM_BUFFER_LOCATION gpput_loc;
|
||||
UVM_BUFFER_LOCATION pushbuffer_loc;
|
||||
} conf;
|
||||
|
||||
struct
|
||||
{
|
||||
// Flag indicating that the WLC/LCIC mechanism is ready/setup; should
|
||||
// only be false during (de)initialization.
|
||||
bool wlc_ready;
|
||||
|
||||
// True indicates that key rotation is enabled (UVM-wise).
|
||||
bool key_rotation_enabled;
|
||||
} conf_computing;
|
||||
};
|
||||
|
||||
// Create a channel manager for the GPU
|
||||
@@ -522,18 +497,6 @@ static bool uvm_channel_is_lcic(uvm_channel_t *channel)
|
||||
return uvm_channel_pool_is_lcic(channel->pool);
|
||||
}
|
||||
|
||||
uvm_channel_t *uvm_channel_lcic_get_paired_wlc(uvm_channel_t *lcic_channel);
|
||||
|
||||
uvm_channel_t *uvm_channel_wlc_get_paired_lcic(uvm_channel_t *wlc_channel);
|
||||
|
||||
NvU64 uvm_channel_get_static_pb_protected_vidmem_gpu_va(uvm_channel_t *channel);
|
||||
|
||||
NvU64 uvm_channel_get_static_pb_unprotected_sysmem_gpu_va(uvm_channel_t *channel);
|
||||
|
||||
char* uvm_channel_get_static_pb_unprotected_sysmem_cpu(uvm_channel_t *channel);
|
||||
|
||||
char *uvm_channel_get_push_crypto_bundle_auth_tags_cpu_va(uvm_channel_t *channel, unsigned tag_index);
|
||||
|
||||
static bool uvm_channel_pool_is_proxy(uvm_channel_pool_t *pool)
|
||||
{
|
||||
UVM_ASSERT(uvm_pool_type_is_valid(pool->pool_type));
|
||||
@@ -565,17 +528,6 @@ static uvm_channel_type_t uvm_channel_proxy_channel_type(void)
|
||||
return UVM_CHANNEL_TYPE_MEMOPS;
|
||||
}
|
||||
|
||||
// Force key rotation in the engine associated with the given channel pool.
|
||||
// Rotation may still not happen if RM cannot acquire the necessary locks (in
|
||||
// which case the function returns NV_ERR_STATE_IN_USE).
|
||||
//
|
||||
// This function should be only invoked in pools in which key rotation is
|
||||
// enabled.
|
||||
NV_STATUS uvm_channel_pool_rotate_key(uvm_channel_pool_t *pool);
|
||||
|
||||
// Retrieve the current encryption key version associated with the channel pool.
|
||||
NvU32 uvm_channel_pool_key_version(uvm_channel_pool_t *pool);
|
||||
|
||||
// Privileged channels support all the Host and engine methods, while
|
||||
// non-privileged channels don't support privileged methods.
|
||||
//
|
||||
@@ -623,9 +575,12 @@ NvU32 uvm_channel_manager_update_progress(uvm_channel_manager_t *channel_manager
|
||||
// beginning.
|
||||
NV_STATUS uvm_channel_manager_wait(uvm_channel_manager_t *manager);
|
||||
|
||||
// Check if WLC/LCIC mechanism is ready/setup
|
||||
// Should only return false during initialization
|
||||
static bool uvm_channel_manager_is_wlc_ready(uvm_channel_manager_t *manager)
|
||||
{
|
||||
return manager->conf_computing.wlc_ready;
|
||||
return (manager->pool_to_use.default_for_type[UVM_CHANNEL_TYPE_WLC] != NULL) &&
|
||||
(manager->pool_to_use.default_for_type[UVM_CHANNEL_TYPE_LCIC] != NULL);
|
||||
}
|
||||
// Get the GPU VA of semaphore_channel's tracking semaphore within the VA space
|
||||
// associated with access_channel.
|
||||
@@ -648,11 +603,6 @@ bool uvm_channel_is_value_completed(uvm_channel_t *channel, NvU64 value);
|
||||
// Update and get the latest completed value by the channel
|
||||
NvU64 uvm_channel_update_completed_value(uvm_channel_t *channel);
|
||||
|
||||
// Wait for the channel to idle
|
||||
// It waits for anything that is running, but doesn't prevent new work from
|
||||
// beginning.
|
||||
NV_STATUS uvm_channel_wait(uvm_channel_t *channel);
|
||||
|
||||
// Select and reserve a channel with the specified type for a push
|
||||
NV_STATUS uvm_channel_reserve_type(uvm_channel_manager_t *manager,
|
||||
uvm_channel_type_t type,
|
||||
@@ -667,9 +617,6 @@ NV_STATUS uvm_channel_reserve_gpu_to_gpu(uvm_channel_manager_t *channel_manager,
|
||||
// Reserve a specific channel for a push or for a control GPFIFO entry.
|
||||
NV_STATUS uvm_channel_reserve(uvm_channel_t *channel, NvU32 num_gpfifo_entries);
|
||||
|
||||
// Release reservation on a specific channel
|
||||
void uvm_channel_release(uvm_channel_t *channel, NvU32 num_gpfifo_entries);
|
||||
|
||||
// Set optimal CE for P2P transfers between manager->gpu and peer
|
||||
void uvm_channel_manager_set_p2p_ce(uvm_channel_manager_t *manager, uvm_gpu_t *peer, NvU32 optimal_ce);
|
||||
|
||||
@@ -701,8 +648,6 @@ NvU32 uvm_channel_get_available_gpfifo_entries(uvm_channel_t *channel);
|
||||
|
||||
void uvm_channel_print_pending_pushes(uvm_channel_t *channel);
|
||||
|
||||
bool uvm_channel_is_locked_for_push(uvm_channel_t *channel);
|
||||
|
||||
static uvm_gpu_t *uvm_channel_get_gpu(uvm_channel_t *channel)
|
||||
{
|
||||
return channel->pool->manager->gpu;
|
||||
|
||||
@@ -1,5 +1,5 @@
|
||||
/*******************************************************************************
|
||||
Copyright (c) 2015-2023 NVIDIA Corporation
|
||||
Copyright (c) 2015-2022 NVIDIA Corporation
|
||||
|
||||
Permission is hereby granted, free of charge, to any person obtaining a copy
|
||||
of this software and associated documentation files (the "Software"), to
|
||||
@@ -24,7 +24,6 @@
|
||||
#include "uvm_global.h"
|
||||
#include "uvm_channel.h"
|
||||
#include "uvm_hal.h"
|
||||
#include "uvm_mem.h"
|
||||
#include "uvm_push.h"
|
||||
#include "uvm_test.h"
|
||||
#include "uvm_test_rng.h"
|
||||
@@ -58,14 +57,14 @@ static NV_STATUS test_ordering(uvm_va_space_t *va_space)
|
||||
const NvU32 values_count = iters_per_channel_type_per_gpu;
|
||||
const size_t buffer_size = sizeof(NvU32) * values_count;
|
||||
|
||||
// TODO: Bug 3839176: the test is waived on Confidential Computing because
|
||||
// it assumes that GPU can access system memory without using encryption.
|
||||
if (g_uvm_global.conf_computing_enabled)
|
||||
return NV_OK;
|
||||
|
||||
gpu = uvm_va_space_find_first_gpu(va_space);
|
||||
TEST_CHECK_RET(gpu != NULL);
|
||||
|
||||
// TODO: Bug 3839176: the test is waived on Confidential Computing because
|
||||
// it assumes that GPU can access system memory without using encryption.
|
||||
if (uvm_conf_computing_mode_enabled(gpu))
|
||||
return NV_OK;
|
||||
|
||||
status = uvm_rm_mem_alloc_and_map_all(gpu, UVM_RM_MEM_TYPE_SYS, buffer_size, 0, &mem);
|
||||
TEST_CHECK_GOTO(status == NV_OK, done);
|
||||
|
||||
@@ -85,7 +84,7 @@ static NV_STATUS test_ordering(uvm_va_space_t *va_space)
|
||||
|
||||
TEST_NV_CHECK_GOTO(uvm_tracker_add_push(&tracker, &push), done);
|
||||
|
||||
exclude_proxy_channel_type = uvm_parent_gpu_needs_proxy_channel_pool(gpu->parent);
|
||||
exclude_proxy_channel_type = uvm_gpu_uses_proxy_channel_pool(gpu);
|
||||
|
||||
for (i = 0; i < iters_per_channel_type_per_gpu; ++i) {
|
||||
for (j = 0; j < UVM_CHANNEL_TYPE_CE_COUNT; ++j) {
|
||||
@@ -223,7 +222,7 @@ static NV_STATUS uvm_test_rc_for_gpu(uvm_gpu_t *gpu)
|
||||
// Check RC on a proxy channel (SR-IOV heavy) or internal channel (any other
|
||||
// mode). It is not allowed to use a virtual address in a memset pushed to
|
||||
// a proxy channel, so we use a physical address instead.
|
||||
if (uvm_parent_gpu_needs_proxy_channel_pool(gpu->parent)) {
|
||||
if (uvm_gpu_uses_proxy_channel_pool(gpu)) {
|
||||
uvm_gpu_address_t dst_address;
|
||||
|
||||
// Save the line number the push that's supposed to fail was started on
|
||||
@@ -315,110 +314,6 @@ static NV_STATUS test_rc(uvm_va_space_t *va_space)
|
||||
return NV_OK;
|
||||
}
|
||||
|
||||
static NV_STATUS uvm_test_iommu_rc_for_gpu(uvm_gpu_t *gpu)
|
||||
{
|
||||
NV_STATUS status = NV_OK;
|
||||
|
||||
#if defined(NV_IOMMU_IS_DMA_DOMAIN_PRESENT) && defined(CONFIG_IOMMU_DEFAULT_DMA_STRICT)
|
||||
// This test needs the DMA API to immediately invalidate IOMMU mappings on
|
||||
// DMA unmap (as apposed to lazy invalidation). The policy can be changed
|
||||
// on boot (e.g. iommu.strict=1), but there isn't a good way to check for
|
||||
// the runtime setting. CONFIG_IOMMU_DEFAULT_DMA_STRICT checks for the
|
||||
// default value.
|
||||
|
||||
uvm_push_t push;
|
||||
uvm_mem_t *sysmem;
|
||||
uvm_gpu_address_t sysmem_dma_addr;
|
||||
char *cpu_ptr = NULL;
|
||||
const size_t data_size = PAGE_SIZE;
|
||||
size_t i;
|
||||
|
||||
struct device *dev = &gpu->parent->pci_dev->dev;
|
||||
struct iommu_domain *domain = iommu_get_domain_for_dev(dev);
|
||||
|
||||
// Check that the iommu domain is controlled by linux DMA API
|
||||
if (!domain || !iommu_is_dma_domain(domain))
|
||||
return NV_OK;
|
||||
|
||||
// Only run if ATS is enabled with 64kB base page.
|
||||
// Otherwise the CE doesn't get response on writing to unmapped location.
|
||||
if (!g_uvm_global.ats.enabled || PAGE_SIZE != UVM_PAGE_SIZE_64K)
|
||||
return NV_OK;
|
||||
|
||||
status = uvm_mem_alloc_sysmem_and_map_cpu_kernel(data_size, NULL, &sysmem);
|
||||
TEST_NV_CHECK_RET(status);
|
||||
|
||||
status = uvm_mem_map_gpu_phys(sysmem, gpu);
|
||||
TEST_NV_CHECK_GOTO(status, done);
|
||||
|
||||
cpu_ptr = uvm_mem_get_cpu_addr_kernel(sysmem);
|
||||
sysmem_dma_addr = uvm_mem_gpu_address_physical(sysmem, gpu, 0, data_size);
|
||||
|
||||
status = uvm_push_begin(gpu->channel_manager, UVM_CHANNEL_TYPE_GPU_TO_CPU, &push, "Test memset to IOMMU mapped sysmem");
|
||||
TEST_NV_CHECK_GOTO(status, done);
|
||||
|
||||
gpu->parent->ce_hal->memset_8(&push, sysmem_dma_addr, 0, data_size);
|
||||
|
||||
status = uvm_push_end_and_wait(&push);
|
||||
TEST_NV_CHECK_GOTO(status, done);
|
||||
|
||||
// Check that we have zeroed the memory
|
||||
for (i = 0; i < data_size; ++i)
|
||||
TEST_CHECK_GOTO(cpu_ptr[i] == 0, done);
|
||||
|
||||
// Unmap the buffer and try write again to the same address
|
||||
uvm_mem_unmap_gpu_phys(sysmem, gpu);
|
||||
|
||||
status = uvm_push_begin(gpu->channel_manager, UVM_CHANNEL_TYPE_GPU_TO_CPU, &push, "Test memset after IOMMU unmap");
|
||||
TEST_NV_CHECK_GOTO(status, done);
|
||||
|
||||
gpu->parent->ce_hal->memset_4(&push, sysmem_dma_addr, 0xffffffff, data_size);
|
||||
|
||||
status = uvm_push_end_and_wait(&push);
|
||||
|
||||
TEST_CHECK_GOTO(status == NV_ERR_RC_ERROR, done);
|
||||
TEST_CHECK_GOTO(uvm_channel_get_status(push.channel) == NV_ERR_RC_ERROR, done);
|
||||
TEST_CHECK_GOTO(uvm_global_reset_fatal_error() == NV_ERR_RC_ERROR, done);
|
||||
|
||||
// Check that writes after unmap did not succeed
|
||||
for (i = 0; i < data_size; ++i)
|
||||
TEST_CHECK_GOTO(cpu_ptr[i] == 0, done);
|
||||
|
||||
status = NV_OK;
|
||||
|
||||
done:
|
||||
uvm_mem_free(sysmem);
|
||||
#endif
|
||||
return status;
|
||||
}
|
||||
|
||||
static NV_STATUS test_iommu(uvm_va_space_t *va_space)
|
||||
{
|
||||
uvm_gpu_t *gpu;
|
||||
|
||||
uvm_assert_mutex_locked(&g_uvm_global.global_lock);
|
||||
|
||||
for_each_va_space_gpu(gpu, va_space) {
|
||||
NV_STATUS test_status, create_status;
|
||||
|
||||
// The GPU channel manager is destroyed and then re-created after
|
||||
// testing ATS RC fault, so this test requires exclusive access to the GPU.
|
||||
TEST_CHECK_RET(uvm_gpu_retained_count(gpu) == 1);
|
||||
|
||||
g_uvm_global.disable_fatal_error_assert = true;
|
||||
test_status = uvm_test_iommu_rc_for_gpu(gpu);
|
||||
g_uvm_global.disable_fatal_error_assert = false;
|
||||
|
||||
uvm_channel_manager_destroy(gpu->channel_manager);
|
||||
create_status = uvm_channel_manager_create(gpu, &gpu->channel_manager);
|
||||
|
||||
TEST_NV_CHECK_RET(test_status);
|
||||
TEST_NV_CHECK_RET(create_status);
|
||||
}
|
||||
|
||||
return NV_OK;
|
||||
}
|
||||
|
||||
typedef struct
|
||||
{
|
||||
uvm_push_t push;
|
||||
@@ -508,7 +403,7 @@ static uvm_channel_type_t random_ce_channel_type_except(uvm_test_rng_t *rng, uvm
|
||||
|
||||
static uvm_channel_type_t gpu_random_internal_ce_channel_type(uvm_gpu_t *gpu, uvm_test_rng_t *rng)
|
||||
{
|
||||
if (uvm_parent_gpu_needs_proxy_channel_pool(gpu->parent))
|
||||
if (uvm_gpu_uses_proxy_channel_pool(gpu))
|
||||
return random_ce_channel_type_except(rng, uvm_channel_proxy_channel_type());
|
||||
|
||||
return random_ce_channel_type(rng);
|
||||
@@ -691,16 +586,12 @@ static NV_STATUS stress_test_all_gpus_in_va(uvm_va_space_t *va_space,
|
||||
if (uvm_test_rng_range_32(&rng, 0, 1) == 0) {
|
||||
NvU32 random_stream_index = uvm_test_rng_range_32(&rng, 0, num_streams - 1);
|
||||
uvm_test_stream_t *random_stream = &streams[random_stream_index];
|
||||
|
||||
if ((random_stream->push.gpu == gpu) || uvm_push_allow_dependencies_across_gpus()) {
|
||||
uvm_push_acquire_tracker(&stream->push, &random_stream->tracker);
|
||||
|
||||
snapshot_counter(&stream->push,
|
||||
random_stream->counter_mem,
|
||||
stream->other_stream_counter_snapshots_mem,
|
||||
i,
|
||||
random_stream->queued_counter_repeat);
|
||||
}
|
||||
uvm_push_acquire_tracker(&stream->push, &random_stream->tracker);
|
||||
snapshot_counter(&stream->push,
|
||||
random_stream->counter_mem,
|
||||
stream->other_stream_counter_snapshots_mem,
|
||||
i,
|
||||
random_stream->queued_counter_repeat);
|
||||
}
|
||||
|
||||
uvm_push_end(&stream->push);
|
||||
@@ -796,10 +687,15 @@ done:
|
||||
NV_STATUS test_conf_computing_channel_selection(uvm_va_space_t *va_space)
|
||||
{
|
||||
NV_STATUS status = NV_OK;
|
||||
uvm_push_t *pushes = NULL;
|
||||
uvm_gpu_t *gpu = NULL;
|
||||
uvm_channel_pool_t *pool;
|
||||
uvm_push_t *pushes;
|
||||
uvm_gpu_t *gpu;
|
||||
NvU32 i;
|
||||
NvU32 num_pushes;
|
||||
|
||||
if (!g_uvm_global.conf_computing_enabled)
|
||||
gpu = uvm_va_space_find_first_gpu(va_space);
|
||||
|
||||
if (!uvm_conf_computing_mode_enabled(gpu))
|
||||
return NV_OK;
|
||||
|
||||
uvm_thread_context_lock_disable_tracking();
|
||||
@@ -807,19 +703,9 @@ NV_STATUS test_conf_computing_channel_selection(uvm_va_space_t *va_space)
|
||||
for_each_va_space_gpu(gpu, va_space) {
|
||||
uvm_channel_type_t channel_type;
|
||||
|
||||
// Key rotation is disabled because this test relies on nested pushes,
|
||||
// which is illegal. If any push other than the first one triggers key
|
||||
// rotation, the test won't complete. This is because key rotation
|
||||
// depends on waiting for ongoing pushes to end, which doesn't happen
|
||||
// if those pushes are ended after the current one begins.
|
||||
uvm_conf_computing_disable_key_rotation(gpu);
|
||||
|
||||
for (channel_type = 0; channel_type < UVM_CHANNEL_TYPE_COUNT; channel_type++) {
|
||||
NvU32 i;
|
||||
NvU32 num_pushes;
|
||||
uvm_channel_pool_t *pool = gpu->channel_manager->pool_to_use.default_for_type[channel_type];
|
||||
|
||||
TEST_CHECK_GOTO(pool != NULL, error);
|
||||
pool = gpu->channel_manager->pool_to_use.default_for_type[channel_type];
|
||||
TEST_CHECK_RET(pool != NULL);
|
||||
|
||||
// Skip LCIC channels as those can't accept any pushes
|
||||
if (uvm_channel_pool_is_lcic(pool))
|
||||
@@ -831,7 +717,7 @@ NV_STATUS test_conf_computing_channel_selection(uvm_va_space_t *va_space)
|
||||
num_pushes = min(pool->num_channels, (NvU32)UVM_PUSH_MAX_CONCURRENT_PUSHES);
|
||||
|
||||
pushes = uvm_kvmalloc_zero(sizeof(*pushes) * num_pushes);
|
||||
TEST_CHECK_GOTO(pushes != NULL, error);
|
||||
TEST_CHECK_RET(pushes != NULL);
|
||||
|
||||
for (i = 0; i < num_pushes; i++) {
|
||||
uvm_push_t *push = &pushes[i];
|
||||
@@ -848,431 +734,18 @@ NV_STATUS test_conf_computing_channel_selection(uvm_va_space_t *va_space)
|
||||
|
||||
uvm_kvfree(pushes);
|
||||
}
|
||||
|
||||
uvm_conf_computing_enable_key_rotation(gpu);
|
||||
}
|
||||
|
||||
uvm_thread_context_lock_enable_tracking();
|
||||
|
||||
return status;
|
||||
|
||||
error:
|
||||
if (gpu != NULL)
|
||||
uvm_conf_computing_enable_key_rotation(gpu);
|
||||
|
||||
uvm_thread_context_lock_enable_tracking();
|
||||
uvm_kvfree(pushes);
|
||||
|
||||
return status;
|
||||
}
|
||||
|
||||
NV_STATUS test_channel_iv_rotation(uvm_va_space_t *va_space)
|
||||
{
|
||||
uvm_gpu_t *gpu;
|
||||
|
||||
if (!g_uvm_global.conf_computing_enabled)
|
||||
return NV_OK;
|
||||
|
||||
for_each_va_space_gpu(gpu, va_space) {
|
||||
uvm_channel_pool_t *pool;
|
||||
|
||||
uvm_for_each_pool(pool, gpu->channel_manager) {
|
||||
NvU64 before_rotation_enc, before_rotation_dec, after_rotation_enc, after_rotation_dec;
|
||||
NV_STATUS status = NV_OK;
|
||||
|
||||
// Check one (the first) channel per pool
|
||||
uvm_channel_t *channel = pool->channels;
|
||||
|
||||
// Create a dummy encrypt/decrypt push to use few IVs.
|
||||
// SEC2 used encrypt during initialization, no need to use a dummy
|
||||
// push.
|
||||
if (!uvm_channel_is_sec2(channel)) {
|
||||
uvm_push_t push;
|
||||
size_t data_size;
|
||||
uvm_conf_computing_dma_buffer_t *cipher_text;
|
||||
void *cipher_cpu_va, *plain_cpu_va, *tag_cpu_va;
|
||||
uvm_gpu_address_t cipher_gpu_address, plain_gpu_address, tag_gpu_address;
|
||||
uvm_channel_t *work_channel = uvm_channel_is_lcic(channel) ? uvm_channel_lcic_get_paired_wlc(channel) : channel;
|
||||
|
||||
plain_cpu_va = &status;
|
||||
data_size = sizeof(status);
|
||||
|
||||
TEST_NV_CHECK_RET(uvm_conf_computing_dma_buffer_alloc(&gpu->conf_computing.dma_buffer_pool,
|
||||
&cipher_text,
|
||||
NULL));
|
||||
cipher_cpu_va = uvm_mem_get_cpu_addr_kernel(cipher_text->alloc);
|
||||
tag_cpu_va = uvm_mem_get_cpu_addr_kernel(cipher_text->auth_tag);
|
||||
|
||||
cipher_gpu_address = uvm_mem_gpu_address_virtual_kernel(cipher_text->alloc, gpu);
|
||||
tag_gpu_address = uvm_mem_gpu_address_virtual_kernel(cipher_text->auth_tag, gpu);
|
||||
|
||||
TEST_NV_CHECK_GOTO(uvm_push_begin_on_channel(work_channel, &push, "Dummy push for IV rotation"), free);
|
||||
|
||||
(void)uvm_push_get_single_inline_buffer(&push,
|
||||
data_size,
|
||||
UVM_CONF_COMPUTING_BUF_ALIGNMENT,
|
||||
&plain_gpu_address);
|
||||
|
||||
uvm_conf_computing_cpu_encrypt(work_channel, cipher_cpu_va, plain_cpu_va, NULL, data_size, tag_cpu_va);
|
||||
gpu->parent->ce_hal->decrypt(&push, plain_gpu_address, cipher_gpu_address, data_size, tag_gpu_address);
|
||||
|
||||
TEST_NV_CHECK_GOTO(uvm_push_end_and_wait(&push), free);
|
||||
|
||||
free:
|
||||
uvm_conf_computing_dma_buffer_free(&gpu->conf_computing.dma_buffer_pool, cipher_text, NULL);
|
||||
|
||||
if (status != NV_OK)
|
||||
return status;
|
||||
}
|
||||
|
||||
// Reserve a channel to hold the push lock during rotation
|
||||
if (!uvm_channel_is_lcic(channel))
|
||||
TEST_NV_CHECK_RET(uvm_channel_reserve(channel, 1));
|
||||
|
||||
uvm_conf_computing_query_message_pools(channel, &before_rotation_enc, &before_rotation_dec);
|
||||
TEST_NV_CHECK_GOTO(uvm_conf_computing_rotate_channel_ivs_below_limit(channel, -1, true), release);
|
||||
uvm_conf_computing_query_message_pools(channel, &after_rotation_enc, &after_rotation_dec);
|
||||
|
||||
release:
|
||||
if (!uvm_channel_is_lcic(channel))
|
||||
uvm_channel_release(channel, 1);
|
||||
|
||||
if (status != NV_OK)
|
||||
return status;
|
||||
|
||||
// All channels except SEC2 used at least a single IV to release tracking.
|
||||
// SEC2 doesn't support decrypt direction.
|
||||
if (uvm_channel_is_sec2(channel))
|
||||
TEST_CHECK_RET(before_rotation_dec == after_rotation_dec);
|
||||
else
|
||||
TEST_CHECK_RET(before_rotation_dec < after_rotation_dec);
|
||||
|
||||
// All channels used one CPU encrypt/GPU decrypt, either during
|
||||
// initialization or in the push above, with the exception of LCIC.
|
||||
// LCIC is used in tandem with WLC, but it never uses CPU encrypt/
|
||||
// GPU decrypt ops.
|
||||
if (uvm_channel_is_lcic(channel))
|
||||
TEST_CHECK_RET(before_rotation_enc == after_rotation_enc);
|
||||
else
|
||||
TEST_CHECK_RET(before_rotation_enc < after_rotation_enc);
|
||||
}
|
||||
}
|
||||
|
||||
return NV_OK;
|
||||
}
|
||||
|
||||
static NV_STATUS force_key_rotations(uvm_channel_pool_t *pool, unsigned num_rotations)
|
||||
{
|
||||
unsigned num_tries;
|
||||
unsigned max_num_tries = 20;
|
||||
unsigned num_rotations_completed = 0;
|
||||
|
||||
if (num_rotations == 0)
|
||||
return NV_OK;
|
||||
|
||||
// The number of accepted rotations is kept low, so failed rotation
|
||||
// invocations due to RM not acquiring the necessary locks (which imply a
|
||||
// sleep in the test) do not balloon the test execution time.
|
||||
UVM_ASSERT(num_rotations <= 10);
|
||||
|
||||
for (num_tries = 0; (num_tries < max_num_tries) && (num_rotations_completed < num_rotations); num_tries++) {
|
||||
// Force key rotation, irrespective of encryption usage.
|
||||
NV_STATUS status = uvm_channel_pool_rotate_key(pool);
|
||||
|
||||
// Key rotation may not be able to complete due to RM failing to acquire
|
||||
// the necessary locks. Detect the situation, sleep for a bit, and then
|
||||
// try again
|
||||
//
|
||||
// The maximum time spent sleeping in a single rotation call is
|
||||
// (max_num_tries * max_sleep_us)
|
||||
if (status == NV_ERR_STATE_IN_USE) {
|
||||
NvU32 min_sleep_us = 1000;
|
||||
NvU32 max_sleep_us = 10000;
|
||||
|
||||
usleep_range(min_sleep_us, max_sleep_us);
|
||||
continue;
|
||||
}
|
||||
|
||||
TEST_NV_CHECK_RET(status);
|
||||
|
||||
num_rotations_completed++;
|
||||
}
|
||||
|
||||
// If not a single key rotation occurred, the dependent tests still pass,
|
||||
// but there is no much value to them. Instead, return an error so the
|
||||
// maximum number of tries, or the maximum sleep time, are adjusted to
|
||||
// ensure that at least one rotation completes.
|
||||
if (num_rotations_completed > 0)
|
||||
return NV_OK;
|
||||
else
|
||||
return NV_ERR_STATE_IN_USE;
|
||||
}
|
||||
|
||||
static NV_STATUS force_key_rotation(uvm_channel_pool_t *pool)
|
||||
{
|
||||
return force_key_rotations(pool, 1);
|
||||
}
|
||||
|
||||
// Test key rotation in all pools. This is useful because key rotation may not
|
||||
// happen otherwise on certain engines during UVM test execution. For example,
|
||||
// if the MEMOPS channel type is mapped to a CE not shared with any other
|
||||
// channel type, then the only encryption taking place in the engine is due to
|
||||
// semaphore releases (4 bytes each). This small encryption size makes it
|
||||
// unlikely to exceed even small rotation thresholds.
|
||||
static NV_STATUS test_channel_key_rotation_basic(uvm_gpu_t *gpu)
|
||||
{
|
||||
uvm_channel_pool_t *pool;
|
||||
|
||||
uvm_for_each_pool(pool, gpu->channel_manager) {
|
||||
if (!uvm_conf_computing_is_key_rotation_enabled_in_pool(pool))
|
||||
continue;
|
||||
|
||||
TEST_NV_CHECK_RET(force_key_rotation(pool));
|
||||
}
|
||||
|
||||
return NV_OK;
|
||||
}
|
||||
|
||||
// Interleave GPU encryptions and decryptions, and their CPU counterparts, with
|
||||
// key rotations.
|
||||
static NV_STATUS test_channel_key_rotation_interleave(uvm_gpu_t *gpu)
|
||||
{
|
||||
int i;
|
||||
uvm_channel_pool_t *gpu_to_cpu_pool;
|
||||
uvm_channel_pool_t *cpu_to_gpu_pool;
|
||||
NV_STATUS status = NV_OK;
|
||||
size_t size = UVM_CONF_COMPUTING_DMA_BUFFER_SIZE;
|
||||
void *initial_plain_cpu = NULL;
|
||||
void *final_plain_cpu = NULL;
|
||||
uvm_mem_t *plain_gpu = NULL;
|
||||
uvm_gpu_address_t plain_gpu_address;
|
||||
|
||||
cpu_to_gpu_pool = gpu->channel_manager->pool_to_use.default_for_type[UVM_CHANNEL_TYPE_CPU_TO_GPU];
|
||||
TEST_CHECK_RET(uvm_conf_computing_is_key_rotation_enabled_in_pool(cpu_to_gpu_pool));
|
||||
|
||||
gpu_to_cpu_pool = gpu->channel_manager->pool_to_use.default_for_type[UVM_CHANNEL_TYPE_GPU_TO_CPU];
|
||||
TEST_CHECK_RET(uvm_conf_computing_is_key_rotation_enabled_in_pool(gpu_to_cpu_pool));
|
||||
|
||||
initial_plain_cpu = uvm_kvmalloc_zero(size);
|
||||
if (initial_plain_cpu == NULL) {
|
||||
status = NV_ERR_NO_MEMORY;
|
||||
goto out;
|
||||
}
|
||||
|
||||
final_plain_cpu = uvm_kvmalloc_zero(size);
|
||||
if (final_plain_cpu == NULL) {
|
||||
status = NV_ERR_NO_MEMORY;
|
||||
goto out;
|
||||
}
|
||||
|
||||
TEST_NV_CHECK_GOTO(uvm_mem_alloc_vidmem(size, gpu, &plain_gpu), out);
|
||||
TEST_NV_CHECK_GOTO(uvm_mem_map_gpu_kernel(plain_gpu, gpu), out);
|
||||
plain_gpu_address = uvm_mem_gpu_address_virtual_kernel(plain_gpu, gpu);
|
||||
|
||||
memset(initial_plain_cpu, 1, size);
|
||||
|
||||
for (i = 0; i < 5; i++) {
|
||||
TEST_NV_CHECK_GOTO(force_key_rotation(gpu_to_cpu_pool), out);
|
||||
TEST_NV_CHECK_GOTO(force_key_rotation(cpu_to_gpu_pool), out);
|
||||
|
||||
TEST_NV_CHECK_GOTO(uvm_conf_computing_util_memcopy_cpu_to_gpu(gpu,
|
||||
plain_gpu_address,
|
||||
initial_plain_cpu,
|
||||
size,
|
||||
NULL,
|
||||
"CPU > GPU"),
|
||||
out);
|
||||
|
||||
TEST_NV_CHECK_GOTO(force_key_rotation(gpu_to_cpu_pool), out);
|
||||
TEST_NV_CHECK_GOTO(force_key_rotation(cpu_to_gpu_pool), out);
|
||||
|
||||
TEST_NV_CHECK_GOTO(uvm_conf_computing_util_memcopy_gpu_to_cpu(gpu,
|
||||
final_plain_cpu,
|
||||
plain_gpu_address,
|
||||
size,
|
||||
NULL,
|
||||
"GPU > CPU"),
|
||||
out);
|
||||
|
||||
TEST_CHECK_GOTO(!memcmp(initial_plain_cpu, final_plain_cpu, size), out);
|
||||
|
||||
memset(final_plain_cpu, 0, size);
|
||||
}
|
||||
|
||||
out:
|
||||
uvm_mem_free(plain_gpu);
|
||||
uvm_kvfree(final_plain_cpu);
|
||||
uvm_kvfree(initial_plain_cpu);
|
||||
|
||||
return status;
|
||||
}
|
||||
|
||||
static NV_STATUS memset_vidmem(uvm_mem_t *mem, NvU8 val)
|
||||
{
|
||||
uvm_push_t push;
|
||||
uvm_gpu_address_t gpu_address;
|
||||
uvm_gpu_t *gpu = mem->backing_gpu;
|
||||
|
||||
UVM_ASSERT(uvm_mem_is_vidmem(mem));
|
||||
|
||||
TEST_NV_CHECK_RET(uvm_push_begin(gpu->channel_manager, UVM_CHANNEL_TYPE_GPU_INTERNAL, &push, "zero vidmem"));
|
||||
|
||||
gpu_address = uvm_mem_gpu_address_virtual_kernel(mem, gpu);
|
||||
gpu->parent->ce_hal->memset_1(&push, gpu_address, val, mem->size);
|
||||
|
||||
TEST_NV_CHECK_RET(uvm_push_end_and_wait(&push));
|
||||
|
||||
return NV_OK;
|
||||
}
|
||||
|
||||
// Custom version of uvm_conf_computing_util_memcopy_gpu_to_cpu that allows
|
||||
// testing to insert key rotations in between the push end, and the CPU
|
||||
// decryption
|
||||
static NV_STATUS encrypted_memcopy_gpu_to_cpu(uvm_gpu_t *gpu,
|
||||
void *dst_plain,
|
||||
uvm_gpu_address_t src_gpu_address,
|
||||
size_t size,
|
||||
unsigned num_rotations_to_insert)
|
||||
{
|
||||
NV_STATUS status;
|
||||
uvm_push_t push;
|
||||
uvm_conf_computing_dma_buffer_t *dma_buffer;
|
||||
uvm_gpu_address_t dst_gpu_address, auth_tag_gpu_address;
|
||||
void *src_cipher, *auth_tag;
|
||||
uvm_channel_t *channel;
|
||||
|
||||
UVM_ASSERT(g_uvm_global.conf_computing_enabled);
|
||||
UVM_ASSERT(size <= UVM_CONF_COMPUTING_DMA_BUFFER_SIZE);
|
||||
|
||||
status = uvm_conf_computing_dma_buffer_alloc(&gpu->conf_computing.dma_buffer_pool, &dma_buffer, NULL);
|
||||
if (status != NV_OK)
|
||||
return status;
|
||||
|
||||
status = uvm_push_begin(gpu->channel_manager, UVM_CHANNEL_TYPE_GPU_TO_CPU, &push, "Small GPU > CPU encryption");
|
||||
if (status != NV_OK)
|
||||
goto out;
|
||||
|
||||
channel = push.channel;
|
||||
uvm_conf_computing_log_gpu_encryption(channel, size, dma_buffer->decrypt_iv);
|
||||
dma_buffer->key_version[0] = uvm_channel_pool_key_version(channel->pool);
|
||||
|
||||
dst_gpu_address = uvm_mem_gpu_address_virtual_kernel(dma_buffer->alloc, gpu);
|
||||
auth_tag_gpu_address = uvm_mem_gpu_address_virtual_kernel(dma_buffer->auth_tag, gpu);
|
||||
gpu->parent->ce_hal->encrypt(&push, dst_gpu_address, src_gpu_address, size, auth_tag_gpu_address);
|
||||
|
||||
status = uvm_push_end_and_wait(&push);
|
||||
if (status != NV_OK)
|
||||
goto out;
|
||||
|
||||
TEST_NV_CHECK_GOTO(force_key_rotations(channel->pool, num_rotations_to_insert), out);
|
||||
|
||||
// If num_rotations_to_insert is not zero, the current encryption key will
|
||||
// be different from the one used during CE encryption.
|
||||
|
||||
src_cipher = uvm_mem_get_cpu_addr_kernel(dma_buffer->alloc);
|
||||
auth_tag = uvm_mem_get_cpu_addr_kernel(dma_buffer->auth_tag);
|
||||
status = uvm_conf_computing_cpu_decrypt(channel,
|
||||
dst_plain,
|
||||
src_cipher,
|
||||
dma_buffer->decrypt_iv,
|
||||
dma_buffer->key_version[0],
|
||||
size,
|
||||
auth_tag);
|
||||
|
||||
out:
|
||||
uvm_conf_computing_dma_buffer_free(&gpu->conf_computing.dma_buffer_pool, dma_buffer, NULL);
|
||||
return status;
|
||||
}
|
||||
|
||||
static NV_STATUS test_channel_key_rotation_cpu_decryption(uvm_gpu_t *gpu,
|
||||
unsigned num_repetitions,
|
||||
unsigned num_rotations_to_insert)
|
||||
{
|
||||
unsigned i;
|
||||
uvm_channel_pool_t *gpu_to_cpu_pool;
|
||||
NV_STATUS status = NV_OK;
|
||||
size_t size = UVM_CONF_COMPUTING_DMA_BUFFER_SIZE;
|
||||
NvU8 *plain_cpu = NULL;
|
||||
uvm_mem_t *plain_gpu = NULL;
|
||||
uvm_gpu_address_t plain_gpu_address;
|
||||
|
||||
if (!uvm_conf_computing_is_key_rotation_enabled(gpu))
|
||||
return NV_OK;
|
||||
|
||||
gpu_to_cpu_pool = gpu->channel_manager->pool_to_use.default_for_type[UVM_CHANNEL_TYPE_GPU_TO_CPU];
|
||||
TEST_CHECK_RET(uvm_conf_computing_is_key_rotation_enabled_in_pool(gpu_to_cpu_pool));
|
||||
|
||||
plain_cpu = (NvU8 *) uvm_kvmalloc_zero(size);
|
||||
if (plain_cpu == NULL) {
|
||||
status = NV_ERR_NO_MEMORY;
|
||||
goto out;
|
||||
}
|
||||
|
||||
TEST_NV_CHECK_GOTO(uvm_mem_alloc_vidmem(size, gpu, &plain_gpu), out);
|
||||
TEST_NV_CHECK_GOTO(uvm_mem_map_gpu_kernel(plain_gpu, gpu), out);
|
||||
TEST_NV_CHECK_GOTO(memset_vidmem(plain_gpu, 1), out);
|
||||
|
||||
plain_gpu_address = uvm_mem_gpu_address_virtual_kernel(plain_gpu, gpu);
|
||||
|
||||
for (i = 0; i < num_repetitions; i++) {
|
||||
unsigned j;
|
||||
|
||||
TEST_NV_CHECK_GOTO(encrypted_memcopy_gpu_to_cpu(gpu,
|
||||
plain_cpu,
|
||||
plain_gpu_address,
|
||||
size,
|
||||
num_rotations_to_insert),
|
||||
out);
|
||||
|
||||
for (j = 0; j < size; j++)
|
||||
TEST_CHECK_GOTO(plain_cpu[j] == 1, out);
|
||||
|
||||
memset(plain_cpu, 0, size);
|
||||
|
||||
}
|
||||
out:
|
||||
uvm_mem_free(plain_gpu);
|
||||
uvm_kvfree(plain_cpu);
|
||||
|
||||
return status;
|
||||
}
|
||||
|
||||
// Test that CPU decryptions can use old keys i.e. previous versions of the keys
|
||||
// that are no longer the current key, due to key rotation. Given that SEC2
|
||||
// does not expose encryption capabilities, the "decrypt-after-rotation" problem
|
||||
// is exclusive of CE encryptions.
|
||||
static NV_STATUS test_channel_key_rotation_decrypt_after_key_rotation(uvm_gpu_t *gpu)
|
||||
{
|
||||
// Instruct encrypted_memcopy_gpu_to_cpu to insert several key rotations
|
||||
// between the GPU encryption, and the associated CPU decryption.
|
||||
unsigned num_rotations_to_insert = 8;
|
||||
|
||||
TEST_NV_CHECK_RET(test_channel_key_rotation_cpu_decryption(gpu, 1, num_rotations_to_insert));
|
||||
|
||||
return NV_OK;
|
||||
}
|
||||
|
||||
static NV_STATUS test_channel_key_rotation(uvm_va_space_t *va_space)
|
||||
{
|
||||
uvm_gpu_t *gpu;
|
||||
|
||||
if (!g_uvm_global.conf_computing_enabled)
|
||||
return NV_OK;
|
||||
|
||||
for_each_va_space_gpu(gpu, va_space) {
|
||||
if (!uvm_conf_computing_is_key_rotation_enabled(gpu))
|
||||
break;
|
||||
|
||||
TEST_NV_CHECK_RET(test_channel_key_rotation_basic(gpu));
|
||||
|
||||
TEST_NV_CHECK_RET(test_channel_key_rotation_interleave(gpu));
|
||||
|
||||
TEST_NV_CHECK_RET(test_channel_key_rotation_decrypt_after_key_rotation(gpu));
|
||||
}
|
||||
|
||||
return NV_OK;
|
||||
}
|
||||
|
||||
NV_STATUS test_write_ctrl_gpfifo_noop(uvm_va_space_t *va_space)
|
||||
{
|
||||
uvm_gpu_t *gpu;
|
||||
@@ -1372,9 +845,11 @@ NV_STATUS test_write_ctrl_gpfifo_tight(uvm_va_space_t *va_space)
|
||||
NvU64 entry;
|
||||
uvm_push_t push;
|
||||
|
||||
gpu = uvm_va_space_find_first_gpu(va_space);
|
||||
|
||||
// TODO: Bug 3839176: the test is waived on Confidential Computing because
|
||||
// it assumes that GPU can access system memory without using encryption.
|
||||
if (g_uvm_global.conf_computing_enabled)
|
||||
if (uvm_conf_computing_mode_enabled(gpu))
|
||||
return NV_OK;
|
||||
|
||||
for_each_va_space_gpu(gpu, va_space) {
|
||||
@@ -1449,7 +924,7 @@ static NV_STATUS test_channel_pushbuffer_extension_base(uvm_va_space_t *va_space
|
||||
uvm_channel_manager_t *manager;
|
||||
uvm_channel_pool_t *pool;
|
||||
|
||||
if (!uvm_parent_gpu_needs_pushbuffer_segments(gpu->parent))
|
||||
if (!uvm_gpu_has_pushbuffer_segments(gpu))
|
||||
continue;
|
||||
|
||||
// The GPU channel manager pushbuffer is destroyed and then re-created
|
||||
@@ -1524,14 +999,6 @@ NV_STATUS uvm_test_channel_sanity(UVM_TEST_CHANNEL_SANITY_PARAMS *params, struct
|
||||
if (status != NV_OK)
|
||||
goto done;
|
||||
|
||||
status = test_channel_iv_rotation(va_space);
|
||||
if (status != NV_OK)
|
||||
goto done;
|
||||
|
||||
status = test_channel_key_rotation(va_space);
|
||||
if (status != NV_OK)
|
||||
goto done;
|
||||
|
||||
// The following tests have side effects, they reset the GPU's
|
||||
// channel_manager.
|
||||
status = test_channel_pushbuffer_extension_base(va_space);
|
||||
@@ -1552,10 +1019,6 @@ NV_STATUS uvm_test_channel_sanity(UVM_TEST_CHANNEL_SANITY_PARAMS *params, struct
|
||||
goto done;
|
||||
}
|
||||
|
||||
status = test_iommu(va_space);
|
||||
if (status != NV_OK)
|
||||
goto done;
|
||||
|
||||
done:
|
||||
uvm_va_space_up_read_rm(va_space);
|
||||
uvm_mutex_unlock(&g_uvm_global.global_lock);
|
||||
@@ -1571,22 +1034,23 @@ static NV_STATUS uvm_test_channel_stress_stream(uvm_va_space_t *va_space,
|
||||
if (params->iterations == 0 || params->num_streams == 0)
|
||||
return NV_ERR_INVALID_PARAMETER;
|
||||
|
||||
// TODO: Bug 3839176: the test is waived on Confidential Computing because
|
||||
// it assumes that GPU can access system memory without using encryption.
|
||||
if (g_uvm_global.conf_computing_enabled)
|
||||
return NV_OK;
|
||||
|
||||
// TODO: Bug 1764963: Rework the test to not rely on the global lock as that
|
||||
// serializes all the threads calling this at the same time.
|
||||
uvm_mutex_lock(&g_uvm_global.global_lock);
|
||||
uvm_va_space_down_read_rm(va_space);
|
||||
|
||||
// TODO: Bug 3839176: the test is waived on Confidential Computing because
|
||||
// it assumes that GPU can access system memory without using encryption.
|
||||
if (uvm_conf_computing_mode_enabled(uvm_va_space_find_first_gpu(va_space)))
|
||||
goto done;
|
||||
|
||||
status = stress_test_all_gpus_in_va(va_space,
|
||||
params->num_streams,
|
||||
params->iterations,
|
||||
params->seed,
|
||||
params->verbose);
|
||||
|
||||
done:
|
||||
uvm_va_space_up_read_rm(va_space);
|
||||
uvm_mutex_unlock(&g_uvm_global.global_lock);
|
||||
|
||||
@@ -1667,126 +1131,6 @@ done:
|
||||
return status;
|
||||
}
|
||||
|
||||
static NV_STATUS channel_stress_key_rotation_cpu_encryption(uvm_gpu_t *gpu, UVM_TEST_CHANNEL_STRESS_PARAMS *params)
|
||||
{
|
||||
int i;
|
||||
uvm_channel_pool_t *cpu_to_gpu_pool;
|
||||
NV_STATUS status = NV_OK;
|
||||
size_t size = UVM_CONF_COMPUTING_DMA_BUFFER_SIZE;
|
||||
void *initial_plain_cpu = NULL;
|
||||
uvm_mem_t *plain_gpu = NULL;
|
||||
uvm_gpu_address_t plain_gpu_address;
|
||||
|
||||
UVM_ASSERT(params->key_rotation_operation == UVM_TEST_CHANNEL_STRESS_KEY_ROTATION_OPERATION_CPU_TO_GPU);
|
||||
|
||||
cpu_to_gpu_pool = gpu->channel_manager->pool_to_use.default_for_type[UVM_CHANNEL_TYPE_CPU_TO_GPU];
|
||||
TEST_CHECK_RET(uvm_conf_computing_is_key_rotation_enabled_in_pool(cpu_to_gpu_pool));
|
||||
|
||||
initial_plain_cpu = uvm_kvmalloc_zero(size);
|
||||
if (initial_plain_cpu == NULL) {
|
||||
status = NV_ERR_NO_MEMORY;
|
||||
goto out;
|
||||
}
|
||||
|
||||
TEST_NV_CHECK_GOTO(uvm_mem_alloc_vidmem(size, gpu, &plain_gpu), out);
|
||||
TEST_NV_CHECK_GOTO(uvm_mem_map_gpu_kernel(plain_gpu, gpu), out);
|
||||
plain_gpu_address = uvm_mem_gpu_address_virtual_kernel(plain_gpu, gpu);
|
||||
|
||||
memset(initial_plain_cpu, 1, size);
|
||||
|
||||
for (i = 0; i < params->iterations; i++) {
|
||||
TEST_NV_CHECK_GOTO(uvm_conf_computing_util_memcopy_cpu_to_gpu(gpu,
|
||||
plain_gpu_address,
|
||||
initial_plain_cpu,
|
||||
size,
|
||||
NULL,
|
||||
"CPU > GPU"),
|
||||
out);
|
||||
}
|
||||
|
||||
out:
|
||||
uvm_mem_free(plain_gpu);
|
||||
uvm_kvfree(initial_plain_cpu);
|
||||
|
||||
return status;
|
||||
}
|
||||
|
||||
static NV_STATUS channel_stress_key_rotation_cpu_decryption(uvm_gpu_t *gpu, UVM_TEST_CHANNEL_STRESS_PARAMS *params)
|
||||
{
|
||||
unsigned num_rotations_to_insert = 0;
|
||||
|
||||
UVM_ASSERT(params->key_rotation_operation == UVM_TEST_CHANNEL_STRESS_KEY_ROTATION_OPERATION_GPU_TO_CPU);
|
||||
|
||||
return test_channel_key_rotation_cpu_decryption(gpu, params->iterations, num_rotations_to_insert);
|
||||
}
|
||||
|
||||
static NV_STATUS channel_stress_key_rotation_rotate(uvm_gpu_t *gpu, UVM_TEST_CHANNEL_STRESS_PARAMS *params)
|
||||
{
|
||||
NvU32 i;
|
||||
|
||||
UVM_ASSERT(params->key_rotation_operation == UVM_TEST_CHANNEL_STRESS_KEY_ROTATION_OPERATION_ROTATE);
|
||||
|
||||
for (i = 0; i < params->iterations; ++i) {
|
||||
NV_STATUS status;
|
||||
uvm_channel_pool_t *pool;
|
||||
uvm_channel_type_t type;
|
||||
|
||||
if ((i % 3) == 0)
|
||||
type = UVM_CHANNEL_TYPE_CPU_TO_GPU;
|
||||
else if ((i % 3) == 1)
|
||||
type = UVM_CHANNEL_TYPE_GPU_TO_CPU;
|
||||
else
|
||||
type = UVM_CHANNEL_TYPE_WLC;
|
||||
|
||||
pool = gpu->channel_manager->pool_to_use.default_for_type[type];
|
||||
|
||||
if (!uvm_conf_computing_is_key_rotation_enabled_in_pool(pool))
|
||||
return NV_ERR_INVALID_STATE;
|
||||
|
||||
status = force_key_rotation(pool);
|
||||
if (status != NV_OK)
|
||||
return status;
|
||||
}
|
||||
|
||||
return NV_OK;
|
||||
}
|
||||
|
||||
// The objective of this test is documented in the user-level function
|
||||
static NV_STATUS uvm_test_channel_stress_key_rotation(uvm_va_space_t *va_space, UVM_TEST_CHANNEL_STRESS_PARAMS *params)
|
||||
{
|
||||
uvm_test_rng_t rng;
|
||||
uvm_gpu_t *gpu;
|
||||
NV_STATUS status = NV_OK;
|
||||
|
||||
if (!g_uvm_global.conf_computing_enabled)
|
||||
return NV_OK;
|
||||
|
||||
uvm_test_rng_init(&rng, params->seed);
|
||||
|
||||
uvm_va_space_down_read(va_space);
|
||||
|
||||
// Key rotation should be enabled, or disabled, in all GPUs. Pick a random
|
||||
// one.
|
||||
gpu = random_va_space_gpu(&rng, va_space);
|
||||
|
||||
if (!uvm_conf_computing_is_key_rotation_enabled(gpu))
|
||||
goto out;
|
||||
|
||||
if (params->key_rotation_operation == UVM_TEST_CHANNEL_STRESS_KEY_ROTATION_OPERATION_CPU_TO_GPU)
|
||||
status = channel_stress_key_rotation_cpu_encryption(gpu, params);
|
||||
else if (params->key_rotation_operation == UVM_TEST_CHANNEL_STRESS_KEY_ROTATION_OPERATION_GPU_TO_CPU)
|
||||
status = channel_stress_key_rotation_cpu_decryption(gpu, params);
|
||||
else if (params->key_rotation_operation == UVM_TEST_CHANNEL_STRESS_KEY_ROTATION_OPERATION_ROTATE)
|
||||
status = channel_stress_key_rotation_rotate(gpu, params);
|
||||
else
|
||||
status = NV_ERR_INVALID_PARAMETER;
|
||||
|
||||
out:
|
||||
uvm_va_space_up_read(va_space);
|
||||
|
||||
return status;
|
||||
}
|
||||
|
||||
NV_STATUS uvm_test_channel_stress(UVM_TEST_CHANNEL_STRESS_PARAMS *params, struct file *filp)
|
||||
{
|
||||
uvm_va_space_t *va_space = uvm_va_space_get(filp);
|
||||
@@ -1798,8 +1142,6 @@ NV_STATUS uvm_test_channel_stress(UVM_TEST_CHANNEL_STRESS_PARAMS *params, struct
|
||||
return uvm_test_channel_stress_update_channels(va_space, params);
|
||||
case UVM_TEST_CHANNEL_STRESS_MODE_NOOP_PUSH:
|
||||
return uvm_test_channel_noop_push(va_space, params);
|
||||
case UVM_TEST_CHANNEL_STRESS_MODE_KEY_ROTATION:
|
||||
return uvm_test_channel_stress_key_rotation(va_space, params);
|
||||
default:
|
||||
return NV_ERR_INVALID_PARAMETER;
|
||||
}
|
||||
|
||||
@@ -1,5 +1,5 @@
|
||||
/*******************************************************************************
|
||||
Copyright (c) 2013-2023 NVIDIA Corporation
|
||||
Copyright (c) 2013-2021 NVIDIA Corporation
|
||||
|
||||
This program is free software; you can redistribute it and/or
|
||||
modify it under the terms of the GNU General Public License
|
||||
@@ -233,6 +233,18 @@ unsigned uvm_get_stale_thread_id(void)
|
||||
return (unsigned)task_pid_vnr(current);
|
||||
}
|
||||
|
||||
//
|
||||
// A simple security rule for allowing access to UVM user space memory: if you
|
||||
// are the same user as the owner of the memory, or if you are root, then you
|
||||
// are granted access. The idea is to allow debuggers and profilers to work, but
|
||||
// without opening up any security holes.
|
||||
//
|
||||
NvBool uvm_user_id_security_check(uid_t euidTarget)
|
||||
{
|
||||
return (NV_CURRENT_EUID() == euidTarget) ||
|
||||
(UVM_ROOT_UID == euidTarget);
|
||||
}
|
||||
|
||||
void on_uvm_test_fail(void)
|
||||
{
|
||||
(void)NULL;
|
||||
@@ -318,11 +330,10 @@ int format_uuid_to_buffer(char *buffer, unsigned bufferLength, const NvProcessor
|
||||
unsigned i;
|
||||
unsigned dashMask = 1 << 4 | 1 << 6 | 1 << 8 | 1 << 10;
|
||||
|
||||
memcpy(buffer, "UVM-GPU-", 8);
|
||||
if (bufferLength < (8 /*prefix*/+ 16 * 2 /*digits*/ + 4 * 1 /*dashes*/ + 1 /*null*/))
|
||||
return *buffer = 0;
|
||||
|
||||
memcpy(buffer, "UVM-GPU-", 8);
|
||||
|
||||
for (i = 0; i < 16; i++) {
|
||||
*str++ = uvm_digit_to_hex(pUuidStruct->uuid[i] >> 4);
|
||||
*str++ = uvm_digit_to_hex(pUuidStruct->uuid[i] & 0xF);
|
||||
|
||||
@@ -1,5 +1,5 @@
|
||||
/*******************************************************************************
|
||||
Copyright (c) 2013-2023 NVIDIA Corporation
|
||||
Copyright (c) 2013-2021 NVIDIA Corporation
|
||||
|
||||
Permission is hereby granted, free of charge, to any person obtaining a copy
|
||||
of this software and associated documentation files (the "Software"), to
|
||||
@@ -204,6 +204,13 @@ extern bool uvm_release_asserts_set_global_error_for_tests;
|
||||
#define UVM_ASSERT_MSG_RELEASE(expr, fmt, ...) _UVM_ASSERT_MSG_RELEASE(expr, #expr, ": " fmt, ##__VA_ARGS__)
|
||||
#define UVM_ASSERT_RELEASE(expr) _UVM_ASSERT_MSG_RELEASE(expr, #expr, "\n")
|
||||
|
||||
// Provide a short form of UUID's, typically for use in debug printing:
|
||||
#define ABBREV_UUID(uuid) (unsigned)(uuid)
|
||||
|
||||
static inline NvBool uvm_uuid_is_cpu(const NvProcessorUuid *uuid)
|
||||
{
|
||||
return memcmp(uuid, &NV_PROCESSOR_UUID_CPU_DEFAULT, sizeof(*uuid)) == 0;
|
||||
}
|
||||
#define UVM_SIZE_1KB (1024ULL)
|
||||
#define UVM_SIZE_1MB (1024 * UVM_SIZE_1KB)
|
||||
#define UVM_SIZE_1GB (1024 * UVM_SIZE_1MB)
|
||||
@@ -275,6 +282,9 @@ static inline void kmem_cache_destroy_safe(struct kmem_cache **ppCache)
|
||||
}
|
||||
}
|
||||
|
||||
static const uid_t UVM_ROOT_UID = 0;
|
||||
|
||||
|
||||
typedef struct
|
||||
{
|
||||
NvU64 start_time_ns;
|
||||
@@ -325,6 +335,7 @@ NV_STATUS errno_to_nv_status(int errnoCode);
|
||||
int nv_status_to_errno(NV_STATUS status);
|
||||
unsigned uvm_get_stale_process_id(void);
|
||||
unsigned uvm_get_stale_thread_id(void);
|
||||
NvBool uvm_user_id_security_check(uid_t euidTarget);
|
||||
|
||||
extern int uvm_enable_builtin_tests;
|
||||
|
||||
|
||||
@@ -33,43 +33,22 @@
|
||||
#include "nv_uvm_interface.h"
|
||||
#include "uvm_va_block.h"
|
||||
|
||||
// Amount of encrypted data on a given engine that triggers key rotation. This
|
||||
// is a UVM internal threshold, different from that of RM, and used only during
|
||||
// testing.
|
||||
//
|
||||
// Key rotation is triggered when the total encryption size, or the total
|
||||
// decryption size (whatever comes first) reaches this lower threshold on the
|
||||
// engine.
|
||||
#define UVM_CONF_COMPUTING_KEY_ROTATION_LOWER_THRESHOLD (UVM_SIZE_1MB * 8)
|
||||
|
||||
// The maximum number of secure operations per push is:
|
||||
// UVM_MAX_PUSH_SIZE / min(CE encryption size, CE decryption size)
|
||||
// + 1 (tracking semaphore) = 128 * 1024 / 56 + 1 = 2342
|
||||
#define UVM_CONF_COMPUTING_IV_REMAINING_LIMIT_MIN 2342lu
|
||||
|
||||
// Channels use 32-bit counters so the value after rotation is 0xffffffff.
|
||||
// setting the limit to this value (or higher) will result in rotation
|
||||
// on every check. However, pre-emptive rotation when submitting control
|
||||
// GPFIFO entries relies on the fact that multiple successive checks after
|
||||
// rotation do not trigger more rotations if there was no IV used in between.
|
||||
#define UVM_CONF_COMPUTING_IV_REMAINING_LIMIT_MAX 0xfffffffelu
|
||||
|
||||
// Attempt rotation when two billion IVs are left. IV rotation call can fail if
|
||||
// the necessary locks are not available, so multiple attempts may be need for
|
||||
// IV rotation to succeed.
|
||||
#define UVM_CONF_COMPUTING_IV_REMAINING_LIMIT_DEFAULT (1lu << 31)
|
||||
|
||||
// Start rotating after 500 encryption/decryptions when running tests.
|
||||
#define UVM_CONF_COMPUTING_IV_REMAINING_LIMIT_TESTS ((1lu << 32) - 500lu)
|
||||
static ulong uvm_conf_computing_channel_iv_rotation_limit = UVM_CONF_COMPUTING_IV_REMAINING_LIMIT_DEFAULT;
|
||||
|
||||
module_param(uvm_conf_computing_channel_iv_rotation_limit, ulong, S_IRUGO);
|
||||
|
||||
static UvmGpuConfComputeMode uvm_conf_computing_get_mode(const uvm_parent_gpu_t *parent)
|
||||
{
|
||||
return parent->rm_info.gpuConfComputeCaps.mode;
|
||||
}
|
||||
|
||||
bool uvm_conf_computing_mode_enabled_parent(const uvm_parent_gpu_t *parent)
|
||||
{
|
||||
return uvm_conf_computing_get_mode(parent) != UVM_GPU_CONF_COMPUTE_MODE_NONE;
|
||||
}
|
||||
|
||||
bool uvm_conf_computing_mode_enabled(const uvm_gpu_t *gpu)
|
||||
{
|
||||
return uvm_conf_computing_mode_enabled_parent(gpu->parent);
|
||||
}
|
||||
|
||||
bool uvm_conf_computing_mode_is_hcc(const uvm_gpu_t *gpu)
|
||||
{
|
||||
return uvm_conf_computing_get_mode(gpu->parent) == UVM_GPU_CONF_COMPUTE_MODE_HCC;
|
||||
@@ -77,20 +56,24 @@ bool uvm_conf_computing_mode_is_hcc(const uvm_gpu_t *gpu)
|
||||
|
||||
void uvm_conf_computing_check_parent_gpu(const uvm_parent_gpu_t *parent)
|
||||
{
|
||||
uvm_parent_gpu_t *other_parent;
|
||||
UvmGpuConfComputeMode parent_mode = uvm_conf_computing_get_mode(parent);
|
||||
uvm_gpu_t *first_gpu;
|
||||
|
||||
uvm_assert_mutex_locked(&g_uvm_global.global_lock);
|
||||
|
||||
// The Confidential Computing state of the GPU should match that of the
|
||||
// system.
|
||||
UVM_ASSERT((parent_mode != UVM_GPU_CONF_COMPUTE_MODE_NONE) == g_uvm_global.conf_computing_enabled);
|
||||
UVM_ASSERT(uvm_conf_computing_mode_enabled_parent(parent) == g_uvm_global.conf_computing_enabled);
|
||||
|
||||
// TODO: Bug 2844714: since we have no routine to traverse parent GPUs,
|
||||
// find first child GPU and get its parent.
|
||||
first_gpu = uvm_global_processor_mask_find_first_gpu(&g_uvm_global.retained_gpus);
|
||||
if (first_gpu == NULL)
|
||||
return;
|
||||
|
||||
// All GPUs derive Confidential Computing status from their parent. By
|
||||
// current policy all parent GPUs have identical Confidential Computing
|
||||
// status.
|
||||
for_each_parent_gpu(other_parent)
|
||||
UVM_ASSERT(parent_mode == uvm_conf_computing_get_mode(other_parent));
|
||||
UVM_ASSERT(uvm_conf_computing_get_mode(parent) == uvm_conf_computing_get_mode(first_gpu->parent));
|
||||
}
|
||||
|
||||
static void dma_buffer_destroy_locked(uvm_conf_computing_dma_buffer_pool_t *dma_buffer_pool,
|
||||
@@ -204,11 +187,15 @@ static void dma_buffer_pool_add(uvm_conf_computing_dma_buffer_pool_t *dma_buffer
|
||||
static NV_STATUS conf_computing_dma_buffer_pool_init(uvm_conf_computing_dma_buffer_pool_t *dma_buffer_pool)
|
||||
{
|
||||
size_t i;
|
||||
uvm_gpu_t *gpu;
|
||||
size_t num_dma_buffers = 32;
|
||||
NV_STATUS status = NV_OK;
|
||||
|
||||
UVM_ASSERT(dma_buffer_pool->num_dma_buffers == 0);
|
||||
UVM_ASSERT(g_uvm_global.conf_computing_enabled);
|
||||
|
||||
gpu = dma_buffer_pool_to_gpu(dma_buffer_pool);
|
||||
|
||||
UVM_ASSERT(uvm_conf_computing_mode_enabled(gpu));
|
||||
|
||||
INIT_LIST_HEAD(&dma_buffer_pool->free_dma_buffers);
|
||||
uvm_mutex_init(&dma_buffer_pool->lock, UVM_LOCK_ORDER_CONF_COMPUTING_DMA_BUFFER_POOL);
|
||||
@@ -361,24 +348,11 @@ error:
|
||||
return status;
|
||||
}
|
||||
|
||||
// The production key rotation defaults are such that key rotations rarely
|
||||
// happen. During UVM testing more frequent rotations are triggering by relying
|
||||
// on internal encryption usage accounting. When key rotations are triggered by
|
||||
// UVM, the driver does not rely on channel key rotation notifiers.
|
||||
//
|
||||
// TODO: Bug 4612912: UVM should be able to programmatically set the rotation
|
||||
// lower threshold. This function, and all the metadata associated with it
|
||||
// (per-pool encryption accounting, for example) can be removed at that point.
|
||||
static bool key_rotation_is_notifier_driven(void)
|
||||
{
|
||||
return !uvm_enable_builtin_tests;
|
||||
}
|
||||
|
||||
NV_STATUS uvm_conf_computing_gpu_init(uvm_gpu_t *gpu)
|
||||
{
|
||||
NV_STATUS status;
|
||||
|
||||
if (!g_uvm_global.conf_computing_enabled)
|
||||
if (!uvm_conf_computing_mode_enabled(gpu))
|
||||
return NV_OK;
|
||||
|
||||
status = conf_computing_dma_buffer_pool_init(&gpu->conf_computing.dma_buffer_pool);
|
||||
@@ -389,20 +363,6 @@ NV_STATUS uvm_conf_computing_gpu_init(uvm_gpu_t *gpu)
|
||||
if (status != NV_OK)
|
||||
goto error;
|
||||
|
||||
if (uvm_enable_builtin_tests && uvm_conf_computing_channel_iv_rotation_limit == UVM_CONF_COMPUTING_IV_REMAINING_LIMIT_DEFAULT)
|
||||
uvm_conf_computing_channel_iv_rotation_limit = UVM_CONF_COMPUTING_IV_REMAINING_LIMIT_TESTS;
|
||||
|
||||
if (uvm_conf_computing_channel_iv_rotation_limit < UVM_CONF_COMPUTING_IV_REMAINING_LIMIT_MIN ||
|
||||
uvm_conf_computing_channel_iv_rotation_limit > UVM_CONF_COMPUTING_IV_REMAINING_LIMIT_MAX) {
|
||||
UVM_ERR_PRINT("Value of uvm_conf_computing_channel_iv_rotation_limit: %lu is outside of the safe "
|
||||
"range: <%lu, %lu>. Using the default value instead (%lu)\n",
|
||||
uvm_conf_computing_channel_iv_rotation_limit,
|
||||
UVM_CONF_COMPUTING_IV_REMAINING_LIMIT_MIN,
|
||||
UVM_CONF_COMPUTING_IV_REMAINING_LIMIT_MAX,
|
||||
UVM_CONF_COMPUTING_IV_REMAINING_LIMIT_DEFAULT);
|
||||
uvm_conf_computing_channel_iv_rotation_limit = UVM_CONF_COMPUTING_IV_REMAINING_LIMIT_DEFAULT;
|
||||
}
|
||||
|
||||
return NV_OK;
|
||||
|
||||
error:
|
||||
@@ -416,35 +376,18 @@ void uvm_conf_computing_gpu_deinit(uvm_gpu_t *gpu)
|
||||
conf_computing_dma_buffer_pool_deinit(&gpu->conf_computing.dma_buffer_pool);
|
||||
}
|
||||
|
||||
void uvm_conf_computing_log_gpu_encryption(uvm_channel_t *channel, size_t size, UvmCslIv *iv)
|
||||
void uvm_conf_computing_log_gpu_encryption(uvm_channel_t *channel, UvmCslIv *iv)
|
||||
{
|
||||
NV_STATUS status;
|
||||
uvm_channel_pool_t *pool;
|
||||
|
||||
if (uvm_channel_is_lcic(channel))
|
||||
pool = uvm_channel_lcic_get_paired_wlc(channel)->pool;
|
||||
else
|
||||
pool = channel->pool;
|
||||
|
||||
uvm_mutex_lock(&channel->csl.ctx_lock);
|
||||
|
||||
if (uvm_conf_computing_is_key_rotation_enabled_in_pool(pool)) {
|
||||
status = nvUvmInterfaceCslLogEncryption(&channel->csl.ctx, UVM_CSL_OPERATION_DECRYPT, size);
|
||||
|
||||
// Informing RM of an encryption/decryption should not fail
|
||||
UVM_ASSERT(status == NV_OK);
|
||||
|
||||
if (!key_rotation_is_notifier_driven())
|
||||
atomic64_add(size, &pool->conf_computing.key_rotation.encrypted);
|
||||
}
|
||||
|
||||
status = nvUvmInterfaceCslIncrementIv(&channel->csl.ctx, UVM_CSL_OPERATION_DECRYPT, 1, iv);
|
||||
|
||||
// IV rotation is done preemptively as needed, so the above
|
||||
// call cannot return failure.
|
||||
UVM_ASSERT(status == NV_OK);
|
||||
|
||||
uvm_mutex_unlock(&channel->csl.ctx_lock);
|
||||
|
||||
// TODO: Bug 4014720: If nvUvmInterfaceCslIncrementIv returns with
|
||||
// NV_ERR_INSUFFICIENT_RESOURCES then the IV needs to be rotated via
|
||||
// nvUvmInterfaceCslRotateIv.
|
||||
UVM_ASSERT(status == NV_OK);
|
||||
}
|
||||
|
||||
void uvm_conf_computing_acquire_encryption_iv(uvm_channel_t *channel, UvmCslIv *iv)
|
||||
@@ -455,8 +398,9 @@ void uvm_conf_computing_acquire_encryption_iv(uvm_channel_t *channel, UvmCslIv *
|
||||
status = nvUvmInterfaceCslIncrementIv(&channel->csl.ctx, UVM_CSL_OPERATION_ENCRYPT, 1, iv);
|
||||
uvm_mutex_unlock(&channel->csl.ctx_lock);
|
||||
|
||||
// IV rotation is done preemptively as needed, so the above
|
||||
// call cannot return failure.
|
||||
// TODO: Bug 4014720: If nvUvmInterfaceCslIncrementIv returns with
|
||||
// NV_ERR_INSUFFICIENT_RESOURCES then the IV needs to be rotated via
|
||||
// nvUvmInterfaceCslRotateIv.
|
||||
UVM_ASSERT(status == NV_OK);
|
||||
}
|
||||
|
||||
@@ -468,79 +412,41 @@ void uvm_conf_computing_cpu_encrypt(uvm_channel_t *channel,
|
||||
void *auth_tag_buffer)
|
||||
{
|
||||
NV_STATUS status;
|
||||
uvm_channel_pool_t *pool;
|
||||
|
||||
UVM_ASSERT(size);
|
||||
|
||||
if (uvm_channel_is_lcic(channel))
|
||||
pool = uvm_channel_lcic_get_paired_wlc(channel)->pool;
|
||||
else
|
||||
pool = channel->pool;
|
||||
|
||||
uvm_mutex_lock(&channel->csl.ctx_lock);
|
||||
|
||||
status = nvUvmInterfaceCslEncrypt(&channel->csl.ctx,
|
||||
size,
|
||||
(NvU8 const *) src_plain,
|
||||
encrypt_iv,
|
||||
(NvU8 *) dst_cipher,
|
||||
(NvU8 *) auth_tag_buffer);
|
||||
|
||||
// IV rotation is done preemptively as needed, so the above
|
||||
// call cannot return failure.
|
||||
UVM_ASSERT(status == NV_OK);
|
||||
|
||||
if (uvm_conf_computing_is_key_rotation_enabled_in_pool(pool)) {
|
||||
status = nvUvmInterfaceCslLogEncryption(&channel->csl.ctx, UVM_CSL_OPERATION_ENCRYPT, size);
|
||||
|
||||
// Informing RM of an encryption/decryption should not fail
|
||||
UVM_ASSERT(status == NV_OK);
|
||||
|
||||
if (!key_rotation_is_notifier_driven())
|
||||
atomic64_add(size, &pool->conf_computing.key_rotation.decrypted);
|
||||
}
|
||||
|
||||
uvm_mutex_unlock(&channel->csl.ctx_lock);
|
||||
|
||||
// nvUvmInterfaceCslEncrypt fails when a 64-bit encryption counter
|
||||
// overflows. This is not supposed to happen on CC.
|
||||
UVM_ASSERT(status == NV_OK);
|
||||
}
|
||||
|
||||
NV_STATUS uvm_conf_computing_cpu_decrypt(uvm_channel_t *channel,
|
||||
void *dst_plain,
|
||||
const void *src_cipher,
|
||||
const UvmCslIv *src_iv,
|
||||
NvU32 key_version,
|
||||
size_t size,
|
||||
const void *auth_tag_buffer)
|
||||
{
|
||||
NV_STATUS status;
|
||||
|
||||
// The CSL context associated with a channel can be used by multiple
|
||||
// threads. The IV sequence is thus guaranteed only while the channel is
|
||||
// "locked for push". The channel/push lock is released in
|
||||
// "uvm_channel_end_push", and at that time the GPU encryption operations
|
||||
// have not executed, yet. Therefore the caller has to use
|
||||
// "uvm_conf_computing_log_gpu_encryption" to explicitly store IVs needed
|
||||
// to perform CPU decryption and pass those IVs to this function after the
|
||||
// push that did the encryption completes.
|
||||
UVM_ASSERT(src_iv);
|
||||
|
||||
uvm_mutex_lock(&channel->csl.ctx_lock);
|
||||
status = nvUvmInterfaceCslDecrypt(&channel->csl.ctx,
|
||||
size,
|
||||
(const NvU8 *) src_cipher,
|
||||
src_iv,
|
||||
key_version,
|
||||
(NvU8 *) dst_plain,
|
||||
NULL,
|
||||
0,
|
||||
(const NvU8 *) auth_tag_buffer);
|
||||
|
||||
if (status != NV_OK) {
|
||||
UVM_ERR_PRINT("nvUvmInterfaceCslDecrypt() failed: %s, channel %s, GPU %s\n",
|
||||
nvstatusToString(status),
|
||||
channel->name,
|
||||
uvm_gpu_name(uvm_channel_get_gpu(channel)));
|
||||
}
|
||||
|
||||
uvm_mutex_unlock(&channel->csl.ctx_lock);
|
||||
|
||||
return status;
|
||||
@@ -553,8 +459,6 @@ NV_STATUS uvm_conf_computing_fault_decrypt(uvm_parent_gpu_t *parent_gpu,
|
||||
NvU8 valid)
|
||||
{
|
||||
NV_STATUS status;
|
||||
NvU32 fault_entry_size = parent_gpu->fault_buffer_hal->entry_size(parent_gpu);
|
||||
UvmCslContext *csl_context = &parent_gpu->fault_buffer_info.rm_info.replayable.cslCtx;
|
||||
|
||||
// There is no dedicated lock for the CSL context associated with replayable
|
||||
// faults. The mutual exclusion required by the RM CSL API is enforced by
|
||||
@@ -562,376 +466,36 @@ NV_STATUS uvm_conf_computing_fault_decrypt(uvm_parent_gpu_t *parent_gpu,
|
||||
// decryption is invoked as part of fault servicing.
|
||||
UVM_ASSERT(uvm_sem_is_locked(&parent_gpu->isr.replayable_faults.service_lock));
|
||||
|
||||
UVM_ASSERT(g_uvm_global.conf_computing_enabled);
|
||||
UVM_ASSERT(!uvm_parent_gpu_replayable_fault_buffer_is_uvm_owned(parent_gpu));
|
||||
|
||||
status = nvUvmInterfaceCslLogEncryption(csl_context, UVM_CSL_OPERATION_DECRYPT, fault_entry_size);
|
||||
|
||||
// Informing RM of an encryption/decryption should not fail
|
||||
UVM_ASSERT(status == NV_OK);
|
||||
|
||||
status = nvUvmInterfaceCslDecrypt(csl_context,
|
||||
fault_entry_size,
|
||||
status = nvUvmInterfaceCslDecrypt(&parent_gpu->fault_buffer_info.rm_info.replayable.cslCtx,
|
||||
parent_gpu->fault_buffer_hal->entry_size(parent_gpu),
|
||||
(const NvU8 *) src_cipher,
|
||||
NULL,
|
||||
NV_U32_MAX,
|
||||
(NvU8 *) dst_plain,
|
||||
&valid,
|
||||
sizeof(valid),
|
||||
(const NvU8 *) auth_tag_buffer);
|
||||
|
||||
if (status != NV_OK) {
|
||||
UVM_ERR_PRINT("nvUvmInterfaceCslDecrypt() failed: %s, GPU %s\n",
|
||||
nvstatusToString(status),
|
||||
uvm_parent_gpu_name(parent_gpu));
|
||||
|
||||
}
|
||||
if (status != NV_OK)
|
||||
UVM_ERR_PRINT("nvUvmInterfaceCslDecrypt() failed: %s, GPU %s\n", nvstatusToString(status), parent_gpu->name);
|
||||
|
||||
return status;
|
||||
}
|
||||
|
||||
void uvm_conf_computing_fault_increment_decrypt_iv(uvm_parent_gpu_t *parent_gpu)
|
||||
void uvm_conf_computing_fault_increment_decrypt_iv(uvm_parent_gpu_t *parent_gpu, NvU64 increment)
|
||||
{
|
||||
NV_STATUS status;
|
||||
NvU32 fault_entry_size = parent_gpu->fault_buffer_hal->entry_size(parent_gpu);
|
||||
UvmCslContext *csl_context = &parent_gpu->fault_buffer_info.rm_info.replayable.cslCtx;
|
||||
|
||||
// See comment in uvm_conf_computing_fault_decrypt
|
||||
UVM_ASSERT(uvm_sem_is_locked(&parent_gpu->isr.replayable_faults.service_lock));
|
||||
|
||||
UVM_ASSERT(g_uvm_global.conf_computing_enabled);
|
||||
UVM_ASSERT(!uvm_parent_gpu_replayable_fault_buffer_is_uvm_owned(parent_gpu));
|
||||
|
||||
status = nvUvmInterfaceCslLogEncryption(csl_context, UVM_CSL_OPERATION_DECRYPT, fault_entry_size);
|
||||
|
||||
// Informing RM of an encryption/decryption should not fail
|
||||
UVM_ASSERT(status == NV_OK);
|
||||
|
||||
status = nvUvmInterfaceCslIncrementIv(csl_context, UVM_CSL_OPERATION_DECRYPT, 1, NULL);
|
||||
status = nvUvmInterfaceCslIncrementIv(&parent_gpu->fault_buffer_info.rm_info.replayable.cslCtx,
|
||||
UVM_CSL_OPERATION_DECRYPT,
|
||||
increment,
|
||||
NULL);
|
||||
|
||||
UVM_ASSERT(status == NV_OK);
|
||||
}
|
||||
|
||||
void uvm_conf_computing_query_message_pools(uvm_channel_t *channel,
|
||||
NvU64 *remaining_encryptions,
|
||||
NvU64 *remaining_decryptions)
|
||||
{
|
||||
NV_STATUS status;
|
||||
|
||||
UVM_ASSERT(channel);
|
||||
UVM_ASSERT(remaining_encryptions);
|
||||
UVM_ASSERT(remaining_decryptions);
|
||||
|
||||
uvm_mutex_lock(&channel->csl.ctx_lock);
|
||||
status = nvUvmInterfaceCslQueryMessagePool(&channel->csl.ctx, UVM_CSL_OPERATION_ENCRYPT, remaining_encryptions);
|
||||
UVM_ASSERT(status == NV_OK);
|
||||
UVM_ASSERT(*remaining_encryptions <= NV_U32_MAX);
|
||||
|
||||
status = nvUvmInterfaceCslQueryMessagePool(&channel->csl.ctx, UVM_CSL_OPERATION_DECRYPT, remaining_decryptions);
|
||||
UVM_ASSERT(status == NV_OK);
|
||||
UVM_ASSERT(*remaining_decryptions <= NV_U32_MAX);
|
||||
|
||||
// LCIC channels never use CPU encrypt/GPU decrypt
|
||||
if (uvm_channel_is_lcic(channel))
|
||||
UVM_ASSERT(*remaining_encryptions == NV_U32_MAX);
|
||||
|
||||
uvm_mutex_unlock(&channel->csl.ctx_lock);
|
||||
}
|
||||
|
||||
static NV_STATUS uvm_conf_computing_rotate_channel_ivs_below_limit_internal(uvm_channel_t *channel, NvU64 limit)
|
||||
{
|
||||
NV_STATUS status = NV_OK;
|
||||
NvU64 remaining_encryptions, remaining_decryptions;
|
||||
bool rotate_encryption_iv, rotate_decryption_iv;
|
||||
|
||||
UVM_ASSERT(uvm_channel_is_locked_for_push(channel) ||
|
||||
(uvm_channel_is_lcic(channel) && uvm_channel_manager_is_wlc_ready(channel->pool->manager)));
|
||||
|
||||
uvm_conf_computing_query_message_pools(channel, &remaining_encryptions, &remaining_decryptions);
|
||||
|
||||
// Ignore decryption limit for SEC2, only CE channels support
|
||||
// GPU encrypt/CPU decrypt. However, RM reports _some_ decrementing
|
||||
// value for SEC2 decryption counter.
|
||||
rotate_decryption_iv = (remaining_decryptions <= limit) && uvm_channel_is_ce(channel);
|
||||
rotate_encryption_iv = remaining_encryptions <= limit;
|
||||
|
||||
if (!rotate_encryption_iv && !rotate_decryption_iv)
|
||||
return NV_OK;
|
||||
|
||||
// Wait for all in-flight pushes. The caller needs to guarantee that there
|
||||
// are no concurrent pushes created, e.g. by only calling rotate after
|
||||
// a channel is locked_for_push.
|
||||
status = uvm_channel_wait(channel);
|
||||
if (status != NV_OK)
|
||||
return status;
|
||||
|
||||
uvm_mutex_lock(&channel->csl.ctx_lock);
|
||||
|
||||
if (rotate_encryption_iv)
|
||||
status = nvUvmInterfaceCslRotateIv(&channel->csl.ctx, UVM_CSL_OPERATION_ENCRYPT);
|
||||
|
||||
if (status == NV_OK && rotate_decryption_iv)
|
||||
status = nvUvmInterfaceCslRotateIv(&channel->csl.ctx, UVM_CSL_OPERATION_DECRYPT);
|
||||
|
||||
uvm_mutex_unlock(&channel->csl.ctx_lock);
|
||||
|
||||
// Change the error to out of resources if the available IVs are running
|
||||
// too low
|
||||
if (status == NV_ERR_STATE_IN_USE &&
|
||||
(remaining_encryptions < UVM_CONF_COMPUTING_IV_REMAINING_LIMIT_MIN ||
|
||||
remaining_decryptions < UVM_CONF_COMPUTING_IV_REMAINING_LIMIT_MIN))
|
||||
return NV_ERR_INSUFFICIENT_RESOURCES;
|
||||
|
||||
return status;
|
||||
}
|
||||
|
||||
NV_STATUS uvm_conf_computing_rotate_channel_ivs_below_limit(uvm_channel_t *channel, NvU64 limit, bool retry_if_busy)
|
||||
{
|
||||
NV_STATUS status;
|
||||
|
||||
do {
|
||||
status = uvm_conf_computing_rotate_channel_ivs_below_limit_internal(channel, limit);
|
||||
} while (retry_if_busy && status == NV_ERR_STATE_IN_USE);
|
||||
|
||||
// Hide "busy" error. The rotation will be retried at the next opportunity.
|
||||
if (!retry_if_busy && status == NV_ERR_STATE_IN_USE)
|
||||
status = NV_OK;
|
||||
|
||||
return status;
|
||||
}
|
||||
|
||||
NV_STATUS uvm_conf_computing_maybe_rotate_channel_ivs(uvm_channel_t *channel)
|
||||
{
|
||||
return uvm_conf_computing_rotate_channel_ivs_below_limit(channel, uvm_conf_computing_channel_iv_rotation_limit, false);
|
||||
}
|
||||
|
||||
NV_STATUS uvm_conf_computing_maybe_rotate_channel_ivs_retry_busy(uvm_channel_t *channel)
|
||||
{
|
||||
return uvm_conf_computing_rotate_channel_ivs_below_limit(channel, uvm_conf_computing_channel_iv_rotation_limit, true);
|
||||
}
|
||||
|
||||
void uvm_conf_computing_enable_key_rotation(uvm_gpu_t *gpu)
|
||||
{
|
||||
if (!g_uvm_global.conf_computing_enabled)
|
||||
return;
|
||||
|
||||
// Key rotation cannot be enabled on UVM if it is disabled on RM
|
||||
if (!gpu->parent->rm_info.gpuConfComputeCaps.bKeyRotationEnabled)
|
||||
return;
|
||||
|
||||
gpu->channel_manager->conf_computing.key_rotation_enabled = true;
|
||||
}
|
||||
|
||||
void uvm_conf_computing_disable_key_rotation(uvm_gpu_t *gpu)
|
||||
{
|
||||
if (!g_uvm_global.conf_computing_enabled)
|
||||
return;
|
||||
|
||||
gpu->channel_manager->conf_computing.key_rotation_enabled = false;
|
||||
}
|
||||
|
||||
bool uvm_conf_computing_is_key_rotation_enabled(uvm_gpu_t *gpu)
|
||||
{
|
||||
return gpu->channel_manager->conf_computing.key_rotation_enabled;
|
||||
}
|
||||
|
||||
bool uvm_conf_computing_is_key_rotation_enabled_in_pool(uvm_channel_pool_t *pool)
|
||||
{
|
||||
if (!uvm_conf_computing_is_key_rotation_enabled(pool->manager->gpu))
|
||||
return false;
|
||||
|
||||
// TODO: Bug 4586447: key rotation must be disabled in the SEC2 engine,
|
||||
// because currently the encryption key is shared between UVM and RM, but
|
||||
// UVM is not able to idle SEC2 channels owned by RM.
|
||||
if (uvm_channel_pool_is_sec2(pool))
|
||||
return false;
|
||||
|
||||
// Key rotation happens as part of channel reservation, and LCIC channels
|
||||
// are never reserved directly. Rotation of keys in LCIC channels happens
|
||||
// as the result of key rotation in WLC channels.
|
||||
//
|
||||
// Return false even if there is nothing fundamental prohibiting direct key
|
||||
// rotation on LCIC pools
|
||||
if (uvm_channel_pool_is_lcic(pool))
|
||||
return false;
|
||||
|
||||
return true;
|
||||
}
|
||||
|
||||
static bool conf_computing_is_key_rotation_pending_use_stats(uvm_channel_pool_t *pool)
|
||||
{
|
||||
NvU64 decrypted, encrypted;
|
||||
|
||||
UVM_ASSERT(!key_rotation_is_notifier_driven());
|
||||
|
||||
decrypted = atomic64_read(&pool->conf_computing.key_rotation.decrypted);
|
||||
|
||||
if (decrypted > UVM_CONF_COMPUTING_KEY_ROTATION_LOWER_THRESHOLD)
|
||||
return true;
|
||||
|
||||
encrypted = atomic64_read(&pool->conf_computing.key_rotation.encrypted);
|
||||
|
||||
if (encrypted > UVM_CONF_COMPUTING_KEY_ROTATION_LOWER_THRESHOLD)
|
||||
return true;
|
||||
|
||||
return false;
|
||||
}
|
||||
|
||||
static bool conf_computing_is_key_rotation_pending_use_notifier(uvm_channel_pool_t *pool)
|
||||
{
|
||||
// If key rotation is pending for the pool's engine, then the key rotation
|
||||
// notifier in any of the engine channels can be used by UVM to detect the
|
||||
// situation. Note that RM doesn't update all the notifiers in a single
|
||||
// atomic operation, so it is possible that the channel read by UVM (the
|
||||
// first one in the pool) indicates that a key rotation is pending, but
|
||||
// another channel in the pool (temporarily) indicates the opposite, or vice
|
||||
// versa.
|
||||
uvm_channel_t *first_channel = pool->channels;
|
||||
|
||||
UVM_ASSERT(key_rotation_is_notifier_driven());
|
||||
UVM_ASSERT(first_channel != NULL);
|
||||
|
||||
return first_channel->channel_info.keyRotationNotifier->status == UVM_KEY_ROTATION_STATUS_PENDING;
|
||||
}
|
||||
|
||||
bool uvm_conf_computing_is_key_rotation_pending_in_pool(uvm_channel_pool_t *pool)
|
||||
{
|
||||
if (!uvm_conf_computing_is_key_rotation_enabled_in_pool(pool))
|
||||
return false;
|
||||
|
||||
if (key_rotation_is_notifier_driven())
|
||||
return conf_computing_is_key_rotation_pending_use_notifier(pool);
|
||||
else
|
||||
return conf_computing_is_key_rotation_pending_use_stats(pool);
|
||||
}
|
||||
|
||||
NV_STATUS uvm_conf_computing_rotate_pool_key(uvm_channel_pool_t *pool)
|
||||
{
|
||||
NV_STATUS status;
|
||||
|
||||
UVM_ASSERT(uvm_conf_computing_is_key_rotation_enabled_in_pool(pool));
|
||||
UVM_ASSERT(pool->conf_computing.key_rotation.csl_contexts != NULL);
|
||||
UVM_ASSERT(pool->conf_computing.key_rotation.num_csl_contexts > 0);
|
||||
|
||||
// NV_ERR_STATE_IN_USE indicates that RM was not able to acquire the
|
||||
// required locks at this time. This status is not interpreted as an error,
|
||||
// but as a sign for UVM to try again later. This is the same "protocol"
|
||||
// used in IV rotation.
|
||||
status = nvUvmInterfaceCslRotateKey(pool->conf_computing.key_rotation.csl_contexts,
|
||||
pool->conf_computing.key_rotation.num_csl_contexts);
|
||||
|
||||
if (status == NV_OK) {
|
||||
pool->conf_computing.key_rotation.version++;
|
||||
|
||||
if (!key_rotation_is_notifier_driven()) {
|
||||
atomic64_set(&pool->conf_computing.key_rotation.decrypted, 0);
|
||||
atomic64_set(&pool->conf_computing.key_rotation.encrypted, 0);
|
||||
}
|
||||
}
|
||||
else if (status != NV_ERR_STATE_IN_USE) {
|
||||
UVM_DBG_PRINT("nvUvmInterfaceCslRotateKey() failed in engine %u: %s\n",
|
||||
pool->engine_index,
|
||||
nvstatusToString(status));
|
||||
}
|
||||
|
||||
return status;
|
||||
}
|
||||
|
||||
__attribute__ ((format(printf, 6, 7)))
|
||||
NV_STATUS uvm_conf_computing_util_memcopy_cpu_to_gpu(uvm_gpu_t *gpu,
|
||||
uvm_gpu_address_t dst_gpu_address,
|
||||
void *src_plain,
|
||||
size_t size,
|
||||
uvm_tracker_t *tracker,
|
||||
const char *format,
|
||||
...)
|
||||
{
|
||||
NV_STATUS status;
|
||||
uvm_push_t push;
|
||||
uvm_conf_computing_dma_buffer_t *dma_buffer;
|
||||
uvm_gpu_address_t src_gpu_address, auth_tag_gpu_address;
|
||||
void *dst_cipher, *auth_tag;
|
||||
va_list args;
|
||||
|
||||
UVM_ASSERT(g_uvm_global.conf_computing_enabled);
|
||||
UVM_ASSERT(size <= UVM_CONF_COMPUTING_DMA_BUFFER_SIZE);
|
||||
|
||||
status = uvm_conf_computing_dma_buffer_alloc(&gpu->conf_computing.dma_buffer_pool, &dma_buffer, NULL);
|
||||
if (status != NV_OK)
|
||||
return status;
|
||||
|
||||
va_start(args, format);
|
||||
status = uvm_push_begin_acquire(gpu->channel_manager, UVM_CHANNEL_TYPE_CPU_TO_GPU, tracker, &push, format, args);
|
||||
va_end(args);
|
||||
|
||||
if (status != NV_OK)
|
||||
goto out;
|
||||
|
||||
dst_cipher = uvm_mem_get_cpu_addr_kernel(dma_buffer->alloc);
|
||||
auth_tag = uvm_mem_get_cpu_addr_kernel(dma_buffer->auth_tag);
|
||||
uvm_conf_computing_cpu_encrypt(push.channel, dst_cipher, src_plain, NULL, size, auth_tag);
|
||||
|
||||
src_gpu_address = uvm_mem_gpu_address_virtual_kernel(dma_buffer->alloc, gpu);
|
||||
auth_tag_gpu_address = uvm_mem_gpu_address_virtual_kernel(dma_buffer->auth_tag, gpu);
|
||||
gpu->parent->ce_hal->decrypt(&push, dst_gpu_address, src_gpu_address, size, auth_tag_gpu_address);
|
||||
|
||||
status = uvm_push_end_and_wait(&push);
|
||||
|
||||
out:
|
||||
uvm_conf_computing_dma_buffer_free(&gpu->conf_computing.dma_buffer_pool, dma_buffer, NULL);
|
||||
return status;
|
||||
}
|
||||
|
||||
__attribute__ ((format(printf, 6, 7)))
|
||||
NV_STATUS uvm_conf_computing_util_memcopy_gpu_to_cpu(uvm_gpu_t *gpu,
|
||||
void *dst_plain,
|
||||
uvm_gpu_address_t src_gpu_address,
|
||||
size_t size,
|
||||
uvm_tracker_t *tracker,
|
||||
const char *format,
|
||||
...)
|
||||
{
|
||||
NV_STATUS status;
|
||||
uvm_push_t push;
|
||||
uvm_conf_computing_dma_buffer_t *dma_buffer;
|
||||
uvm_gpu_address_t dst_gpu_address, auth_tag_gpu_address;
|
||||
void *src_cipher, *auth_tag;
|
||||
va_list args;
|
||||
|
||||
UVM_ASSERT(g_uvm_global.conf_computing_enabled);
|
||||
UVM_ASSERT(size <= UVM_CONF_COMPUTING_DMA_BUFFER_SIZE);
|
||||
|
||||
status = uvm_conf_computing_dma_buffer_alloc(&gpu->conf_computing.dma_buffer_pool, &dma_buffer, NULL);
|
||||
if (status != NV_OK)
|
||||
return status;
|
||||
|
||||
va_start(args, format);
|
||||
status = uvm_push_begin_acquire(gpu->channel_manager, UVM_CHANNEL_TYPE_GPU_TO_CPU, tracker, &push, format, args);
|
||||
va_end(args);
|
||||
|
||||
if (status != NV_OK)
|
||||
goto out;
|
||||
|
||||
uvm_conf_computing_log_gpu_encryption(push.channel, size, dma_buffer->decrypt_iv);
|
||||
dma_buffer->key_version[0] = uvm_channel_pool_key_version(push.channel->pool);
|
||||
|
||||
dst_gpu_address = uvm_mem_gpu_address_virtual_kernel(dma_buffer->alloc, gpu);
|
||||
auth_tag_gpu_address = uvm_mem_gpu_address_virtual_kernel(dma_buffer->auth_tag, gpu);
|
||||
gpu->parent->ce_hal->encrypt(&push, dst_gpu_address, src_gpu_address, size, auth_tag_gpu_address);
|
||||
|
||||
status = uvm_push_end_and_wait(&push);
|
||||
if (status != NV_OK)
|
||||
goto out;
|
||||
|
||||
src_cipher = uvm_mem_get_cpu_addr_kernel(dma_buffer->alloc);
|
||||
auth_tag = uvm_mem_get_cpu_addr_kernel(dma_buffer->auth_tag);
|
||||
status = uvm_conf_computing_cpu_decrypt(push.channel,
|
||||
dst_plain,
|
||||
src_cipher,
|
||||
dma_buffer->decrypt_iv,
|
||||
dma_buffer->key_version[0],
|
||||
size,
|
||||
auth_tag);
|
||||
|
||||
out:
|
||||
uvm_conf_computing_dma_buffer_free(&gpu->conf_computing.dma_buffer_pool, dma_buffer, NULL);
|
||||
return status;
|
||||
}
|
||||
|
||||
@@ -62,6 +62,8 @@
|
||||
|
||||
void uvm_conf_computing_check_parent_gpu(const uvm_parent_gpu_t *parent);
|
||||
|
||||
bool uvm_conf_computing_mode_enabled_parent(const uvm_parent_gpu_t *parent);
|
||||
bool uvm_conf_computing_mode_enabled(const uvm_gpu_t *gpu);
|
||||
bool uvm_conf_computing_mode_is_hcc(const uvm_gpu_t *gpu);
|
||||
|
||||
typedef struct
|
||||
@@ -87,9 +89,9 @@ typedef struct
|
||||
// a free buffer.
|
||||
uvm_tracker_t tracker;
|
||||
|
||||
// When the DMA buffer is used as the destination of a GPU encryption, the
|
||||
// engine (CE or SEC2) writes the authentication tag here. When the buffer
|
||||
// is decrypted on the CPU the authentication tag is used by CSL to verify
|
||||
// When the DMA buffer is used as the destination of a GPU encryption, SEC2
|
||||
// writes the authentication tag here. Later when the buffer is decrypted
|
||||
// on the CPU the authentication tag is used again (read) for CSL to verify
|
||||
// the authenticity. The allocation is big enough for one authentication
|
||||
// tag per PAGE_SIZE page in the alloc buffer.
|
||||
uvm_mem_t *auth_tag;
|
||||
@@ -98,12 +100,7 @@ typedef struct
|
||||
// to the authentication tag. The allocation is big enough for one IV per
|
||||
// PAGE_SIZE page in the alloc buffer. The granularity between the decrypt
|
||||
// IV and authentication tag must match.
|
||||
UvmCslIv decrypt_iv[UVM_CONF_COMPUTING_DMA_BUFFER_SIZE / PAGE_SIZE];
|
||||
|
||||
// When the DMA buffer is used as the destination of a GPU encryption, the
|
||||
// key version used during GPU encryption of each PAGE_SIZE page can be
|
||||
// saved here, so CPU decryption uses the correct decryption key.
|
||||
NvU32 key_version[UVM_CONF_COMPUTING_DMA_BUFFER_SIZE / PAGE_SIZE];
|
||||
UvmCslIv decrypt_iv[(UVM_CONF_COMPUTING_DMA_BUFFER_SIZE / PAGE_SIZE)];
|
||||
|
||||
// Bitmap of the encrypted pages in the backing allocation
|
||||
uvm_page_mask_t encrypted_page_mask;
|
||||
@@ -152,7 +149,7 @@ NV_STATUS uvm_conf_computing_gpu_init(uvm_gpu_t *gpu);
|
||||
void uvm_conf_computing_gpu_deinit(uvm_gpu_t *gpu);
|
||||
|
||||
// Logs encryption information from the GPU and returns the IV.
|
||||
void uvm_conf_computing_log_gpu_encryption(uvm_channel_t *channel, size_t size, UvmCslIv *iv);
|
||||
void uvm_conf_computing_log_gpu_encryption(uvm_channel_t *channel, UvmCslIv *iv);
|
||||
|
||||
// Acquires next CPU encryption IV and returns it.
|
||||
void uvm_conf_computing_acquire_encryption_iv(uvm_channel_t *channel, UvmCslIv *iv);
|
||||
@@ -172,14 +169,10 @@ void uvm_conf_computing_cpu_encrypt(uvm_channel_t *channel,
|
||||
// CPU side decryption helper. Decrypts data from src_cipher and writes the
|
||||
// plain text in dst_plain. src_cipher and dst_plain can't overlap. IV obtained
|
||||
// from uvm_conf_computing_log_gpu_encryption() needs to be be passed to src_iv.
|
||||
//
|
||||
// The caller must indicate which key to use for decryption by passing the
|
||||
// appropiate key version number.
|
||||
NV_STATUS uvm_conf_computing_cpu_decrypt(uvm_channel_t *channel,
|
||||
void *dst_plain,
|
||||
const void *src_cipher,
|
||||
const UvmCslIv *src_iv,
|
||||
NvU32 key_version,
|
||||
size_t size,
|
||||
const void *auth_tag_buffer);
|
||||
|
||||
@@ -200,94 +193,10 @@ NV_STATUS uvm_conf_computing_fault_decrypt(uvm_parent_gpu_t *parent_gpu,
|
||||
NvU8 valid);
|
||||
|
||||
// Increment the CPU-side decrypt IV of the CSL context associated with
|
||||
// replayable faults.
|
||||
// replayable faults. The function is a no-op if the given increment is zero.
|
||||
//
|
||||
// The IV associated with a fault CSL context is a 64-bit counter.
|
||||
//
|
||||
// Locking: this function must be invoked while holding the replayable ISR lock.
|
||||
void uvm_conf_computing_fault_increment_decrypt_iv(uvm_parent_gpu_t *parent_gpu);
|
||||
|
||||
// Query the number of remaining messages before IV needs to be rotated.
|
||||
void uvm_conf_computing_query_message_pools(uvm_channel_t *channel,
|
||||
NvU64 *remaining_encryptions,
|
||||
NvU64 *remaining_decryptions);
|
||||
|
||||
// Check if there are more than uvm_conf_computing_channel_iv_rotation_limit
|
||||
// messages available in the channel and try to rotate if not.
|
||||
NV_STATUS uvm_conf_computing_maybe_rotate_channel_ivs(uvm_channel_t *channel);
|
||||
|
||||
// Check if there are more than uvm_conf_computing_channel_iv_rotation_limit
|
||||
// messages available in the channel and rotate if not.
|
||||
NV_STATUS uvm_conf_computing_maybe_rotate_channel_ivs_retry_busy(uvm_channel_t *channel);
|
||||
|
||||
// Check if there are fewer than 'limit' messages available in either direction
|
||||
// and rotate if not.
|
||||
NV_STATUS uvm_conf_computing_rotate_channel_ivs_below_limit(uvm_channel_t *channel, NvU64 limit, bool retry_if_busy);
|
||||
|
||||
// Rotate the engine key associated with the given channel pool.
|
||||
NV_STATUS uvm_conf_computing_rotate_pool_key(uvm_channel_pool_t *pool);
|
||||
|
||||
// Returns true if key rotation is allowed in the channel pool.
|
||||
bool uvm_conf_computing_is_key_rotation_enabled_in_pool(uvm_channel_pool_t *pool);
|
||||
|
||||
// Returns true if key rotation is pending in the channel pool.
|
||||
bool uvm_conf_computing_is_key_rotation_pending_in_pool(uvm_channel_pool_t *pool);
|
||||
|
||||
// Enable/disable key rotation in the passed GPU. Note that UVM enablement is
|
||||
// dependent on RM enablement: key rotation may still be disabled upon calling
|
||||
// this function, if it is disabled in RM. On the other hand, key rotation can
|
||||
// be disabled in UVM, even if it is enabled in RM.
|
||||
//
|
||||
// Enablement/Disablement affects only kernel key rotation in keys owned by UVM.
|
||||
// It doesn't affect user key rotation (CUDA, Video...), nor it affects RM
|
||||
// kernel key rotation.
|
||||
void uvm_conf_computing_enable_key_rotation(uvm_gpu_t *gpu);
|
||||
void uvm_conf_computing_disable_key_rotation(uvm_gpu_t *gpu);
|
||||
|
||||
// Returns true if key rotation is enabled on UVM in the given GPU. Key rotation
|
||||
// can be enabled on the GPU but disabled on some of GPU engines (LCEs or SEC2),
|
||||
// see uvm_conf_computing_is_key_rotation_enabled_in_pool.
|
||||
bool uvm_conf_computing_is_key_rotation_enabled(uvm_gpu_t *gpu);
|
||||
|
||||
// Launch a synchronous, encrypted copy between CPU and GPU.
|
||||
//
|
||||
// The maximum copy size allowed is UVM_CONF_COMPUTING_DMA_BUFFER_SIZE.
|
||||
//
|
||||
// The source CPU buffer pointed by src_plain contains the unencrypted (plain
|
||||
// text) contents; the function internally performs a CPU-side encryption step
|
||||
// before launching the GPU-side CE decryption. The source buffer can be in
|
||||
// protected or unprotected sysmem, while the destination buffer must be in
|
||||
// protected vidmem.
|
||||
//
|
||||
// The input tracker, if not NULL, is internally acquired by the push
|
||||
// responsible for the encrypted copy.
|
||||
__attribute__ ((format(printf, 6, 7)))
|
||||
NV_STATUS uvm_conf_computing_util_memcopy_cpu_to_gpu(uvm_gpu_t *gpu,
|
||||
uvm_gpu_address_t dst_gpu_address,
|
||||
void *src_plain,
|
||||
size_t size,
|
||||
uvm_tracker_t *tracker,
|
||||
const char *format,
|
||||
...);
|
||||
|
||||
// Launch a synchronous, encrypted copy between CPU and GPU.
|
||||
//
|
||||
// The maximum copy size allowed is UVM_CONF_COMPUTING_DMA_BUFFER_SIZE.
|
||||
//
|
||||
// The source CPU buffer pointed by src_plain contains the unencrypted (plain
|
||||
// text) contents; the function internally performs a CPU-side encryption step
|
||||
// before launching the GPU-side CE decryption. The source buffer can be in
|
||||
// protected or unprotected sysmem, while the destination buffer must be in
|
||||
// protected vidmem.
|
||||
//
|
||||
// The input tracker, if not NULL, is internally acquired by the push
|
||||
// responsible for the encrypted copy.
|
||||
__attribute__ ((format(printf, 6, 7)))
|
||||
NV_STATUS uvm_conf_computing_util_memcopy_gpu_to_cpu(uvm_gpu_t *gpu,
|
||||
void *dst_plain,
|
||||
uvm_gpu_address_t src_gpu_address,
|
||||
size_t size,
|
||||
uvm_tracker_t *tracker,
|
||||
const char *format,
|
||||
...);
|
||||
void uvm_conf_computing_fault_increment_decrypt_iv(uvm_parent_gpu_t *parent_gpu, NvU64 increment);
|
||||
#endif // __UVM_CONF_COMPUTING_H__
|
||||
|
||||
@@ -1,5 +1,5 @@
|
||||
/*******************************************************************************
|
||||
Copyright (c) 2016-2023 NVIDIA Corporation
|
||||
Copyright (c) 2016-2019 NVIDIA Corporation
|
||||
|
||||
Permission is hereby granted, free of charge, to any person obtaining a copy
|
||||
of this software and associated documentation files (the "Software"), to
|
||||
@@ -34,27 +34,24 @@ NV_STATUS uvm_test_fault_buffer_flush(UVM_TEST_FAULT_BUFFER_FLUSH_PARAMS *params
|
||||
NV_STATUS status = NV_OK;
|
||||
uvm_va_space_t *va_space = uvm_va_space_get(filp);
|
||||
uvm_gpu_t *gpu;
|
||||
uvm_processor_mask_t *retained_gpus;
|
||||
uvm_global_processor_mask_t retained_gpus;
|
||||
NvU64 i;
|
||||
|
||||
retained_gpus = uvm_processor_mask_cache_alloc();
|
||||
if (!retained_gpus)
|
||||
return NV_ERR_NO_MEMORY;
|
||||
|
||||
uvm_processor_mask_zero(retained_gpus);
|
||||
uvm_global_processor_mask_zero(&retained_gpus);
|
||||
|
||||
uvm_va_space_down_read(va_space);
|
||||
|
||||
uvm_processor_mask_and(retained_gpus, &va_space->faultable_processors, &va_space->registered_gpus);
|
||||
for_each_va_space_gpu(gpu, va_space) {
|
||||
if (gpu->parent->replayable_faults_supported)
|
||||
uvm_global_processor_mask_set(&retained_gpus, gpu->global_id);
|
||||
}
|
||||
|
||||
uvm_global_gpu_retain(retained_gpus);
|
||||
uvm_global_mask_retain(&retained_gpus);
|
||||
|
||||
uvm_va_space_up_read(va_space);
|
||||
|
||||
if (uvm_processor_mask_empty(retained_gpus)) {
|
||||
status = NV_ERR_INVALID_DEVICE;
|
||||
goto out;
|
||||
}
|
||||
if (uvm_global_processor_mask_empty(&retained_gpus))
|
||||
return NV_ERR_INVALID_DEVICE;
|
||||
|
||||
for (i = 0; i < params->iterations; i++) {
|
||||
if (fatal_signal_pending(current)) {
|
||||
@@ -62,12 +59,11 @@ NV_STATUS uvm_test_fault_buffer_flush(UVM_TEST_FAULT_BUFFER_FLUSH_PARAMS *params
|
||||
break;
|
||||
}
|
||||
|
||||
for_each_gpu_in_mask(gpu, retained_gpus)
|
||||
for_each_global_gpu_in_mask(gpu, &retained_gpus)
|
||||
TEST_CHECK_GOTO(uvm_gpu_fault_buffer_flush(gpu) == NV_OK, out);
|
||||
}
|
||||
|
||||
out:
|
||||
uvm_global_gpu_release(retained_gpus);
|
||||
uvm_processor_mask_cache_free(retained_gpus);
|
||||
uvm_global_mask_release(&retained_gpus);
|
||||
return status;
|
||||
}
|
||||
|
||||
@@ -1,5 +1,5 @@
|
||||
/*******************************************************************************
|
||||
Copyright (c) 2016-2023 NVIDIA Corporation
|
||||
Copyright (c) 2016-2021 NVidia Corporation
|
||||
|
||||
Permission is hereby granted, free of charge, to any person obtaining a copy
|
||||
of this software and associated documentation files (the "Software"), to
|
||||
@@ -168,8 +168,7 @@ static NV_STATUS test_get_rm_ptes_single_gpu(uvm_va_space_t *va_space, UVM_TEST_
|
||||
client = params->hClient;
|
||||
memory = params->hMemory;
|
||||
|
||||
// Note: This check is safe as single GPU test does not run on SLI enabled
|
||||
// devices.
|
||||
// Note: This check is safe as single GPU test does not run on SLI enabled devices.
|
||||
memory_mapping_gpu = uvm_va_space_get_gpu_by_uuid_with_gpu_va_space(va_space, ¶ms->gpu_uuid);
|
||||
if (!memory_mapping_gpu)
|
||||
return NV_ERR_INVALID_DEVICE;
|
||||
@@ -181,7 +180,7 @@ static NV_STATUS test_get_rm_ptes_single_gpu(uvm_va_space_t *va_space, UVM_TEST_
|
||||
if (status != NV_OK)
|
||||
return status;
|
||||
|
||||
TEST_CHECK_GOTO(uvm_uuid_eq(&memory_info.uuid, ¶ms->gpu_uuid), done);
|
||||
TEST_CHECK_GOTO(uvm_processor_uuid_eq(&memory_info.uuid, ¶ms->gpu_uuid), done);
|
||||
|
||||
TEST_CHECK_GOTO((memory_info.size == params->size), done);
|
||||
|
||||
|
||||
@@ -1,5 +1,5 @@
|
||||
/*******************************************************************************
|
||||
Copyright (c) 2015-2024 NVIDIA Corporation
|
||||
Copyright (c) 2015-2022 NVIDIA Corporation
|
||||
|
||||
Permission is hereby granted, free of charge, to any person obtaining a copy
|
||||
of this software and associated documentation files (the "Software"), to
|
||||
@@ -27,7 +27,6 @@
|
||||
#include "uvm_gpu_replayable_faults.h"
|
||||
#include "uvm_mem.h"
|
||||
#include "uvm_perf_events.h"
|
||||
#include "uvm_processors.h"
|
||||
#include "uvm_procfs.h"
|
||||
#include "uvm_thread_context.h"
|
||||
#include "uvm_va_range.h"
|
||||
@@ -122,12 +121,6 @@ NV_STATUS uvm_global_init(void)
|
||||
g_uvm_global.num_simulated_devices = 0;
|
||||
g_uvm_global.conf_computing_enabled = platform_info.confComputingEnabled;
|
||||
|
||||
status = uvm_processor_mask_cache_init();
|
||||
if (status != NV_OK) {
|
||||
UVM_ERR_PRINT("uvm_processor_mask_cache_init() failed: %s\n", nvstatusToString(status));
|
||||
goto error;
|
||||
}
|
||||
|
||||
status = uvm_gpu_init();
|
||||
if (status != NV_OK) {
|
||||
UVM_ERR_PRINT("uvm_gpu_init() failed: %s\n", nvstatusToString(status));
|
||||
@@ -230,7 +223,6 @@ void uvm_global_exit(void)
|
||||
uvm_mem_global_exit();
|
||||
uvm_pmm_sysmem_exit();
|
||||
uvm_gpu_exit();
|
||||
uvm_processor_mask_cache_exit();
|
||||
|
||||
if (g_uvm_global.rm_session_handle != 0)
|
||||
uvm_rm_locked_call_void(nvUvmInterfaceSessionDestroy(g_uvm_global.rm_session_handle));
|
||||
@@ -249,19 +241,19 @@ void uvm_global_exit(void)
|
||||
|
||||
// Signal to the top-half ISR whether calls from the RM's top-half ISR are to
|
||||
// be completed without processing.
|
||||
static void uvm_parent_gpu_set_isr_suspended(uvm_parent_gpu_t *parent_gpu, bool is_suspended)
|
||||
static void uvm_gpu_set_isr_suspended(uvm_gpu_t *gpu, bool is_suspended)
|
||||
{
|
||||
uvm_spin_lock_irqsave(&parent_gpu->isr.interrupts_lock);
|
||||
uvm_spin_lock_irqsave(&gpu->parent->isr.interrupts_lock);
|
||||
|
||||
parent_gpu->isr.is_suspended = is_suspended;
|
||||
gpu->parent->isr.is_suspended = is_suspended;
|
||||
|
||||
uvm_spin_unlock_irqrestore(&parent_gpu->isr.interrupts_lock);
|
||||
uvm_spin_unlock_irqrestore(&gpu->parent->isr.interrupts_lock);
|
||||
}
|
||||
|
||||
static NV_STATUS uvm_suspend(void)
|
||||
{
|
||||
uvm_va_space_t *va_space = NULL;
|
||||
uvm_gpu_id_t gpu_id;
|
||||
uvm_global_gpu_id_t gpu_id;
|
||||
uvm_gpu_t *gpu;
|
||||
|
||||
// Upon entry into this function, the following is true:
|
||||
@@ -295,7 +287,7 @@ static NV_STATUS uvm_suspend(void)
|
||||
// Though global_lock isn't held here, pm.lock indirectly prevents the
|
||||
// addition and removal of GPUs, since these operations can currently
|
||||
// only occur in response to ioctl() calls.
|
||||
for_each_gpu_id_in_mask(gpu_id, &g_uvm_global.retained_gpus) {
|
||||
for_each_global_gpu_id_in_mask(gpu_id, &g_uvm_global.retained_gpus) {
|
||||
gpu = uvm_gpu_get(gpu_id);
|
||||
|
||||
// Since fault buffer state may be lost across sleep cycles, UVM must
|
||||
@@ -314,9 +306,9 @@ static NV_STATUS uvm_suspend(void)
|
||||
// interrupts in the bottom half in the future, the bottom half flush
|
||||
// below will no longer be able to guarantee that all outstanding
|
||||
// notifications have been handled.
|
||||
uvm_parent_gpu_access_counters_set_ignore(gpu->parent, true);
|
||||
uvm_gpu_access_counters_set_ignore(gpu, true);
|
||||
|
||||
uvm_parent_gpu_set_isr_suspended(gpu->parent, true);
|
||||
uvm_gpu_set_isr_suspended(gpu, true);
|
||||
|
||||
nv_kthread_q_flush(&gpu->parent->isr.bottom_half_q);
|
||||
|
||||
@@ -349,7 +341,7 @@ NV_STATUS uvm_suspend_entry(void)
|
||||
static NV_STATUS uvm_resume(void)
|
||||
{
|
||||
uvm_va_space_t *va_space = NULL;
|
||||
uvm_gpu_id_t gpu_id;
|
||||
uvm_global_gpu_id_t gpu_id;
|
||||
uvm_gpu_t *gpu;
|
||||
|
||||
g_uvm_global.pm.is_suspended = false;
|
||||
@@ -368,18 +360,18 @@ static NV_STATUS uvm_resume(void)
|
||||
uvm_mutex_unlock(&g_uvm_global.va_spaces.lock);
|
||||
|
||||
// pm.lock is held in lieu of global_lock to prevent GPU addition/removal
|
||||
for_each_gpu_id_in_mask(gpu_id, &g_uvm_global.retained_gpus) {
|
||||
for_each_global_gpu_id_in_mask(gpu_id, &g_uvm_global.retained_gpus) {
|
||||
gpu = uvm_gpu_get(gpu_id);
|
||||
|
||||
// Bring the fault buffer software state back in sync with the
|
||||
// hardware state.
|
||||
uvm_parent_gpu_fault_buffer_resume(gpu->parent);
|
||||
uvm_gpu_fault_buffer_resume(gpu->parent);
|
||||
|
||||
uvm_parent_gpu_set_isr_suspended(gpu->parent, false);
|
||||
uvm_gpu_set_isr_suspended(gpu, false);
|
||||
|
||||
// Reenable access counter interrupt processing unless notifications
|
||||
// have been set to be suppressed.
|
||||
uvm_parent_gpu_access_counters_set_ignore(gpu->parent, false);
|
||||
uvm_gpu_access_counters_set_ignore(gpu, false);
|
||||
}
|
||||
|
||||
uvm_up_write(&g_uvm_global.pm.lock);
|
||||
@@ -433,36 +425,35 @@ NV_STATUS uvm_global_reset_fatal_error(void)
|
||||
return nv_atomic_xchg(&g_uvm_global.fatal_error, NV_OK);
|
||||
}
|
||||
|
||||
void uvm_global_gpu_retain(const uvm_processor_mask_t *mask)
|
||||
void uvm_global_mask_retain(const uvm_global_processor_mask_t *mask)
|
||||
{
|
||||
uvm_gpu_t *gpu;
|
||||
|
||||
for_each_gpu_in_mask(gpu, mask)
|
||||
for_each_global_gpu_in_mask(gpu, mask)
|
||||
uvm_gpu_retain(gpu);
|
||||
}
|
||||
|
||||
void uvm_global_gpu_release(const uvm_processor_mask_t *mask)
|
||||
void uvm_global_mask_release(const uvm_global_processor_mask_t *mask)
|
||||
{
|
||||
uvm_gpu_id_t gpu_id;
|
||||
uvm_global_gpu_id_t gpu_id;
|
||||
|
||||
if (uvm_processor_mask_empty(mask))
|
||||
if (uvm_global_processor_mask_empty(mask))
|
||||
return;
|
||||
|
||||
uvm_mutex_lock(&g_uvm_global.global_lock);
|
||||
|
||||
// Do not use for_each_gpu_in_mask as it reads the GPU state and it
|
||||
// might get destroyed.
|
||||
for_each_gpu_id_in_mask(gpu_id, mask)
|
||||
// Do not use for_each_global_gpu_in_mask as it reads the GPU state and it
|
||||
// might get destroyed
|
||||
for_each_global_gpu_id_in_mask(gpu_id, mask)
|
||||
uvm_gpu_release_locked(uvm_gpu_get(gpu_id));
|
||||
|
||||
uvm_mutex_unlock(&g_uvm_global.global_lock);
|
||||
}
|
||||
|
||||
NV_STATUS uvm_global_gpu_check_ecc_error(uvm_processor_mask_t *gpus)
|
||||
NV_STATUS uvm_global_mask_check_ecc_error(uvm_global_processor_mask_t *gpus)
|
||||
{
|
||||
uvm_gpu_t *gpu;
|
||||
|
||||
for_each_gpu_in_mask(gpu, gpus) {
|
||||
for_each_global_gpu_in_mask(gpu, gpus) {
|
||||
NV_STATUS status = uvm_gpu_check_ecc_error(gpu);
|
||||
if (status != NV_OK)
|
||||
return status;
|
||||
|
||||
@@ -40,13 +40,13 @@ struct uvm_global_struct
|
||||
// Note that GPUs are added to this mask as the last step of add_gpu() and
|
||||
// removed from it as the first step of remove_gpu() implying that a GPU
|
||||
// that's being initialized or deinitialized will not be in it.
|
||||
uvm_processor_mask_t retained_gpus;
|
||||
uvm_global_processor_mask_t retained_gpus;
|
||||
|
||||
// Array of the parent GPUs registered with UVM. Note that GPUs will have
|
||||
// ids offset by 1 to accomodate the UVM_ID_CPU so e.g., parent_gpus[0]
|
||||
// will have GPU id = 1. A GPU entry is unused iff it does not exist
|
||||
// (is a NULL pointer) in this table.
|
||||
uvm_parent_gpu_t *parent_gpus[UVM_PARENT_ID_MAX_GPUS];
|
||||
// ids offset by 1 to accomodate the UVM_GLOBAL_ID_CPU so e.g.
|
||||
// parent_gpus[0] will have GPU id = 1. A GPU entry is unused iff it does
|
||||
// not exist (is a NULL pointer) in this table.
|
||||
uvm_parent_gpu_t *parent_gpus[UVM_MAX_GPUS];
|
||||
|
||||
// A global RM session (RM client)
|
||||
// Created on module load and destroyed on module unload
|
||||
@@ -172,7 +172,7 @@ NV_STATUS uvm_resume_entry(void);
|
||||
// LOCKING: requires that you hold the global lock and gpu_table_lock
|
||||
static void uvm_global_add_parent_gpu(uvm_parent_gpu_t *parent_gpu)
|
||||
{
|
||||
NvU32 gpu_index = uvm_parent_id_gpu_index(parent_gpu->id);
|
||||
NvU32 gpu_index = uvm_id_gpu_index(parent_gpu->id);
|
||||
|
||||
uvm_assert_mutex_locked(&g_uvm_global.global_lock);
|
||||
uvm_assert_spinlock_locked(&g_uvm_global.gpu_table_lock);
|
||||
@@ -186,7 +186,7 @@ static void uvm_global_add_parent_gpu(uvm_parent_gpu_t *parent_gpu)
|
||||
// LOCKING: requires that you hold the global lock and gpu_table_lock
|
||||
static void uvm_global_remove_parent_gpu(uvm_parent_gpu_t *parent_gpu)
|
||||
{
|
||||
NvU32 gpu_index = uvm_parent_id_gpu_index(parent_gpu->id);
|
||||
NvU32 gpu_index = uvm_id_gpu_index(parent_gpu->id);
|
||||
|
||||
uvm_assert_mutex_locked(&g_uvm_global.global_lock);
|
||||
uvm_assert_spinlock_locked(&g_uvm_global.gpu_table_lock);
|
||||
@@ -201,29 +201,47 @@ static void uvm_global_remove_parent_gpu(uvm_parent_gpu_t *parent_gpu)
|
||||
//
|
||||
// LOCKING: requires that you hold the gpu_table_lock, the global lock, or have
|
||||
// retained at least one of the child GPUs.
|
||||
static uvm_parent_gpu_t *uvm_parent_gpu_get(uvm_parent_gpu_id_t id)
|
||||
static uvm_parent_gpu_t *uvm_parent_gpu_get(uvm_gpu_id_t id)
|
||||
{
|
||||
return g_uvm_global.parent_gpus[uvm_parent_id_gpu_index(id)];
|
||||
return g_uvm_global.parent_gpus[uvm_id_gpu_index(id)];
|
||||
}
|
||||
|
||||
// Get a gpu by its GPU id.
|
||||
// Get a gpu by its global id.
|
||||
// Returns a pointer to the GPU object, or NULL if not found.
|
||||
//
|
||||
// LOCKING: requires that you hold the gpu_table_lock, the global_lock, or have
|
||||
// retained the gpu.
|
||||
static uvm_gpu_t *uvm_gpu_get(uvm_gpu_id_t gpu_id)
|
||||
static uvm_gpu_t *uvm_gpu_get(uvm_global_gpu_id_t global_gpu_id)
|
||||
{
|
||||
uvm_parent_gpu_t *parent_gpu;
|
||||
|
||||
parent_gpu = g_uvm_global.parent_gpus[uvm_parent_id_gpu_index_from_gpu_id(gpu_id)];
|
||||
parent_gpu = g_uvm_global.parent_gpus[uvm_id_gpu_index_from_global_gpu_id(global_gpu_id)];
|
||||
if (!parent_gpu)
|
||||
return NULL;
|
||||
|
||||
return parent_gpu->gpus[uvm_id_sub_processor_index(gpu_id)];
|
||||
return parent_gpu->gpus[uvm_global_id_sub_processor_index(global_gpu_id)];
|
||||
}
|
||||
|
||||
static uvmGpuSessionHandle uvm_global_session_handle(void)
|
||||
// Get a gpu by its processor id.
|
||||
// Returns a pointer to the GPU object, or NULL if not found.
|
||||
//
|
||||
// LOCKING: requires that you hold the gpu_table_lock, the global_lock, or have
|
||||
// retained the gpu.
|
||||
static uvm_gpu_t *uvm_gpu_get_by_processor_id(uvm_processor_id_t id)
|
||||
{
|
||||
uvm_global_gpu_id_t global_id = uvm_global_gpu_id_from_gpu_id(id);
|
||||
uvm_gpu_t *gpu = uvm_gpu_get(global_id);
|
||||
|
||||
if (gpu)
|
||||
UVM_ASSERT(!gpu->parent->smc.enabled);
|
||||
|
||||
return gpu;
|
||||
}
|
||||
|
||||
static uvmGpuSessionHandle uvm_gpu_session_handle(uvm_gpu_t *gpu)
|
||||
{
|
||||
if (gpu->parent->smc.enabled)
|
||||
return gpu->smc.rm_session_handle;
|
||||
return g_uvm_global.rm_session_handle;
|
||||
}
|
||||
|
||||
@@ -276,57 +294,56 @@ static NV_STATUS uvm_global_get_status(void)
|
||||
// reset call was made.
|
||||
NV_STATUS uvm_global_reset_fatal_error(void);
|
||||
|
||||
static uvm_gpu_t *uvm_processor_mask_find_first_gpu(const uvm_processor_mask_t *gpus)
|
||||
static uvm_gpu_t *uvm_global_processor_mask_find_first_gpu(const uvm_global_processor_mask_t *global_gpus)
|
||||
{
|
||||
uvm_gpu_t *gpu;
|
||||
uvm_gpu_id_t gpu_id = uvm_processor_mask_find_first_gpu_id(gpus);
|
||||
uvm_global_gpu_id_t gpu_id = uvm_global_processor_mask_find_first_gpu_id(global_gpus);
|
||||
|
||||
if (UVM_ID_IS_INVALID(gpu_id))
|
||||
if (UVM_GLOBAL_ID_IS_INVALID(gpu_id))
|
||||
return NULL;
|
||||
|
||||
gpu = uvm_gpu_get(gpu_id);
|
||||
|
||||
// If there is valid GPU id in the mask, assert that the corresponding
|
||||
// uvm_gpu_t is present. Otherwise it would stop a
|
||||
// for_each_gpu_in_mask() loop pre-maturely. Today, this could only
|
||||
// for_each_global_gpu_in_mask() loop pre-maturely. Today, this could only
|
||||
// happen in remove_gpu() because the GPU being removed is deleted from the
|
||||
// global table very early.
|
||||
UVM_ASSERT_MSG(gpu, "gpu_id %u\n", uvm_id_value(gpu_id));
|
||||
UVM_ASSERT_MSG(gpu, "gpu_id %u\n", uvm_global_id_value(gpu_id));
|
||||
|
||||
return gpu;
|
||||
}
|
||||
|
||||
static uvm_gpu_t *__uvm_processor_mask_find_next_gpu(const uvm_processor_mask_t *gpus, uvm_gpu_t *gpu)
|
||||
static uvm_gpu_t *__uvm_global_processor_mask_find_next_gpu(const uvm_global_processor_mask_t *global_gpus, uvm_gpu_t *gpu)
|
||||
{
|
||||
uvm_gpu_id_t gpu_id;
|
||||
uvm_global_gpu_id_t gpu_id;
|
||||
|
||||
UVM_ASSERT(gpu);
|
||||
|
||||
gpu_id = uvm_processor_mask_find_next_id(gpus, uvm_gpu_id_next(gpu->id));
|
||||
if (UVM_ID_IS_INVALID(gpu_id))
|
||||
gpu_id = uvm_global_processor_mask_find_next_id(global_gpus, uvm_global_gpu_id_next(gpu->global_id));
|
||||
if (UVM_GLOBAL_ID_IS_INVALID(gpu_id))
|
||||
return NULL;
|
||||
|
||||
gpu = uvm_gpu_get(gpu_id);
|
||||
|
||||
// See comment in uvm_processor_mask_find_first_gpu().
|
||||
UVM_ASSERT_MSG(gpu, "gpu_id %u\n", uvm_id_value(gpu_id));
|
||||
// See comment in uvm_global_processor_mask_find_first_gpu().
|
||||
UVM_ASSERT_MSG(gpu, "gpu_id %u\n", uvm_global_id_value(gpu_id));
|
||||
|
||||
return gpu;
|
||||
}
|
||||
|
||||
// Helper to iterate over all GPUs in the input mask
|
||||
#define for_each_gpu_in_mask(gpu, mask) \
|
||||
for (gpu = uvm_processor_mask_find_first_gpu(mask); \
|
||||
gpu != NULL; \
|
||||
gpu = __uvm_processor_mask_find_next_gpu(mask, gpu))
|
||||
#define for_each_global_gpu_in_mask(gpu, global_mask) \
|
||||
for (gpu = uvm_global_processor_mask_find_first_gpu(global_mask); \
|
||||
gpu != NULL; \
|
||||
gpu = __uvm_global_processor_mask_find_next_gpu(global_mask, gpu))
|
||||
|
||||
// Helper to iterate over all GPUs retained by the UVM driver
|
||||
// (across all va spaces).
|
||||
#define for_each_gpu(gpu) \
|
||||
for (({uvm_assert_mutex_locked(&g_uvm_global.global_lock); \
|
||||
gpu = uvm_processor_mask_find_first_gpu(&g_uvm_global.retained_gpus);}); \
|
||||
gpu != NULL; \
|
||||
gpu = __uvm_processor_mask_find_next_gpu(&g_uvm_global.retained_gpus, gpu))
|
||||
// Helper to iterate over all GPUs retained by the UVM driver (across all va spaces)
|
||||
#define for_each_global_gpu(gpu) \
|
||||
for (({uvm_assert_mutex_locked(&g_uvm_global.global_lock); \
|
||||
gpu = uvm_global_processor_mask_find_first_gpu(&g_uvm_global.retained_gpus);}); \
|
||||
gpu != NULL; \
|
||||
gpu = __uvm_global_processor_mask_find_next_gpu(&g_uvm_global.retained_gpus, gpu))
|
||||
|
||||
// LOCKING: Must hold either the global_lock or the gpu_table_lock
|
||||
static uvm_parent_gpu_t *uvm_global_find_next_parent_gpu(uvm_parent_gpu_t *parent_gpu)
|
||||
@@ -334,7 +351,7 @@ static uvm_parent_gpu_t *uvm_global_find_next_parent_gpu(uvm_parent_gpu_t *paren
|
||||
NvU32 i;
|
||||
|
||||
if (parent_gpu) {
|
||||
NvU32 gpu_index = uvm_parent_id_gpu_index(parent_gpu->id);
|
||||
NvU32 gpu_index = uvm_id_gpu_index(parent_gpu->id);
|
||||
i = gpu_index + 1;
|
||||
}
|
||||
else {
|
||||
@@ -343,7 +360,7 @@ static uvm_parent_gpu_t *uvm_global_find_next_parent_gpu(uvm_parent_gpu_t *paren
|
||||
|
||||
parent_gpu = NULL;
|
||||
|
||||
while (i < UVM_PARENT_ID_MAX_GPUS) {
|
||||
while (i < UVM_MAX_GPUS) {
|
||||
if (g_uvm_global.parent_gpus[i]) {
|
||||
parent_gpu = g_uvm_global.parent_gpus[i];
|
||||
break;
|
||||
@@ -359,18 +376,18 @@ static uvm_parent_gpu_t *uvm_global_find_next_parent_gpu(uvm_parent_gpu_t *paren
|
||||
static uvm_gpu_t *uvm_gpu_find_next_valid_gpu_in_parent(uvm_parent_gpu_t *parent_gpu, uvm_gpu_t *cur_gpu)
|
||||
{
|
||||
uvm_gpu_t *gpu = NULL;
|
||||
uvm_gpu_id_t gpu_id;
|
||||
uvm_global_gpu_id_t global_gpu_id;
|
||||
NvU32 sub_processor_index;
|
||||
NvU32 cur_sub_processor_index;
|
||||
|
||||
UVM_ASSERT(parent_gpu);
|
||||
|
||||
gpu_id = uvm_gpu_id_from_parent_gpu_id(parent_gpu->id);
|
||||
cur_sub_processor_index = cur_gpu ? uvm_id_sub_processor_index(cur_gpu->id) : -1;
|
||||
global_gpu_id = uvm_global_gpu_id_from_gpu_id(parent_gpu->id);
|
||||
cur_sub_processor_index = cur_gpu ? uvm_global_id_sub_processor_index(cur_gpu->global_id) : -1;
|
||||
|
||||
sub_processor_index = find_next_bit(parent_gpu->valid_gpus, UVM_PARENT_ID_MAX_SUB_PROCESSORS, cur_sub_processor_index + 1);
|
||||
if (sub_processor_index < UVM_PARENT_ID_MAX_SUB_PROCESSORS) {
|
||||
gpu = uvm_gpu_get(uvm_id_from_value(uvm_id_value(gpu_id) + sub_processor_index));
|
||||
sub_processor_index = find_next_bit(parent_gpu->valid_gpus, UVM_ID_MAX_SUB_PROCESSORS, cur_sub_processor_index + 1);
|
||||
if (sub_processor_index < UVM_ID_MAX_SUB_PROCESSORS) {
|
||||
gpu = uvm_gpu_get(uvm_global_id_from_value(uvm_global_id_value(global_gpu_id) + sub_processor_index));
|
||||
UVM_ASSERT(gpu != NULL);
|
||||
}
|
||||
|
||||
@@ -390,18 +407,18 @@ static uvm_gpu_t *uvm_gpu_find_next_valid_gpu_in_parent(uvm_parent_gpu_t *parent
|
||||
(gpu) != NULL; \
|
||||
(gpu) = uvm_gpu_find_next_valid_gpu_in_parent((parent_gpu), (gpu)))
|
||||
|
||||
// Helper which calls uvm_gpu_retain() on each GPU in mask.
|
||||
void uvm_global_gpu_retain(const uvm_processor_mask_t *mask);
|
||||
// Helper which calls uvm_gpu_retain on each GPU in mask
|
||||
void uvm_global_mask_retain(const uvm_global_processor_mask_t *mask);
|
||||
|
||||
// Helper which calls uvm_gpu_release_locked on each GPU in mask.
|
||||
//
|
||||
// LOCKING: this function takes and releases the global lock if the input mask
|
||||
// is not empty
|
||||
void uvm_global_gpu_release(const uvm_processor_mask_t *mask);
|
||||
void uvm_global_mask_release(const uvm_global_processor_mask_t *mask);
|
||||
|
||||
// Check for ECC errors for all GPUs in a mask
|
||||
// Notably this check cannot be performed where it's not safe to call into RM.
|
||||
NV_STATUS uvm_global_gpu_check_ecc_error(uvm_processor_mask_t *gpus);
|
||||
NV_STATUS uvm_global_mask_check_ecc_error(uvm_global_processor_mask_t *gpus);
|
||||
|
||||
// Pre-allocate fault service contexts.
|
||||
NV_STATUS uvm_service_block_context_init(void);
|
||||
@@ -409,10 +426,4 @@ NV_STATUS uvm_service_block_context_init(void);
|
||||
// Release fault service contexts if any exist.
|
||||
void uvm_service_block_context_exit(void);
|
||||
|
||||
// Allocate a service block context
|
||||
uvm_service_block_context_t *uvm_service_block_context_alloc(struct mm_struct *mm);
|
||||
|
||||
// Free a servic block context
|
||||
void uvm_service_block_context_free(uvm_service_block_context_t *service_context);
|
||||
|
||||
#endif // __UVM_GLOBAL_H__
|
||||
|
||||
File diff suppressed because it is too large
Load Diff
@@ -1,5 +1,5 @@
|
||||
/*******************************************************************************
|
||||
Copyright (c) 2015-2023 NVIDIA Corporation
|
||||
Copyright (c) 2015-2022 NVIDIA Corporation
|
||||
|
||||
Permission is hereby granted, free of charge, to any person obtaining a copy
|
||||
of this software and associated documentation files (the "Software"), to
|
||||
@@ -160,10 +160,6 @@ struct uvm_service_block_context_struct
|
||||
// Pages whose permissions need to be revoked from other processors
|
||||
uvm_page_mask_t revocation_mask;
|
||||
|
||||
// Temporary mask used in service_va_block_locked() in
|
||||
// uvm_gpu_access_counters.c.
|
||||
uvm_processor_mask_t update_processors;
|
||||
|
||||
struct
|
||||
{
|
||||
// Per-processor mask with the pages that will be resident after
|
||||
@@ -173,7 +169,7 @@ struct uvm_service_block_context_struct
|
||||
} per_processor_masks[UVM_ID_MAX_PROCESSORS];
|
||||
|
||||
// State used by the VA block routines called by the servicing routine
|
||||
uvm_va_block_context_t *block_context;
|
||||
uvm_va_block_context_t block_context;
|
||||
|
||||
// Prefetch state hint
|
||||
uvm_perf_prefetch_hint_t prefetch_hint;
|
||||
@@ -597,41 +593,23 @@ typedef enum
|
||||
UVM_GPU_LINK_MAX
|
||||
} uvm_gpu_link_type_t;
|
||||
|
||||
// UVM does not support P2P copies on pre-Pascal GPUs. Pascal+ GPUs only
|
||||
// support virtual addresses in P2P copies. Therefore, a peer identity mapping
|
||||
// needs to be created.
|
||||
// Ampere+ GPUs support physical peer copies, too, so identity mappings are not
|
||||
// needed
|
||||
typedef enum
|
||||
{
|
||||
// Peer copies can be disallowed for a variety of reasons. For example,
|
||||
// P2P transfers are disabled in pre-Pascal GPUs because there is no
|
||||
// compelling use case for direct peer migrations.
|
||||
UVM_GPU_PEER_COPY_MODE_UNSUPPORTED,
|
||||
|
||||
// Pascal+ GPUs support virtual addresses in P2P copies. Virtual peer copies
|
||||
// require the creation of peer identity mappings.
|
||||
UVM_GPU_PEER_COPY_MODE_VIRTUAL,
|
||||
|
||||
// Ampere+ GPUs support virtual and physical peer copies. Physical peer
|
||||
// copies do not depend on peer identity mappings.
|
||||
UVM_GPU_PEER_COPY_MODE_PHYSICAL,
|
||||
|
||||
UVM_GPU_PEER_COPY_MODE_COUNT
|
||||
} uvm_gpu_peer_copy_mode_t;
|
||||
|
||||
// In order to support SMC/MIG GPU partitions, we split UVM GPUs into two
|
||||
// parts: parent GPUs (uvm_parent_gpu_t) which represent unique PCIe devices
|
||||
// (including VFs), and sub/child GPUs (uvm_gpu_t) which represent individual
|
||||
// partitions within the parent. The parent GPU and partition GPU have
|
||||
// different "id" and "uuid".
|
||||
struct uvm_gpu_struct
|
||||
{
|
||||
uvm_parent_gpu_t *parent;
|
||||
|
||||
// The gpu's GI uuid if SMC is enabled; otherwise, a copy of parent->uuid.
|
||||
NvProcessorUuid uuid;
|
||||
|
||||
// Nice printable name in the format:
|
||||
// ID: 999: GPU-<parent_uuid> UVM-GI-<gi_uuid>.
|
||||
// UVM_GPU_UUID_TEXT_BUFFER_LENGTH includes the null character.
|
||||
char name[9 + 2 * UVM_GPU_UUID_TEXT_BUFFER_LENGTH];
|
||||
|
||||
// Refcount of the gpu, i.e. how many times it has been retained. This is
|
||||
// roughly a count of how many times it has been registered with a VA space,
|
||||
// except that some paths retain the GPU temporarily without a VA space.
|
||||
@@ -650,9 +628,13 @@ struct uvm_gpu_struct
|
||||
// user can create a lot of va spaces and register the gpu with them).
|
||||
atomic64_t retained_count;
|
||||
|
||||
// A unique uvm gpu id in range [1, UVM_ID_MAX_PROCESSORS).
|
||||
// A unique uvm gpu id in range [1, UVM_ID_MAX_PROCESSORS); this is a copy
|
||||
// of the parent's id.
|
||||
uvm_gpu_id_t id;
|
||||
|
||||
// A unique uvm global_gpu id in range [1, UVM_GLOBAL_ID_MAX_PROCESSORS)
|
||||
uvm_global_gpu_id_t global_id;
|
||||
|
||||
// Should be UVM_GPU_MAGIC_VALUE. Used for memory checking.
|
||||
NvU64 magic;
|
||||
|
||||
@@ -666,10 +648,6 @@ struct uvm_gpu_struct
|
||||
// can allocate through PMM (PMA).
|
||||
NvU64 max_allocatable_address;
|
||||
|
||||
// Max supported vidmem page size may be smaller than the max GMMU page
|
||||
// size, because of the vMMU supported page sizes.
|
||||
NvU64 max_vidmem_page_size;
|
||||
|
||||
struct
|
||||
{
|
||||
// True if the platform supports HW coherence and the GPU's memory
|
||||
@@ -846,6 +824,8 @@ struct uvm_gpu_struct
|
||||
{
|
||||
NvU32 swizz_id;
|
||||
|
||||
uvmGpuSessionHandle rm_session_handle;
|
||||
|
||||
// RM device handle used in many of the UVM/RM APIs.
|
||||
//
|
||||
// Do not read this field directly, use uvm_gpu_device_handle instead.
|
||||
@@ -858,9 +838,6 @@ struct uvm_gpu_struct
|
||||
|
||||
struct proc_dir_entry *dir_symlink;
|
||||
|
||||
// The GPU instance UUID symlink if SMC is enabled.
|
||||
struct proc_dir_entry *gpu_instance_uuid_symlink;
|
||||
|
||||
struct proc_dir_entry *info_file;
|
||||
|
||||
struct proc_dir_entry *dir_peers;
|
||||
@@ -873,11 +850,6 @@ struct uvm_gpu_struct
|
||||
bool uvm_test_force_upper_pushbuffer_segment;
|
||||
};
|
||||
|
||||
// In order to support SMC/MIG GPU partitions, we split UVM GPUs into two
|
||||
// parts: parent GPUs (uvm_parent_gpu_t) which represent unique PCIe devices
|
||||
// (including VFs), and sub/child GPUs (uvm_gpu_t) which represent individual
|
||||
// partitions within the parent. The parent GPU and partition GPU have
|
||||
// different "id" and "uuid".
|
||||
struct uvm_parent_gpu_struct
|
||||
{
|
||||
// Reference count for how many places are holding on to a parent GPU
|
||||
@@ -890,11 +862,11 @@ struct uvm_parent_gpu_struct
|
||||
// The number of uvm_gpu_ts referencing this uvm_parent_gpu_t.
|
||||
NvU32 num_retained_gpus;
|
||||
|
||||
uvm_gpu_t *gpus[UVM_PARENT_ID_MAX_SUB_PROCESSORS];
|
||||
uvm_gpu_t *gpus[UVM_ID_MAX_SUB_PROCESSORS];
|
||||
|
||||
// Bitmap of valid child entries in the gpus[] table. Used to retrieve a
|
||||
// usable child GPU in bottom-halves.
|
||||
DECLARE_BITMAP(valid_gpus, UVM_PARENT_ID_MAX_SUB_PROCESSORS);
|
||||
DECLARE_BITMAP(valid_gpus, UVM_ID_MAX_SUB_PROCESSORS);
|
||||
|
||||
// The gpu's uuid
|
||||
NvProcessorUuid uuid;
|
||||
@@ -906,8 +878,8 @@ struct uvm_parent_gpu_struct
|
||||
// hardware classes, etc.).
|
||||
UvmGpuInfo rm_info;
|
||||
|
||||
// A unique uvm gpu id in range [1, UVM_PARENT_ID_MAX_PROCESSORS)
|
||||
uvm_parent_gpu_id_t id;
|
||||
// A unique uvm gpu id in range [1, UVM_ID_MAX_PROCESSORS)
|
||||
uvm_gpu_id_t id;
|
||||
|
||||
// Reference to the Linux PCI device
|
||||
//
|
||||
@@ -942,13 +914,12 @@ struct uvm_parent_gpu_struct
|
||||
// dma_addressable_start (in bifSetupDmaWindow_IMPL()) and hence when
|
||||
// referencing sysmem from the GPU, dma_addressable_start should be
|
||||
// subtracted from the physical address. The DMA mapping helpers like
|
||||
// uvm_parent_gpu_map_cpu_pages() and uvm_parent_gpu_dma_alloc_page() take
|
||||
// care of that.
|
||||
// uvm_gpu_map_cpu_pages() and uvm_gpu_dma_alloc_page() take care of that.
|
||||
NvU64 dma_addressable_start;
|
||||
NvU64 dma_addressable_limit;
|
||||
|
||||
// Total size (in bytes) of physically mapped (with
|
||||
// uvm_parent_gpu_map_cpu_pages) sysmem pages, used for leak detection.
|
||||
// Total size (in bytes) of physically mapped (with uvm_gpu_map_cpu_pages)
|
||||
// sysmem pages, used for leak detection.
|
||||
atomic64_t mapped_cpu_pages_size;
|
||||
|
||||
// Hardware Abstraction Layer
|
||||
@@ -967,11 +938,7 @@ struct uvm_parent_gpu_struct
|
||||
// Virtualization mode of the GPU.
|
||||
UVM_VIRT_MODE virt_mode;
|
||||
|
||||
// Pascal+ GPUs can trigger faults on prefetch instructions. If false, this
|
||||
// feature must be disabled at all times in GPUs of the given architecture.
|
||||
// If true, the feature can be toggled at will by SW.
|
||||
//
|
||||
// The field should not be used unless the GPU supports replayable faults.
|
||||
// Whether the GPU can trigger faults on prefetch instructions
|
||||
bool prefetch_fault_supported;
|
||||
|
||||
// Number of membars required to flush out HSHUB following a TLB invalidate
|
||||
@@ -986,11 +953,6 @@ struct uvm_parent_gpu_struct
|
||||
|
||||
bool access_counters_supported;
|
||||
|
||||
// If this is true, physical address based access counter notifications are
|
||||
// potentially generated. If false, only virtual address based notifications
|
||||
// are generated (assuming access_counters_supported is true too).
|
||||
bool access_counters_can_use_physical_addresses;
|
||||
|
||||
bool fault_cancel_va_supported;
|
||||
|
||||
// True if the GPU has hardware support for scoped atomics
|
||||
@@ -1217,14 +1179,14 @@ struct uvm_parent_gpu_struct
|
||||
} smmu_war;
|
||||
};
|
||||
|
||||
static const char *uvm_parent_gpu_name(uvm_parent_gpu_t *parent_gpu)
|
||||
{
|
||||
return parent_gpu->name;
|
||||
}
|
||||
|
||||
static const char *uvm_gpu_name(uvm_gpu_t *gpu)
|
||||
{
|
||||
return gpu->name;
|
||||
return gpu->parent->name;
|
||||
}
|
||||
|
||||
static const NvProcessorUuid *uvm_gpu_uuid(uvm_gpu_t *gpu)
|
||||
{
|
||||
return &gpu->parent->uuid;
|
||||
}
|
||||
|
||||
static uvmGpuDeviceHandle uvm_gpu_device_handle(uvm_gpu_t *gpu)
|
||||
@@ -1246,9 +1208,6 @@ struct uvm_gpu_peer_struct
|
||||
// - The global lock is held.
|
||||
//
|
||||
// - While the global lock was held in the past, the two GPUs were detected
|
||||
// to be SMC peers and were both retained.
|
||||
//
|
||||
// - While the global lock was held in the past, the two GPUs were detected
|
||||
// to be NVLINK peers and were both retained.
|
||||
//
|
||||
// - While the global lock was held in the past, the two GPUs were detected
|
||||
@@ -1334,17 +1293,17 @@ static uvm_gpu_phys_address_t uvm_gpu_page_to_phys_address(uvm_gpu_t *gpu, struc
|
||||
// Note that there is a uvm_gpu_get() function defined in uvm_global.h to break
|
||||
// a circular dep between global and gpu modules.
|
||||
|
||||
// Get a uvm_gpu_t by UUID (physical GPU UUID if SMC is not enabled, otherwise
|
||||
// GPU instance UUID).
|
||||
// This returns NULL if the GPU is not present.
|
||||
// This is the general purpose call that should be used normally.
|
||||
// Get a uvm_gpu_t by UUID. This returns NULL if the GPU is not present. This
|
||||
// is the general purpose call that should be used normally.
|
||||
// That is, unless a uvm_gpu_t for a specific SMC partition needs to be
|
||||
// retrieved, in which case uvm_gpu_get_by_parent_and_swizz_id() must be used
|
||||
// instead.
|
||||
//
|
||||
// LOCKING: requires the global lock to be held
|
||||
uvm_gpu_t *uvm_gpu_get_by_uuid(const NvProcessorUuid *gpu_uuid);
|
||||
|
||||
// Get a uvm_parent_gpu_t by UUID (physical GPU UUID).
|
||||
// Like uvm_gpu_get_by_uuid(), this function returns NULL if the GPU has not
|
||||
// been registered.
|
||||
// Get a uvm_parent_gpu_t by UUID. Like uvm_gpu_get_by_uuid(), this function
|
||||
// returns NULL if the GPU has not been registered.
|
||||
//
|
||||
// LOCKING: requires the global lock to be held
|
||||
uvm_parent_gpu_t *uvm_parent_gpu_get_by_uuid(const NvProcessorUuid *gpu_uuid);
|
||||
@@ -1355,6 +1314,13 @@ uvm_parent_gpu_t *uvm_parent_gpu_get_by_uuid(const NvProcessorUuid *gpu_uuid);
|
||||
// limited cases.
|
||||
uvm_parent_gpu_t *uvm_parent_gpu_get_by_uuid_locked(const NvProcessorUuid *gpu_uuid);
|
||||
|
||||
// Get the uvm_gpu_t for a partition by parent and swizzId. This returns NULL if
|
||||
// the partition hasn't been registered. This call needs to be used instead of
|
||||
// uvm_gpu_get_by_uuid() when a specific partition is targeted.
|
||||
//
|
||||
// LOCKING: requires the global lock to be held
|
||||
uvm_gpu_t *uvm_gpu_get_by_parent_and_swizz_id(uvm_parent_gpu_t *parent_gpu, NvU32 swizz_id);
|
||||
|
||||
// Retain a gpu by uuid
|
||||
// Returns the retained uvm_gpu_t in gpu_out on success
|
||||
//
|
||||
@@ -1385,7 +1351,7 @@ static NvU64 uvm_gpu_retained_count(uvm_gpu_t *gpu)
|
||||
void uvm_parent_gpu_kref_put(uvm_parent_gpu_t *gpu);
|
||||
|
||||
// Calculates peer table index using GPU ids.
|
||||
NvU32 uvm_gpu_peer_table_index(const uvm_gpu_id_t gpu_id0, const uvm_gpu_id_t gpu_id1);
|
||||
NvU32 uvm_gpu_peer_table_index(uvm_gpu_id_t gpu_id1, uvm_gpu_id_t gpu_id2);
|
||||
|
||||
// Either retains an existing PCIe peer entry or creates a new one. In both
|
||||
// cases the two GPUs are also each retained.
|
||||
@@ -1405,7 +1371,7 @@ uvm_aperture_t uvm_gpu_peer_aperture(uvm_gpu_t *local_gpu, uvm_gpu_t *remote_gpu
|
||||
uvm_processor_id_t uvm_gpu_get_processor_id_by_address(uvm_gpu_t *gpu, uvm_gpu_phys_address_t addr);
|
||||
|
||||
// Get the P2P capabilities between the gpus with the given indexes
|
||||
uvm_gpu_peer_t *uvm_gpu_index_peer_caps(const uvm_gpu_id_t gpu_id0, const uvm_gpu_id_t gpu_id1);
|
||||
uvm_gpu_peer_t *uvm_gpu_index_peer_caps(uvm_gpu_id_t gpu_id1, uvm_gpu_id_t gpu_id2);
|
||||
|
||||
// Get the P2P capabilities between the given gpus
|
||||
static uvm_gpu_peer_t *uvm_gpu_peer_caps(const uvm_gpu_t *gpu0, const uvm_gpu_t *gpu1)
|
||||
@@ -1413,10 +1379,10 @@ static uvm_gpu_peer_t *uvm_gpu_peer_caps(const uvm_gpu_t *gpu0, const uvm_gpu_t
|
||||
return uvm_gpu_index_peer_caps(gpu0->id, gpu1->id);
|
||||
}
|
||||
|
||||
static bool uvm_gpus_are_nvswitch_connected(const uvm_gpu_t *gpu0, const uvm_gpu_t *gpu1)
|
||||
static bool uvm_gpus_are_nvswitch_connected(uvm_gpu_t *gpu1, uvm_gpu_t *gpu2)
|
||||
{
|
||||
if (gpu0->parent->nvswitch_info.is_nvswitch_connected && gpu1->parent->nvswitch_info.is_nvswitch_connected) {
|
||||
UVM_ASSERT(uvm_gpu_peer_caps(gpu0, gpu1)->link_type >= UVM_GPU_LINK_NVLINK_2);
|
||||
if (gpu1->parent->nvswitch_info.is_nvswitch_connected && gpu2->parent->nvswitch_info.is_nvswitch_connected) {
|
||||
UVM_ASSERT(uvm_gpu_peer_caps(gpu1, gpu2)->link_type >= UVM_GPU_LINK_NVLINK_2);
|
||||
return true;
|
||||
}
|
||||
|
||||
@@ -1446,11 +1412,10 @@ static bool uvm_gpus_are_indirect_peers(uvm_gpu_t *gpu0, uvm_gpu_t *gpu1)
|
||||
// mapping covering the passed address, has been previously created.
|
||||
static uvm_gpu_address_t uvm_gpu_address_virtual_from_vidmem_phys(uvm_gpu_t *gpu, NvU64 pa)
|
||||
{
|
||||
UVM_ASSERT(uvm_mmu_parent_gpu_needs_static_vidmem_mapping(gpu->parent) ||
|
||||
uvm_mmu_parent_gpu_needs_dynamic_vidmem_mapping(gpu->parent));
|
||||
UVM_ASSERT(uvm_mmu_gpu_needs_static_vidmem_mapping(gpu) || uvm_mmu_gpu_needs_dynamic_vidmem_mapping(gpu));
|
||||
UVM_ASSERT(pa <= gpu->mem_info.max_allocatable_address);
|
||||
|
||||
if (uvm_mmu_parent_gpu_needs_static_vidmem_mapping(gpu->parent))
|
||||
if (uvm_mmu_gpu_needs_static_vidmem_mapping(gpu))
|
||||
UVM_ASSERT(gpu->static_flat_mapping.ready);
|
||||
|
||||
return uvm_gpu_address_virtual(gpu->parent->flat_vidmem_va_base + pa);
|
||||
@@ -1462,12 +1427,12 @@ static uvm_gpu_address_t uvm_gpu_address_virtual_from_vidmem_phys(uvm_gpu_t *gpu
|
||||
//
|
||||
// The actual GPU mapping only exists if a linear mapping covering the passed
|
||||
// address has been previously created.
|
||||
static uvm_gpu_address_t uvm_parent_gpu_address_virtual_from_sysmem_phys(uvm_parent_gpu_t *parent_gpu, NvU64 pa)
|
||||
static uvm_gpu_address_t uvm_gpu_address_virtual_from_sysmem_phys(uvm_gpu_t *gpu, NvU64 pa)
|
||||
{
|
||||
UVM_ASSERT(uvm_mmu_parent_gpu_needs_dynamic_sysmem_mapping(parent_gpu));
|
||||
UVM_ASSERT(pa <= (parent_gpu->dma_addressable_limit - parent_gpu->dma_addressable_start));
|
||||
UVM_ASSERT(uvm_mmu_gpu_needs_dynamic_sysmem_mapping(gpu));
|
||||
UVM_ASSERT(pa <= (gpu->parent->dma_addressable_limit - gpu->parent->dma_addressable_start));
|
||||
|
||||
return uvm_gpu_address_virtual(parent_gpu->flat_sysmem_va_base + pa);
|
||||
return uvm_gpu_address_virtual(gpu->parent->flat_sysmem_va_base + pa);
|
||||
}
|
||||
|
||||
// Given a GPU or CPU physical address (not peer), retrieve an address suitable
|
||||
@@ -1477,12 +1442,11 @@ static uvm_gpu_address_t uvm_gpu_address_copy(uvm_gpu_t *gpu, uvm_gpu_phys_addre
|
||||
UVM_ASSERT(phys_addr.aperture == UVM_APERTURE_VID || phys_addr.aperture == UVM_APERTURE_SYS);
|
||||
|
||||
if (phys_addr.aperture == UVM_APERTURE_VID) {
|
||||
if (uvm_mmu_parent_gpu_needs_static_vidmem_mapping(gpu->parent) ||
|
||||
uvm_mmu_parent_gpu_needs_dynamic_vidmem_mapping(gpu->parent))
|
||||
if (uvm_mmu_gpu_needs_static_vidmem_mapping(gpu) || uvm_mmu_gpu_needs_dynamic_vidmem_mapping(gpu))
|
||||
return uvm_gpu_address_virtual_from_vidmem_phys(gpu, phys_addr.address);
|
||||
}
|
||||
else if (uvm_mmu_parent_gpu_needs_dynamic_sysmem_mapping(gpu->parent)) {
|
||||
return uvm_parent_gpu_address_virtual_from_sysmem_phys(gpu->parent, phys_addr.address);
|
||||
else if (uvm_mmu_gpu_needs_dynamic_sysmem_mapping(gpu)) {
|
||||
return uvm_gpu_address_virtual_from_sysmem_phys(gpu, phys_addr.address);
|
||||
}
|
||||
|
||||
return uvm_gpu_address_from_phys(phys_addr);
|
||||
@@ -1511,19 +1475,19 @@ NV_STATUS uvm_gpu_check_ecc_error_no_rm(uvm_gpu_t *gpu);
|
||||
//
|
||||
// Returns the physical address of the pages that can be used to access them on
|
||||
// the GPU.
|
||||
NV_STATUS uvm_parent_gpu_map_cpu_pages(uvm_parent_gpu_t *parent_gpu, struct page *page, size_t size, NvU64 *dma_address_out);
|
||||
NV_STATUS uvm_gpu_map_cpu_pages(uvm_parent_gpu_t *parent_gpu, struct page *page, size_t size, NvU64 *dma_address_out);
|
||||
|
||||
// Unmap num_pages pages previously mapped with uvm_parent_gpu_map_cpu_pages().
|
||||
void uvm_parent_gpu_unmap_cpu_pages(uvm_parent_gpu_t *parent_gpu, NvU64 dma_address, size_t size);
|
||||
// Unmap num_pages pages previously mapped with uvm_gpu_map_cpu_pages().
|
||||
void uvm_gpu_unmap_cpu_pages(uvm_parent_gpu_t *parent_gpu, NvU64 dma_address, size_t size);
|
||||
|
||||
static NV_STATUS uvm_parent_gpu_map_cpu_page(uvm_parent_gpu_t *parent_gpu, struct page *page, NvU64 *dma_address_out)
|
||||
static NV_STATUS uvm_gpu_map_cpu_page(uvm_parent_gpu_t *parent_gpu, struct page *page, NvU64 *dma_address_out)
|
||||
{
|
||||
return uvm_parent_gpu_map_cpu_pages(parent_gpu, page, PAGE_SIZE, dma_address_out);
|
||||
return uvm_gpu_map_cpu_pages(parent_gpu, page, PAGE_SIZE, dma_address_out);
|
||||
}
|
||||
|
||||
static void uvm_parent_gpu_unmap_cpu_page(uvm_parent_gpu_t *parent_gpu, NvU64 dma_address)
|
||||
static void uvm_gpu_unmap_cpu_page(uvm_parent_gpu_t *parent_gpu, NvU64 dma_address)
|
||||
{
|
||||
uvm_parent_gpu_unmap_cpu_pages(parent_gpu, dma_address, PAGE_SIZE);
|
||||
uvm_gpu_unmap_cpu_pages(parent_gpu, dma_address, PAGE_SIZE);
|
||||
}
|
||||
|
||||
// Allocate and map a page of system DMA memory on the GPU for physical access
|
||||
@@ -1532,13 +1496,13 @@ static void uvm_parent_gpu_unmap_cpu_page(uvm_parent_gpu_t *parent_gpu, NvU64 dm
|
||||
// - the address of the page that can be used to access them on
|
||||
// the GPU in the dma_address_out parameter.
|
||||
// - the address of allocated memory in CPU virtual address space.
|
||||
void *uvm_parent_gpu_dma_alloc_page(uvm_parent_gpu_t *parent_gpu,
|
||||
gfp_t gfp_flags,
|
||||
NvU64 *dma_address_out);
|
||||
void *uvm_gpu_dma_alloc_page(uvm_parent_gpu_t *parent_gpu,
|
||||
gfp_t gfp_flags,
|
||||
NvU64 *dma_address_out);
|
||||
|
||||
// Unmap and free size bytes of contiguous sysmem DMA previously allocated
|
||||
// with uvm_parent_gpu_map_cpu_pages().
|
||||
void uvm_parent_gpu_dma_free_page(uvm_parent_gpu_t *parent_gpu, void *va, NvU64 dma_address);
|
||||
// with uvm_gpu_map_cpu_pages().
|
||||
void uvm_gpu_dma_free_page(uvm_parent_gpu_t *parent_gpu, void *va, NvU64 dma_address);
|
||||
|
||||
// Returns whether the given range is within the GPU's addressable VA ranges.
|
||||
// It requires the input 'addr' to be in canonical form for platforms compliant
|
||||
@@ -1565,45 +1529,44 @@ bool uvm_platform_uses_canonical_form_address(void);
|
||||
// addresses.
|
||||
NvU64 uvm_parent_gpu_canonical_address(uvm_parent_gpu_t *parent_gpu, NvU64 addr);
|
||||
|
||||
static bool uvm_parent_gpu_is_coherent(const uvm_parent_gpu_t *parent_gpu)
|
||||
static bool uvm_gpu_is_coherent(const uvm_parent_gpu_t *parent_gpu)
|
||||
{
|
||||
return parent_gpu->system_bus.memory_window_end > parent_gpu->system_bus.memory_window_start;
|
||||
}
|
||||
|
||||
static bool uvm_parent_gpu_needs_pushbuffer_segments(uvm_parent_gpu_t *parent_gpu)
|
||||
static bool uvm_gpu_has_pushbuffer_segments(uvm_gpu_t *gpu)
|
||||
{
|
||||
return parent_gpu->max_host_va > (1ull << 40);
|
||||
return gpu->parent->max_host_va > (1ull << 40);
|
||||
}
|
||||
|
||||
static bool uvm_parent_gpu_supports_eviction(uvm_parent_gpu_t *parent_gpu)
|
||||
static bool uvm_gpu_supports_eviction(uvm_gpu_t *gpu)
|
||||
{
|
||||
// Eviction is supported only if the GPU supports replayable faults
|
||||
return parent_gpu->replayable_faults_supported;
|
||||
return gpu->parent->replayable_faults_supported;
|
||||
}
|
||||
|
||||
static bool uvm_parent_gpu_is_virt_mode_sriov_heavy(const uvm_parent_gpu_t *parent_gpu)
|
||||
static bool uvm_gpu_is_virt_mode_sriov_heavy(const uvm_gpu_t *gpu)
|
||||
{
|
||||
return parent_gpu->virt_mode == UVM_VIRT_MODE_SRIOV_HEAVY;
|
||||
return gpu->parent->virt_mode == UVM_VIRT_MODE_SRIOV_HEAVY;
|
||||
}
|
||||
|
||||
static bool uvm_parent_gpu_is_virt_mode_sriov_standard(const uvm_parent_gpu_t *parent_gpu)
|
||||
static bool uvm_gpu_is_virt_mode_sriov_standard(const uvm_gpu_t *gpu)
|
||||
{
|
||||
return parent_gpu->virt_mode == UVM_VIRT_MODE_SRIOV_STANDARD;
|
||||
return gpu->parent->virt_mode == UVM_VIRT_MODE_SRIOV_STANDARD;
|
||||
}
|
||||
|
||||
// Returns true if the virtualization mode is SR-IOV heavy or SR-IOV standard.
|
||||
static bool uvm_parent_gpu_is_virt_mode_sriov(const uvm_parent_gpu_t *parent_gpu)
|
||||
static bool uvm_gpu_is_virt_mode_sriov(const uvm_gpu_t *gpu)
|
||||
{
|
||||
return uvm_parent_gpu_is_virt_mode_sriov_heavy(parent_gpu) ||
|
||||
uvm_parent_gpu_is_virt_mode_sriov_standard(parent_gpu);
|
||||
return uvm_gpu_is_virt_mode_sriov_heavy(gpu) || uvm_gpu_is_virt_mode_sriov_standard(gpu);
|
||||
}
|
||||
|
||||
static bool uvm_parent_gpu_needs_proxy_channel_pool(const uvm_parent_gpu_t *parent_gpu)
|
||||
static bool uvm_gpu_uses_proxy_channel_pool(const uvm_gpu_t *gpu)
|
||||
{
|
||||
return uvm_parent_gpu_is_virt_mode_sriov_heavy(parent_gpu);
|
||||
return uvm_gpu_is_virt_mode_sriov_heavy(gpu);
|
||||
}
|
||||
|
||||
uvm_aperture_t uvm_get_page_tree_location(const uvm_parent_gpu_t *parent_gpu);
|
||||
uvm_aperture_t uvm_gpu_page_tree_init_location(const uvm_gpu_t *gpu);
|
||||
|
||||
// Debug print of GPU properties
|
||||
void uvm_gpu_print(uvm_gpu_t *gpu);
|
||||
@@ -1611,8 +1574,8 @@ void uvm_gpu_print(uvm_gpu_t *gpu);
|
||||
// Add the given instance pointer -> user_channel mapping to this GPU. The
|
||||
// bottom half GPU page fault handler uses this to look up the VA space for GPU
|
||||
// faults.
|
||||
NV_STATUS uvm_parent_gpu_add_user_channel(uvm_parent_gpu_t *parent_gpu, uvm_user_channel_t *user_channel);
|
||||
void uvm_parent_gpu_remove_user_channel(uvm_parent_gpu_t *parent_gpu, uvm_user_channel_t *user_channel);
|
||||
NV_STATUS uvm_gpu_add_user_channel(uvm_gpu_t *gpu, uvm_user_channel_t *user_channel);
|
||||
void uvm_gpu_remove_user_channel(uvm_gpu_t *gpu, uvm_user_channel_t *user_channel);
|
||||
|
||||
// Looks up an entry added by uvm_gpu_add_user_channel. Return codes:
|
||||
// NV_OK Translation successful
|
||||
@@ -1623,13 +1586,13 @@ void uvm_parent_gpu_remove_user_channel(uvm_parent_gpu_t *parent_gpu, uvm_user_c
|
||||
// out_va_space is valid if NV_OK is returned, otherwise it's NULL. The caller
|
||||
// is responsibile for ensuring that the returned va_space can't be destroyed,
|
||||
// so these functions should only be called from the bottom half.
|
||||
NV_STATUS uvm_parent_gpu_fault_entry_to_va_space(uvm_parent_gpu_t *parent_gpu,
|
||||
uvm_fault_buffer_entry_t *fault,
|
||||
uvm_va_space_t **out_va_space);
|
||||
NV_STATUS uvm_gpu_fault_entry_to_va_space(uvm_gpu_t *gpu,
|
||||
uvm_fault_buffer_entry_t *fault,
|
||||
uvm_va_space_t **out_va_space);
|
||||
|
||||
NV_STATUS uvm_parent_gpu_access_counter_entry_to_va_space(uvm_parent_gpu_t *parent_gpu,
|
||||
uvm_access_counter_buffer_entry_t *entry,
|
||||
uvm_va_space_t **out_va_space);
|
||||
NV_STATUS uvm_gpu_access_counter_entry_to_va_space(uvm_gpu_t *gpu,
|
||||
uvm_access_counter_buffer_entry_t *entry,
|
||||
uvm_va_space_t **out_va_space);
|
||||
|
||||
typedef enum
|
||||
{
|
||||
|
||||
@@ -1,5 +1,5 @@
|
||||
/*******************************************************************************
|
||||
Copyright (c) 2017-2024 NVIDIA Corporation
|
||||
Copyright (c) 2017-2022 NVIDIA Corporation
|
||||
|
||||
Permission is hereby granted, free of charge, to any person obtaining a copy
|
||||
of this software and associated documentation files (the "Software"), to
|
||||
@@ -99,8 +99,7 @@ MODULE_PARM_DESC(uvm_perf_access_counter_threshold,
|
||||
"Number of remote accesses on a region required to trigger a notification."
|
||||
"Valid values: [1, 65535]");
|
||||
|
||||
static void access_counter_buffer_flush_locked(uvm_parent_gpu_t *parent_gpu,
|
||||
uvm_gpu_buffer_flush_mode_t flush_mode);
|
||||
static void access_counter_buffer_flush_locked(uvm_gpu_t *gpu, uvm_gpu_buffer_flush_mode_t flush_mode);
|
||||
|
||||
static uvm_perf_module_event_callback_desc_t g_callbacks_access_counters[] = {};
|
||||
|
||||
@@ -282,7 +281,7 @@ get_config_for_type(const uvm_access_counter_buffer_info_t *access_counters, uvm
|
||||
&(access_counters)->current_config.momc;
|
||||
}
|
||||
|
||||
bool uvm_parent_gpu_access_counters_pending(uvm_parent_gpu_t *parent_gpu)
|
||||
bool uvm_gpu_access_counters_pending(uvm_parent_gpu_t *parent_gpu)
|
||||
{
|
||||
UVM_ASSERT(parent_gpu->access_counters_supported);
|
||||
|
||||
@@ -341,7 +340,7 @@ static void init_access_counter_types_config(const UvmGpuAccessCntrConfig *confi
|
||||
UVM_ASSERT(counter_type_config->sub_granularity_regions_per_translation <= UVM_SUB_GRANULARITY_REGIONS);
|
||||
}
|
||||
|
||||
NV_STATUS uvm_parent_gpu_init_access_counters(uvm_parent_gpu_t *parent_gpu)
|
||||
NV_STATUS uvm_gpu_init_access_counters(uvm_parent_gpu_t *parent_gpu)
|
||||
{
|
||||
NV_STATUS status = NV_OK;
|
||||
uvm_access_counter_buffer_info_t *access_counters = &parent_gpu->access_counter_buffer_info;
|
||||
@@ -373,7 +372,7 @@ NV_STATUS uvm_parent_gpu_init_access_counters(uvm_parent_gpu_t *parent_gpu)
|
||||
if (status != NV_OK) {
|
||||
UVM_ERR_PRINT("Failed to init notify buffer info from RM: %s, GPU %s\n",
|
||||
nvstatusToString(status),
|
||||
uvm_parent_gpu_name(parent_gpu));
|
||||
parent_gpu->name);
|
||||
|
||||
// nvUvmInterfaceInitAccessCntrInfo may leave fields in rm_info
|
||||
// populated when it returns an error. Set the buffer handle to zero as
|
||||
@@ -407,7 +406,7 @@ NV_STATUS uvm_parent_gpu_init_access_counters(uvm_parent_gpu_t *parent_gpu)
|
||||
|
||||
if (access_counters->max_batch_size != uvm_perf_access_counter_batch_count) {
|
||||
pr_info("Invalid uvm_perf_access_counter_batch_count value on GPU %s: %u. Valid range [%u:%u] Using %u instead\n",
|
||||
uvm_parent_gpu_name(parent_gpu),
|
||||
parent_gpu->name,
|
||||
uvm_perf_access_counter_batch_count,
|
||||
UVM_PERF_ACCESS_COUNTER_BATCH_COUNT_MIN,
|
||||
access_counters->max_notifications,
|
||||
@@ -445,12 +444,12 @@ NV_STATUS uvm_parent_gpu_init_access_counters(uvm_parent_gpu_t *parent_gpu)
|
||||
return NV_OK;
|
||||
|
||||
fail:
|
||||
uvm_parent_gpu_deinit_access_counters(parent_gpu);
|
||||
uvm_gpu_deinit_access_counters(parent_gpu);
|
||||
|
||||
return status;
|
||||
}
|
||||
|
||||
void uvm_parent_gpu_deinit_access_counters(uvm_parent_gpu_t *parent_gpu)
|
||||
void uvm_gpu_deinit_access_counters(uvm_parent_gpu_t *parent_gpu)
|
||||
{
|
||||
uvm_access_counter_buffer_info_t *access_counters = &parent_gpu->access_counter_buffer_info;
|
||||
uvm_access_counter_service_batch_context_t *batch_context = &access_counters->batch_service_context;
|
||||
@@ -476,7 +475,7 @@ void uvm_parent_gpu_deinit_access_counters(uvm_parent_gpu_t *parent_gpu)
|
||||
batch_context->phys.translations = NULL;
|
||||
}
|
||||
|
||||
bool uvm_parent_gpu_access_counters_required(const uvm_parent_gpu_t *parent_gpu)
|
||||
bool uvm_gpu_access_counters_required(const uvm_parent_gpu_t *parent_gpu)
|
||||
{
|
||||
if (!parent_gpu->access_counters_supported)
|
||||
return false;
|
||||
@@ -519,7 +518,7 @@ static NV_STATUS access_counters_take_ownership(uvm_gpu_t *gpu, UvmGpuAccessCntr
|
||||
// taken control of the notify buffer since the GPU was initialized. Then
|
||||
// flush old notifications. This will update the cached_put pointer.
|
||||
access_counters->cached_get = UVM_GPU_READ_ONCE(*access_counters->rm_info.pAccessCntrBufferGet);
|
||||
access_counter_buffer_flush_locked(gpu->parent, UVM_GPU_BUFFER_FLUSH_MODE_UPDATE_PUT);
|
||||
access_counter_buffer_flush_locked(gpu, UVM_GPU_BUFFER_FLUSH_MODE_UPDATE_PUT);
|
||||
|
||||
access_counters->current_config.threshold = config->threshold;
|
||||
|
||||
@@ -538,20 +537,20 @@ error:
|
||||
|
||||
// If ownership is yielded as part of reconfiguration, the access counters
|
||||
// handling refcount may not be 0
|
||||
static void access_counters_yield_ownership(uvm_parent_gpu_t *parent_gpu)
|
||||
static void access_counters_yield_ownership(uvm_gpu_t *gpu)
|
||||
{
|
||||
NV_STATUS status;
|
||||
uvm_access_counter_buffer_info_t *access_counters = &parent_gpu->access_counter_buffer_info;
|
||||
uvm_access_counter_buffer_info_t *access_counters = &gpu->parent->access_counter_buffer_info;
|
||||
|
||||
UVM_ASSERT(parent_gpu->access_counters_supported);
|
||||
UVM_ASSERT(uvm_sem_is_locked(&parent_gpu->isr.access_counters.service_lock));
|
||||
UVM_ASSERT(gpu->parent->access_counters_supported);
|
||||
UVM_ASSERT(uvm_sem_is_locked(&gpu->parent->isr.access_counters.service_lock));
|
||||
|
||||
// Wait for any pending clear operation befor releasing ownership
|
||||
status = uvm_tracker_wait(&access_counters->clear_tracker);
|
||||
if (status != NV_OK)
|
||||
UVM_ASSERT(status == uvm_global_get_status());
|
||||
|
||||
status = uvm_rm_locked_call(nvUvmInterfaceDisableAccessCntr(parent_gpu->rm_device,
|
||||
status = uvm_rm_locked_call(nvUvmInterfaceDisableAccessCntr(gpu->parent->rm_device,
|
||||
&access_counters->rm_info));
|
||||
UVM_ASSERT(status == NV_OK);
|
||||
}
|
||||
@@ -580,14 +579,14 @@ static NV_STATUS gpu_access_counters_enable(uvm_gpu_t *gpu, UvmGpuAccessCntrConf
|
||||
|
||||
// Decrement the refcount of access counter enablement. If this is the last
|
||||
// reference, disable the HW feature.
|
||||
static void parent_gpu_access_counters_disable(uvm_parent_gpu_t *parent_gpu)
|
||||
static void gpu_access_counters_disable(uvm_gpu_t *gpu)
|
||||
{
|
||||
UVM_ASSERT(uvm_sem_is_locked(&parent_gpu->isr.access_counters.service_lock));
|
||||
UVM_ASSERT(parent_gpu->access_counters_supported);
|
||||
UVM_ASSERT(parent_gpu->isr.access_counters.handling_ref_count > 0);
|
||||
UVM_ASSERT(uvm_sem_is_locked(&gpu->parent->isr.access_counters.service_lock));
|
||||
UVM_ASSERT(gpu->parent->access_counters_supported);
|
||||
UVM_ASSERT(gpu->parent->isr.access_counters.handling_ref_count > 0);
|
||||
|
||||
if (--parent_gpu->isr.access_counters.handling_ref_count == 0)
|
||||
access_counters_yield_ownership(parent_gpu);
|
||||
if (--gpu->parent->isr.access_counters.handling_ref_count == 0)
|
||||
access_counters_yield_ownership(gpu);
|
||||
}
|
||||
|
||||
// Invoked during registration of the GPU in the VA space
|
||||
@@ -597,9 +596,9 @@ NV_STATUS uvm_gpu_access_counters_enable(uvm_gpu_t *gpu, uvm_va_space_t *va_spac
|
||||
|
||||
UVM_ASSERT(gpu->parent->access_counters_supported);
|
||||
|
||||
uvm_parent_gpu_access_counters_isr_lock(gpu->parent);
|
||||
uvm_gpu_access_counters_isr_lock(gpu->parent);
|
||||
|
||||
if (uvm_parent_processor_mask_test(&va_space->access_counters_enabled_processors, gpu->parent->id)) {
|
||||
if (uvm_processor_mask_test(&va_space->access_counters_enabled_processors, gpu->id)) {
|
||||
status = NV_ERR_INVALID_DEVICE;
|
||||
}
|
||||
else {
|
||||
@@ -617,34 +616,32 @@ NV_STATUS uvm_gpu_access_counters_enable(uvm_gpu_t *gpu, uvm_va_space_t *va_spac
|
||||
// modified to protect from concurrent enablement of access counters in
|
||||
// another GPU
|
||||
if (status == NV_OK)
|
||||
uvm_parent_processor_mask_set_atomic(&va_space->access_counters_enabled_processors, gpu->parent->id);
|
||||
uvm_processor_mask_set_atomic(&va_space->access_counters_enabled_processors, gpu->id);
|
||||
}
|
||||
|
||||
// If this is the first reference taken on access counters, dropping the
|
||||
// ISR lock will enable interrupts.
|
||||
uvm_parent_gpu_access_counters_isr_unlock(gpu->parent);
|
||||
uvm_gpu_access_counters_isr_unlock(gpu->parent);
|
||||
|
||||
return status;
|
||||
}
|
||||
|
||||
void uvm_parent_gpu_access_counters_disable(uvm_parent_gpu_t *parent_gpu,
|
||||
uvm_va_space_t *va_space)
|
||||
void uvm_gpu_access_counters_disable(uvm_gpu_t *gpu, uvm_va_space_t *va_space)
|
||||
{
|
||||
UVM_ASSERT(parent_gpu->access_counters_supported);
|
||||
UVM_ASSERT(gpu->parent->access_counters_supported);
|
||||
|
||||
uvm_parent_gpu_access_counters_isr_lock(parent_gpu);
|
||||
uvm_gpu_access_counters_isr_lock(gpu->parent);
|
||||
|
||||
if (uvm_parent_processor_mask_test_and_clear_atomic(&va_space->access_counters_enabled_processors,
|
||||
parent_gpu->id)) {
|
||||
parent_gpu_access_counters_disable(parent_gpu);
|
||||
if (uvm_processor_mask_test_and_clear_atomic(&va_space->access_counters_enabled_processors, gpu->id)) {
|
||||
gpu_access_counters_disable(gpu);
|
||||
|
||||
// If this is VA space reconfigured access counters, clear the
|
||||
// ownership to allow for other processes to invoke the reconfiguration
|
||||
if (parent_gpu->access_counter_buffer_info.reconfiguration_owner == va_space)
|
||||
parent_gpu->access_counter_buffer_info.reconfiguration_owner = NULL;
|
||||
if (gpu->parent->access_counter_buffer_info.reconfiguration_owner == va_space)
|
||||
gpu->parent->access_counter_buffer_info.reconfiguration_owner = NULL;
|
||||
}
|
||||
|
||||
uvm_parent_gpu_access_counters_isr_unlock(parent_gpu);
|
||||
uvm_gpu_access_counters_isr_unlock(gpu->parent);
|
||||
}
|
||||
|
||||
static void write_get(uvm_parent_gpu_t *parent_gpu, NvU32 get)
|
||||
@@ -663,16 +660,15 @@ static void write_get(uvm_parent_gpu_t *parent_gpu, NvU32 get)
|
||||
UVM_GPU_WRITE_ONCE(*access_counters->rm_info.pAccessCntrBufferGet, get);
|
||||
}
|
||||
|
||||
static void access_counter_buffer_flush_locked(uvm_parent_gpu_t *parent_gpu,
|
||||
uvm_gpu_buffer_flush_mode_t flush_mode)
|
||||
static void access_counter_buffer_flush_locked(uvm_gpu_t *gpu, uvm_gpu_buffer_flush_mode_t flush_mode)
|
||||
{
|
||||
NvU32 get;
|
||||
NvU32 put;
|
||||
uvm_spin_loop_t spin;
|
||||
uvm_access_counter_buffer_info_t *access_counters = &parent_gpu->access_counter_buffer_info;
|
||||
uvm_access_counter_buffer_info_t *access_counters = &gpu->parent->access_counter_buffer_info;
|
||||
|
||||
UVM_ASSERT(uvm_sem_is_locked(&parent_gpu->isr.access_counters.service_lock));
|
||||
UVM_ASSERT(parent_gpu->access_counters_supported);
|
||||
UVM_ASSERT(uvm_sem_is_locked(&gpu->parent->isr.access_counters.service_lock));
|
||||
UVM_ASSERT(gpu->parent->access_counters_supported);
|
||||
|
||||
// Read PUT pointer from the GPU if requested
|
||||
UVM_ASSERT(flush_mode != UVM_GPU_BUFFER_FLUSH_MODE_WAIT_UPDATE_PUT);
|
||||
@@ -684,28 +680,28 @@ static void access_counter_buffer_flush_locked(uvm_parent_gpu_t *parent_gpu,
|
||||
|
||||
while (get != put) {
|
||||
// Wait until valid bit is set
|
||||
UVM_SPIN_WHILE(!parent_gpu->access_counter_buffer_hal->entry_is_valid(parent_gpu, get), &spin);
|
||||
UVM_SPIN_WHILE(!gpu->parent->access_counter_buffer_hal->entry_is_valid(gpu->parent, get), &spin);
|
||||
|
||||
parent_gpu->access_counter_buffer_hal->entry_clear_valid(parent_gpu, get);
|
||||
gpu->parent->access_counter_buffer_hal->entry_clear_valid(gpu->parent, get);
|
||||
++get;
|
||||
if (get == access_counters->max_notifications)
|
||||
get = 0;
|
||||
}
|
||||
|
||||
write_get(parent_gpu, get);
|
||||
write_get(gpu->parent, get);
|
||||
}
|
||||
|
||||
void uvm_parent_gpu_access_counter_buffer_flush(uvm_parent_gpu_t *parent_gpu)
|
||||
void uvm_gpu_access_counter_buffer_flush(uvm_gpu_t *gpu)
|
||||
{
|
||||
UVM_ASSERT(parent_gpu->access_counters_supported);
|
||||
UVM_ASSERT(gpu->parent->access_counters_supported);
|
||||
|
||||
// Disables access counter interrupts and notification servicing
|
||||
uvm_parent_gpu_access_counters_isr_lock(parent_gpu);
|
||||
uvm_gpu_access_counters_isr_lock(gpu->parent);
|
||||
|
||||
if (parent_gpu->isr.access_counters.handling_ref_count > 0)
|
||||
access_counter_buffer_flush_locked(parent_gpu, UVM_GPU_BUFFER_FLUSH_MODE_UPDATE_PUT);
|
||||
if (gpu->parent->isr.access_counters.handling_ref_count > 0)
|
||||
access_counter_buffer_flush_locked(gpu, UVM_GPU_BUFFER_FLUSH_MODE_UPDATE_PUT);
|
||||
|
||||
uvm_parent_gpu_access_counters_isr_unlock(parent_gpu);
|
||||
uvm_gpu_access_counters_isr_unlock(gpu->parent);
|
||||
}
|
||||
|
||||
static inline int cmp_access_counter_instance_ptr(const uvm_access_counter_buffer_entry_t *a,
|
||||
@@ -879,7 +875,7 @@ done:
|
||||
return notification_index;
|
||||
}
|
||||
|
||||
static void translate_virt_notifications_instance_ptrs(uvm_parent_gpu_t *parent_gpu,
|
||||
static void translate_virt_notifications_instance_ptrs(uvm_gpu_t *gpu,
|
||||
uvm_access_counter_service_batch_context_t *batch_context)
|
||||
{
|
||||
NvU32 i;
|
||||
@@ -893,9 +889,9 @@ static void translate_virt_notifications_instance_ptrs(uvm_parent_gpu_t *parent_
|
||||
// If instance_ptr is different, make a new translation. If the
|
||||
// translation fails then va_space will be NULL and the entry will
|
||||
// simply be ignored in subsequent processing.
|
||||
status = uvm_parent_gpu_access_counter_entry_to_va_space(parent_gpu,
|
||||
current_entry,
|
||||
¤t_entry->virtual_info.va_space);
|
||||
status = uvm_gpu_access_counter_entry_to_va_space(gpu,
|
||||
current_entry,
|
||||
¤t_entry->virtual_info.va_space);
|
||||
if (status != NV_OK)
|
||||
UVM_ASSERT(current_entry->virtual_info.va_space == NULL);
|
||||
}
|
||||
@@ -908,7 +904,7 @@ static void translate_virt_notifications_instance_ptrs(uvm_parent_gpu_t *parent_
|
||||
// GVA notifications provide an instance_ptr and ve_id that can be directly
|
||||
// translated to a VA space. In order to minimize translations, we sort the
|
||||
// entries by instance_ptr, va_space and notification address in that order.
|
||||
static void preprocess_virt_notifications(uvm_parent_gpu_t *parent_gpu,
|
||||
static void preprocess_virt_notifications(uvm_gpu_t *gpu,
|
||||
uvm_access_counter_service_batch_context_t *batch_context)
|
||||
{
|
||||
if (!batch_context->virt.is_single_instance_ptr) {
|
||||
@@ -919,7 +915,7 @@ static void preprocess_virt_notifications(uvm_parent_gpu_t *parent_gpu,
|
||||
NULL);
|
||||
}
|
||||
|
||||
translate_virt_notifications_instance_ptrs(parent_gpu, batch_context);
|
||||
translate_virt_notifications_instance_ptrs(gpu, batch_context);
|
||||
|
||||
sort(batch_context->virt.notifications,
|
||||
batch_context->virt.num_notifications,
|
||||
@@ -978,7 +974,6 @@ static NV_STATUS service_va_block_locked(uvm_processor_id_t processor,
|
||||
uvm_page_index_t last_page_index;
|
||||
NvU32 page_count = 0;
|
||||
const uvm_page_mask_t *residency_mask;
|
||||
const bool hmm_migratable = true;
|
||||
|
||||
uvm_assert_mutex_locked(&va_block->lock);
|
||||
|
||||
@@ -995,7 +990,7 @@ static NV_STATUS service_va_block_locked(uvm_processor_id_t processor,
|
||||
return NV_OK;
|
||||
|
||||
if (uvm_processor_mask_test(&va_block->resident, processor))
|
||||
residency_mask = uvm_va_block_resident_mask_get(va_block, processor, NUMA_NO_NODE);
|
||||
residency_mask = uvm_va_block_resident_mask_get(va_block, processor);
|
||||
else
|
||||
residency_mask = NULL;
|
||||
|
||||
@@ -1031,7 +1026,7 @@ static NV_STATUS service_va_block_locked(uvm_processor_id_t processor,
|
||||
if (!iter.migratable)
|
||||
continue;
|
||||
|
||||
thrashing_hint = uvm_perf_thrashing_get_hint(va_block, service_context->block_context, address, processor);
|
||||
thrashing_hint = uvm_perf_thrashing_get_hint(va_block, address, processor);
|
||||
if (thrashing_hint.type == UVM_PERF_THRASHING_HINT_TYPE_THROTTLE) {
|
||||
// If the page is throttling, ignore the access counter
|
||||
// notification
|
||||
@@ -1046,8 +1041,8 @@ static NV_STATUS service_va_block_locked(uvm_processor_id_t processor,
|
||||
|
||||
// If the underlying VMA is gone, skip HMM migrations.
|
||||
if (uvm_va_block_is_hmm(va_block)) {
|
||||
status = uvm_hmm_find_vma(service_context->block_context->mm,
|
||||
&service_context->block_context->hmm.vma,
|
||||
status = uvm_hmm_find_vma(service_context->block_context.mm,
|
||||
&service_context->block_context.hmm.vma,
|
||||
address);
|
||||
if (status == NV_ERR_INVALID_ADDRESS)
|
||||
continue;
|
||||
@@ -1058,14 +1053,13 @@ static NV_STATUS service_va_block_locked(uvm_processor_id_t processor,
|
||||
policy = uvm_va_policy_get(va_block, address);
|
||||
|
||||
new_residency = uvm_va_block_select_residency(va_block,
|
||||
service_context->block_context,
|
||||
&service_context->block_context,
|
||||
page_index,
|
||||
processor,
|
||||
uvm_fault_access_type_mask_bit(UVM_FAULT_ACCESS_TYPE_PREFETCH),
|
||||
policy,
|
||||
&thrashing_hint,
|
||||
UVM_SERVICE_OPERATION_ACCESS_COUNTERS,
|
||||
hmm_migratable,
|
||||
&read_duplicate);
|
||||
|
||||
if (!uvm_processor_mask_test_and_set(&service_context->resident_processors, new_residency))
|
||||
@@ -1087,14 +1081,14 @@ static NV_STATUS service_va_block_locked(uvm_processor_id_t processor,
|
||||
// pages to be serviced
|
||||
if (page_count > 0) {
|
||||
uvm_processor_id_t id;
|
||||
uvm_processor_mask_t *update_processors = &service_context->update_processors;
|
||||
uvm_processor_mask_t update_processors;
|
||||
|
||||
uvm_processor_mask_and(update_processors, &va_block->resident, &service_context->resident_processors);
|
||||
uvm_processor_mask_and(&update_processors, &va_block->resident, &service_context->resident_processors);
|
||||
|
||||
// Remove pages that are already resident in the destination processors
|
||||
for_each_id_in_mask(id, update_processors) {
|
||||
for_each_id_in_mask(id, &update_processors) {
|
||||
bool migrate_pages;
|
||||
uvm_page_mask_t *residency_mask = uvm_va_block_resident_mask_get(va_block, id, NUMA_NO_NODE);
|
||||
uvm_page_mask_t *residency_mask = uvm_va_block_resident_mask_get(va_block, id);
|
||||
UVM_ASSERT(residency_mask);
|
||||
|
||||
migrate_pages = uvm_page_mask_andnot(&service_context->per_processor_masks[uvm_id_value(id)].new_residency,
|
||||
@@ -1112,9 +1106,9 @@ static NV_STATUS service_va_block_locked(uvm_processor_id_t processor,
|
||||
|
||||
if (uvm_va_block_is_hmm(va_block)) {
|
||||
status = NV_ERR_INVALID_ADDRESS;
|
||||
if (service_context->block_context->mm) {
|
||||
if (service_context->block_context.mm) {
|
||||
status = uvm_hmm_find_policy_vma_and_outer(va_block,
|
||||
&service_context->block_context->hmm.vma,
|
||||
&service_context->block_context.hmm.vma,
|
||||
first_page_index,
|
||||
&policy,
|
||||
&outer);
|
||||
@@ -1216,18 +1210,18 @@ static NV_STATUS service_phys_single_va_block(uvm_gpu_t *gpu,
|
||||
|
||||
service_context->operation = UVM_SERVICE_OPERATION_ACCESS_COUNTERS;
|
||||
service_context->num_retries = 0;
|
||||
service_context->block_context.mm = mm;
|
||||
|
||||
uvm_va_block_context_init(service_context->block_context, mm);
|
||||
|
||||
if (uvm_va_block_is_hmm(va_block))
|
||||
if (uvm_va_block_is_hmm(va_block)) {
|
||||
uvm_hmm_service_context_init(service_context);
|
||||
uvm_hmm_migrate_begin_wait(va_block);
|
||||
}
|
||||
|
||||
uvm_mutex_lock(&va_block->lock);
|
||||
|
||||
reverse_mappings_to_va_block_page_mask(va_block, reverse_mappings, num_reverse_mappings, accessed_pages);
|
||||
|
||||
status = UVM_VA_BLOCK_RETRY_LOCKED(va_block,
|
||||
&va_block_retry,
|
||||
status = UVM_VA_BLOCK_RETRY_LOCKED(va_block, &va_block_retry,
|
||||
service_va_block_locked(processor,
|
||||
va_block,
|
||||
&va_block_retry,
|
||||
@@ -1236,15 +1230,9 @@ static NV_STATUS service_phys_single_va_block(uvm_gpu_t *gpu,
|
||||
|
||||
uvm_mutex_unlock(&va_block->lock);
|
||||
|
||||
if (uvm_va_block_is_hmm(va_block)) {
|
||||
if (uvm_va_block_is_hmm(va_block))
|
||||
uvm_hmm_migrate_finish(va_block);
|
||||
|
||||
// If the pages could not be migrated, no need to try again,
|
||||
// this is best effort only.
|
||||
if (status == NV_WARN_MORE_PROCESSING_REQUIRED || status == NV_WARN_MISMATCHED_TARGET)
|
||||
status = NV_OK;
|
||||
}
|
||||
|
||||
if (status == NV_OK)
|
||||
*out_flags |= UVM_ACCESS_COUNTER_ACTION_CLEAR;
|
||||
}
|
||||
@@ -1416,7 +1404,7 @@ static NV_STATUS service_phys_notification(uvm_gpu_t *gpu,
|
||||
sub_granularity = 1;
|
||||
|
||||
if (UVM_ID_IS_GPU(current_entry->physical_info.resident_id)) {
|
||||
resident_gpu = uvm_gpu_get(current_entry->physical_info.resident_id);
|
||||
resident_gpu = uvm_gpu_get_by_processor_id(current_entry->physical_info.resident_id);
|
||||
UVM_ASSERT(resident_gpu != NULL);
|
||||
|
||||
if (gpu != resident_gpu && uvm_gpus_are_nvswitch_connected(gpu, resident_gpu)) {
|
||||
@@ -1472,8 +1460,6 @@ static NV_STATUS service_phys_notifications(uvm_gpu_t *gpu,
|
||||
NvU32 i;
|
||||
uvm_access_counter_buffer_entry_t **notifications = batch_context->phys.notifications;
|
||||
|
||||
UVM_ASSERT(gpu->parent->access_counters_can_use_physical_addresses);
|
||||
|
||||
preprocess_phys_notifications(batch_context);
|
||||
|
||||
for (i = 0; i < batch_context->phys.num_notifications; ++i) {
|
||||
@@ -1511,6 +1497,7 @@ static NV_STATUS service_notification_va_block_helper(struct mm_struct *mm,
|
||||
|
||||
service_context->operation = UVM_SERVICE_OPERATION_ACCESS_COUNTERS;
|
||||
service_context->num_retries = 0;
|
||||
service_context->block_context.mm = mm;
|
||||
|
||||
return UVM_VA_BLOCK_RETRY_LOCKED(va_block,
|
||||
&va_block_retry,
|
||||
@@ -1523,7 +1510,6 @@ static NV_STATUS service_notification_va_block_helper(struct mm_struct *mm,
|
||||
|
||||
static void expand_notification_block(uvm_gpu_va_space_t *gpu_va_space,
|
||||
uvm_va_block_t *va_block,
|
||||
uvm_va_block_context_t *va_block_context,
|
||||
uvm_page_mask_t *accessed_pages,
|
||||
const uvm_access_counter_buffer_entry_t *current_entry)
|
||||
{
|
||||
@@ -1551,7 +1537,7 @@ static void expand_notification_block(uvm_gpu_va_space_t *gpu_va_space,
|
||||
|
||||
page_index = uvm_va_block_cpu_page_index(va_block, addr);
|
||||
|
||||
resident_id = uvm_va_block_page_get_closest_resident(va_block, va_block_context, page_index, gpu->id);
|
||||
resident_id = uvm_va_block_page_get_closest_resident(va_block, page_index, gpu->id);
|
||||
|
||||
// resident_id might be invalid or might already be the same as the GPU
|
||||
// which received the notification if the memory was already migrated before
|
||||
@@ -1573,7 +1559,7 @@ static void expand_notification_block(uvm_gpu_va_space_t *gpu_va_space,
|
||||
unsigned long sub_granularity = current_entry->sub_granularity;
|
||||
NvU32 num_regions = config->sub_granularity_regions_per_translation;
|
||||
NvU32 num_sub_pages = config->sub_granularity_region_size / PAGE_SIZE;
|
||||
uvm_page_mask_t *resident_mask = uvm_va_block_resident_mask_get(va_block, resident_id, NUMA_NO_NODE);
|
||||
uvm_page_mask_t *resident_mask = uvm_va_block_resident_mask_get(va_block, resident_id);
|
||||
|
||||
UVM_ASSERT(num_sub_pages >= 1);
|
||||
|
||||
@@ -1607,7 +1593,6 @@ static NV_STATUS service_virt_notifications_in_block(uvm_gpu_va_space_t *gpu_va_
|
||||
uvm_va_space_t *va_space = gpu_va_space->va_space;
|
||||
uvm_page_mask_t *accessed_pages = &batch_context->accessed_pages;
|
||||
uvm_access_counter_buffer_entry_t **notifications = batch_context->virt.notifications;
|
||||
uvm_service_block_context_t *service_context = &batch_context->block_service_context;
|
||||
|
||||
UVM_ASSERT(va_block);
|
||||
UVM_ASSERT(index < batch_context->virt.num_notifications);
|
||||
@@ -1616,24 +1601,16 @@ static NV_STATUS service_virt_notifications_in_block(uvm_gpu_va_space_t *gpu_va_
|
||||
|
||||
uvm_page_mask_zero(accessed_pages);
|
||||
|
||||
uvm_va_block_context_init(service_context->block_context, mm);
|
||||
|
||||
uvm_mutex_lock(&va_block->lock);
|
||||
|
||||
for (i = index; i < batch_context->virt.num_notifications; i++) {
|
||||
uvm_access_counter_buffer_entry_t *current_entry = notifications[i];
|
||||
NvU64 address = current_entry->address.address;
|
||||
|
||||
if ((current_entry->virtual_info.va_space == va_space) && (address <= va_block->end)) {
|
||||
expand_notification_block(gpu_va_space,
|
||||
va_block,
|
||||
batch_context->block_service_context.block_context,
|
||||
accessed_pages,
|
||||
current_entry);
|
||||
}
|
||||
else {
|
||||
if ((current_entry->virtual_info.va_space == va_space) && (address <= va_block->end))
|
||||
expand_notification_block(gpu_va_space, va_block, accessed_pages, current_entry);
|
||||
else
|
||||
break;
|
||||
}
|
||||
}
|
||||
|
||||
*out_index = i;
|
||||
@@ -1828,7 +1805,7 @@ static NV_STATUS service_virt_notifications(uvm_gpu_t *gpu,
|
||||
0);
|
||||
}
|
||||
|
||||
preprocess_virt_notifications(gpu->parent, batch_context);
|
||||
preprocess_virt_notifications(gpu, batch_context);
|
||||
|
||||
while (i < batch_context->virt.num_notifications) {
|
||||
uvm_access_counter_buffer_entry_t *current_entry = batch_context->virt.notifications[i];
|
||||
@@ -1896,17 +1873,13 @@ void uvm_gpu_service_access_counters(uvm_gpu_t *gpu)
|
||||
|
||||
++batch_context->batch_id;
|
||||
|
||||
if (batch_context->virt.num_notifications) {
|
||||
status = service_virt_notifications(gpu, batch_context);
|
||||
if (status != NV_OK)
|
||||
break;
|
||||
}
|
||||
status = service_virt_notifications(gpu, batch_context);
|
||||
if (status != NV_OK)
|
||||
break;
|
||||
|
||||
if (batch_context->phys.num_notifications) {
|
||||
status = service_phys_notifications(gpu, batch_context);
|
||||
if (status != NV_OK)
|
||||
break;
|
||||
}
|
||||
status = service_phys_notifications(gpu, batch_context);
|
||||
if (status != NV_OK)
|
||||
break;
|
||||
}
|
||||
|
||||
if (status != NV_OK) {
|
||||
@@ -2002,7 +1975,7 @@ NV_STATUS uvm_test_access_counters_enabled_by_default(UVM_TEST_ACCESS_COUNTERS_E
|
||||
if (!gpu)
|
||||
return NV_ERR_INVALID_DEVICE;
|
||||
|
||||
params->enabled = uvm_parent_gpu_access_counters_required(gpu->parent);
|
||||
params->enabled = uvm_gpu_access_counters_required(gpu->parent);
|
||||
|
||||
uvm_gpu_release(gpu);
|
||||
|
||||
@@ -2034,7 +2007,7 @@ NV_STATUS uvm_test_reconfigure_access_counters(UVM_TEST_RECONFIGURE_ACCESS_COUNT
|
||||
// ISR lock ensures that we own GET/PUT registers. It disables interrupts
|
||||
// and ensures that no other thread (nor the top half) will be able to
|
||||
// re-enable interrupts during reconfiguration.
|
||||
uvm_parent_gpu_access_counters_isr_lock(gpu->parent);
|
||||
uvm_gpu_access_counters_isr_lock(gpu->parent);
|
||||
|
||||
uvm_va_space_down_read_rm(va_space);
|
||||
|
||||
@@ -2067,11 +2040,11 @@ NV_STATUS uvm_test_reconfigure_access_counters(UVM_TEST_RECONFIGURE_ACCESS_COUNT
|
||||
goto exit_isr_unlock;
|
||||
}
|
||||
|
||||
if (!uvm_parent_processor_mask_test(&va_space->access_counters_enabled_processors, gpu->parent->id)) {
|
||||
if (!uvm_processor_mask_test(&va_space->access_counters_enabled_processors, gpu->id)) {
|
||||
status = gpu_access_counters_enable(gpu, &config);
|
||||
|
||||
if (status == NV_OK)
|
||||
uvm_parent_processor_mask_set_atomic(&va_space->access_counters_enabled_processors, gpu->parent->id);
|
||||
uvm_processor_mask_set_atomic(&va_space->access_counters_enabled_processors, gpu->id);
|
||||
else
|
||||
goto exit_isr_unlock;
|
||||
}
|
||||
@@ -2083,7 +2056,7 @@ NV_STATUS uvm_test_reconfigure_access_counters(UVM_TEST_RECONFIGURE_ACCESS_COUNT
|
||||
// enabled in at least gpu. This inconsistent state is not visible to other
|
||||
// threads or VA spaces because of the ISR lock, and it is immediately
|
||||
// rectified by retaking ownership.
|
||||
access_counters_yield_ownership(gpu->parent);
|
||||
access_counters_yield_ownership(gpu);
|
||||
status = access_counters_take_ownership(gpu, &config);
|
||||
|
||||
// Retaking ownership failed, so RM owns the interrupt.
|
||||
@@ -2097,8 +2070,8 @@ NV_STATUS uvm_test_reconfigure_access_counters(UVM_TEST_RECONFIGURE_ACCESS_COUNT
|
||||
"Access counters interrupt still owned by RM, other VA spaces may experience failures");
|
||||
}
|
||||
|
||||
uvm_parent_processor_mask_clear_atomic(&va_space->access_counters_enabled_processors, gpu->parent->id);
|
||||
parent_gpu_access_counters_disable(gpu->parent);
|
||||
uvm_processor_mask_clear_atomic(&va_space->access_counters_enabled_processors, gpu->id);
|
||||
gpu_access_counters_disable(gpu);
|
||||
goto exit_isr_unlock;
|
||||
}
|
||||
|
||||
@@ -2114,7 +2087,7 @@ exit_isr_unlock:
|
||||
if (status != NV_OK)
|
||||
uvm_va_space_up_read_rm(va_space);
|
||||
|
||||
uvm_parent_gpu_access_counters_isr_unlock(gpu->parent);
|
||||
uvm_gpu_access_counters_isr_unlock(gpu->parent);
|
||||
|
||||
exit_release_gpu:
|
||||
uvm_gpu_release(gpu);
|
||||
@@ -2146,7 +2119,7 @@ NV_STATUS uvm_test_reset_access_counters(UVM_TEST_RESET_ACCESS_COUNTERS_PARAMS *
|
||||
goto exit_release_gpu;
|
||||
}
|
||||
|
||||
uvm_parent_gpu_access_counters_isr_lock(gpu->parent);
|
||||
uvm_gpu_access_counters_isr_lock(gpu->parent);
|
||||
|
||||
// Access counters not enabled. Nothing to reset
|
||||
if (gpu->parent->isr.access_counters.handling_ref_count == 0)
|
||||
@@ -2176,7 +2149,7 @@ NV_STATUS uvm_test_reset_access_counters(UVM_TEST_RESET_ACCESS_COUNTERS_PARAMS *
|
||||
status = uvm_tracker_wait(&access_counters->clear_tracker);
|
||||
|
||||
exit_isr_unlock:
|
||||
uvm_parent_gpu_access_counters_isr_unlock(gpu->parent);
|
||||
uvm_gpu_access_counters_isr_unlock(gpu->parent);
|
||||
|
||||
exit_release_gpu:
|
||||
uvm_gpu_release(gpu);
|
||||
@@ -2184,42 +2157,42 @@ exit_release_gpu:
|
||||
return status;
|
||||
}
|
||||
|
||||
void uvm_parent_gpu_access_counters_set_ignore(uvm_parent_gpu_t *parent_gpu, bool do_ignore)
|
||||
void uvm_gpu_access_counters_set_ignore(uvm_gpu_t *gpu, bool do_ignore)
|
||||
{
|
||||
bool change_intr_state = false;
|
||||
|
||||
if (!parent_gpu->access_counters_supported)
|
||||
if (!gpu->parent->access_counters_supported)
|
||||
return;
|
||||
|
||||
uvm_parent_gpu_access_counters_isr_lock(parent_gpu);
|
||||
uvm_gpu_access_counters_isr_lock(gpu->parent);
|
||||
|
||||
if (do_ignore) {
|
||||
if (parent_gpu->access_counter_buffer_info.notifications_ignored_count++ == 0)
|
||||
if (gpu->parent->access_counter_buffer_info.notifications_ignored_count++ == 0)
|
||||
change_intr_state = true;
|
||||
}
|
||||
else {
|
||||
UVM_ASSERT(parent_gpu->access_counter_buffer_info.notifications_ignored_count >= 1);
|
||||
if (--parent_gpu->access_counter_buffer_info.notifications_ignored_count == 0)
|
||||
UVM_ASSERT(gpu->parent->access_counter_buffer_info.notifications_ignored_count >= 1);
|
||||
if (--gpu->parent->access_counter_buffer_info.notifications_ignored_count == 0)
|
||||
change_intr_state = true;
|
||||
}
|
||||
|
||||
if (change_intr_state) {
|
||||
// We need to avoid an interrupt storm while ignoring notifications. We
|
||||
// just disable the interrupt.
|
||||
uvm_spin_lock_irqsave(&parent_gpu->isr.interrupts_lock);
|
||||
uvm_spin_lock_irqsave(&gpu->parent->isr.interrupts_lock);
|
||||
|
||||
if (do_ignore)
|
||||
uvm_parent_gpu_access_counters_intr_disable(parent_gpu);
|
||||
uvm_gpu_access_counters_intr_disable(gpu->parent);
|
||||
else
|
||||
uvm_parent_gpu_access_counters_intr_enable(parent_gpu);
|
||||
uvm_gpu_access_counters_intr_enable(gpu->parent);
|
||||
|
||||
uvm_spin_unlock_irqrestore(&parent_gpu->isr.interrupts_lock);
|
||||
uvm_spin_unlock_irqrestore(&gpu->parent->isr.interrupts_lock);
|
||||
|
||||
if (!do_ignore)
|
||||
access_counter_buffer_flush_locked(parent_gpu, UVM_GPU_BUFFER_FLUSH_MODE_CACHED_PUT);
|
||||
access_counter_buffer_flush_locked(gpu, UVM_GPU_BUFFER_FLUSH_MODE_CACHED_PUT);
|
||||
}
|
||||
|
||||
uvm_parent_gpu_access_counters_isr_unlock(parent_gpu);
|
||||
uvm_gpu_access_counters_isr_unlock(gpu->parent);
|
||||
}
|
||||
|
||||
NV_STATUS uvm_test_set_ignore_access_counters(UVM_TEST_SET_IGNORE_ACCESS_COUNTERS_PARAMS *params, struct file *filp)
|
||||
@@ -2233,7 +2206,7 @@ NV_STATUS uvm_test_set_ignore_access_counters(UVM_TEST_SET_IGNORE_ACCESS_COUNTER
|
||||
return NV_ERR_INVALID_DEVICE;
|
||||
|
||||
if (gpu->parent->access_counters_supported)
|
||||
uvm_parent_gpu_access_counters_set_ignore(gpu->parent, params->ignore);
|
||||
uvm_gpu_access_counters_set_ignore(gpu, params->ignore);
|
||||
else
|
||||
status = NV_ERR_NOT_SUPPORTED;
|
||||
|
||||
|
||||
@@ -1,5 +1,5 @@
|
||||
/*******************************************************************************
|
||||
Copyright (c) 2017-2024 NVIDIA Corporation
|
||||
Copyright (c) 2017 NVIDIA Corporation
|
||||
|
||||
Permission is hereby granted, free of charge, to any person obtaining a copy
|
||||
of this software and associated documentation files (the "Software"), to
|
||||
@@ -27,13 +27,13 @@
|
||||
#include "uvm_forward_decl.h"
|
||||
#include "uvm_test_ioctl.h"
|
||||
|
||||
NV_STATUS uvm_parent_gpu_init_access_counters(uvm_parent_gpu_t *parent_gpu);
|
||||
void uvm_parent_gpu_deinit_access_counters(uvm_parent_gpu_t *parent_gpu);
|
||||
bool uvm_parent_gpu_access_counters_pending(uvm_parent_gpu_t *parent_gpu);
|
||||
NV_STATUS uvm_gpu_init_access_counters(uvm_parent_gpu_t *parent_gpu);
|
||||
void uvm_gpu_deinit_access_counters(uvm_parent_gpu_t *parent_gpu);
|
||||
bool uvm_gpu_access_counters_pending(uvm_parent_gpu_t *parent_gpu);
|
||||
|
||||
void uvm_gpu_service_access_counters(uvm_gpu_t *gpu);
|
||||
|
||||
void uvm_parent_gpu_access_counter_buffer_flush(uvm_parent_gpu_t *parent_gpu);
|
||||
void uvm_gpu_access_counter_buffer_flush(uvm_gpu_t *gpu);
|
||||
|
||||
// Ignore or unignore access counters notifications. Ignoring means that the
|
||||
// bottom half is a no-op which just leaves notifications in the HW buffer
|
||||
@@ -46,7 +46,7 @@ void uvm_parent_gpu_access_counter_buffer_flush(uvm_parent_gpu_t *parent_gpu);
|
||||
//
|
||||
// When uningoring, the interrupt conditions will be re-evaluated to trigger
|
||||
// processing of buffered notifications, if any exist.
|
||||
void uvm_parent_gpu_access_counters_set_ignore(uvm_parent_gpu_t *parent_gpu, bool do_ignore);
|
||||
void uvm_gpu_access_counters_set_ignore(uvm_gpu_t *gpu, bool do_ignore);
|
||||
|
||||
// Return whether the VA space has access counter migrations enabled. The
|
||||
// caller must ensure that the VA space cannot go away.
|
||||
@@ -63,7 +63,7 @@ void uvm_perf_access_counters_unload(uvm_va_space_t *va_space);
|
||||
|
||||
// Check whether access counters should be enabled when the given GPU is
|
||||
// registered on any VA space.
|
||||
bool uvm_parent_gpu_access_counters_required(const uvm_parent_gpu_t *parent_gpu);
|
||||
bool uvm_gpu_access_counters_required(const uvm_parent_gpu_t *parent_gpu);
|
||||
|
||||
// Functions used to enable/disable access counters on a GPU in the given VA
|
||||
// space.
|
||||
@@ -72,12 +72,12 @@ bool uvm_parent_gpu_access_counters_required(const uvm_parent_gpu_t *parent_gpu)
|
||||
// counters are currently enabled. The hardware notifications and interrupts on
|
||||
// the GPU are enabled the first time any VA space invokes
|
||||
// uvm_gpu_access_counters_enable, and disabled when the last VA space invokes
|
||||
// uvm_parent_gpu_access_counters_disable().
|
||||
// uvm_gpu_access_counters_disable
|
||||
//
|
||||
// Locking: the VA space lock must not be held by the caller since these
|
||||
// functions may take the access counters ISR lock.
|
||||
NV_STATUS uvm_gpu_access_counters_enable(uvm_gpu_t *gpu, uvm_va_space_t *va_space);
|
||||
void uvm_parent_gpu_access_counters_disable(uvm_parent_gpu_t *parent_gpu, uvm_va_space_t *va_space);
|
||||
void uvm_gpu_access_counters_disable(uvm_gpu_t *gpu, uvm_va_space_t *va_space);
|
||||
|
||||
NV_STATUS uvm_test_access_counters_enabled_by_default(UVM_TEST_ACCESS_COUNTERS_ENABLED_BY_DEFAULT_PARAMS *params,
|
||||
struct file *filp);
|
||||
|
||||
@@ -1,5 +1,5 @@
|
||||
/*******************************************************************************
|
||||
Copyright (c) 2016-2024 NVIDIA Corporation
|
||||
Copyright (c) 2016-2021 NVIDIA Corporation
|
||||
|
||||
Permission is hereby granted, free of charge, to any person obtaining a copy
|
||||
of this software and associated documentation files (the "Software"), to
|
||||
@@ -67,21 +67,21 @@ static void access_counters_isr_bottom_half_entry(void *args);
|
||||
// interrupts should be disabled. The caller is guaranteed that replayable page
|
||||
// faults are disabled upon return. Interrupts might already be disabled prior
|
||||
// to making this call. Each call is ref-counted, so this must be paired with a
|
||||
// call to uvm_parent_gpu_replayable_faults_intr_enable().
|
||||
// call to uvm_gpu_replayable_faults_intr_enable().
|
||||
//
|
||||
// parent_gpu->isr.interrupts_lock must be held to call this function.
|
||||
static void uvm_parent_gpu_replayable_faults_intr_disable(uvm_parent_gpu_t *parent_gpu);
|
||||
static void uvm_gpu_replayable_faults_intr_disable(uvm_parent_gpu_t *parent_gpu);
|
||||
|
||||
// Decrements the reference count tracking whether replayable page fault
|
||||
// interrupts should be disabled. Only once the count reaches 0 are the HW
|
||||
// interrupts actually enabled, so this call does not guarantee that the
|
||||
// interrupts have been re-enabled upon return.
|
||||
//
|
||||
// uvm_parent_gpu_replayable_faults_intr_disable() must have been called prior
|
||||
// to calling this function.
|
||||
// uvm_gpu_replayable_faults_intr_disable() must have been called prior to
|
||||
// calling this function.
|
||||
//
|
||||
// parent_gpu->isr.interrupts_lock must be held to call this function.
|
||||
static void uvm_parent_gpu_replayable_faults_intr_enable(uvm_parent_gpu_t *parent_gpu);
|
||||
static void uvm_gpu_replayable_faults_intr_enable(uvm_parent_gpu_t *parent_gpu);
|
||||
|
||||
static unsigned schedule_replayable_faults_handler(uvm_parent_gpu_t *parent_gpu)
|
||||
{
|
||||
@@ -100,7 +100,7 @@ static unsigned schedule_replayable_faults_handler(uvm_parent_gpu_t *parent_gpu)
|
||||
if (down_trylock(&parent_gpu->isr.replayable_faults.service_lock.sem) != 0)
|
||||
return 0;
|
||||
|
||||
if (!uvm_parent_gpu_replayable_faults_pending(parent_gpu)) {
|
||||
if (!uvm_gpu_replayable_faults_pending(parent_gpu)) {
|
||||
up(&parent_gpu->isr.replayable_faults.service_lock.sem);
|
||||
return 0;
|
||||
}
|
||||
@@ -108,7 +108,7 @@ static unsigned schedule_replayable_faults_handler(uvm_parent_gpu_t *parent_gpu)
|
||||
nv_kref_get(&parent_gpu->gpu_kref);
|
||||
|
||||
// Interrupts need to be disabled here to avoid an interrupt storm
|
||||
uvm_parent_gpu_replayable_faults_intr_disable(parent_gpu);
|
||||
uvm_gpu_replayable_faults_intr_disable(parent_gpu);
|
||||
|
||||
// Schedule a bottom half, but do *not* release the GPU ISR lock. The bottom
|
||||
// half releases the GPU ISR lock as part of its cleanup.
|
||||
@@ -137,7 +137,7 @@ static unsigned schedule_non_replayable_faults_handler(uvm_parent_gpu_t *parent_
|
||||
// interrupts will be triggered by the gpu and faults may stay
|
||||
// unserviced. Therefore, if there is a fault in the queue, we schedule
|
||||
// a bottom half unconditionally.
|
||||
if (!uvm_parent_gpu_non_replayable_faults_pending(parent_gpu))
|
||||
if (!uvm_gpu_non_replayable_faults_pending(parent_gpu))
|
||||
return 0;
|
||||
|
||||
nv_kref_get(&parent_gpu->gpu_kref);
|
||||
@@ -167,7 +167,7 @@ static unsigned schedule_access_counters_handler(uvm_parent_gpu_t *parent_gpu)
|
||||
if (down_trylock(&parent_gpu->isr.access_counters.service_lock.sem) != 0)
|
||||
return 0;
|
||||
|
||||
if (!uvm_parent_gpu_access_counters_pending(parent_gpu)) {
|
||||
if (!uvm_gpu_access_counters_pending(parent_gpu)) {
|
||||
up(&parent_gpu->isr.access_counters.service_lock.sem);
|
||||
return 0;
|
||||
}
|
||||
@@ -175,7 +175,7 @@ static unsigned schedule_access_counters_handler(uvm_parent_gpu_t *parent_gpu)
|
||||
nv_kref_get(&parent_gpu->gpu_kref);
|
||||
|
||||
// Interrupts need to be disabled to avoid an interrupt storm
|
||||
uvm_parent_gpu_access_counters_intr_disable(parent_gpu);
|
||||
uvm_gpu_access_counters_intr_disable(parent_gpu);
|
||||
|
||||
nv_kthread_q_schedule_q_item(&parent_gpu->isr.bottom_half_q,
|
||||
&parent_gpu->isr.access_counters.bottom_half_q_item);
|
||||
@@ -288,18 +288,17 @@ static NV_STATUS init_queue_on_node(nv_kthread_q_t *queue, const char *name, int
|
||||
return errno_to_nv_status(nv_kthread_q_init(queue, name));
|
||||
}
|
||||
|
||||
NV_STATUS uvm_parent_gpu_init_isr(uvm_parent_gpu_t *parent_gpu)
|
||||
NV_STATUS uvm_gpu_init_isr(uvm_parent_gpu_t *parent_gpu)
|
||||
{
|
||||
NV_STATUS status = NV_OK;
|
||||
char kthread_name[TASK_COMM_LEN + 1];
|
||||
uvm_va_block_context_t *block_context;
|
||||
|
||||
if (parent_gpu->replayable_faults_supported) {
|
||||
status = uvm_parent_gpu_fault_buffer_init(parent_gpu);
|
||||
status = uvm_gpu_fault_buffer_init(parent_gpu);
|
||||
if (status != NV_OK) {
|
||||
UVM_ERR_PRINT("Failed to initialize GPU fault buffer: %s, GPU: %s\n",
|
||||
nvstatusToString(status),
|
||||
uvm_parent_gpu_name(parent_gpu));
|
||||
parent_gpu->name);
|
||||
return status;
|
||||
}
|
||||
|
||||
@@ -312,20 +311,14 @@ NV_STATUS uvm_parent_gpu_init_isr(uvm_parent_gpu_t *parent_gpu)
|
||||
if (!parent_gpu->isr.replayable_faults.stats.cpu_exec_count)
|
||||
return NV_ERR_NO_MEMORY;
|
||||
|
||||
block_context = uvm_va_block_context_alloc(NULL);
|
||||
if (!block_context)
|
||||
return NV_ERR_NO_MEMORY;
|
||||
|
||||
parent_gpu->fault_buffer_info.replayable.block_service_context.block_context = block_context;
|
||||
|
||||
parent_gpu->isr.replayable_faults.handling = true;
|
||||
|
||||
snprintf(kthread_name, sizeof(kthread_name), "UVM GPU%u BH", uvm_parent_id_value(parent_gpu->id));
|
||||
snprintf(kthread_name, sizeof(kthread_name), "UVM GPU%u BH", uvm_id_value(parent_gpu->id));
|
||||
status = init_queue_on_node(&parent_gpu->isr.bottom_half_q, kthread_name, parent_gpu->closest_cpu_numa_node);
|
||||
if (status != NV_OK) {
|
||||
UVM_ERR_PRINT("Failed in nv_kthread_q_init for bottom_half_q: %s, GPU %s\n",
|
||||
nvstatusToString(status),
|
||||
uvm_parent_gpu_name(parent_gpu));
|
||||
parent_gpu->name);
|
||||
return status;
|
||||
}
|
||||
|
||||
@@ -340,42 +333,29 @@ NV_STATUS uvm_parent_gpu_init_isr(uvm_parent_gpu_t *parent_gpu)
|
||||
if (!parent_gpu->isr.non_replayable_faults.stats.cpu_exec_count)
|
||||
return NV_ERR_NO_MEMORY;
|
||||
|
||||
block_context = uvm_va_block_context_alloc(NULL);
|
||||
if (!block_context)
|
||||
return NV_ERR_NO_MEMORY;
|
||||
|
||||
parent_gpu->fault_buffer_info.non_replayable.block_service_context.block_context = block_context;
|
||||
|
||||
parent_gpu->isr.non_replayable_faults.handling = true;
|
||||
|
||||
snprintf(kthread_name, sizeof(kthread_name), "UVM GPU%u KC", uvm_parent_id_value(parent_gpu->id));
|
||||
snprintf(kthread_name, sizeof(kthread_name), "UVM GPU%u KC", uvm_id_value(parent_gpu->id));
|
||||
status = init_queue_on_node(&parent_gpu->isr.kill_channel_q,
|
||||
kthread_name,
|
||||
parent_gpu->closest_cpu_numa_node);
|
||||
if (status != NV_OK) {
|
||||
UVM_ERR_PRINT("Failed in nv_kthread_q_init for kill_channel_q: %s, GPU %s\n",
|
||||
nvstatusToString(status),
|
||||
uvm_parent_gpu_name(parent_gpu));
|
||||
parent_gpu->name);
|
||||
return status;
|
||||
}
|
||||
}
|
||||
|
||||
if (parent_gpu->access_counters_supported) {
|
||||
status = uvm_parent_gpu_init_access_counters(parent_gpu);
|
||||
status = uvm_gpu_init_access_counters(parent_gpu);
|
||||
if (status != NV_OK) {
|
||||
UVM_ERR_PRINT("Failed to initialize GPU access counters: %s, GPU: %s\n",
|
||||
nvstatusToString(status),
|
||||
uvm_parent_gpu_name(parent_gpu));
|
||||
parent_gpu->name);
|
||||
return status;
|
||||
}
|
||||
|
||||
block_context = uvm_va_block_context_alloc(NULL);
|
||||
if (!block_context)
|
||||
return NV_ERR_NO_MEMORY;
|
||||
|
||||
parent_gpu->access_counter_buffer_info.batch_service_context.block_service_context.block_context =
|
||||
block_context;
|
||||
|
||||
nv_kthread_q_item_init(&parent_gpu->isr.access_counters.bottom_half_q_item,
|
||||
access_counters_isr_bottom_half_entry,
|
||||
parent_gpu);
|
||||
@@ -393,13 +373,13 @@ NV_STATUS uvm_parent_gpu_init_isr(uvm_parent_gpu_t *parent_gpu)
|
||||
return NV_OK;
|
||||
}
|
||||
|
||||
void uvm_parent_gpu_flush_bottom_halves(uvm_parent_gpu_t *parent_gpu)
|
||||
void uvm_gpu_flush_bottom_halves(uvm_parent_gpu_t *parent_gpu)
|
||||
{
|
||||
nv_kthread_q_flush(&parent_gpu->isr.bottom_half_q);
|
||||
nv_kthread_q_flush(&parent_gpu->isr.kill_channel_q);
|
||||
}
|
||||
|
||||
void uvm_parent_gpu_disable_isr(uvm_parent_gpu_t *parent_gpu)
|
||||
void uvm_gpu_disable_isr(uvm_parent_gpu_t *parent_gpu)
|
||||
{
|
||||
UVM_ASSERT(parent_gpu->isr.access_counters.handling_ref_count == 0);
|
||||
|
||||
@@ -408,7 +388,7 @@ void uvm_parent_gpu_disable_isr(uvm_parent_gpu_t *parent_gpu)
|
||||
// any more bottom halves.
|
||||
uvm_spin_lock_irqsave(&parent_gpu->isr.interrupts_lock);
|
||||
|
||||
uvm_parent_gpu_replayable_faults_intr_disable(parent_gpu);
|
||||
uvm_gpu_replayable_faults_intr_disable(parent_gpu);
|
||||
|
||||
parent_gpu->isr.replayable_faults.was_handling = parent_gpu->isr.replayable_faults.handling;
|
||||
parent_gpu->isr.non_replayable_faults.was_handling = parent_gpu->isr.non_replayable_faults.handling;
|
||||
@@ -423,57 +403,44 @@ void uvm_parent_gpu_disable_isr(uvm_parent_gpu_t *parent_gpu)
|
||||
// bottom half never take the global lock, since we're holding it here.
|
||||
//
|
||||
// Note that it's safe to call nv_kthread_q_stop() even if
|
||||
// nv_kthread_q_init() failed in uvm_parent_gpu_init_isr().
|
||||
// nv_kthread_q_init() failed in uvm_gpu_init_isr().
|
||||
nv_kthread_q_stop(&parent_gpu->isr.bottom_half_q);
|
||||
nv_kthread_q_stop(&parent_gpu->isr.kill_channel_q);
|
||||
}
|
||||
|
||||
void uvm_parent_gpu_deinit_isr(uvm_parent_gpu_t *parent_gpu)
|
||||
void uvm_gpu_deinit_isr(uvm_parent_gpu_t *parent_gpu)
|
||||
{
|
||||
uvm_va_block_context_t *block_context;
|
||||
|
||||
// Return ownership to RM:
|
||||
if (parent_gpu->isr.replayable_faults.was_handling) {
|
||||
// No user threads could have anything left on
|
||||
// replayable_faults.disable_intr_ref_count since they must retain the
|
||||
// GPU across uvm_parent_gpu_replayable_faults_isr_lock/
|
||||
// uvm_parent_gpu_replayable_faults_isr_unlock. This means the
|
||||
// uvm_parent_gpu_replayable_faults_disable_intr above could only have
|
||||
// raced with bottom halves.
|
||||
// GPU across uvm_gpu_replayable_faults_isr_lock/
|
||||
// uvm_gpu_replayable_faults_isr_unlock. This means the
|
||||
// uvm_gpu_replayable_faults_disable_intr above could only have raced
|
||||
// with bottom halves.
|
||||
//
|
||||
// If we cleared replayable_faults.handling before the bottom half got
|
||||
// to its uvm_parent_gpu_replayable_faults_isr_unlock, when it
|
||||
// eventually reached uvm_parent_gpu_replayable_faults_isr_unlock it
|
||||
// would have skipped the disable, leaving us with extra ref counts
|
||||
// here.
|
||||
// to its uvm_gpu_replayable_faults_isr_unlock, when it eventually
|
||||
// reached uvm_gpu_replayable_faults_isr_unlock it would have skipped
|
||||
// the disable, leaving us with extra ref counts here.
|
||||
//
|
||||
// In any case we're guaranteed that replayable faults interrupts are
|
||||
// disabled and can't get re-enabled, so we can safely ignore the ref
|
||||
// count value and just clean things up.
|
||||
UVM_ASSERT_MSG(parent_gpu->isr.replayable_faults.disable_intr_ref_count > 0,
|
||||
"%s replayable_faults.disable_intr_ref_count: %llu\n",
|
||||
uvm_parent_gpu_name(parent_gpu),
|
||||
parent_gpu->name,
|
||||
parent_gpu->isr.replayable_faults.disable_intr_ref_count);
|
||||
|
||||
uvm_parent_gpu_fault_buffer_deinit(parent_gpu);
|
||||
uvm_gpu_fault_buffer_deinit(parent_gpu);
|
||||
}
|
||||
|
||||
if (parent_gpu->access_counters_supported) {
|
||||
// It is safe to deinitialize access counters even if they have not been
|
||||
// successfully initialized.
|
||||
uvm_parent_gpu_deinit_access_counters(parent_gpu);
|
||||
block_context =
|
||||
parent_gpu->access_counter_buffer_info.batch_service_context.block_service_context.block_context;
|
||||
uvm_va_block_context_free(block_context);
|
||||
uvm_gpu_deinit_access_counters(parent_gpu);
|
||||
}
|
||||
|
||||
if (parent_gpu->non_replayable_faults_supported) {
|
||||
block_context = parent_gpu->fault_buffer_info.non_replayable.block_service_context.block_context;
|
||||
uvm_va_block_context_free(block_context);
|
||||
}
|
||||
|
||||
block_context = parent_gpu->fault_buffer_info.replayable.block_service_context.block_context;
|
||||
uvm_va_block_context_free(block_context);
|
||||
uvm_kvfree(parent_gpu->isr.replayable_faults.stats.cpu_exec_count);
|
||||
uvm_kvfree(parent_gpu->isr.non_replayable_faults.stats.cpu_exec_count);
|
||||
uvm_kvfree(parent_gpu->isr.access_counters.stats.cpu_exec_count);
|
||||
@@ -481,6 +448,7 @@ void uvm_parent_gpu_deinit_isr(uvm_parent_gpu_t *parent_gpu)
|
||||
|
||||
static uvm_gpu_t *find_first_valid_gpu(uvm_parent_gpu_t *parent_gpu)
|
||||
{
|
||||
uvm_global_gpu_id_t global_gpu_id = uvm_global_gpu_id_from_gpu_id(parent_gpu->id);
|
||||
uvm_gpu_t *gpu;
|
||||
|
||||
// When SMC is enabled, there's no longer a 1:1 relationship between the
|
||||
@@ -495,10 +463,10 @@ static uvm_gpu_t *find_first_valid_gpu(uvm_parent_gpu_t *parent_gpu)
|
||||
|
||||
uvm_spin_lock_irqsave(&g_uvm_global.gpu_table_lock);
|
||||
|
||||
sub_processor_index = find_first_bit(parent_gpu->valid_gpus, UVM_PARENT_ID_MAX_SUB_PROCESSORS);
|
||||
sub_processor_index = find_first_bit(parent_gpu->valid_gpus, UVM_ID_MAX_SUB_PROCESSORS);
|
||||
|
||||
if (sub_processor_index < UVM_PARENT_ID_MAX_SUB_PROCESSORS) {
|
||||
gpu = parent_gpu->gpus[sub_processor_index];
|
||||
if (sub_processor_index < UVM_ID_MAX_SUB_PROCESSORS) {
|
||||
gpu = uvm_gpu_get(uvm_global_id_from_value(uvm_global_id_value(global_gpu_id) + sub_processor_index));
|
||||
UVM_ASSERT(gpu != NULL);
|
||||
}
|
||||
else {
|
||||
@@ -508,7 +476,7 @@ static uvm_gpu_t *find_first_valid_gpu(uvm_parent_gpu_t *parent_gpu)
|
||||
uvm_spin_unlock_irqrestore(&g_uvm_global.gpu_table_lock);
|
||||
}
|
||||
else {
|
||||
gpu = parent_gpu->gpus[0];
|
||||
gpu = uvm_gpu_get(global_gpu_id);
|
||||
UVM_ASSERT(gpu != NULL);
|
||||
}
|
||||
|
||||
@@ -547,12 +515,12 @@ static void replayable_faults_isr_bottom_half(void *args)
|
||||
|
||||
uvm_gpu_service_replayable_faults(gpu);
|
||||
|
||||
uvm_parent_gpu_replayable_faults_isr_unlock(parent_gpu);
|
||||
uvm_gpu_replayable_faults_isr_unlock(parent_gpu);
|
||||
|
||||
put_kref:
|
||||
// It is OK to drop a reference on the parent GPU if a bottom half has
|
||||
// been retriggered within uvm_parent_gpu_replayable_faults_isr_unlock,
|
||||
// because the rescheduling added an additional reference.
|
||||
// been retriggered within uvm_gpu_replayable_faults_isr_unlock, because the
|
||||
// rescheduling added an additional reference.
|
||||
uvm_parent_gpu_kref_put(parent_gpu);
|
||||
}
|
||||
|
||||
@@ -573,7 +541,7 @@ static void non_replayable_faults_isr_bottom_half(void *args)
|
||||
|
||||
UVM_ASSERT(parent_gpu->non_replayable_faults_supported);
|
||||
|
||||
uvm_parent_gpu_non_replayable_faults_isr_lock(parent_gpu);
|
||||
uvm_gpu_non_replayable_faults_isr_lock(parent_gpu);
|
||||
|
||||
// Multiple bottom halves for non-replayable faults can be running
|
||||
// concurrently, but only one can enter this section for a given GPU
|
||||
@@ -586,7 +554,7 @@ static void non_replayable_faults_isr_bottom_half(void *args)
|
||||
|
||||
uvm_gpu_service_non_replayable_fault_buffer(gpu);
|
||||
|
||||
uvm_parent_gpu_non_replayable_faults_isr_unlock(parent_gpu);
|
||||
uvm_gpu_non_replayable_faults_isr_unlock(parent_gpu);
|
||||
|
||||
put_kref:
|
||||
uvm_parent_gpu_kref_put(parent_gpu);
|
||||
@@ -622,7 +590,7 @@ static void access_counters_isr_bottom_half(void *args)
|
||||
|
||||
uvm_gpu_service_access_counters(gpu);
|
||||
|
||||
uvm_parent_gpu_access_counters_isr_unlock(parent_gpu);
|
||||
uvm_gpu_access_counters_isr_unlock(parent_gpu);
|
||||
|
||||
put_kref:
|
||||
uvm_parent_gpu_kref_put(parent_gpu);
|
||||
@@ -651,7 +619,7 @@ static void replayable_faults_retrigger_bottom_half(uvm_parent_gpu_t *parent_gpu
|
||||
//
|
||||
// (1) UVM didn't process all the entries up to cached PUT
|
||||
//
|
||||
// (2) UVM did process all the entries up to cached PUT, but GSP-RM
|
||||
// (2) UVM did process all the entries up to cached PUT, but GPS-RM
|
||||
// added new entries such that cached PUT is out-of-date
|
||||
//
|
||||
// In both cases, re-enablement of interrupts would have caused the
|
||||
@@ -663,7 +631,7 @@ static void replayable_faults_retrigger_bottom_half(uvm_parent_gpu_t *parent_gpu
|
||||
// While in the typical case the retriggering happens within a replayable
|
||||
// fault bottom half, it can also happen within a non-interrupt path such as
|
||||
// uvm_gpu_fault_buffer_flush.
|
||||
if (g_uvm_global.conf_computing_enabled)
|
||||
if (uvm_conf_computing_mode_enabled_parent(parent_gpu))
|
||||
retrigger = true;
|
||||
|
||||
if (!retrigger)
|
||||
@@ -678,7 +646,7 @@ static void replayable_faults_retrigger_bottom_half(uvm_parent_gpu_t *parent_gpu
|
||||
uvm_spin_unlock_irqrestore(&parent_gpu->isr.interrupts_lock);
|
||||
}
|
||||
|
||||
void uvm_parent_gpu_replayable_faults_isr_lock(uvm_parent_gpu_t *parent_gpu)
|
||||
void uvm_gpu_replayable_faults_isr_lock(uvm_parent_gpu_t *parent_gpu)
|
||||
{
|
||||
UVM_ASSERT(nv_kref_read(&parent_gpu->gpu_kref) > 0);
|
||||
|
||||
@@ -687,7 +655,7 @@ void uvm_parent_gpu_replayable_faults_isr_lock(uvm_parent_gpu_t *parent_gpu)
|
||||
// Bump the disable ref count. This guarantees that the bottom half or
|
||||
// another thread trying to take the replayable_faults.service_lock won't
|
||||
// inadvertently re-enable interrupts during this locking sequence.
|
||||
uvm_parent_gpu_replayable_faults_intr_disable(parent_gpu);
|
||||
uvm_gpu_replayable_faults_intr_disable(parent_gpu);
|
||||
|
||||
uvm_spin_unlock_irqrestore(&parent_gpu->isr.interrupts_lock);
|
||||
|
||||
@@ -696,7 +664,7 @@ void uvm_parent_gpu_replayable_faults_isr_lock(uvm_parent_gpu_t *parent_gpu)
|
||||
uvm_down(&parent_gpu->isr.replayable_faults.service_lock);
|
||||
}
|
||||
|
||||
void uvm_parent_gpu_replayable_faults_isr_unlock(uvm_parent_gpu_t *parent_gpu)
|
||||
void uvm_gpu_replayable_faults_isr_unlock(uvm_parent_gpu_t *parent_gpu)
|
||||
{
|
||||
UVM_ASSERT(nv_kref_read(&parent_gpu->gpu_kref) > 0);
|
||||
|
||||
@@ -733,10 +701,9 @@ void uvm_parent_gpu_replayable_faults_isr_unlock(uvm_parent_gpu_t *parent_gpu)
|
||||
// Note that if we're in the bottom half and the GPU was removed before
|
||||
// we checked replayable_faults.handling, we won't drop our interrupt
|
||||
// disable ref count from the corresponding top-half call to
|
||||
// uvm_parent_gpu_replayable_faults_intr_disable. That's ok because
|
||||
// remove_gpu ignores the refcount after waiting for the bottom half to
|
||||
// finish.
|
||||
uvm_parent_gpu_replayable_faults_intr_enable(parent_gpu);
|
||||
// uvm_gpu_replayable_faults_intr_disable. That's ok because remove_gpu
|
||||
// ignores the refcount after waiting for the bottom half to finish.
|
||||
uvm_gpu_replayable_faults_intr_enable(parent_gpu);
|
||||
|
||||
// Rearm pulse interrupts. This guarantees that the state of the pending
|
||||
// interrupt is current and the top level rearm performed by RM is only
|
||||
@@ -763,42 +730,42 @@ void uvm_parent_gpu_replayable_faults_isr_unlock(uvm_parent_gpu_t *parent_gpu)
|
||||
replayable_faults_retrigger_bottom_half(parent_gpu);
|
||||
}
|
||||
|
||||
void uvm_parent_gpu_non_replayable_faults_isr_lock(uvm_parent_gpu_t *parent_gpu)
|
||||
void uvm_gpu_non_replayable_faults_isr_lock(uvm_parent_gpu_t *parent_gpu)
|
||||
{
|
||||
UVM_ASSERT(nv_kref_read(&parent_gpu->gpu_kref) > 0);
|
||||
|
||||
uvm_down(&parent_gpu->isr.non_replayable_faults.service_lock);
|
||||
}
|
||||
|
||||
void uvm_parent_gpu_non_replayable_faults_isr_unlock(uvm_parent_gpu_t *parent_gpu)
|
||||
void uvm_gpu_non_replayable_faults_isr_unlock(uvm_parent_gpu_t *parent_gpu)
|
||||
{
|
||||
UVM_ASSERT(nv_kref_read(&parent_gpu->gpu_kref) > 0);
|
||||
|
||||
uvm_up(&parent_gpu->isr.non_replayable_faults.service_lock);
|
||||
}
|
||||
|
||||
void uvm_parent_gpu_access_counters_isr_lock(uvm_parent_gpu_t *parent_gpu)
|
||||
void uvm_gpu_access_counters_isr_lock(uvm_parent_gpu_t *parent_gpu)
|
||||
{
|
||||
// See comments in uvm_parent_gpu_replayable_faults_isr_lock
|
||||
// See comments in uvm_gpu_replayable_faults_isr_lock
|
||||
|
||||
uvm_spin_lock_irqsave(&parent_gpu->isr.interrupts_lock);
|
||||
|
||||
uvm_parent_gpu_access_counters_intr_disable(parent_gpu);
|
||||
uvm_gpu_access_counters_intr_disable(parent_gpu);
|
||||
|
||||
uvm_spin_unlock_irqrestore(&parent_gpu->isr.interrupts_lock);
|
||||
|
||||
uvm_down(&parent_gpu->isr.access_counters.service_lock);
|
||||
}
|
||||
|
||||
void uvm_parent_gpu_access_counters_isr_unlock(uvm_parent_gpu_t *parent_gpu)
|
||||
void uvm_gpu_access_counters_isr_unlock(uvm_parent_gpu_t *parent_gpu)
|
||||
{
|
||||
UVM_ASSERT(nv_kref_read(&parent_gpu->gpu_kref) > 0);
|
||||
|
||||
// See comments in uvm_parent_gpu_replayable_faults_isr_unlock
|
||||
// See comments in uvm_gpu_replayable_faults_isr_unlock
|
||||
|
||||
uvm_spin_lock_irqsave(&parent_gpu->isr.interrupts_lock);
|
||||
|
||||
uvm_parent_gpu_access_counters_intr_enable(parent_gpu);
|
||||
uvm_gpu_access_counters_intr_enable(parent_gpu);
|
||||
|
||||
if (parent_gpu->isr.access_counters.handling_ref_count > 0) {
|
||||
parent_gpu->access_counter_buffer_hal->clear_access_counter_notifications(parent_gpu,
|
||||
@@ -812,7 +779,7 @@ void uvm_parent_gpu_access_counters_isr_unlock(uvm_parent_gpu_t *parent_gpu)
|
||||
uvm_spin_unlock_irqrestore(&parent_gpu->isr.interrupts_lock);
|
||||
}
|
||||
|
||||
static void uvm_parent_gpu_replayable_faults_intr_disable(uvm_parent_gpu_t *parent_gpu)
|
||||
static void uvm_gpu_replayable_faults_intr_disable(uvm_parent_gpu_t *parent_gpu)
|
||||
{
|
||||
uvm_assert_spinlock_locked(&parent_gpu->isr.interrupts_lock);
|
||||
|
||||
@@ -822,7 +789,7 @@ static void uvm_parent_gpu_replayable_faults_intr_disable(uvm_parent_gpu_t *pare
|
||||
++parent_gpu->isr.replayable_faults.disable_intr_ref_count;
|
||||
}
|
||||
|
||||
static void uvm_parent_gpu_replayable_faults_intr_enable(uvm_parent_gpu_t *parent_gpu)
|
||||
static void uvm_gpu_replayable_faults_intr_enable(uvm_parent_gpu_t *parent_gpu)
|
||||
{
|
||||
uvm_assert_spinlock_locked(&parent_gpu->isr.interrupts_lock);
|
||||
UVM_ASSERT(parent_gpu->isr.replayable_faults.disable_intr_ref_count > 0);
|
||||
@@ -832,7 +799,7 @@ static void uvm_parent_gpu_replayable_faults_intr_enable(uvm_parent_gpu_t *paren
|
||||
parent_gpu->fault_buffer_hal->enable_replayable_faults(parent_gpu);
|
||||
}
|
||||
|
||||
void uvm_parent_gpu_access_counters_intr_disable(uvm_parent_gpu_t *parent_gpu)
|
||||
void uvm_gpu_access_counters_intr_disable(uvm_parent_gpu_t *parent_gpu)
|
||||
{
|
||||
uvm_assert_spinlock_locked(&parent_gpu->isr.interrupts_lock);
|
||||
|
||||
@@ -849,7 +816,7 @@ void uvm_parent_gpu_access_counters_intr_disable(uvm_parent_gpu_t *parent_gpu)
|
||||
++parent_gpu->isr.access_counters.disable_intr_ref_count;
|
||||
}
|
||||
|
||||
void uvm_parent_gpu_access_counters_intr_enable(uvm_parent_gpu_t *parent_gpu)
|
||||
void uvm_gpu_access_counters_intr_enable(uvm_parent_gpu_t *parent_gpu)
|
||||
{
|
||||
uvm_assert_spinlock_locked(&parent_gpu->isr.interrupts_lock);
|
||||
UVM_ASSERT(uvm_sem_is_locked(&parent_gpu->isr.access_counters.service_lock));
|
||||
|
||||
@@ -1,5 +1,5 @@
|
||||
/*******************************************************************************
|
||||
Copyright (c) 2016-2023 NVIDIA Corporation
|
||||
Copyright (c) 2016-2019 NVIDIA Corporation
|
||||
|
||||
Permission is hereby granted, free of charge, to any person obtaining a copy
|
||||
of this software and associated documentation files (the "Software"), to
|
||||
@@ -131,19 +131,19 @@ typedef struct
|
||||
NV_STATUS uvm_isr_top_half_entry(const NvProcessorUuid *gpu_uuid);
|
||||
|
||||
// Initialize ISR handling state
|
||||
NV_STATUS uvm_parent_gpu_init_isr(uvm_parent_gpu_t *parent_gpu);
|
||||
NV_STATUS uvm_gpu_init_isr(uvm_parent_gpu_t *parent_gpu);
|
||||
|
||||
// Flush any currently scheduled bottom halves. This is called during GPU
|
||||
// removal.
|
||||
void uvm_parent_gpu_flush_bottom_halves(uvm_parent_gpu_t *parent_gpu);
|
||||
void uvm_gpu_flush_bottom_halves(uvm_parent_gpu_t *parent_gpu);
|
||||
|
||||
// Prevent new bottom halves from being scheduled. This is called during parent
|
||||
// GPU removal.
|
||||
void uvm_parent_gpu_disable_isr(uvm_parent_gpu_t *parent_gpu);
|
||||
void uvm_gpu_disable_isr(uvm_parent_gpu_t *parent_gpu);
|
||||
|
||||
// Destroy ISR handling state and return interrupt ownership to RM. This is
|
||||
// called during parent GPU removal
|
||||
void uvm_parent_gpu_deinit_isr(uvm_parent_gpu_t *parent_gpu);
|
||||
void uvm_gpu_deinit_isr(uvm_parent_gpu_t *parent_gpu);
|
||||
|
||||
// Take parent_gpu->isr.replayable_faults.service_lock from a non-top/bottom
|
||||
// half thread. This will also disable replayable page fault interrupts (if
|
||||
@@ -151,46 +151,46 @@ void uvm_parent_gpu_deinit_isr(uvm_parent_gpu_t *parent_gpu);
|
||||
// would cause an interrupt storm if we didn't disable them first.
|
||||
//
|
||||
// At least one GPU under the parent must have been previously retained.
|
||||
void uvm_parent_gpu_replayable_faults_isr_lock(uvm_parent_gpu_t *parent_gpu);
|
||||
void uvm_gpu_replayable_faults_isr_lock(uvm_parent_gpu_t *parent_gpu);
|
||||
|
||||
// Unlock parent_gpu->isr.replayable_faults.service_lock. This call may
|
||||
// re-enable replayable page fault interrupts. Unlike
|
||||
// uvm_parent_gpu_replayable_faults_isr_lock(), which should only called from
|
||||
// uvm_gpu_replayable_faults_isr_lock(), which should only called from
|
||||
// non-top/bottom half threads, this can be called by any thread.
|
||||
void uvm_parent_gpu_replayable_faults_isr_unlock(uvm_parent_gpu_t *parent_gpu);
|
||||
void uvm_gpu_replayable_faults_isr_unlock(uvm_parent_gpu_t *parent_gpu);
|
||||
|
||||
// Lock/unlock routines for non-replayable faults. These do not need to prevent
|
||||
// interrupt storms since the GPU fault buffers for non-replayable faults are
|
||||
// managed by RM. Unlike uvm_parent_gpu_replayable_faults_isr_lock, no GPUs
|
||||
// under the parent need to have been previously retained.
|
||||
void uvm_parent_gpu_non_replayable_faults_isr_lock(uvm_parent_gpu_t *parent_gpu);
|
||||
void uvm_parent_gpu_non_replayable_faults_isr_unlock(uvm_parent_gpu_t *parent_gpu);
|
||||
// managed by RM. Unlike uvm_gpu_replayable_faults_isr_lock, no GPUs under
|
||||
// the parent need to have been previously retained.
|
||||
void uvm_gpu_non_replayable_faults_isr_lock(uvm_parent_gpu_t *parent_gpu);
|
||||
void uvm_gpu_non_replayable_faults_isr_unlock(uvm_parent_gpu_t *parent_gpu);
|
||||
|
||||
// See uvm_parent_gpu_replayable_faults_isr_lock/unlock
|
||||
void uvm_parent_gpu_access_counters_isr_lock(uvm_parent_gpu_t *parent_gpu);
|
||||
void uvm_parent_gpu_access_counters_isr_unlock(uvm_parent_gpu_t *parent_gpu);
|
||||
// See uvm_gpu_replayable_faults_isr_lock/unlock
|
||||
void uvm_gpu_access_counters_isr_lock(uvm_parent_gpu_t *parent_gpu);
|
||||
void uvm_gpu_access_counters_isr_unlock(uvm_parent_gpu_t *parent_gpu);
|
||||
|
||||
// Increments the reference count tracking whether access counter interrupts
|
||||
// should be disabled. The caller is guaranteed that access counter interrupts
|
||||
// are disabled upon return. Interrupts might already be disabled prior to
|
||||
// making this call. Each call is ref-counted, so this must be paired with a
|
||||
// call to uvm_parent_gpu_access_counters_intr_enable().
|
||||
// call to uvm_gpu_access_counters_intr_enable().
|
||||
//
|
||||
// parent_gpu->isr.interrupts_lock must be held to call this function.
|
||||
void uvm_parent_gpu_access_counters_intr_disable(uvm_parent_gpu_t *parent_gpu);
|
||||
void uvm_gpu_access_counters_intr_disable(uvm_parent_gpu_t *parent_gpu);
|
||||
|
||||
// Decrements the reference count tracking whether access counter interrupts
|
||||
// should be disabled. Only once the count reaches 0 are the HW interrupts
|
||||
// actually enabled, so this call does not guarantee that the interrupts have
|
||||
// been re-enabled upon return.
|
||||
//
|
||||
// uvm_parent_gpu_access_counters_intr_disable() must have been called prior to
|
||||
// calling this function.
|
||||
// uvm_gpu_access_counters_intr_disable() must have been called prior to calling
|
||||
// this function.
|
||||
//
|
||||
// NOTE: For pulse-based interrupts, the caller is responsible for re-arming
|
||||
// the interrupt.
|
||||
//
|
||||
// parent_gpu->isr.interrupts_lock must be held to call this function.
|
||||
void uvm_parent_gpu_access_counters_intr_enable(uvm_parent_gpu_t *parent_gpu);
|
||||
void uvm_gpu_access_counters_intr_enable(uvm_parent_gpu_t *parent_gpu);
|
||||
|
||||
#endif // __UVM_GPU_ISR_H__
|
||||
|
||||
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user