summaryrefslogtreecommitdiff
path: root/libavutil/hwcontext_vulkan.c
Commit message (Collapse)AuthorAge
* Replace all occurences of av_mallocz_array() by av_calloc()Andreas Rheinhardt2021-09-20
| | | | | | | They do the same. Reviewed-by: Paul B Mahol <onemda@gmail.com> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
* hwcontext_vulkan: use GPU memcpy when copying to system RAMLynne2021-08-14
| | | | This should speed it up significantly on systems where it matters.
* hwcontext_vulkan: fix typo in vulkan_device_init()Lynne2021-06-10
| | | | load_functions() did not load the device-level functions.
* hwcontext_vulkan: dlopen libvulkanLynne2021-04-30
| | | | | | | | | | | | While Vulkan itself went more or less the way it was expected to go, libvulkan didn't quite solve all of the opengl loader issues. It's multi-vendor, yes, but unfortunately, the code is Google/Khronos QUALITY, so suffers from big static linking issues (static linking on anything but OSX is unsupported), has bugs, and due to the prefix system used, there are 3 or so ways to type out functions. Just solve all of those problems by dlopening it. We even have nice emulation for it on Windows.
* hwcontext_vulkan: dynamically load functionsLynne2021-04-30
| | | | This patch allows for alternative loader implementations.
* avutil/hwcontext_vulkan: fix format specifiers for some printed variablesJames Almer2021-04-29
| | | | | | | | | | VkPhysicalDeviceLimits.optimalBufferCopyRowPitchAlignment and VkPhysicalDeviceExternalMemoryHostPropertiesEXT.minImportedHostPointerAlignment are of type VkDeviceSize (a typedef uint64_t). VkPhysicalDeviceLimits.minMemoryMapAlignment is of type size_t. Signed-off-by: James Almer <jamrial@gmail.com> Reviewed-by: Lynne <dev@lynne.ee>
* avutil/buffer: Switch AVBuffer API to size_tAndreas Rheinhardt2021-04-27
| | | | | | | Announced in 14040a1d913794d9a3fd6406a6d8c2f0e37e0062. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com> Signed-off-by: James Almer <jamrial@gmail.com>
* avutil: use the buffer_size_t typedef where requiredJames Almer2021-03-10
| | | | Signed-off-by: James Almer <jamrial@gmail.com>
* avutils/vulkan: hwmap, respect src frame resolutionXu Guangxin2021-01-22
| | | | | | | | | | | | | fixes http://trac.ffmpeg.org/ticket/9055 The hw decoder may allocate a large frame from AVHWFramesContext, and adjust width and height based on bitstream. We need to use resolution from src frame instead of AVHWFramesContext. test command: ffmpeg -loglevel debug -hide_banner -hwaccel vaapi -init_hw_device vaapi=va:/dev/dri/renderD128 -hwaccel_device va -hwaccel_output_format vaapi -init_hw_device vulkan=vulk -filter_hw_device vulk -i 1920x1080.264 -c:v libx264 -r:v 30 -profile:v high -preset veryfast -vf "hwmap,chromaber_vulkan=0:0,hwdownload,format=nv12" -map 0 -y vaapiouts.mkv expected: No green bar at bottom.
* hwcontext_vulkan: wait and signal semaphores when transferring to CUDALynne2020-12-05
| | | | | Same as when downloading. Not sure why this isn't done, probably because the CUDA code predates the sync mechanism we settled on.
* hwcontext_vulkan: reduce priority for PACK32 formatsLynne2020-11-27
| | | | Due to some endian-dependent overlap, these should be used last.
* hwcontext_vulkan: optionally enable more functionalityNiklas Haas2020-11-25
| | | | | | These two extensions and two features are both optionally used by libplacebo to speed up rendering, so it makes sense for libavutil to automatically enable them as well.
* hwcontext_vulkan: support additional pixel formatsLynne2020-11-25
| | | | | We support every single packed format possible now. There are some fringe leftover mappings which are uninteresting.
* hwcontext_vulkan: fix incorrect A/0BGR mappingLynne2020-11-25
| | | | | | | | Vulkan formats with a PACK suffix define native endianess. Vulkan formats without a PACK suffix are in bytestream order. Pixel formats with a LE/BE suffix define endianess. Pixel formats without LE/BE suffix are in bytestream order.
* hwcontext_vulkan: simplify plane size calculations and support 4-plane formatsLynne2020-11-25
| | | | Needed to support YUVA.
* hwcontext_vulkan: do not segfault when failing to init a AVHWFramesContextLynne2020-11-25
| | | | | frames_uninit is always called on failure, and the free_exec_ctx function did not zero the pool when freeing it, so it resulted in a double free.
* hwcontext_vulkan: always attempt to map host memory when transferringLynne2020-11-25
| | | | | | | | | | | | | | This relies on the fact that host memory is always going to be required to be aligned to the platform's page size, which means we can adjust the pointers when we map them to buffers and therefore skip an entire copy. This has already had extensive testing in libplacebo without problems, so its safe to use here as well. Speeds up downloads and uploads on platforms which do not pool their memory hugely, but less so on platforms that do. We can pool the buffers ourselves, but that can come as a later patch if necessary.
* hwcontext_vulkan: check for memory size before choosing typeLynne2020-11-25
| | | | | It makes allocation a bit more robust in case some weird device with weird drivers which segments memory in weird ways appears.
* hwcontext_vulkan: correctly access the p->extensions bitmaskLynne2020-11-25
| | | | Its a 64-bit bitfield being put directly into an int.
* hwcontext_vulkan: unify download/upload functionsLynne2020-11-25
| | | | They were identical, save for variable names and order.
* hwcontext_vulkan: add VkExternalMemoryBufferCreateInfo to imported buffersLynne2020-11-25
| | | | Its a validation layer thing.
* hwcontext_vulkan: do not use uninitialized variables on errors in CUDA codeLynne2020-11-25
|
* hwcontext_vulkan: remove plane size alignment checks when host importingLynne2020-08-02
| | | | | | | The process space is guaranteed to be aligned to the page size, hence we're never going to map outside of our address space. There are more optimizations to do with respect to chroma plane alignment and buffer offsets, but that can be done later.
* hwcontext_vulkan: fix uploading and downloading from/to flipped imagesLynne2020-05-26
| | | | | | We want to copy the lowest amount of bytes per line, but while the buffer stride is sanitized, the src/dst stride can be negative, and negative numbers of bytes do not make a lot of sense.
* hwcontext_vulkan: check for dedicated allocation when mapping from drm/vaapiLynne2020-05-26
| | | | | Some vendors (AMD) require dedicated allocation to be used for all imported images.
* hwcontext_vulkan: initialize the frames context when derivingLynne2020-05-26
| | | | | | Otherwise, the frames context is considered to be ready to handle mapping, and it doesn't get initialized the normal way through .frames_init.
* hwcontext_vulkan: use dedicated allocation for buffers when necessaryLynne2020-05-26
|
* hwcontext_vulkan: use host mapped buffers when uploading and downloadingLynne2020-05-26
| | | | Speeds up both use cases by 30%.
* hwcontext_vulkan: move physical device feature discovery to device_initLynne2020-05-23
| | | | Otherwise custom vulkan device contexts won't work.
* hwcontext_vulkan: split uploading and downloading contextsLynne2020-05-23
| | | | This allows us to speed up only-uploading or only-downloading use cases.
* hwcontext_vulkan: set usage for DRM imports to the frames context usageLynne2020-05-23
| | | | | They're nothing special, and there's no reason they should always use the default flags.
* hwcontext_vulkan: do not OR the user-specified usage with our default flagsLynne2020-05-23
| | | | | Some users may need special formats that aren't available when the STORAGE flag bit is set, which would result in allocations failing.
* hwcontext_vulkan: actually use the frames exec context for prep/import/exportLynne2020-05-23
| | | | | | | | This was never actually used, likely due to confusion, as the device context also had one used for uploads and downloads. Also, since we're only using it for very quick image barriers (which are practically free on all hardware), use the compute queue instead of the transfer queue.
* hwcontext_vulkan: support user-provided poolsLynne2020-05-23
| | | | | If an external pool was provided we skipped all of frames init, including the exec context.
* hwcontext_vulkan: use all enabled queues for transfers, make uploads asyncLynne2020-05-23
| | | | | | This commit makes full use of the enabled queues to provide asynchronous uploads of images (downloads remain synchronous). For a pure uploading use cases, the performance gains can be significant.
* hwcontext_vulkan: wrap ImageBufs into AVBufferRefsLynne2020-05-23
| | | | Makes it easier to support multiple queues
* hwcontext_vulkan: expose the enabled device featuresLynne2020-05-23
| | | | | | | With this, the puzzle of making libplacebo, ffmpeg and any other Vulkan API users interoperable is complete. Users of both libraries can initialize one another's contexts without having to create a new one.
* hwcontext_vulkan: expose the amount of queues for each queue familyLynne2020-05-23
| | | | | This, along with the next patch, are the last missing pieces to being interoperable with libplacebo.
* hwcontext: add av_hwdevice_ctx_create_derived_optsLynne2020-05-23
| | | | | | | | | | | | | This allows for users who derive devices to set options for the new device context they derive. The main use case of this is to allow users to enable extensions (such as surface drawing extensions) in Vulkan while deriving from the device their frames are on. That way, users don't need to write any initialization code themselves, since the Vulkan spec invalidates mixing instances, physical devices and active devices. Apart from Vulkan, other hwcontexts ignore the opts argument since they don't support options at all (or in VAAPI and OpenCL's case, options are currently only used for device selection, which device_derive overrides).
* hwcontext_vulkan: fix incorrect print argumentLynne2020-05-14
|
* hwcontext_vulkan: don't add the optional VK_KHR_surface extension by defaultLynne2020-05-12
| | | | Both API and CLI users can enable any extension they'd like using the options.
* hwcontext_vulkan: don't error on unavailable user-specified extensionsLynne2020-05-12
| | | | | | Only warn instead. API users can find out which extensions were unavailable by using the enabled_inst_extensions and enabled_dev_extensions fields. This eliminates having to trial-and-error to find which extensions were missing.
* hwcontext_vulkan: use the maximum amount of queues for each familyLynne2020-05-12
| | | | | | | | | | Due to our AVHWDevice infrastructure, where API users are offered a way to derive contexts rather than always create new one, our filterchains, being supported by a single hardware device context, can grow to considerable size. Hence, in such situations, using the maximum amount of queues the device offers can be benefitial to eliminating bottlenecks where queue submissions on the same family have to wait for the previous one to finish.
* hwcontext_vulkan: update prepare_frame() for multiple semaphores when exportingLynne2020-05-12
|
* Revert "hwcontext_vulkan: only use one semaphore per image"Lynne2020-05-11
| | | | | | | | | This reverts commit 97b526c192add6f252b327245fd9223546867352. It broke the API, and assumed no other APIs used multiple semaphores. This also disallowed certain optimizations to happen. Dealing with APIs that give or expect single semaphores is easier when we use per-image semaphores.
* hwcontext_vulkan: convert to general layout and transfer queue when exportingLynne2020-05-10
| | | | | | | The specs note that images should be in the GENERAL layout when exporting for maximum compatibility. CUDA exported images are handled differently, and the queue is the same, so we don't need to do that there.
* hwcontext_vulkan: create all images with concurrent sharing modeLynne2020-05-10
| | | | | | | | | As it turns out, we were already assuming and treating all images as if they had concurrent access mode. This just changes the flag to CONCURRENT, which has less restrictions than EXCLUSIVE, and fixed validation messages on machines with multiple queues. The validation layer didn't pick this up because the machine I was testing on had only a single queue.
* hwcontext_vulkan: fix inverted condition when exporting images to drm_primeLynne2020-05-10
| | | | Calling vkGetImageSubresourceLayout is only legal for linear and drm images.
* hwcontext_vulkan: update debugging layer nameLynne2020-05-10
|
* hwcontext_vulkan: remove unused internal REQUIRED extension flagLynne2020-05-10
| | | | | This is a leftover from an old version which used the 1.0 Vulkan API with the maintenance extensions being required.