summaryrefslogtreecommitdiff
path: root/tests/checkasm/checkasm.h
Commit message (Collapse)AuthorAge
* lavc/aarch64: motion estimation functions in neonSwinney, Jonathan2022-06-28
| | | | | | | | | | | | | | | | | | | | | | - ff_pix_abs16_neon - ff_pix_abs16_xy2_neon In direct micro benchmarks of these ff functions verses their C implementations, these functions performed as follows on AWS Graviton 3. ff_pix_abs16_neon: pix_abs_0_0_c: 141.1 pix_abs_0_0_neon: 19.6 ff_pix_abs16_xy2_neon: pix_abs_0_3_c: 269.1 pix_abs_0_3_neon: 39.3 Tested with: ./tests/checkasm/checkasm --test=motion --bench --disable-linux-perf Signed-off-by: Jonathan Swinney <jswinney@amazon.com> Signed-off-by: Martin Storsjö <martin@martin.st>
* checkasm: Add idctdsp add/put-pixels-clamped testsBen Avison2022-04-01
| | | | | Signed-off-by: Ben Avison <bavison@riscosopen.org> Signed-off-by: Martin Storsjö <martin@martin.st>
* checkasm: Add vc1dsp in-loop deblocking filter testsBen Avison2022-04-01
| | | | | | | | | | | Note that the benchmarking results for these functions are highly dependent upon the input data. Therefore, each function is benchmarked twice, corresponding to the best and worst case complexity of the reference C implementation. The performance of a real stream decode will fall somewhere between these two extremes. Signed-off-by: Ben Avison <bavison@riscosopen.org> Signed-off-by: Martin Storsjö <martin@martin.st>
* swscale/x86/output.asm: add x86-optimized planer gbr yuv2anyX functionsMark Reid2022-01-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | changes since v2: * fixed label changes since v1: * remove vex intruction on sse4 path * some load/pack marcos use less intructions * fixed some typos yuv2gbrp_full_X_4_512_c: 12757.6 yuv2gbrp_full_X_4_512_sse2: 8946.6 yuv2gbrp_full_X_4_512_sse4: 5138.6 yuv2gbrp_full_X_4_512_avx2: 3889.6 yuv2gbrap_full_X_4_512_c: 15368.6 yuv2gbrap_full_X_4_512_sse2: 11916.1 yuv2gbrap_full_X_4_512_sse4: 6294.6 yuv2gbrap_full_X_4_512_avx2: 3477.1 yuv2gbrp9be_full_X_4_512_c: 14381.6 yuv2gbrp9be_full_X_4_512_sse2: 9139.1 yuv2gbrp9be_full_X_4_512_sse4: 5150.1 yuv2gbrp9be_full_X_4_512_avx2: 2834.6 yuv2gbrp9le_full_X_4_512_c: 12990.1 yuv2gbrp9le_full_X_4_512_sse2: 9118.1 yuv2gbrp9le_full_X_4_512_sse4: 5132.1 yuv2gbrp9le_full_X_4_512_avx2: 2833.1 yuv2gbrp10be_full_X_4_512_c: 14401.6 yuv2gbrp10be_full_X_4_512_sse2: 9133.1 yuv2gbrp10be_full_X_4_512_sse4: 5126.1 yuv2gbrp10be_full_X_4_512_avx2: 2837.6 yuv2gbrp10le_full_X_4_512_c: 12718.1 yuv2gbrp10le_full_X_4_512_sse2: 9106.1 yuv2gbrp10le_full_X_4_512_sse4: 5120.1 yuv2gbrp10le_full_X_4_512_avx2: 2826.1 yuv2gbrap10be_full_X_4_512_c: 18535.6 yuv2gbrap10be_full_X_4_512_sse2: 33617.6 yuv2gbrap10be_full_X_4_512_sse4: 6264.1 yuv2gbrap10be_full_X_4_512_avx2: 3422.1 yuv2gbrap10le_full_X_4_512_c: 16724.1 yuv2gbrap10le_full_X_4_512_sse2: 11787.1 yuv2gbrap10le_full_X_4_512_sse4: 6282.1 yuv2gbrap10le_full_X_4_512_avx2: 3441.6 yuv2gbrp12be_full_X_4_512_c: 13723.6 yuv2gbrp12be_full_X_4_512_sse2: 9128.1 yuv2gbrp12be_full_X_4_512_sse4: 7997.6 yuv2gbrp12be_full_X_4_512_avx2: 2844.1 yuv2gbrp12le_full_X_4_512_c: 12257.1 yuv2gbrp12le_full_X_4_512_sse2: 9107.6 yuv2gbrp12le_full_X_4_512_sse4: 5142.6 yuv2gbrp12le_full_X_4_512_avx2: 2837.6 yuv2gbrap12be_full_X_4_512_c: 18511.1 yuv2gbrap12be_full_X_4_512_sse2: 12156.6 yuv2gbrap12be_full_X_4_512_sse4: 6251.1 yuv2gbrap12be_full_X_4_512_avx2: 3444.6 yuv2gbrap12le_full_X_4_512_c: 16687.1 yuv2gbrap12le_full_X_4_512_sse2: 11785.1 yuv2gbrap12le_full_X_4_512_sse4: 6243.6 yuv2gbrap12le_full_X_4_512_avx2: 3446.1 yuv2gbrp14be_full_X_4_512_c: 13690.6 yuv2gbrp14be_full_X_4_512_sse2: 9120.6 yuv2gbrp14be_full_X_4_512_sse4: 5138.1 yuv2gbrp14be_full_X_4_512_avx2: 2843.1 yuv2gbrp14le_full_X_4_512_c: 14995.6 yuv2gbrp14le_full_X_4_512_sse2: 9119.1 yuv2gbrp14le_full_X_4_512_sse4: 5126.1 yuv2gbrp14le_full_X_4_512_avx2: 2843.1 yuv2gbrp16be_full_X_4_512_c: 12367.1 yuv2gbrp16be_full_X_4_512_sse2: 8233.6 yuv2gbrp16be_full_X_4_512_sse4: 4820.1 yuv2gbrp16be_full_X_4_512_avx2: 2666.6 yuv2gbrp16le_full_X_4_512_c: 10904.1 yuv2gbrp16le_full_X_4_512_sse2: 8214.1 yuv2gbrp16le_full_X_4_512_sse4: 4824.1 yuv2gbrp16le_full_X_4_512_avx2: 2629.1 yuv2gbrap16be_full_X_4_512_c: 26569.6 yuv2gbrap16be_full_X_4_512_sse2: 10884.1 yuv2gbrap16be_full_X_4_512_sse4: 5488.1 yuv2gbrap16be_full_X_4_512_avx2: 3272.1 yuv2gbrap16le_full_X_4_512_c: 14010.1 yuv2gbrap16le_full_X_4_512_sse2: 10562.1 yuv2gbrap16le_full_X_4_512_sse4: 5463.6 yuv2gbrap16le_full_X_4_512_avx2: 3255.1 yuv2gbrpf32be_full_X_4_512_c: 14524.1 yuv2gbrpf32be_full_X_4_512_sse2: 8552.6 yuv2gbrpf32be_full_X_4_512_sse4: 4636.1 yuv2gbrpf32be_full_X_4_512_avx2: 2474.6 yuv2gbrpf32le_full_X_4_512_c: 13060.6 yuv2gbrpf32le_full_X_4_512_sse2: 9682.6 yuv2gbrpf32le_full_X_4_512_sse4: 4298.1 yuv2gbrpf32le_full_X_4_512_avx2: 2453.1 yuv2gbrapf32be_full_X_4_512_c: 18629.6 yuv2gbrapf32be_full_X_4_512_sse2: 11363.1 yuv2gbrapf32be_full_X_4_512_sse4: 15201.6 yuv2gbrapf32be_full_X_4_512_avx2: 3727.1 yuv2gbrapf32le_full_X_4_512_c: 16677.6 yuv2gbrapf32le_full_X_4_512_sse2: 10221.6 yuv2gbrapf32le_full_X_4_512_sse4: 5693.6 yuv2gbrapf32le_full_X_4_512_avx2: 3656.6 Reviewed-by: Paul B Mahol <onemda@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>
* checkasm: collapse hevc pel testsJ. Dekker2021-08-24
| | | | | | Also add to `make fate-checkasm' target. Signed-off-by: J. Dekker <jdek@itanimul.li>
* lavu/checkasm: add (private) kperf timing for macOSJ. Dekker2021-07-20
| | | | Signed-off-by: J. Dekker <jdek@itanimul.li>
* checkasm: add av_tx FFT SIMD testing codeLynne2021-04-24
| | | | | | This sadly required making changes to the code itself, due to the same context needing to be reused for both versions. The lookup table had to be duplicated for both versions.
* checkasm: add hevc_pel testsJosh Dekker2021-01-25
| | | | | Co-authored-by: Niklas Haas <git@haasn.xyz> Signed-off-by: Josh Dekker <josh@itanimul.li>
* checkasm: aarch64: Check for stack overflowsMartin Storsjö2020-05-15
| | | | | | | | | Also fill x8-x17 with garbage before calling the function. Figure out the number of stack parameters and make sure that the value on the stack after those is untouched. Signed-off-by: Martin Storsjö <martin@martin.st>
* checkasm: arm: Check for stack overflowsMartin Storsjö2020-05-15
| | | | | | | Figure out the number of stack parameters and make sure that the value on the stack after those is untouched. Signed-off-by: Martin Storsjö <martin@martin.st>
* checkasm: add hscale testJosh de Kock2020-05-15
| | | | | | | This tests the hscale 8bpp to 14/18bpp functions with different filter sizes. Signed-off-by: Josh de Kock <josh@itanimul.li>
* checkasm: add function to check and diff memoryMartin Storsjö2020-05-15
| | | | | | This was ported from dav1d (c950e7101bdf5f7117bfca816984a21e550509f0). Signed-off-by: Josh de Kock <josh@itanimul.li>
* checkasm/vf_eq: add test for vf_eqTing Fu2019-09-26
| | | | | Signed-off-by: Ting Fu <ting.fu@intel.com> Signed-off-by: Ruiling Song <ruiling.song@intel.com>
* checkasm: add opusdsp testsLynne2019-09-11
|
* checkasm/vf_gblur: add test for horiz_slice simdRuiling Song2019-06-12
| | | | Signed-off-by: Ruiling Song <ruiling.song@intel.com>
* checkasm: add test for v210decJames Darnley2019-05-02
|
* checkasm: add an af_afir testJames Almer2019-01-03
| | | | | Reviewed-by: Paul B Mahol <onemda@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>
* checkasm: add vf_nlmeans test for ssd_integral_imageClément Bœsch2018-05-08
|
* checkasm/swscale : add test for rgb shuffle_bytes funcMartin Vignali2018-03-24
|
* checkasm/hevc_sao : add hevc_sao for checkasmYingming Fan2018-03-07
| | | | Signed-off-by: James Almer <jamrial@gmail.com>
* checkasm : add test for losslessvideoencdsp for diff bytes and sub_left_predMartin Vignali2018-01-28
|
* Revert "checkasm/vf_interlace : add test for lowpass_line 8 and 16"James Almer2017-12-19
| | | | | | | | This reverts commit adff97be5e2ff51c0bb66080c2f904ed40b6c571. It currently fails on Windows targets. Signed-off-by: James Almer <jamrial@gmail.com>
* checkasm/vf_interlace : add test for lowpass_line 8 and 16Martin Vignali2017-12-19
|
* checkasm/vf_hflip : add test for vf_hflip byte and short simdMartin Vignali2017-12-13
|
* checkasm/vf_threshold : add checkasm test for threshold8Martin Vignali2017-12-03
|
* checkasm : add test for huffyuvdsp add_int16Martin Vignali2017-11-21
|
* checkasm : add utvideodsp testMartin Vignali2017-11-21
|
* checkasm: add an exrdsp testJames Almer2017-09-17
| | | | Signed-off-by: James Almer <jamrial@gmail.com>
* checkasm: use perf API on Linux ARM*Clément Bœsch2017-09-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On ARM platforms, accessing the PMU registers requires special user access permissions. Since there is no other way to get accurate timers, the current implementation of timers in FFmpeg rely on these registers. Unfortunately, enabling user access to these registers on Linux is not trivial, and generally involve compiling a random and unreliable github kernel module, or patching somehow your kernel. Such module is very unlikely to reach the upstream anytime soon. Quoting Robin Murphin from ARM: > Say you do give userspace direct access to the PMU; now run two or more > programs at once that believe they can use the counters for their own > "minimal-overhead" profiling. Have fun interpreting those results... > > And that's not even getting into the implications of scheduling across > different CPUs, CPUidle, etc. where the PMU state is completely beyond > userspace's control. In general, the plan to provide userspace with > something which might happen to just about work in a few corner cases, > but is meaningless, misleading or downright broken in all others, is to > never do so. As a result, the alternative is to use the Performance Monitoring Linux API which makes use of these registers internally (assuming the PMU of your ARM board is supported in the kernel, which is definitely not a given...). While the Linux API is obviously cross platform, it does have a significant overhead which needs to be taken into account. As a result, that mode is only weakly enabled on ARM platforms exclusively. Note on the non flexibility of the implementation: the timers (native FFmpeg vs Linux API) are selected at compilation time to prevent the need of function calls, which would result in a negative impact on the cycle counters.
* checkasm: add a g722dsp testJames Almer2017-07-13
| | | | Signed-off-by: James Almer <jamrial@gmail.com>
* checkasm: add sbrdsp testsMatthieu Bouron2017-07-03
|
* checkasm: add AAC PS testsClément Bœsch2017-06-28
| | | | | | This includes various fixes and improvements from James Almer. Signed-off-by: James Almer <jamrial@gmail.com>
* build: Generalize yasm/nasm-related variable namesDiego Biurrun2017-06-21
| | | | | | | | None of them are specific to the YASM assembler. (Cherry-picked from libav commit 39e208f4d4756367c7cd2d581847e0c1b8a429c1) Signed-off-by: James Almer <jamrial@gmail.com>
* checkasm: add float_dsp testsJames Almer2017-06-14
| | | | | | Ported from libavutil/tests/float_dsp.c Signed-off-by: James Almer <jamrial@gmail.com>
* checkasm: add a checkasm_checked_call function that doesn't issue emmsJames Almer2017-06-14
| | | | | | | Meant for DSP functions returning a float or double, as they'd fail if emms is called after every run on x86_32. Signed-off-by: James Almer <jamrial@gmail.com>
* checkasm: add fixed_dsp testsJames Almer2017-04-11
| | | | | Tested-by: Michael Niedermayer <michael@niedermayer.cc> Signed-off-by: James Almer <jamrial@gmail.com>
* Merge commit 'ed48a9d8143d2575a4458589cebde69ec326afd8'Clément Bœsch2017-03-24
|\ | | | | | | | | | | | | * commit 'ed48a9d8143d2575a4458589cebde69ec326afd8': checkasm: Add a test for HEVC add_residual Merged-by: Clément Bœsch <u@pkh.me>
| * checkasm: Add a test for HEVC add_residualAlexandra Hájková2016-10-22
| |
* | Merge commit 'c91d6a33f872574c95c8784277cf60ffcf6bff4f'James Almer2017-03-23
|\| | | | | | | | | | | | | * commit 'c91d6a33f872574c95c8784277cf60ffcf6bff4f': checkasm: aarch64: Add filler args to make sure all parameters are passed on the stack Merged-by: James Almer <jamrial@gmail.com>
| * checkasm: aarch64: Add filler args to make sure all parameters are passed on ↵Martin Storsjö2016-10-16
| | | | | | | | | | | | | | | | | | | | the stack This, combined with clobbering the stack space prior to the call, increases the chances of finding cases where 32 bit parameters are erroneously treated as 64 bit. Signed-off-by: Martin Storsjö <martin@martin.st>
* | Merge commit 'f1b3e131385176c3c9d9783b25047856a0dcebf6'James Almer2017-03-23
|\| | | | | | | | | | | | | * commit 'f1b3e131385176c3c9d9783b25047856a0dcebf6': checkasm: aarch64: Clobber the stack before calling functions Merged-by: James Almer <jamrial@gmail.com>
| * checkasm: aarch64: Clobber the stack before calling functionsMartin Storsjö2016-10-16
| | | | | | | | Signed-off-by: Martin Storsjö <martin@martin.st>
* | Merge commit '22c3ab18646924ce24dc6017a9e882ff69689e40'Clément Bœsch2017-03-22
|\| | | | | | | | | | | | | | | | | | | | | | | * commit '22c3ab18646924ce24dc6017a9e882ff69689e40': checkasm: Add test for huffyuvdsp add_bytes huffyuvdsp is renamed to llviddsp to be consistent with our codebase. Note: af607b7e07 wasn't actually required for this test since this commit is not actually testing huffyuvdsp. Merged-by: Clément Bœsch <u@pkh.me>
| * checkasm: Add test for huffyuvdsp add_bytesAlexandra Hájková2016-10-02
| | | | | | | | Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
* | Merge commit 'e9ef6171396dc4106526aaa86b620c61ca3d1017'Clément Bœsch2017-03-20
|\| | | | | | | | | | | | | * commit 'e9ef6171396dc4106526aaa86b620c61ca3d1017': checkasm: add tests for audiodsp Merged-by: Clément Bœsch <u@pkh.me>
| * checkasm: add tests for audiodspAnton Khirnov2016-09-22
| |
* | Merge commit '2eb97af66af90ca3978229da151f0b8b3a5d9370'Clément Bœsch2017-03-20
|\| | | | | | | | | | | | | * commit '2eb97af66af90ca3978229da151f0b8b3a5d9370': checkasm: add a test for blockdsp Merged-by: Clément Bœsch <u@pkh.me>
| * checkasm: add a test for blockdspAnton Khirnov2016-09-22
| |
| * checkasm: add vp9 MC tests.Ronald S. Bultje2016-08-03
| | | | | | | | Signed-off-by: Anton Khirnov <anton@khirnov.net>
* | Merge commit '9064777dbb335ab4809ae09e3fdcc0245f925cdc'Clément Bœsch2017-02-02
|\| | | | | | | | | | | | | * commit '9064777dbb335ab4809ae09e3fdcc0245f925cdc': checkasm: add HEVC test for testing IDCT DC Merged-by: Clément Bœsch <cboesch@gopro.com>