summaryrefslogtreecommitdiff
path: root/libavcodec/x86/hevcdsp.h
Commit message (Collapse)AuthorAge
* avcodec/x86/hevc_mc: add qpel_h64_8_avx512iclWu Jianhua2022-04-24
| | | | | | | | | ff_hevc_put_hevc_qpel_h64_8_sse4 56782981 ff_hevc_put_hevc_qpel_h64_8_avx2 40097816 ff_hevc_put_hevc_qpel_h64_8_avx512icl 25488576 Reviewed-by: Henrik Gramner <henrik@gramner.com> Signed-off-by: Wu Jianhua <jianhua.wu@intel.com>
* avcodec/x86/hevc_mc: add qpel_h32_8_avx512iclWu Jianhua2022-04-24
| | | | | | | | | ff_hevc_put_hevc_qpel_h32_8_sse4 14122151 ff_hevc_put_hevc_qpel_h32_8_avx2 9337675 ff_hevc_put_hevc_qpel_h32_8_avx512icl 6424654 Reviewed-by: Henrik Gramner <henrik@gramner.com> Signed-off-by: Wu Jianhua <jianhua.wu@intel.com>
* avcodec/x86/hevc_mc: add qpel_h4_8_avx512iclWu Jianhua2022-04-24
| | | | | | | | ff_hevc_put_hevc_qpel_h4_8_sse4 993694 ff_hevc_put_hevc_qpel_h4_8_avx512icl 686647 Reviewed-by: Henrik Gramner <henrik@gramner.com> Signed-off-by: Wu Jianhua <jianhua.wu@intel.com>
* avcodec/x86/hevc_mc: add qpel_h16_8_avx512iclWu Jianhua2022-04-24
| | | | | | | | ff_hevc_put_hevc_qpel_h16_8_sse4 3290870 ff_hevc_put_hevc_qpel_h16_8_avx512icl 1730033 Reviewed-by: Henrik Gramner <henrik@gramner.com> Signed-off-by: Wu Jianhua <jianhua.wu@intel.com>
* avcodec/x86/hevc_mc: add qpel_h8_8_avx512icl and qpel_hv8_8_avx512iclWu Jianhua2022-04-24
| | | | | | | | | | | | | This commit uses the instruction `vpdpbusd` introduced by AVX512 VNNI to calculate the horizontal filter. ff_hevc_put_hevc_qpel_h8_8_sse4 1039169 ff_hevc_put_hevc_qpel_h8_8_avx512icl 677153 ff_hevc_put_hevc_qpel_hv8_8_sse4 3603511 ff_hevc_put_hevc_qpel_hv8_8_avx512icl 2995354 Reviewed-by: Henrik Gramner <henrik@gramner.com> Signed-off-by: Wu Jianhua <jianhua.wu@intel.com>
* Merge commit '6d5636ad9ab6bd9bedf902051d88b7044385f88b'Clément Bœsch2017-03-24
| | | | | | | | | | | | | | | * commit '6d5636ad9ab6bd9bedf902051d88b7044385f88b': hevc: x86: Add add_residual() SIMD optimizations See a6af4bf64dae46356a5f91537a1c8c5f86456b37 This merge is only cosmetics (renames, space shuffling, etc). The functionnal changes in the ASM are *not* merged: - unrolling with %rep is kept - ADD_RES_MMX_4_8 is left untouched: this needs investigation Merged-by: Clément Bœsch <u@pkh.me>
* Merge commit 'fca3c3b61952aacc45e9ca54d86a762946c21942'Clément Bœsch2017-01-31
| | | | | | | | | | | | | | | | | * commit 'fca3c3b61952aacc45e9ca54d86a762946c21942': hevc: Add AVX2 DC IDCT Mostly noop as we already have that code. In the ASM, code is merged with the exception of SECTION which is kept uppercase for consistency with the rest of the codebase. Still in the ASM, the prototype comment is fixed to honor the '_' added from the original commit. idct_dc_proto() is dropped as it's not used anymore here. Merged-by: Clément Bœsch <cboesch@gopro.com>
* Merge commit '1bd890ad173d79e7906c5e1d06bf0a06cca4519d'Clément Bœsch2017-01-31
| | | | | | | | | | | | | * commit '1bd890ad173d79e7906c5e1d06bf0a06cca4519d': hevc: Separate adding residual to prediction from IDCT This commit should be a noop but isn't because of the following renames: - transform_add → add_residual - transform_skip → dequant - idct_4x4_luma → transform_4x4_luma Merged-by: Clément Bœsch <cboesch@gopro.com>
* x86: hevc: remove a parameter to WP internalsChristophe Gisquet2015-02-14
| | | | | | | The second stride is always the internal buffer one, MAX_PB_SIZE (times 2 to get the value in bytes). Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* x86: hevc_mc: add AVX2 optimizationsPierre Edouard Lepere2015-02-06
| | | | | | | | | | | | | | | | | | before 33304 decicycles in luma_bi_1, 523066 runs, 1222 skips 38138 decicycles in luma_bi_2, 523427 runs, 861 skips 13490 decicycles in luma_uni, 516138 runs, 8150 skips after 20185 decicycles in luma_bi_1, 519970 runs, 4318 skips 24620 decicycles in luma_bi_2, 521024 runs, 3264 skips 10397 decicycles in luma_uni, 515715 runs, 8573 skips Conflicts: libavcodec/x86/hevc_mc.asm libavcodec/x86/hevcdsp_init.c Reviewed-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* x86/hevc_res_add: add ff_hevc_transform_add32_8_avx2James Almer2014-09-04
| | | | | | | ~20% faster than AVX. Reviewed-by: Michael Niedermayer <michaelni@gmx.at> Signed-off-by: James Almer <jamrial@gmail.com>
* hevcdsp: remove more instances of compile-time-fixed parametersChristophe Gisquet2014-08-22
| | | | Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* hevcdsp: remove compilation-time-fixed parameterChristophe Gisquet2014-08-22
| | | | | | | The dststride parameter is always MAX_PB_SIZE. Reviewed-by: Mickaël Raulet <mraulet@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* x86/hecv_res_add: add ff_hevc_transform_add{8,16,32}_8_avxJames Almer2014-08-20
| | | | | | | | ~15% faster than sse2 Reviewed-by: Mickaël Raulet <mraulet@gmail.com> Reviewed-by: Christophe Gisquet <christophe.gisquet@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>
* x86: hevc: adding transform_addPierre Edouard Lepere2014-08-20
| | | | | | Reviewed-by: James Almer <jamrial@gmail.com> Approved-by: Ronald S. Bultje Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* x86/hevc_idct: replace old and unused idct functionsJames Almer2014-07-26
| | | | | | | | | | | | | | | | | | | | | | Only 8-bit and 10-bit idct_dc() functions are included (adding others should be trivial). Benchmarks on an Intel Core i5-4200U: idct8x8_dc SSE2 MMXEXT C cycles 22 26 57 idct16x16_dc AVX2 SSE2 C cycles 27 32 249 idct32x32_dc AVX2 SSE2 C cycles 62 126 1375 Signed-off-by: James Almer <jamrial@gmail.com> Reviewed-by: Mickaël Raulet <mraulet@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* x86/hevc: add 12bits support for MCMickaël Raulet2014-07-26
| | | | | | cherry picked from commit 3fcb7a4595a6f40100a22110a5805e3b7510c0fd Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* avcodec/x86/hevc: add avx2 dc idctplepere2014-06-25
| | | | Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* avcodec/hevc: new idct + asmplepere2014-06-17
| | | | Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* hevcdsp: include stddef.h for ptrdiff_t definitionJames Almer2014-05-10
| | | | | | | | Including stdint.h was enough for systems like Mingw, but apparently not for Linux. This should fix make checkheaders failures on every platform Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* hevcdsp: add missing header includeJames Almer2014-05-10
| | | | | | | Fixes make checkheaders Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* hvcodec/x86/hevcdsp: make macros more modular to support functions that are ↵plepere2014-05-09
| | | | | | not sse4 Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* HEVC : added assembly MC functionsplepere2014-05-06
pretty print x86 Reviewed-by: "Ronald S. Bultje" <rsbultje@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>