summaryrefslogtreecommitdiff
path: root/libavutil/x86/pixelutils.asm
Commit message (Collapse)AuthorAge
* avutil/x86/pixelutils: Remove obsolete MMX(EXT) functionsAndreas Rheinhardt2022-06-22
| | | | | | | | | | | x64 always has MMX, MMXEXT, SSE and SSE2 and this means that some functions for MMX, MMXEXT, SSE and 3dnow are always overridden by other functions (unless one e.g. explicitly disables SSE2). So given that the only systems which benefit from the 8x8 MMX (overridden by MMXEXT) or the 16x16 MMXEXT (overridden by SSE2) are truely ancient 32bit x86s they are removed. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
* libavutil: include assembly with full path from source rootAlexander Kanavin2022-02-08
| | | | | | | | Otherwise nasm writes the full host-specific paths into .o output, which breaks binary reproducibility. Signed-off-by: Alexander Kanavin <alex.kanavin@gmail.com> Signed-off-by: Anton Khirnov <anton@khirnov.net>
* x86/pixelutils: add missing preprocessor wrapper to the AVX2 functionsJames Almer2018-07-31
| | | | | | Should fix compilation with old yasm/nasm Signed-off-by: James Almer <jamrial@gmail.com>
* avutil/pixelutils: sad_32x32 sse2/avx2 optimizations.Jun Zhao2018-07-31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | add ff_pixelutils_sad_32x32_sse2, ff_pixelutils_sad_{a,u}_32x32_sse2, ff_pixelutils_sad_32x32_avx22, ff_pixelutils_sad_{a,u}_32x32_avx2 use perf record/report profiling, get instructions:u for avx2 sad_32x32: 72.05% pixelutils pixelutils [.] block_sad_32x32_c 18.50% pixelutils pixelutils [.] block_sad_16x16_c 4.78% pixelutils pixelutils [.] block_sad_8x8_c 2.69% pixelutils pixelutils [.] block_sad_4x4_c 0.89% pixelutils pixelutils [.] block_sad_2x2_c 0.16% pixelutils pixelutils [.] ff_pixelutils_sad_32x32_avx2 0.16% pixelutils pixelutils [.] ff_pixelutils_sad_u_32x32_avx2 0.12% pixelutils pixelutils [.] ff_pixelutils_sad_a_32x32_avx2 sse2 sad_32x32 instructions:u like: 71.86% pixelutils pixelutils [.] block_sad_32x32_c 18.42% pixelutils pixelutils [.] block_sad_16x16_c 4.81% pixelutils pixelutils [.] block_sad_8x8_c 2.68% pixelutils pixelutils [.] block_sad_4x4_c 0.88% pixelutils pixelutils [.] block_sad_2x2_c 0.29% pixelutils pixelutils [.] ff_pixelutils_sad_32x32_sse2 0.26% pixelutils pixelutils [.] ff_pixelutils_sad_u_32x32_sse2 0.23% pixelutils pixelutils [.] ff_pixelutils_sad_a_32x32_sse2 Signed-off-by: Jun Zhao <mypopydev@gmail.com>
* avutil/pixelutils: correct the function name in commentsJun Zhao2018-07-11
| | | | Signed-off-by: Jun Zhao <mypopydev@gmail.com>
* x86inc: Drop SECTION_TEXT macroHenrik Gramner2015-08-04
| | | | | The .text section is already 16-byte aligned by default on all supported platforms so `SECTION_TEXT` isn't any different from `SECTION .text`.
* avutil/pixelutils: faster pixelutils_sad_16x16Clément Bœsch2014-08-23
| | | | | | 501 to 439 decicycles. See 45c7f3997ea11c3d1007b2126b1c0049a8c27105.
* avutil/pixelutils: faster pixelutils_sad_[au]_16x16Clément Bœsch2014-08-23
| | | | | | | | | | ~560 → ~500 decicycles This is following the comments from Michael in https://ffmpeg.org/pipermail/ffmpeg-devel/2014-August/160599.html Using 2 registers for accumulator didn't help. On the other hand, some re-ordering between the movs and psadbw allowed going ~538 to ~500.
* avutil: add pixelutils APIClément Bœsch2014-08-05