summaryrefslogtreecommitdiff
path: root/libavcodec/aarch64/me_cmp_init_aarch64.c
Commit message (Collapse)AuthorAge
* lavc/aarch64: Add neon implementation for pix_abs16_y2Hubert Mazur2022-08-18
| | | | | | | | | | | | | Provide optimized implementation of pix_abs16_y2 function for arm64. Performance comparison tests are shown below. pix_abs_0_2_c: 317.2 pix_abs_0_2_neon: 37.5 Benchmarks and tests run with checkasm tool on AWS Graviton 3. Signed-off-by: Hubert Mazur <hum@semihalf.com> Signed-off-by: Martin Storsjö <martin@martin.st>
* lavc/aarch64: Add neon implementation for sse4Hubert Mazur2022-08-18
| | | | | | | | | | | | | Provide neon implementation for sse4 function. Performance comparison tests are shown below. - sse_2_c: 80.7 - sse_2_neon: 31.0 Benchmarks and tests are run with checkasm tool on AWS Graviton 3. Signed-off-by: Hubert Mazur <hum@semihalf.com> Signed-off-by: Martin Storsjö <martin@martin.st>
* lavc/aarch64: Add neon implementation for sse16Hubert Mazur2022-08-18
| | | | | | | | | | | | | Provide neon implementation for sse16 function. Performance comparison tests are shown below. - sse_0_c: 268.2 - sse_0_neon: 43.5 Benchmarks and tests run with checkasm tool on AWS Graviton 3. Signed-off-by: Hubert Mazur <hum@semihalf.com> Signed-off-by: Martin Storsjö <martin@martin.st>
* aarch64: me_cmp: Fix the indentation of function declarationsMartin Storsjö2022-08-18
| | | | Signed-off-by: Martin Storsjö <martin@martin.st>
* avcodec/me_cmp: Constify me_cmp_func buffer parametersAndreas Rheinhardt2022-07-31
| | | | Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
* lavc/aarch64: Add pix_abs16_x2 neon implementationHubert Mazur2022-07-13
| | | | | | | | | | | | | Provide neon implementation for pix_abs16_x2 function. Performance tests of implementation are below. - pix_abs_0_1_c: 283.5 - pix_abs_0_1_neon: 39.0 Benchmarks and tests run with checkasm tool on AWS Graviton 3. Signed-off-by: Hubert Mazur <hum@semihalf.com> Signed-off-by: Martin Storsjö <martin@martin.st>
* lavc/aarch64: Hook up the existing ff_pix_abs16_neon to the sad[0] function ↵Hubert Mazur2022-07-11
| | | | | | | pointer Signed-off-by: Hubert Mazur <hum@semihalf.com> Signed-off-by: Martin Storsjö <martin@martin.st>
* lavc/aarch64: motion estimation functions in neonSwinney, Jonathan2022-06-28
- ff_pix_abs16_neon - ff_pix_abs16_xy2_neon In direct micro benchmarks of these ff functions verses their C implementations, these functions performed as follows on AWS Graviton 3. ff_pix_abs16_neon: pix_abs_0_0_c: 141.1 pix_abs_0_0_neon: 19.6 ff_pix_abs16_xy2_neon: pix_abs_0_3_c: 269.1 pix_abs_0_3_neon: 39.3 Tested with: ./tests/checkasm/checkasm --test=motion --bench --disable-linux-perf Signed-off-by: Jonathan Swinney <jswinney@amazon.com> Signed-off-by: Martin Storsjö <martin@martin.st>