summaryrefslogtreecommitdiff
path: root/libavcodec/x86/vp9mc.asm
Commit message (Collapse)AuthorAge
* vp9: don't overread by 4 pixels in ff_vp9_avg4_mmxext().Ronald S. Bultje2022-06-01
| | | | | | If the block is at the end of the allocated buffer and there is no padding, this will over-read, which may cause crashes. Reported by Firefox.
* Merge commit 'e99ecda55082cb9dde8fd349361e169dc383943a'Clément Bœsch2017-03-16
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * commit 'e99ecda55082cb9dde8fd349361e169dc383943a': checkasm: add vp9 MC tests. vp9mc/x86: sse2 MC assembly. vp9mc/x86: add AVX and AVX2 MC vp9mc/x86: rename ff_* to ff_vp9_* vp9mc/x86: rename ff_avg[48]_sse to ff_avg[48]_mmxext vp9mc/x86: simplify a few inits. vp9mc/x86: add 16px functions (64bit only). Noop (aside from a formatting comment in vp9mc.asm). We already have all of this. We should consider making a final diff between the two projects when the dust comes down. Merged-by: Clément Bœsch <u@pkh.me>
| * vp9mc/x86: sse2 MC assembly.Ronald S. Bultje2016-08-03
| | | | | | | | | | | | | | Also a slight change to the ssse3 code, which prevents a theoretical overflow in the sharp filter. Signed-off-by: Anton Khirnov <anton@khirnov.net>
| * vp9mc/x86: add AVX and AVX2 MCJames Almer2016-08-03
| | | | | | | | | | | | | | | | Roughly 25% faster MC than ssse3 for blocksizes 32 and 64. Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Anton Khirnov <anton@khirnov.net>
| * vp9mc/x86: rename ff_* to ff_vp9_*Clément Bœsch2016-08-03
| | | | | | | | Signed-off-by: Anton Khirnov <anton@khirnov.net>
| * vp9mc/x86: rename ff_avg[48]_sse to ff_avg[48]_mmxextJames Almer2016-08-03
| | | | | | | | | | | | | | | | pavgb is an sse integer instruction, so the mmxext flag is enough Signed-off-by: James Almer <jamrial@gmail.com> Reviewed-by: "Ronald S. Bultje" <rsbultje@gmail.com> Signed-off-by: Anton Khirnov <anton@khirnov.net>
| * vp9mc/x86: simplify a few inits.Clément Bœsch2016-08-03
| | | | | | | | Signed-off-by: Anton Khirnov <anton@khirnov.net>
| * vp9mc/x86: add 16px functions (64bit only).Ronald S. Bultje2016-08-03
| | | | | | | | Signed-off-by: Anton Khirnov <anton@khirnov.net>
* | Merge commit '89466de4aeaf5e359489b81b8a9920a2bc7936d6'Clément Bœsch2017-03-16
|\| | | | | | | | | | | | | | | | | * commit '89466de4aeaf5e359489b81b8a9920a2bc7936d6': vp9/x86: rename vp9dsp to vp9mc File was already renamed, only the top description is updated. Merged-by: Clément Bœsch <u@pkh.me>
| * vp9/x86: rename vp9dsp to vp9mcAnton Khirnov2016-08-03
| | | | It only contains the MC SIMD, other SIMD will go into different files.
* x86/vp9mc: fix string concatenation of fullpel function namesJames Almer2015-09-20
| | | | | | | Fixes compilation with NASM Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>
* vp9: add subpel MC SIMD for 10/12bpp.Ronald S. Bultje2015-09-16
|
* vp9: add fullpel (avg) MC SIMD for 10/12bpp.Ronald S. Bultje2015-09-16
|
* vp9: add fullpel (put) MC SIMD for 10/12bpp.Ronald S. Bultje2015-09-16
|
* vp9/x86: sse2 MC assembly.Ronald S. Bultje2014-12-15
| | | | | | | Also a slight change to the ssse3 code, which prevents a theoretical overflow in the sharp filter. Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* x86/vp9: add AVX and AVX2 MCJames Almer2014-09-22
| | | | | | | Roughly 25% faster MC than ssse3 for blocksizes 32 and 64. Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>
* x86: vpx/h264/hevc/mpeg2: share constantsChristophe Gisquet2014-08-06
| | | | Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* x86/vp9mc: add vp9 namespace.Clément Bœsch2014-03-29
|
* vp9/x86: rename ff_avg[48]_sse to ff_avg[48]_mmxextJames Almer2014-01-18
| | | | | | | | pavgb is an sse integer instruction, so the mmxext flag is enough Signed-off-by: James Almer <jamrial@gmail.com> Reviewed-by: "Ronald S. Bultje" <rsbultje@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* vp9/x86: simplify a few mc inits.Clément Bœsch2014-01-16
|
* vp9/x86: 16px MC functions (64bit only).Ronald S. Bultje2013-12-26
| | | | | | | | | | | | | Cycle counts for large MCs (old -> new on ped1080p.webm, mx!=0&&my!=0): 16x8: 876 -> 870 (0.7%) 16x16: 1444 -> 1435 (0.7%) 16x32: 2784 -> 2748 (1.3%) 32x16: 2455 -> 2349 (4.5%) 32x32: 4641 -> 4084 (13.6%) 32x64: 9200 -> 7834 (17.4%) 64x32: 8980 -> 7197 (24.8%) 64x64: 17330 -> 13796 (25.6%) Total decoding time goes from 9.326sec to 9.182sec.
* vp9: split x86 assembly in two files.Ronald S. Bultje2013-12-07
(And in future, loopfilter or intra pred could be put in their own respective files also.)