| Commit message (Collapse) | Author | Age |
|
|
|
|
|
| |
If the block is at the end of the allocated buffer and there is no
padding, this will over-read, which may cause crashes. Reported by
Firefox.
|
|\
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
* commit 'e99ecda55082cb9dde8fd349361e169dc383943a':
checkasm: add vp9 MC tests.
vp9mc/x86: sse2 MC assembly.
vp9mc/x86: add AVX and AVX2 MC
vp9mc/x86: rename ff_* to ff_vp9_*
vp9mc/x86: rename ff_avg[48]_sse to ff_avg[48]_mmxext
vp9mc/x86: simplify a few inits.
vp9mc/x86: add 16px functions (64bit only).
Noop (aside from a formatting comment in vp9mc.asm). We already have all
of this. We should consider making a final diff between the two projects
when the dust comes down.
Merged-by: Clément Bœsch <u@pkh.me>
|
| |
| |
| |
| |
| |
| |
| | |
Also a slight change to the ssse3 code, which prevents a theoretical
overflow in the sharp filter.
Signed-off-by: Anton Khirnov <anton@khirnov.net>
|
| |
| |
| |
| |
| |
| |
| |
| | |
Roughly 25% faster MC than ssse3 for blocksizes 32 and 64.
Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Anton Khirnov <anton@khirnov.net>
|
| |
| |
| |
| | |
Signed-off-by: Anton Khirnov <anton@khirnov.net>
|
| |
| |
| |
| |
| |
| |
| |
| | |
pavgb is an sse integer instruction, so the mmxext flag is enough
Signed-off-by: James Almer <jamrial@gmail.com>
Reviewed-by: "Ronald S. Bultje" <rsbultje@gmail.com>
Signed-off-by: Anton Khirnov <anton@khirnov.net>
|
| |
| |
| |
| | |
Signed-off-by: Anton Khirnov <anton@khirnov.net>
|
| |
| |
| |
| | |
Signed-off-by: Anton Khirnov <anton@khirnov.net>
|
|\|
| |
| |
| |
| |
| |
| |
| |
| | |
* commit '89466de4aeaf5e359489b81b8a9920a2bc7936d6':
vp9/x86: rename vp9dsp to vp9mc
File was already renamed, only the top description is updated.
Merged-by: Clément Bœsch <u@pkh.me>
|
|
|
|
| |
It only contains the MC SIMD, other SIMD will go into different files.
|
|
|
|
|
|
|
| |
Fixes compilation with NASM
Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
| |
Also a slight change to the ssse3 code, which prevents a theoretical
overflow in the sharp filter.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
|
|
|
|
|
|
|
| |
Roughly 25% faster MC than ssse3 for blocksizes 32 and 64.
Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
|
|
|
|
| |
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
|
| |
|
|
|
|
|
|
|
|
| |
pavgb is an sse integer instruction, so the mmxext flag is enough
Signed-off-by: James Almer <jamrial@gmail.com>
Reviewed-by: "Ronald S. Bultje" <rsbultje@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Cycle counts for large MCs (old -> new on ped1080p.webm, mx!=0&&my!=0):
16x8: 876 -> 870 (0.7%)
16x16: 1444 -> 1435 (0.7%)
16x32: 2784 -> 2748 (1.3%)
32x16: 2455 -> 2349 (4.5%)
32x32: 4641 -> 4084 (13.6%)
32x64: 9200 -> 7834 (17.4%)
64x32: 8980 -> 7197 (24.8%)
64x64: 17330 -> 13796 (25.6%)
Total decoding time goes from 9.326sec to 9.182sec.
|
|
(And in future, loopfilter or intra pred could be put in their own
respective files also.)
|