| Commit message (Collapse) | Author | Age |
|
|
|
|
|
|
|
|
|
|
| |
x64 always has MMX, MMXEXT, SSE and SSE2 and this means
that some functions for MMX, MMXEXT, SSE and 3dnow are always
overridden by other functions (unless one e.g. explicitly
disables SSE2). So given that the only systems which benefit
from the MMXEXT resamplers (which are overridden by SSE2)
are truely ancient 32bit x86s they are removed.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
|
|
|
|
|
|
|
|
|
| |
benchmark:
sse2 10.670s
avx 8.763s
fma3 8.380s
Signed-off-by: Muhammad Faiz <mfcc64@gmail.com>
|
|
|
|
|
|
|
| |
phase_shift and phase_mask is removed
generally exact_rational=on is faster than exact_rational=off
Signed-off-by: Muhammad Faiz <mfcc64@gmail.com>
|
|
|
|
|
|
|
|
|
|
| |
Only two functions that use xop multiply-accumulate instructions where the
first operand is the same as the fourth actually took advantage of the macros.
This further reduces differences with x264's x86inc.
Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
|
|
|
|
|
| |
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
|
|
|
|
|
| |
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
|
|
|
|
|
|
|
|
| |
Signed-off-by: James Almer <jamrial@gmail.com>
312531 -> 311528 dezicycles
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
|
|
|
|
| |
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
|
|
Linear interpolation goes from 63 (llvm) or 58 (gcc) to 48 (yasm)
cycles/sample on 64bit, or from 66 (llvm/gcc) to 52 (yasm) cycles/
sample on 32bit. Bon-linear goes from 43 (llvm) or 38 (gcc) to
32 (yasm) cycles/sample on 64bit, or from 46 (llvm) or 44 (gcc) to
38 (yasm) cycles/sample on 32bit (all testing on OSX 10.9.2, llvm
5.1 and gcc 4.8/9).
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
|