| Commit message (Collapse) | Author | Age |
|
|
|
|
|
|
|
|
| |
benchmark:
sse2 10.670s
avx 8.763s
fma3 8.380s
Signed-off-by: Muhammad Faiz <mfcc64@gmail.com>
|
|
|
|
|
|
|
| |
phase_shift and phase_mask is removed
generally exact_rational=on is faster than exact_rational=off
Signed-off-by: Muhammad Faiz <mfcc64@gmail.com>
|
|
|
|
|
|
|
|
|
|
| |
Only two functions that use xop multiply-accumulate instructions where the
first operand is the same as the fourth actually took advantage of the macros.
This further reduces differences with x264's x86inc.
Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
|
|
|
|
|
| |
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
|
|
|
|
|
| |
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
|
|
|
|
|
|
|
|
| |
Signed-off-by: James Almer <jamrial@gmail.com>
312531 -> 311528 dezicycles
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
|
|
|
|
| |
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
|
|
Linear interpolation goes from 63 (llvm) or 58 (gcc) to 48 (yasm)
cycles/sample on 64bit, or from 66 (llvm/gcc) to 52 (yasm) cycles/
sample on 32bit. Bon-linear goes from 43 (llvm) or 38 (gcc) to
32 (yasm) cycles/sample on 64bit, or from 46 (llvm) or 44 (gcc) to
38 (yasm) cycles/sample on 32bit (all testing on OSX 10.9.2, llvm
5.1 and gcc 4.8/9).
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
|