| Commit message (Collapse) | Author | Age |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Timings for Arrandale:
C SSE
win32: 2108 334
win64: 1152 322
Factorizing the inner loop with a call/jmp is a >15 cycles cost, even with
the jmp destination being aligned.
Unrolling for ARCH_X86_64 is a 20 cycles gain.
Signed-off-by: Janne Grunau <janne-libav@jannau.net>
|
|
|
|
| |
Signed-off-by: Mans Rullgard <mans@mansr.com>
|
|
|
|
|
|
|
|
| |
DSPContext.vector_fmul_window()
DCADSPContext.lfe_fir()
SynthFilterContext.synth_filter_float()
Signed-off-by: Mans Rullgard <mans@mansr.com>
|
|
|
|
|
|
| |
2.7x faster DCA decoding on Cortex-A8
Originally committed as revision 22828 to svn://svn.ffmpeg.org/ffmpeg/trunk
|
|
|
|
| |
Originally committed as revision 22827 to svn://svn.ffmpeg.org/ffmpeg/trunk
|
|
|
|
| |
Originally committed as revision 22337 to svn://svn.ffmpeg.org/ffmpeg/trunk
|
|
|
|
| |
Originally committed as revision 20415 to svn://svn.ffmpeg.org/ffmpeg/trunk
|
|
|
|
| |
Originally committed as revision 20413 to svn://svn.ffmpeg.org/ffmpeg/trunk
|
|
Originally committed as revision 20396 to svn://svn.ffmpeg.org/ffmpeg/trunk
|