summaryrefslogtreecommitdiff
path: root/libavcodec/x86
Commit message (Collapse)AuthorAge
...
* dsputil: Split off quarterpel bits into their own contextDiego Biurrun2014-05-29
|
* dsputil: Move APE-specific bits into apedspDiego Biurrun2014-05-29
|
* dsputil: Move SVQ1 encoding specific bits into svq1encDiego Biurrun2014-05-29
|
* dsputil: Split off HuffYUV encoding bits into their own contextDiego Biurrun2014-05-27
| | | | Also shorten HuffYUV context member names to avoid clutter.
* dsputil: Split off HuffYUV decoding bits into their own contextDiego Biurrun2014-05-27
| | | | Also shorten HuffYUV context member names to avoid clutter.
* x86/synth_filter: remove the fma3 version ifdefsJames Almer2014-04-13
| | | | | | | This fixes compilation failures with --disable-fma3 Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Anton Khirnov <anton@khirnov.net>
* DNxHD: convert inline asm to yasmTimothy Gu2014-04-11
|
* DNxHD: make get_pixel_8x4_sym accept ptrdiff_t as strideTimothy Gu2014-04-11
|
* x86: dsputil: Move ff_apply_window_int16_* bits to ac3dsp, where they belongDiego Biurrun2014-04-04
|
* x86: h264_qpel: Simplify an #if conditionalDiego Biurrun2014-04-04
| | | | The extra conditions are covered by previous #ifs and conditional compilation.
* x86: Drop some unnecessary YASM ifdefsDiego Biurrun2014-04-04
| | | | Dead code elimination is enough to avoid undefined references in these cases.
* x86: dsputil: Eliminate some unnecessary dsputil_x86.h #includesDiego Biurrun2014-04-04
|
* Remove a number of unnecessary dsputil.h #includesDiego Biurrun2014-04-04
|
* x86/synth_filter: add synth_filter_fma3James Almer2014-04-04
| | | | | Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Anton Khirnov <anton@khirnov.net>
* x86/synth_filter: add synth_filter_avxJames Almer2014-04-04
| | | | | | | | | | | | Sandy Bridge Win64: 180 cycles in ff_synth_filter_inner_sse2 150 cycles in ff_synth_filter_inner_avx Also switch some instructions to a three operand format to avoid assembly errors with Yasm 1.1.0 or older. Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Anton Khirnov <anton@khirnov.net>
* x86/synth_filter: add synth_filter_sseJames Almer2014-04-04
| | | | | | | Build only on x86_32 targets. Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Anton Khirnov <anton@khirnov.net>
* On2 VP7 decoderPeter Ross2014-04-04
| | | | | | | | | Further performance improvements and security fixes by Vittorio Giovara, Luca Barbato and Diego Biurrun. Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com> Signed-off-by: Luca Barbato <lu_zero@gentoo.org> Signed-off-by: Diego Biurrun <diego@biurrun.de>
* x86: hpeldsp: Keep all rnd_template instantiations in hpeldsp_initDiego Biurrun2014-03-26
| | | | | There is no point in having a separate file just for the instantiation that provides the public functions.
* Add missing headers to make template files compile (more) standaloneDiego Biurrun2014-03-26
|
* x86: h264_qpel: Fix typo in CALL_2X_PIXELS macro invocationDiego Biurrun2014-03-26
| | | | This fixes FATE with mmxext CPUFLAGS set.
* x86: dsputil: Move hpeldsp-related declarations to a separate headerDiego Biurrun2014-03-22
|
* x86: dsputil: Move fpel declarations to a separate headerDiego Biurrun2014-03-22
|
* dsputil: Refactor duplicated CALL_2X_PIXELS / PIXELS16 macrosDiego Biurrun2014-03-22
|
* imgconvert: Move ff_deinterlace_line_*_mmx declarations out of dsputilDiego Biurrun2014-03-22
|
* x86: dsputil: Move inline assembly macros to a separate headerDiego Biurrun2014-03-22
|
* dsputil: Use correct type in me_cmp_func function pointerDiego Biurrun2014-03-20
|
* build: Group general components separate from de/encoders in arch MakefilesDiego Biurrun2014-03-20
| | | | This is in line with how the top-level libavcodec Makefile is structured.
* dsputil: Propagate bit depth information to all (sub)init functionsDiego Biurrun2014-03-20
| | | | This avoids recalculating the value over and over again.
* x86: dsputil_init: Drop some unnecessary parenthesesDiego Biurrun2014-03-13
|
* x86: dsputil_init: K&R formatting cosmeticsDiego Biurrun2014-03-13
|
* x86: dsputil_x86.h: K&R formatting cosmeticsDiego Biurrun2014-03-13
|
* x86: motion_est: K&R formatting cosmeticsDiego Biurrun2014-03-13
|
* dsputilenc_mmx: K&R formatting cosmeticsDiego Biurrun2014-03-13
|
* dsputil_mmx: K&R formatting cosmeticsDiego Biurrun2014-03-13
|
* dsputilenc_mmx: Merge two assignment blocks with identical conditionsDiego Biurrun2014-03-13
|
* x86: Make function prototype comments in assembly code consistentDiego Biurrun2014-03-13
| | | | This helps grepping for functions, among other things.
* x86: h264_idct_10_bit: Use proper type in function prototype commentsDiego Biurrun2014-03-13
|
* Update dsputil- and SIMD-related comments to match reality more closelyDiego Biurrun2014-03-13
|
* x86: Add some more missing headersDiego Biurrun2014-03-13
|
* x86: mpegvideoenc: Remove some remnants of the long-gone libmpeg2 IDCTDiego Biurrun2014-03-13
|
* x86: dcadsp: Fix linking with yasm and optimizations disabledDiego Biurrun2014-03-05
| | | | | Some optimized functions reference optimized symbols, so the functions must be explicitly disabled when those symbols are unavailable.
* x86: cabac: Use correct #includes to make header compile standaloneDiego Biurrun2014-03-05
|
* dcadec: simplify decoding of VQ high frequenciesChristophe Gisquet2014-02-28
| | | | | | | | | | | | | | | | | | | The vector dequantization has a test in a loop preventing effective SIMD implementation. By moving it out of the loop, this loop can be DSPized. Therefore, modify the current DSP implementation. In particular, the DSP implementation no longer has to handle null loop sizes. The decode_hf implementations have following timings: For x86 Arrandale: C SSE SSE2 SSE4 win32: 260 162 119 104 win64: 242 N/A 89 72 The arm NEON optimizations follow in a later patch as external asm. The now unused check for the y modifier in arm inline asm is removed from configure.
* x86: synth filter float: implement SSE2 versionChristophe Gisquet2014-02-28
| | | | | | | | | | | | | | Timings for Arrandale: C SSE win32: 2108 334 win64: 1152 322 Factorizing the inner loop with a call/jmp is a >15 cycles cost, even with the jmp destination being aligned. Unrolling for ARCH_X86_64 is a 20 cycles gain. Signed-off-by: Janne Grunau <janne-libav@jannau.net>
* x86: dcadsp: implement SSE lfe_dirChristophe Gisquet2014-02-28
| | | | | | | | Results for Arrandale/Windows: 32: 1670 -> 316 64: 728 -> 298 Signed-off-by: Janne Grunau <janne-libav@jannau.net>
* prores: Use consistent names for DSP arch initialization functionsDiego Biurrun2014-02-28
|
* x86: dsputil: Use correct file name as multiple inclusion guardDiego Biurrun2014-02-20
|
* x86: dca: Add missing multiple inclusion guardsDiego Biurrun2014-02-19
|
* dca: include dcadsp.h in {arm,x86}/dca.h for checkheadersJanne Grunau2014-02-08
|
* x86: use the inline int8x8_fmul_int32 only if inline SSE2 is availbaleJanne Grunau2014-02-08
| | | | | Fixes compilation with MSVC. Also does not rely on on earlier config.h include but include it directly.