summaryrefslogtreecommitdiff
path: root/libavcodec/x86
Commit message (Collapse)AuthorAge
* mpegvideoenc: make a table constAnton Khirnov2017-01-19
|
* cabac: x86: Give optimizations header a more meaningful nameDiego Biurrun2016-12-01
|
* h264_qpel: x86: Move function with only one instance out of template macroDiego Biurrun2016-11-08
| | | | libavcodec/x86/h264_qpel.c:392:785: warning: unused function 'ff_avg_h264_qpel8or16_hv1_lowpass_mmxext' [-Wunused-function]
* x86: Drop stray semicolons after function definitionsDiego Biurrun2016-11-05
| | | | | libavcodec/x86/rv40dsp_init.c:97:2: warning: ISO C does not allow extra ‘;’ outside of a function [-Wpedantic] libavcodec/x86/vp9dsp_init.c:94:40: warning: ISO C does not allow extra ‘;’ outside of a function [-Wpedantic]
* vp9: Flip the order of arguments in MC functionsMartin Storsjö2016-11-03
| | | | | | | | | This makes it match the pattern already used for VP8 MC functions. This also makes the signature match ffmpeg's version of these functions, easing porting of code in both directions. Signed-off-by: Martin Storsjö <martin@martin.st>
* hevc: x86: Add add_residual() SIMD optimizationsPierre Edouard Lepere2016-10-22
| | | | | | | Initially written by Pierre Edouard Lepere <Pierre-Edouard.Lepere@insa-rennes.fr>, extended by James Almer <jamrial@gmail.com>. Signed-off-by: Alexandra Hájková <alexandra@khirnov.net>
* audiodsp: x86: Remove pointless header fileDiego Biurrun2016-10-19
| | | | | Its single forward declaration can be moved to the only place it is used, like is done for all other dsp init files.
* x86: videodsp: Add parentheses to expression to work around warningDiego Biurrun2016-10-19
| | | | libavcodec/x86/videodsp.asm:128: warning: signed dword value exceeds bounds
* x86: Add missing colons after assembly labelsDiego Biurrun2016-10-17
| | | | | This fixes many warnings of the sort warning: label alone on a line without a colon might be in error
* hevc: Add SSE2 and AVX IDCTAlexandra Hájková2016-10-11
| | | | Signed-off-by: Anton Khirnov <anton@khirnov.net>
* Revert "hevc: x86: Refactor IDCT macro declarations"Anton Khirnov2016-10-06
| | | | | This reverts commit d9dccc03890a976dba59d66ed3b5aceeaa33d14c. There were outstanding objections to this commit.
* h264_intrapred: x86: Update comments left behind in ↵Diego Biurrun2016-10-06
| | | | 95c89da36ebeeb96b7146c0d70f46c582397da7f
* hevc: x86: Refactor IDCT macro declarationsDiego Biurrun2016-10-06
|
* vp9lpf/x86: make filter_16_h work on 32-bit.Ronald S. Bultje2016-10-04
| | | | Signed-off-by: Anton Khirnov <anton@khirnov.net>
* vp9lpf/x86: make filter_48/84/88_h work on 32-bit.Ronald S. Bultje2016-10-04
| | | | Signed-off-by: Anton Khirnov <anton@khirnov.net>
* vp9lpf/x86: make filter_44_h work on 32-bit.Ronald S. Bultje2016-10-04
| | | | Signed-off-by: Anton Khirnov <anton@khirnov.net>
* vp9lpf/x86: make filter_16_v work on 32-bit.Ronald S. Bultje2016-10-04
| | | | Signed-off-by: Anton Khirnov <anton@khirnov.net>
* vp9lpf/x86: make filter_48/84_v work on 32-bit.Ronald S. Bultje2016-10-04
| | | | Signed-off-by: Anton Khirnov <anton@khirnov.net>
* vp9lpf/x86: make filter_88_v work on 32-bit.Ronald S. Bultje2016-10-04
| | | | Signed-off-by: Anton Khirnov <anton@khirnov.net>
* vp9lpf/x86: make filter_44_v work on 32-bit.Ronald S. Bultje2016-10-04
| | | | Signed-off-by: Anton Khirnov <anton@khirnov.net>
* vp9lpf/x86: save one register in SIGN_ADD/SUB.Ronald S. Bultje2016-10-04
| | | | Signed-off-by: Anton Khirnov <anton@khirnov.net>
* vp9lpf/x86: store unpacked intermediates for filter6/14 on stack.Ronald S. Bultje2016-10-04
| | | | | | | filter16 goes from 508 to 482 (h) or 346 to 314 (v) cycles; filter88 goes from 240 to 238 (h) or 174 to 165 (v) cycles, measured on TOS. Signed-off-by: Anton Khirnov <anton@khirnov.net>
* vp9lpf/x86: move variable assigned inside macro branch.Ronald S. Bultje2016-10-04
| | | | | | The value is not used outside the branch. Signed-off-by: Anton Khirnov <anton@khirnov.net>
* vp9lpf/x86: simplify ABSSUM_CMP by inverting the comparison meaning.Ronald S. Bultje2016-10-04
| | | | Signed-off-by: Anton Khirnov <anton@khirnov.net>
* vp9lpf/x86: remove unused register from ABSSUB_CMP macro.Ronald S. Bultje2016-10-04
| | | | Signed-off-by: Anton Khirnov <anton@khirnov.net>
* vp9lpf/x86: slightly simplify 44/48/84/88 h stores.Ronald S. Bultje2016-10-04
| | | | Signed-off-by: Anton Khirnov <anton@khirnov.net>
* vp9lpf/x86: make cglobal statement more conservative in register allocation.Ronald S. Bultje2016-10-04
| | | | Signed-off-by: Anton Khirnov <anton@khirnov.net>
* vp9lpf/x86: save one register in loopfilter surface coverage.Ronald S. Bultje2016-10-04
| | | | Signed-off-by: Anton Khirnov <anton@khirnov.net>
* vp9lpf/x86: add ff_vp9_loop_filter_[vh]_44_16_{sse2,ssse3,avx}.Clément Bœsch2016-10-04
| | | | Signed-off-by: Anton Khirnov <anton@khirnov.net>
* vp9lpf/x86: add ff_vp9_loop_filter_h_{48,84}_16_{sse2,ssse3,avx}().Clément Bœsch2016-10-04
| | | | Signed-off-by: Anton Khirnov <anton@khirnov.net>
* vp9lpf/x86: add an SSE2 version of vp9_loop_filter_[vh]_88_16James Almer2016-10-04
| | | | | | | | | Similar gains as the ssse3 version once again Additional improvements by Clément Bœsch <u@pkh.me>. Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Anton Khirnov <anton@khirnov.net>
* vp9lpf/x86: add ff_vp9_loop_filter_[vh]_88_16_{ssse3,avx}.Clément Bœsch2016-10-04
| | | | Signed-off-by: Anton Khirnov <anton@khirnov.net>
* vp9lpf/x86: add ff_vp9_loop_filter_[vh]_16_16_sse2().James Almer2016-10-04
| | | | | | | Similar gains in performance as the SSSE3 version Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Anton Khirnov <anton@khirnov.net>
* vp9lpf/x86: add x86 SSSE3/AVX SIMD for vp9_loop_filter_[vh]_16_16.Clément Bœsch2016-10-04
| | | | Signed-off-by: Anton Khirnov <anton@khirnov.net>
* ac3dsp: x86: Replace inline asm for in-decoder downmixing with standalone asmJustin Ruggles2016-10-01
| | | | | | | | | Adds a wrapper function for downmixing which detects channel count changes and updates the selected downmix function accordingly. Simplification and porting to current x86inc infrastructure by Diego Biurrun. Signed-off-by: Diego Biurrun <diego@biurrun.de>
* ac3dsp: Reverse matrix in/out order in downmix()Justin Ruggles2016-10-01
| | | | | | | Also use (float **) instead of (float (*)[2]). This matches the matrix layout in libavresample so we can reuse assembly code between the two. Signed-off-by: Diego Biurrun <diego@biurrun.de>
* x86/h264_weight: use appropriate register size for weight parametersHendrik Leppkes2016-09-30
| | | | | | This fixes decoding corruption on 64 bit windows. Signed-off-by: Martin Storsjö <martin@martin.st>
* mpegaudiodsp: Change type of array stride parameters to ptrdiff_tDiego Biurrun2016-09-29
| | | | | This avoids SIMD-optimized functions having to sign-extend their stride argument manually to be able to do pointer arithmetic.
* h264chroma: Change type of stride parameters to ptrdiff_tDiego Biurrun2016-09-29
| | | | | This avoids SIMD-optimized functions having to sign-extend their stride argument manually to be able to do pointer arithmetic.
* idct: Change type of array stride parameters to ptrdiff_tDiego Biurrun2016-09-29
| | | | ptrdiff_t is the correct type for array strides and similar.
* x86: fpel: Remove unnecessary sign extendDiego Biurrun2016-09-29
|
* lavc: add clobber tests for the new encoding/decoding APIAnton Khirnov2016-09-28
|
* audiodsp/x86: yasmify vector_clipf_sseAnton Khirnov2016-09-22
|
* audiodsp: reorder arguments for vector_clipfAnton Khirnov2016-09-22
| | | | | | | This will make the x86 asm simpler. ARM conversion by Martin Storsjö <martin@martin.st> and Janne Grunau <janne-libav@jannau.net>
* blockdsp: drop the high_bit_depth parameterAnton Khirnov2016-09-22
| | | | | It has no effect, since the code is supposed to operate the same way for any bit depth.
* audiodsp/x86: clear the high bits of the order parameter on 64bitAnton Khirnov2016-09-19
| | | | | | Also change shl to add, since it can be faster on some CPUs. CC: libav-stable@libav.org
* audiodsp/x86: fix ff_vector_clip_int32_sse2Anton Khirnov2016-09-19
| | | | | | | | This version, which is the only one doing two processing cycles per loop iteration, computes the load/store indices incorrectly for the second cycle. CC: libav-stable@libav.org
* pixblockdsp: Change type of stride parameters to ptrdiff_tDiego Biurrun2016-09-14
| | | | | | | This avoids SIMD-optimized functions having to sign-extend their line size argument manually to be able to do pointer arithmetic. Also adjust parameter names to be "stride" everywhere.
* vp56: Separate VP5 and VP6 dsp initializationDiego Biurrun2016-08-26
| | | | | VP5 has no arch-specific optimizations (nor will it get some in the future), so it makes no sense to try to share dsp init code with VP6.
* prores: Change type of stride parameters to ptrdiff_tDiego Biurrun2016-08-26
| | | | | | | This avoids SIMD-optimized functions having to sign-extend their line size argument manually to be able to do pointer arithmetic. Also adjust parameter names to be "linesize" everywhere.