summaryrefslogtreecommitdiff
path: root/libavcodec/x86
Commit message (Collapse)AuthorAge
* avcodec/h264: add named parameters to x86 functionJames Darnley2017-02-18
|
* avcodec/x86: deduplicate PASS8ROWS macroJames Darnley2017-02-18
|
* x86/rv34dsp: add ff_rv34_idct_dc_add_sse2James Almer2017-02-02
| | | | | | | Also disable ff_rv34_idct_dc_add_mmx on x86_64 as the presence of sse2 is guaranteed in such builds. Signed-off-by: James Almer <jamrial@gmail.com>
* x86/vp8dsp: add ff_vp8_idct_dc_add_sse2James Almer2017-02-02
| | | | | | | Also disable ff_vp8_idct_dc_add_mmx on x86_64 as the presence of sse2 is guaranteed in such builds. Signed-off-by: James Almer <jamrial@gmail.com>
* Revert "Merge commit '0a39c9ac0bfd7345fe676b4e2707d9cec3cbb553'"Michael Niedermayer2017-02-01
| | | | | | | | | The assumption this is based on is wrong, the code is not always run with bitexact flags This reverts commit a956164e1eb3418922cae949f02ad4035f013213, reversing changes made to f6005907fdeb9e4de37568ed5c1a8e7b869126f6. Approved-by: James Almer <jamrial@gmail.com>
* Merge commit 'd06dfaa5cbdd20acfd2364b16c0f4ae4ddb30a65'James Almer2017-01-31
|\ | | | | | | | | | | | | * commit 'd06dfaa5cbdd20acfd2364b16c0f4ae4ddb30a65': x86: huffyuv: Use EXTERNAL_SSSE3_FAST convenience macro where appropriate Merged-by: James Almer <jamrial@gmail.com>
| * x86: huffyuv: Use EXTERNAL_SSSE3_FAST convenience macro where appropriateDiego Biurrun2016-07-20
| |
* | Merge commit '4efab89332ea39a77145e8b15562b981d9dbde68'James Almer2017-01-31
|\| | | | | | | | | | | | | * commit '4efab89332ea39a77145e8b15562b981d9dbde68': x86: Use *_FAST/*_SLOW CPU feature detection macros where appropriate Merged-by: James Almer <jamrial@gmail.com>
| * x86: Use *_FAST/*_SLOW CPU feature detection macros where appropriateDiego Biurrun2016-07-20
| |
* | Merge commit '0a39c9ac0bfd7345fe676b4e2707d9cec3cbb553'James Almer2017-01-31
|\| | | | | | | | | | | | | * commit '0a39c9ac0bfd7345fe676b4e2707d9cec3cbb553': x86: hpeldsp: Don't check for bitexact flag when initializing VP3-specific code Merged-by: James Almer <jamrial@gmail.com>
| * x86: hpeldsp: Don't check for bitexact flag when initializing VP3-specific codeDiego Biurrun2016-07-20
| | | | | | | | That code is only ever initialized with that flag set.
* | Merge commit '95c1df929b92d81454656c222a35ec5f7db576b4'James Almer2017-01-31
|\| | | | | | | | | | | | | * commit '95c1df929b92d81454656c222a35ec5f7db576b4': x86: hpeldsp: Drop unused function parameters Merged-by: James Almer <jamrial@gmail.com>
| * x86: hpeldsp: Drop unused function parametersDiego Biurrun2016-07-20
| |
* | Merge commit 'c3e83ad3b7d75f3597f47ada2616ba4479665009'James Almer2017-01-31
|\| | | | | | | | | | | | | * commit 'c3e83ad3b7d75f3597f47ada2616ba4479665009': x86: hpeldsp: Use EXTERNAL_SSE2_FAST where appropriate Merged-by: James Almer <jamrial@gmail.com>
| * x86: hpeldsp: Use EXTERNAL_SSE2_FAST where appropriateDiego Biurrun2016-07-20
| |
* | Merge commit '1dfc3cf89d0eb026af28be46294b85d79499ffb5'James Almer2017-01-31
|\| | | | | | | | | | | | | * commit '1dfc3cf89d0eb026af28be46294b85d79499ffb5': x86: hpeldsp: Split off VP3-specific bits into a separate file Merged-by: James Almer <jamrial@gmail.com>
| * x86: hpeldsp: Split off VP3-specific bits into a separate fileDiego Biurrun2016-07-20
| |
* | lavc/hevc: remove a few random spaces to reduce diff with libavClément Bœsch2017-01-31
| |
* | Merge commit 'fca3c3b61952aacc45e9ca54d86a762946c21942'Clément Bœsch2017-01-31
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * commit 'fca3c3b61952aacc45e9ca54d86a762946c21942': hevc: Add AVX2 DC IDCT Mostly noop as we already have that code. In the ASM, code is merged with the exception of SECTION which is kept uppercase for consistency with the rest of the codebase. Still in the ASM, the prototype comment is fixed to honor the '_' added from the original commit. idct_dc_proto() is dropped as it's not used anymore here. Merged-by: Clément Bœsch <cboesch@gopro.com>
| * hevc: Add AVX2 DC IDCTJames Almer2016-07-18
| | | | | | | | | | | | | | Originally written by Pierre Edouard Lepere <pierre-edouard.lepere@insa-rennes.fr>. Integrated to Libav by Josh de Kock <josh@itanimul.li>. Signed-off-by: Alexandra Hájková <alexandra@khirnov.net>
* | Merge commit '1bd890ad173d79e7906c5e1d06bf0a06cca4519d'Clément Bœsch2017-01-31
|\| | | | | | | | | | | | | | | | | | | | | | | | | * commit '1bd890ad173d79e7906c5e1d06bf0a06cca4519d': hevc: Separate adding residual to prediction from IDCT This commit should be a noop but isn't because of the following renames: - transform_add → add_residual - transform_skip → dequant - idct_4x4_luma → transform_4x4_luma Merged-by: Clément Bœsch <cboesch@gopro.com>
* | lossless_videodsp: rename add_hfyu_left_pred_int16 to add_left_pred_int16James Almer2017-01-12
| | | | | | | | Signed-off-by: James Almer <jamrial@gmail.com>
* | huffyuvdsp: move functions only used by huffyuv from lossless_videodspJames Almer2017-01-12
| | | | | | | | Signed-off-by: James Almer <jamrial@gmail.com>
* | huffyuvencdsp: move shared functions to a new lossless_videoencdsp contextJames Almer2017-01-12
| | | | | | | | Signed-off-by: James Almer <jamrial@gmail.com>
* | huffyuvencdsp: move functions only used by huffyuv from lossless_videodspJames Almer2017-01-12
| | | | | | | | Signed-off-by: James Almer <jamrial@gmail.com>
* | lossless_videodsp: move shared functions from huffyuvdspJames Almer2017-01-12
| | | | | | | | | | | | Several codecs other than huffyuv use them. Signed-off-by: James Almer <jamrial@gmail.com>
* | avcodec/x86/vc1dsp_mc: Fix build with NASM 2.09.10Michael Niedermayer2017-01-02
| | | | | | | | | | | | | | make fate passes Reviewed-by: "Ronald S. Bultje" <rsbultje@gmail.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* | avcodec/x86/imdct36: fix building with nasm 2.11.05John Comeau2017-01-02
| | | | | | | | | | | | | | | | | | fixes `operation size not specified` errors as described here: http://stackoverflow.com/questions/36854583/compiling-ffmpeg-for-kali-linux-2 I rebuilt again with yasm and made sure it didn't break that. Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* | avcodec/magicyuv: add 10 bit supportPaul B Mahol2016-12-20
| | | | | | | | Signed-off-by: Paul B Mahol <onemda@gmail.com>
* | avcodec/h264: resolve assert being triggered when stack is not alignedJames Darnley2016-12-07
| | | | | | | | 32-bit msvc.
* | avcodec/h264: mmx2, sse2, avx 10-bit 4:2:2 h chroma deblock/loop filterJames Darnley2016-12-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Yorkfield: - mmx2: 2.53x (504 vs. 199 cycles) - sse2: 3.83x (504 vs. 131 cycles) Nehalem: - mmx2: 2.42x (365 vs. 151 cycles) - sse2: 3.56x (365 vs. 103 cycles) Skylake: - mmx2: 1.81x (308 vs. 170 cycles) - sse2: 2.84x (308 vs. 108 cycles) - avx: 2.93x (308 vs. 105 cycles)
* | avcodec/h264: mmx2, sse2, avx 10-bit h chroma deblock/loop filterJames Darnley2016-12-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Yorkfield: - mmx2: 2.45x (279 vs. 114 cycles) - sse2: 3.36x (279 vs. 83 cycles) Nehalem: - mmx2: 2.10x (192 vs. 92 cycles) - sse2: 2.84x (192 vs. 68 cycles) Skylake: - mmx2: 1.75x (170 vs. 97 cycles) - sse2: 2.47x (170 vs. 69 cycles) - avx: 2.47x (170 vs. 69 cycles)
* | whitespace changes after last commitJames Darnley2016-12-07
| |
* | avcodec/h264: clean up and expand x86 function definitionsJames Darnley2016-12-07
| |
* | avcodec/h264: sse2 and avx 4:2:2 idct add8 10-bit functionsJames Darnley2016-11-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Yorkfield: - sse2: - complex: 4.13x faster (1514 vs. 367 cycles) - simple: 4.38x faster (1836 vs. 419 cycles) Skylake: - sse2: - complex: 3.61x faster ( 936 vs. 260 cycles) - simple: 3.97x faster (1126 vs. 284 cycles) - avx (versus sse2): - complex: 1.07x faster (260 vs. 244 cycles) - simple: 1.03x faster (284 vs. 274 cycles)
* | avcodec/h264: mmx 4:2:2 idct add8 functionJames Darnley2016-11-30
| | | | | | | | 2.87 times faster (1830 vs. 638 cycles)
* | avcodec/h264: mmxext 4:2:2 chroma intra deblock/loop filterJames Darnley2016-11-30
| | | | | | | | 2.1 times faster (401 vs. 194 cycles)
* | x86/vp9itxfm: add missing AVX2 guardsJames Almer2016-11-18
| | | | | | | | | | | | Fixes compilation with Yasm 1.1.0 and older. Signed-off-by: James Almer <jamrial@gmail.com>
* | vp9: add avx2 iadst16 implementations.Ronald S. Bultje2016-11-15
| | | | | | | | | | | | Also a small cosmetic change to the avx2 idct16 version to make it explicit that one of the arguments to the write-out macros is unused for >=avx2 (it uses pmovzxbw instead of punpcklbw).
* | Merge commit '4a081f224e12f4227ae966bcbdd5384f22121ecf'Hendrik Leppkes2016-11-13
|\| | | | | | | | | | | | | * commit '4a081f224e12f4227ae966bcbdd5384f22121ecf': libavcodec: fix constness in clobber test avcodec_open2() wrappers Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
| * libavcodec: fix constness in clobber test avcodec_open2() wrappersClément Bœsch2016-06-26
| | | | | | | | Signed-off-by: Martin Storsjö <martin@martin.st>
* | doc: fix spelling errorsAndreas Cadhalpun2016-10-21
| | | | | | | | | | | | | | | | Thanks to Mathieu Malaterre <malat@debian.org> for reporting the Que/Queue typo. (https://bugs.debian.org/839542) Reviewed-by: Lou Logan <lou@lrcd.com> Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com>
* | aacenc: add SIMD optimizations for abs_pow34 and quantizationRostislav Pehlivanov2016-10-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Performance improvements: quant_bands: with: 681 decicycles in quant_bands, 8388453 runs, 155 skips without: 1190 decicycles in quant_bands, 8388386 runs, 222 skips Around 42% for the function Twoloop coder: abs_pow34: with/without: 7.82s/8.17s Around 4% for the entire encoder Both: with/without: 7.15s/8.17s Around 12% for the entire encoder Fast coder: abs_pow34: with/without: 3.40s/3.77s Around 10% for the entire encoder Both: with/without: 3.02s/3.77s Around 20% faster for the entire encoder Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com> Tested-by: Michael Niedermayer <michael@niedermayer.cc> Reviewed-by: James Almer <jamrial@gmail.com>
* | avcodec: fix arguments on xmm/neon clobber test wrappersJames Almer2016-10-02
| | | | | | | | Signed-off-by: James Almer <jamrial@gmail.com>
* | avcodec: add missing xmm/neon clobber test wrappers for the new encode APIJames Almer2016-10-01
| | | | | | | | | | Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>
* | x86/h264_weight: use appropriate register size for weight parametersHendrik Leppkes2016-09-23
| | | | | | | | | | | | | | Fixes trac 5579 Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com> Acked-by: Michael Niedermayer <michael@niedermayer.cc>
* | avcodec/h264: Use ptrdiff_t for (bi)weight functionsMichael Niedermayer2016-09-23
| | | | | | | | Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* | avcodec/ttadsp: cosmeticsJames Almer2016-08-06
| | | | | | | | | | | | | | Clean some header includes and use the same naming scheme as in ttaencdsp Signed-off-by: James Almer <jamrial@gmail.com>
* | x86/ttaenc: add ff_ttaenc_filter_process_{ssse3,sse4}James Almer2016-08-02
| | | | | | | | Signed-off-by: James Almer <jamrial@gmail.com>
* | Merge commit '9df889a5f116c1ee78c2f239e0ba599c492431aa'Clément Bœsch2016-07-29
|\| | | | | | | | | | | | | * commit '9df889a5f116c1ee78c2f239e0ba599c492431aa': h264: rename h264.[ch] to h264dec.[ch] Merged-by: Clément Bœsch <u@pkh.me>