summaryrefslogtreecommitdiff
path: root/libavcodec/x86
Commit message (Collapse)AuthorAge
* Merge commit '0338c396987c82b41d322630ea9712fe5f9561d6'Michael Niedermayer2013-11-08
|\ | | | | | | | | | | | | | | | | | | | | | | * commit '0338c396987c82b41d322630ea9712fe5f9561d6': dsputil: Split off H.263 bits into their own H263DSPContext Conflicts: configure libavcodec/mpegvideo.h libavcodec/mpegvideo_enc.c Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * dsputil: Split off H.263 bits into their own H263DSPContextDiego Biurrun2013-11-08
| |
* | avcodec/vp9: add ff_vp9_idct_idct_{4x4,8x8}_ssse3().Clément Bœsch2013-11-05
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 1789 decicycles in idct_idct_4x4_add_c, 262136 runs, 8 skips 1839 decicycles in idct_idct_4x4_add_c, 524270 runs, 18 skips 1864 decicycles in idct_idct_4x4_add_c, 1048548 runs, 28 skips 529 decicycles in ff_vp9_idct_idct_4x4_add_ssse3, 262138 runs, 6 skips 516 decicycles in ff_vp9_idct_idct_4x4_add_ssse3, 524282 runs, 6 skips 474 decicycles in ff_vp9_idct_idct_4x4_add_ssse3, 1048565 runs, 11 skips (~3.9x faster) 7726 decicycles in idct_idct_8x8_add_c, 1048433 runs, 143 skips 7732 decicycles in idct_idct_8x8_add_c, 2096882 runs, 270 skips 7731 decicycles in idct_idct_8x8_add_c, 4193772 runs, 532 skips 1145 decicycles in ff_vp9_idct_idct_8x8_add_ssse3, 1048549 runs, 27 skips 1137 decicycles in ff_vp9_idct_idct_8x8_add_ssse3, 2097097 runs, 55 skips 1086 decicycles in ff_vp9_idct_idct_8x8_add_ssse3, 4194188 runs, 116 skips (~7.1x faster) Overall decode time before commit: 16.48s user 0.03s system 99% cpu 16.526 total 16.54s user 0.01s system 99% cpu 16.566 total 16.46s user 0.03s system 99% cpu 16.511 total Overall decode time after commit: 16.34s user 0.02s system 99% cpu 16.378 total 16.28s user 0.02s system 99% cpu 16.315 total 16.32s user 0.03s system 99% cpu 16.366 total Tested on i7 920 with 40s 1080p footage.
* | Merge commit 'e2b5b097898c9155f4bdff4d83cdc54d5eef6930'Michael Niedermayer2013-11-05
|\| | | | | | | | | | | | | * commit 'e2b5b097898c9155f4bdff4d83cdc54d5eef6930': x86: rv40dsp: Use PAVGB instruction macro where appropriate Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * x86: rv40dsp: Use PAVGB instruction macro where appropriateDiego Biurrun2013-11-04
| |
| * x86: hpeldsp: Use PAVGB instruction macro where necessaryMikulas Patocka2013-11-04
| | | | | | | | | | Signed-off-by: Mikulas Patocka <mikulas@artax.karlin.mff.cuni.cz> Signed-off-by: Diego Biurrun <diego@biurrun.de>
* | avcodec/x86/hpeldsp: fix crash on AMD K6-3+Mikulas Patocka2013-11-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are instructions pavgb and pavgusb. Both instructions do the same operation but they have different enconding. Pavgb exists in SSE (or MMXEXT) instruction set and pavgusb exists in 3D-NOW instruction set. livavcodec uses the macro PAVGB to select the proper instruction. However, the function avg_pixels8_xy2 doesn't use this macro, it uses pavgb directly. As a consequence, the function avg_pixels8_xy2 crashes on AMD K6-2 and K6-3 processors, because they have pavgusb, but not pavgb. This bug seems to be introduced by commit 71155d7b4157fee44c0d3d0fc1b660ebfb9ccf46, "dsputil: x86: Convert mpeg4 qpel and dsputil avg to yasm" Signed-off-by: Mikulas Patocka <mikulas@artax.karlin.mff.cuni.cz> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* | Merge commit '1700b4e678ed329611a16b20d11e64b7abda4839'Michael Niedermayer2013-11-02
|\| | | | | | | | | | | | | | | | | | | * commit '1700b4e678ed329611a16b20d11e64b7abda4839': x86: vp8dsp: Split loopfilter code into a separate file Conflicts: libavcodec/x86/Makefile Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * x86: vp8dsp: Split loopfilter code into a separate fileDiego Biurrun2013-11-01
| |
* | avcodec/cabac: support UNCHECKED_BITSTREAM_READER = 0Michael Niedermayer2013-10-31
| | | | | | | | | | | | | | | | | | | | | | Fixes overreads in HEVC Fixes Ticket3070 Also fixed remaining issues from Ticket3075 and Ticket3076 Some lines of code taken from 0c5f839693da2276c2da23400f67a67be4ea0af1:libavcodec/x86/cabac.h and 0c5f839693da2276c2da23400f67a67be4ea0af1:libavcodec/cabac_functions.h Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* | avcodec/x86/videodsp: Small speedups in ff_emulated_edge_mc x86 SIMD.Ronald S. Bultje2013-10-27
| | | | | | | | | | | | | | | | Don't use word-size multiplications if size == 2, and if we're using SIMD instructions (size >= 8), complete leftover 4byte sets using movd, not mov. Both of these changes lead to minor speedups. Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* | avcodec/x86/videodsp: fix a bug in a %if statement where we used '%%' ↵Ronald S. Bultje2013-10-27
| | | | | | | | | | | | instead of '&&'. Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* | avcodec/x86/cabac: include get_cabac_bypass_sign_x86() under #if ↵Michael Niedermayer2013-10-26
| | | | | | | | | | | | | | | | | | | | !BROKEN_COMPILER this might fix Ticket2999 as well as some fate clients untested as the original patch submitter no longer has the environment to test this should be reverted if it does not fix the issues Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* | avcodec/x86/videodsp: Properly mark sse2 instructions in emulated_edge_mc ↵Ronald S. Bultje2013-10-24
| | | | | | | | | | | | | | | | | | | | | | x86 simd as such. Should fix crashes or corrupt output on pre-SSE2 CPUs when they were using SSE2-code (e.g. AMD Athlon XP 2400+ or Intel Pentium III) in hfix or hvar single-edge (left/right) extension functions. Tested-by: Ingo Brückl <ib@wupperonline.de> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* | avcodec/x86/dsputil_init: move ff_idct_xvid_mmxext initMichael Niedermayer2013-10-15
| | | | | | | | | | | | This decreases the diff to libav Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* | avcodec/x86/dsputil_init: remove duplicated sse2 idct initMichael Niedermayer2013-10-15
| | | | | | | | Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* | avcodec/x86/dsputil_init: fix cpu flag checksMichael Niedermayer2013-10-15
| | | | | | | | | | | | Fixes linking failure with --disable-sse2 Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* | libavcodec/x86: Fix emulated_edge_mc SSE code to not contain SSE2 ↵Ronald S. Bultje2013-10-10
| | | | | | | | | | | | instructions on x86-32. Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* | x86: Fix compilation with nasm on PPC & OS/2Ronald S. Bultje2013-10-08
| | | | | | | | Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* | Merge remote-tracking branch 'qatar/master'Michael Niedermayer2013-10-08
|\| | | | | | | | | | | | | * qatar/master: x86: h264_idct: Update comments to match 8/10-bit depth optimization split Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * x86: h264_idct: Update comments to match 8/10-bit depth optimization splitDiego Biurrun2013-10-07
| |
* | Merge commit 'bbe4a6db44f0b55b424a5cc9d3e89cd88e250450'Michael Niedermayer2013-10-08
|\| | | | | | | | | | | | | * commit 'bbe4a6db44f0b55b424a5cc9d3e89cd88e250450': x86inc: Utilize the shadow space on 64-bit Windows Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * x86inc: Utilize the shadow space on 64-bit WindowsHenrik Gramner2013-10-07
| | | | | | | | | | | | | | | | | | Store XMM6 and XMM7 in the shadow space in functions that clobbers them. This way we don't have to adjust the stack pointer as often, reducing the number of instructions as well as code size. Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
* | avcodec/x86/vp9dsp: Fix compilation with nasm.Ronald S. Bultje2013-10-08
| | | | | | | | Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* | Merge remote-tracking branch 'qatar/master'Michael Niedermayer2013-10-07
|\| | | | | | | | | | | | | * qatar/master: x86: fdct: Employ more specific ifdefs Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * x86: fdct: Employ more specific ifdefsDiego Biurrun2013-10-06
| | | | | | | | This avoids building mmxext and sse2 code when disabled by configure.
* | Merge commit '2ddb35b91131115c094d90e04031451023441b4d'Michael Niedermayer2013-10-06
|\| | | | | | | | | | | | | * commit '2ddb35b91131115c094d90e04031451023441b4d': x86: dsputil: Separate ff_add_hfyu_median_prediction_cmov from dsputil_mmx Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * x86: dsputil: Separate ff_add_hfyu_median_prediction_cmov from dsputil_mmxDiego Biurrun2013-10-05
| | | | | | | | | | The function does not depend on MMX and compilation without MMX enabled fails if the function is compiled conditional on MMX availability.
* | Merge commit '258414d0771845d20f646ffe4d4e60f22fba217c'Michael Niedermayer2013-10-06
|\| | | | | | | | | | | | | * commit '258414d0771845d20f646ffe4d4e60f22fba217c': x86: fdct: Initialize optimized fdct implementations in the standard way Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * x86: fdct: Initialize optimized fdct implementations in the standard wayDiego Biurrun2013-10-05
| |
* | Merge commit '0b8b2ae5e93d616c2ece59f7175f483154cff918'Michael Niedermayer2013-10-06
|\| | | | | | | | | | | | | * commit '0b8b2ae5e93d616c2ece59f7175f483154cff918': x86: xviddct: Employ more specific ifdefs Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * x86: xviddct: Employ more specific ifdefsDiego Biurrun2013-10-05
| | | | | | | | This avoids building mmxext and sse2 code when disabled by configure.
* | Merge remote-tracking branch 'qatar/master'Michael Niedermayer2013-10-04
|\| | | | | | | | | | | | | * qatar/master: x86: fdct: Only build fdct code if encoders have been enabled Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * x86: fdct: Only build fdct code if encoders have been enabledDiego Biurrun2013-10-04
| | | | | | | | fdct is only initialized if encoders are enabled.
| * x86: Add an xmm clobbering wrapper for avcodec_encode_video2Martin Storsjö2013-09-16
| | | | | | | | | | | | | | This is required since 187105ff8 when we started trying to wrap this function as well. Signed-off-by: Martin Storsjö <martin@martin.st>
| * mathops/x86: work around inline asm miscompilation with GCC 4.8.1Hendrik Leppkes2013-09-15
| | | | | | | | | | | | | | | | The volatile is not required here, and prevents a miscompilation with GCC 4.8.1 when building on x86 with --cpu=i686 Signed-off-by: Michael Niedermayer <michaelni@gmx.at> Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
* | Full-pixel MC functions.Ronald S. Bultje2013-10-02
| | | | | | | | Decoding time of ped1080p.webm goes from 11.3sec to 11.1sec.
* | VP9 MC (ssse3) optimizations.Ronald S. Bultje2013-10-02
| | | | | | | | Decoding time of ped1080p.webm goes from 20.7sec to 11.3sec.
* | Rewrite emu_edge functions to have separate src/dst_stride arguments.Ronald S. Bultje2013-09-28
| | | | | | | | | | | | This allows supporting files for which the image stride is smaller than the max. block size + number of subpel mc taps, e.g. a 64x64 VP9 file or a 16x16 VP8 file with -fflags +emu_edge.
* | Convert multiplier for MV from int to ptrdiff_t.Ronald S. Bultje2013-09-28
| | | | | | | | | | | | | | This prevents emulated_edge_mc from not undoing mvy*stride-related integer overflows. Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* | x86: Add an xmm clobbering wrapper for avcodec_encode_video2Martin Storsjö2013-09-17
| | | | | | | | | | | | | | This is required since 187105ff8 when we started trying to wrap this function as well. Signed-off-by: Martin Storsjö <martin@martin.st>
* | avcodec: add emuedge_linesize_typeMichael Niedermayer2013-09-04
| | | | | | | | | | | | | | | | | | | | Currently all uses of the emu edge code as well as the code itself assume int linesize changing some but not changing all would introduce a security issue once all use this typedef a simple search and replace can be done to switch them all to ptrdiff_t Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* | x86/simple_idct: use LOCAL_ALIGNED instead of DECLARE_ALIGNEDPaul B Mahol2013-09-03
| | | | | | | | Signed-off-by: Paul B Mahol <onemda@gmail.com>
* | Reinstate proper FFmpeg license for all files.Thilo Borgmann2013-08-30
| |
* | Fix compilation with --disable-mmx.Carl Eugen Hoyos2013-08-30
| |
* | Merge commit 'e998b56362c711701b3daa34e7b956e7126336f4'Michael Niedermayer2013-08-30
|\| | | | | | | | | | | | | * commit 'e998b56362c711701b3daa34e7b956e7126336f4': x86: avcodec: Consistently structure CPU extension initialization Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * x86: avcodec: Consistently structure CPU extension initializationDiego Biurrun2013-08-29
| |
* | avcodec/x86/lpc: Fix cpu flag checks so they workMichael Niedermayer2013-08-30
| | | | | | | | | | | | Broken by 6369ba3c9cc74becfaad2a8882dff3dd3e7ae3c0 Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* | avcodec/x86/vp8dsp: Fix cpu flag checks so they workMichael Niedermayer2013-08-30
| | | | | | | | | | | | Broken by 6369ba3c9cc74becfaad2a8882dff3dd3e7ae3c0 Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* | Merge commit '6369ba3c9cc74becfaad2a8882dff3dd3e7ae3c0'Michael Niedermayer2013-08-30
|\| | | | | | | | | | | | | | | | | | | | | | | * commit '6369ba3c9cc74becfaad2a8882dff3dd3e7ae3c0': x86: avcodec: Use convenience macros to check for CPU flags Conflicts: libavcodec/x86/dsputil_init.c libavcodec/x86/hpeldsp_init.c libavcodec/x86/motion_est.c Merged-by: Michael Niedermayer <michaelni@gmx.at>