summaryrefslogtreecommitdiff
path: root/libavcodec
Commit message (Collapse)AuthorAge
...
* | | x86/vp9lpf: add ff_vp9_loop_filter_[vh]_88_16_{ssse3,avx}.Clément Bœsch2014-01-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 9680 decicycles in loop_filter_v_88_16_c, 4193765 runs, 539 skips 9233 decicycles in loop_filter_h_88_16_c, 4193751 runs, 553 skips 1929 decicycles in ff_vp9_loop_filter_v_88_16_ssse3, 4194118 runs, 186 skips 2738 decicycles in ff_vp9_loop_filter_h_88_16_ssse3, 4193861 runs, 443 skips 5.978 → 5.417 overall decode time on ped1080p.webm (-threads 1) Adding SSE2 support should be relatively trivial (just a matter of changing the pshufb [mask_mix] with something else), patch welcome.
* | | avcodec/huffyuv: dont depend on bitstream_bpp having a specific value for ↵Michael Niedermayer2014-01-28
| | | | | | | | | | | | | | | | | | version>2 Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* | | avcodec/libfdk-aacenc: change MODE_7_1_FRONT_CENTER to map to ↵Michael Niedermayer2014-01-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | AV_CH_LAYOUT_7POINT1_WIDE_BACK This was suggested by Rodeo on IRC <Rodeo> for consistency with the rest, MODE_7_1_FRONT_CENTER would be AV_CH_LAYOUT_7POINT1_WIDE_BACK (since LS+RS is mapped to back channels in other modes) Reviewed-by: Jean First <jeanfirst@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* | | avcodec/libfdk-aacenc: change MODE_7_1_REAR_SURROUND to map to ↵Michael Niedermayer2014-01-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | AV_CH_LAYOUT_7POINT1 This was suggested by Rodeo on IRC <Rodeo> sorry, I meant MODE_7_1_REAR_SURROUND would probably be AV_CH_LAYOUT_7POINT1 Reviewed-by: Jean First <jeanfirst@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* | | x86/vp9lpf: add a preload system in FILTER_UPDATE.Clément Bœsch2014-01-27
| | | | | | | | | | | | Allow some macro refactoring in filter14().
* | | x86/vp9lpf: refactor v/h using common macros for P7 to Q7.Clément Bœsch2014-01-27
| | |
* | | x86/vp9lpf: faster P7..Q7 accesses.Clément Bœsch2014-01-27
| |/ |/| | | | | | | | | | | | | | | Introduce 2 additional registers for stride3 and mstride3 to allow direct accesses (lea drops). 3931 → 3827 decicycles in ff_vp9_loop_filter_v_16_16_ssse3 Also uses defines to clarify the code.
* | Fix decoding of some 8 < bpc < 16 signed j2k samples with libopenjpeg.Carl Eugen Hoyos2014-01-27
| | | | | | | | | | | | No testcase known. Reviewed-by: Michael Bradshaw
* | dxva2: bump maximum number of slieces for mpeg2Rainer Hochecker2014-01-27
| | | | | | | | | | | | Suggested by heleppkes on https://trac.ffmpeg.org/ticket/3133 Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* | avcodec/huffyuv: support gbrp9/10/12/14Michael Niedermayer2014-01-27
| | | | | | | | Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* | avcodec/huffyuv: update years in copyrightMichael Niedermayer2014-01-27
| | | | | | | | Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* | vp9: fix invalid ref frame w/h on size change.Ronald S. Bultje2014-01-26
| | | | | | | | | | | | | | | | | | Fixes invalid reads and crashes in vp90-2-05-resize.webm and fuzzed6.ivf. The output is still not identical to what libvpx does (because we don't actually scale in MC). Reviewed-by: ubitux Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* | vp9: disable use_last_frame_mvs on resolution change (scalable).Ronald S. Bultje2014-01-26
| | | | | | | | | | | | | | | | Prevents some invalid memory accesses after resolution change in vp90-2-05-resize.webm, and libvpx does this too. Reviewed-by: ubitux Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* | avcodec/huffyuvdec: optimize >8bps VLC readingMichael Niedermayer2014-01-26
| | | | | | | | | | | | 97479 -> 54891 decicycles Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* | avcodec/huffyuvenc: fix end pointer for stats_outMichael Niedermayer2014-01-26
| | | | | | | | Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* | avcodec/huffyuvenc: fail if stats_out is too small instead of silently ↵Michael Niedermayer2014-01-26
| | | | | | | | | | | | truncating Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* | avcodec/libfdk_aacenc: enable 7.1 channel encodingJean First2014-01-26
| | | | | | | | | | | | | | 7.1(wide) and 7.1(wide-side) channel layouts are supported in fdk_aac since october 2013 (commit fa3eba1644) Signed-off-by: Jean First <jeanfirst@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* | avcodec/mpeg12dec: Revert Change to mpeg2_fast_decode_block_non_intraMichael Niedermayer2014-01-26
| | | | | | | | | | | | | | | | | | | | | | | | | | This fixes the speed regression from 20626f53e9f41cb3db82329ed3db7d773cfa3a8f and still checks sufficiently to prevent out of allocated memory accesses due to the index Before: 1823 decicycles in mpeg2_fast_decode_block_non_intra, 8388493 runs, 115 skips After: 1808 decicycles in mpeg2_fast_decode_block_non_intra, 8388494 runs, 114 skips Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* | avcodec/mpeg12dec: Redesign index checks for mpeg2_fast_decode_block_intraMichael Niedermayer2014-01-26
| | | | | | | | | | | | | | | | | | | | | | | | | | This fixes the speed regression from 20626f53e9f41cb3db82329ed3db7d773cfa3a8f and still checks sufficiently to prevent out of allocated memory accesses due to the index Before: 1681 decicycles in mpeg2_fast_decode_block_intra, 4194238 runs, 66 skips After: 1658 decicycles in mpeg2_fast_decode_block_intra, 4194248 runs, 56 skips Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* | Merge commit '6d93307f8df81808f0dcdbc064b848054a6e83b3'Michael Niedermayer2014-01-26
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * commit '6d93307f8df81808f0dcdbc064b848054a6e83b3': mpeg12: check scantable indices in all decode_block functions Benchmarks Before: 1878 decicycles in mpeg2_decode_block_non_intra, 8388487 runs, 121 skips 1700 decicycles in mpeg2_decode_block_intra, 4194239 runs, 65 skips 1808 decicycles in mpeg2_fast_decode_block_non_intra, 8388492 runs, 116 skips 1669 decicycles in mpeg2_fast_decode_block_intra, 4194248 runs, 56 skips -- 2056 decicycles in mpeg1_decode_block_inter, 65535 runs, 1 skips 2346 decicycles in mpeg1_decode_block_intra, 32768 runs, 0 skips 2011 decicycles in mpeg1_fast_decode_block_inter, 65533 runs, 3 skips ---------------- After: 1858 decicycles in mpeg2_decode_block_non_intra, 8388490 runs, 118 skips 1691 decicycles in mpeg2_decode_block_intra, 4194233 runs, 71 skips 1823 decicycles in mpeg2_fast_decode_block_non_intra, 8388493 runs, 115 skips 1681 decicycles in mpeg2_fast_decode_block_intra, 4194238 runs, 66 skips -- 2010 decicycles in mpeg1_decode_block_inter, 65535 runs, 1 skips 2322 decicycles in mpeg1_decode_block_intra, 32766 runs, 2 skips 1995 decicycles in mpeg1_fast_decode_block_inter, 65535 runs, 1 skips All benchmarks are the best scores of several runs Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * mpeg12: check scantable indices in all decode_block functionsJanne Grunau2014-01-25
| | | | | | | | | | | | | | | | | | Add checks to the fast functions used with CODEC_FLAGS2_FAST and move the check for all other functions to before the invalid memory is accessed. Fixes https://trac.videolan.org/vlc/ticket/9713 with CODEC_FLAGS2_FAST. CC: libav-stable@libav.org
* | Merge commit 'fb0c9d41d685abb58575c5482ca33b8cd457c5ec'Michael Niedermayer2014-01-26
|\| | | | | | | | | | | | | | | | | | | | | * commit 'fb0c9d41d685abb58575c5482ca33b8cd457c5ec': avutil: remove timer.h include from internal.h Conflicts: libavcodec/ffv1dec.c libavutil/internal.h Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * avutil: remove timer.h include from internal.hJanne Grunau2014-01-25
| | | | | | | | Added libavutil/timer.h include to all files with {START,STOP}_TIMER.
* | avcodec/huffyuv: support AV_PIX_FMT_YUV(A)4XYP16 and GRAY16Michael Niedermayer2014-01-26
| | | | | | | | Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* | avcodec/libx264: also consider ticks per frame for fps/timebase setupMichael Niedermayer2014-01-25
| | | | | | | | | | | | Setting fps = 1/timebase is not correct Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* | x86/lossless_videodsp: silly one-line cosmetic.Clément Bœsch2014-01-25
| |
* | x86/lossless_videodsp: use common macro for add and diff int16 loop.Clément Bœsch2014-01-25
| |
* | x86/lossless_videodsp: simplify and explicit aligned/unaligned flagsClément Bœsch2014-01-25
| |
* | Merge remote-tracking branch 'rbultje/vp9-simd'Michael Niedermayer2014-01-25
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * rbultje/vp9-simd: vp9: fix memory corruption if header decoding fails after size change. vp9/x86: use explicit register for relative stack references. vp9/x86: iwht4x4 (lossless) mmx. vp9/x86: 4x4 iadst SIMD (ssse3) variants. vp9/x86: 8x8 iadst SIMD (ssse3/avx) variants. Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * | vp9: fix memory corruption if header decoding fails after size change.Ronald S. Bultje2014-01-24
| | |
| * | vp9/x86: use explicit register for relative stack references.Ronald S. Bultje2014-01-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Before this patch, we explicitly modify rsp, which isn't necessarily universally acceptable, since the space under the stack pointer might be modified in things like signal handlers. Therefore, use an explicit register to hold the stack pointer relative to the bottom of the stack (i.e. rsp). This will also clear out valgrind errors about the use of uninitialized data that started occurring after the idct16x16/ssse3 optimizations were first merged.
| * | vp9/x86: iwht4x4 (lossless) mmx.Ronald S. Bultje2014-01-24
| | |
| * | vp9/x86: 4x4 iadst SIMD (ssse3) variants.Ronald S. Bultje2014-01-24
| | | | | | | | | | | | | | | | | | | | | | | | Cycle measurements for intra itxfm_4x4_add on ped1080p.webm: idct_idct: 66 -> 67 cycles (noise measurement) idct_iadst: 199 -> 79 cycles iadst_idct: 165 -> 70 cycles iadst_iadst: 183 -> 82 cycles
| * | vp9/x86: 8x8 iadst SIMD (ssse3/avx) variants.Ronald S. Bultje2014-01-24
| | | | | | | | | | | | | | | | | | | | | | | | Cycle measurements for intra itxfm_8x8_add on ped1080p.webm: idct_idct: 133 -> 135 cycles (noise measurement) idct_iadst: 900 -> 241 cycles iadst_idct: 864 -> 215 cycles iadst_iadst: 973 -> 310 cycles
* | | Merge remote-tracking branch 'qatar/master'Michael Niedermayer2014-01-25
|\ \ \ | | |/ | |/| | | | | | | | | | | | | | | | | | | | | | * qatar/master: dxtory: compressed RGB555/RGB565 decoding support Conflicts: libavcodec/dxtory.c Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * | dxtory: compressed RGB555/RGB565 decoding supportKostya Shishkov2014-01-24
| | |
* | | Merge commit '0e1ad2f591b87e944550c15b54e54f8189743289'Michael Niedermayer2014-01-25
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | * commit '0e1ad2f591b87e944550c15b54e54f8189743289': dxtory: add more compressed and uncompressed modes Conflicts: libavcodec/dxtory.c Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * | dxtory: add more compressed and uncompressed modesKostya Shishkov2014-01-24
| | |
| * | vp9: fix bugs in updating coef probabilities with parallelmode=1Guillaume Martres2014-01-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | - The memcpy was completely wrong because s->prob_ctx[s->framectxid].coef is a [4][2][2][6][6][3] array, whereas s->prob.coef is a [4][2][2][6][6][11] array. - The additional check was committed to ffmpeg by Ronald S. Bultje. Signed-off-by: Anton Khirnov <anton@khirnov.net>
| * | vp9: fix mvref finding to adhere to bug in libvpx.Ronald S. Bultje2014-01-24
| | | | | | | | | | | | | | | | | | Fixes a particular youtube video that I unfortunately can't share. Signed-off-by: Anton Khirnov <anton@khirnov.net>
* | | avcodec/dvbsubdec: Remove unused display_list_sizeMichael Niedermayer2014-01-25
| | | | | | | | | | | | Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* | | Fixed a memory leak in dvbsubenc.c: sub->num_rects was reduced without ↵Wim Vander Schelden2014-01-25
| |/ |/| | | | | | | | | | | freeing the associated rects. Signed-off-by: Wim Vander Schelden <lists@fixnum.org> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* | avcodec/mpeg12dec: fix mis-indented lineMichael Niedermayer2014-01-24
| | | | | | | | Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* | avcodec/mpeg12dec: Disable the checked bitstream readerMichael Niedermayer2014-01-24
| | | | | | | | | | | | Mpeg1/2 should not need it Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* | avcodec/mpeg12dec: Check for overread in mpeg_decode_slice()Michael Niedermayer2014-01-24
| | | | | | | | | | | | This is needed in case the checked bitstream reader is disabled Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* | avcodec/mpeg12dec: check block index in mpeg2_fast_decode_block_non_intra()Michael Niedermayer2014-01-24
| | | | | | | | | | | | Prevents some overreads at the cost of 1 cpu cycle Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* | avcodec/mpeg12dec: Optimize mpeg1_decode_block_intra()Michael Niedermayer2014-01-24
| | | | | | | | | | | | sandybridge i7 274->260 cycles Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* | avcodec/mpeg12dec: check for overread in mpeg1_fast_decode_block_inter()Michael Niedermayer2014-01-24
| | | | | | | | | | | | No speedloss meassured Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* | avcodec/mpeg12dec: Make mpeg2_fast_decode_block_intra() more robust by ↵Michael Niedermayer2014-01-24
| | | | | | | | | | | | breaking out on invalid vlcs Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* | avcodec/vc1: fix type of tmpMichael Niedermayer2014-01-24
| | | | | | | | | | Fixes CID1163850 Signed-off-by: Michael Niedermayer <michaelni@gmx.at>