libav.git - [no description]

	Commit message (Collapse)	Author	Age
...
* \|	avcodecc/ccaption_dec: remove extra word from long codec description	Paul B Mahol	2017-01-25
\| \| \| \| \| \| \| \|	Signed-off-by: Paul B Mahol <onemda@gmail.com>
* \|	avformat: add Scenarist Closed Captions demuxer	Paul B Mahol	2017-01-25
\| \| \| \| \| \| \| \| \| \| \| \|	Fixes #4767. Signed-off-by: Paul B Mahol <onemda@gmail.com>
* \|	avformat: add Sample Dump eXchange demuxer	Paul B Mahol	2017-01-25
\| \| \| \| \| \| \| \|	Signed-off-by: Paul B Mahol <onemda@gmail.com>
* \|	lavf/mov: Unscramble dref debug output.	Carl Eugen Hoyos	2017-01-25
\| \|
* \|	isom: map xalg and avlg to h264, fixes ticket #6099	compn	2017-01-24
\| \|
* \|	avcodec/utils: correct align value for interplay	Michael Niedermayer	2017-01-25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fixes out of array access Fixes: 452/fuzz-1-ffmpeg_VIDEO_AV_CODEC_ID_INTERPLAY_VIDEO_fuzzer Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/targets/ffmpeg Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* \|	Cosmetics: Reindent after last commit.	Carl Eugen Hoyos	2017-01-25
\| \|
* \|	lavd/v4l2: Avoid setting frame_size to a negative value.	Carl Eugen Hoyos	2017-01-25
\| \|
* \|	lavf/rtmpproto: Make bytes_read variables 64bit.	Carl Eugen Hoyos	2017-01-25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When bytes_read overflowed, last_bytes_read did not yet overflow and no bytes-read report was created leading to a timeout. Analyzed-by: Thomas Bernhard Fixes ticket #5836.
* \|	avfilter/formats: do not allow unknown layouts in ff_parse_channel_layout if ↵	Marton Balint	2017-01-24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	nret is not set Current code returned the number of channels as channel layout in that case, and if nret is not set then unknown layouts are typically not supported. Also use the common parsing code. Use a temporary workaround to parse an unknown channel layout such as '13c', after a 1 year grace period only '13C' will work. Signed-off-by: Marton Balint <cus@passwd.hu>
* \|	avutil/channel_layout: add av_get_extended_channel_layout	Marton Balint	2017-01-24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Return a channel layout and the number of channels based on the specified name. This function is similar to av_get_channel_layout(), but can also parse unknown channel layout specifications. Unknown channel layout specifications are a decimal number and a capital 'C' suffix, in order to not break compatibility with the lowercase 'c' suffix, which is used for a guessed channel layout with the specified number of channels. Signed-off-by: Marton Balint <cus@passwd.hu>
* \|	avutil/channel_layout: fix remains of old syntax in docs and comments	Marton Balint	2017-01-24
\| \| \| \| \| \| \| \|	Signed-off-by: Marton Balint <cus@passwd.hu>
* \|	lavc/svq3: Fail for media key encryption.	Carl Eugen Hoyos	2017-01-24
\| \| \| \| \| \| \| \| \| \| \| \|	Tested-by: ami_stuff Fixes a part of ticket #6094.
* \|	avcodec/vp56: Check for the bitstream end, pass error codes on	Michael Niedermayer	2017-01-24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fixes timeout Fixes: 446/fuzz-3-ffmpeg_VIDEO_AV_CODEC_ID_VP6_fuzzer Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/targets/ffmpeg Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* \|	aarch64: Add NEON optimizations for 10 and 12 bit vp9 loop filter	Martin Storsjö	2017-01-24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This work is sponsored by, and copyright, Google. This is similar to the arm version, but due to the larger registers on aarch64, we can do 8 pixels at a time for all filter sizes. Examples of runtimes vs the 32 bit version, on a Cortex A53: ARM AArch64 vp9_loop_filter_h_4_8_10bpp_neon: 213.2 172.6 vp9_loop_filter_h_8_8_10bpp_neon: 281.2 244.2 vp9_loop_filter_h_16_8_10bpp_neon: 657.0 444.5 vp9_loop_filter_h_16_16_10bpp_neon: 1280.4 877.7 vp9_loop_filter_mix2_h_44_16_10bpp_neon: 397.7 358.0 vp9_loop_filter_mix2_h_48_16_10bpp_neon: 465.7 429.0 vp9_loop_filter_mix2_h_84_16_10bpp_neon: 465.7 428.0 vp9_loop_filter_mix2_h_88_16_10bpp_neon: 533.7 499.0 vp9_loop_filter_mix2_v_44_16_10bpp_neon: 271.5 244.0 vp9_loop_filter_mix2_v_48_16_10bpp_neon: 330.0 305.0 vp9_loop_filter_mix2_v_84_16_10bpp_neon: 329.0 306.0 vp9_loop_filter_mix2_v_88_16_10bpp_neon: 386.0 365.0 vp9_loop_filter_v_4_8_10bpp_neon: 150.0 115.2 vp9_loop_filter_v_8_8_10bpp_neon: 209.0 175.5 vp9_loop_filter_v_16_8_10bpp_neon: 492.7 345.2 vp9_loop_filter_v_16_16_10bpp_neon: 951.0 682.7 This is significantly faster than the ARM version in almost all cases except for the mix2 functions. Based on START_TIMER/STOP_TIMER wrapping around a few individual functions, the speedup vs C code is around 2-3x. Signed-off-by: Martin Storsjö <martin@martin.st>
* \|	aarch64: Add NEON optimizations for 10 and 12 bit vp9 itxfm	Martin Storsjö	2017-01-24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This work is sponsored by, and copyright, Google. Compared to the arm version, on aarch64 we can keep the full 8x8 transform in registers, and for 16x16 and 32x32, we can process it in slices of 4 pixels instead of 2. Examples of runtimes vs the 32 bit version, on a Cortex A53: ARM AArch64 vp9_inv_adst_adst_4x4_sub4_add_10_neon: 111.0 109.7 vp9_inv_adst_adst_8x8_sub8_add_10_neon: 914.0 733.5 vp9_inv_adst_adst_16x16_sub16_add_10_neon: 5184.0 3745.7 vp9_inv_dct_dct_4x4_sub1_add_10_neon: 65.0 65.7 vp9_inv_dct_dct_4x4_sub4_add_10_neon: 100.0 96.7 vp9_inv_dct_dct_8x8_sub1_add_10_neon: 111.0 119.7 vp9_inv_dct_dct_8x8_sub8_add_10_neon: 618.0 494.7 vp9_inv_dct_dct_16x16_sub1_add_10_neon: 295.1 284.6 vp9_inv_dct_dct_16x16_sub2_add_10_neon: 2303.2 1883.9 vp9_inv_dct_dct_16x16_sub8_add_10_neon: 2984.8 2189.3 vp9_inv_dct_dct_16x16_sub16_add_10_neon: 3890.0 2799.4 vp9_inv_dct_dct_32x32_sub1_add_10_neon: 1044.4 1012.7 vp9_inv_dct_dct_32x32_sub2_add_10_neon: 13333.7 9695.1 vp9_inv_dct_dct_32x32_sub16_add_10_neon: 18531.3 12459.8 vp9_inv_dct_dct_32x32_sub32_add_10_neon: 24470.7 16160.2 vp9_inv_wht_wht_4x4_sub4_add_10_neon: 83.0 79.7 The larger transforms are significantly faster than the corresponding ARM versions. The speedup vs C code is smaller than in 32 bit mode, probably because the 64 bit intermediates in the C code can be expressed more efficiently in aarch64. Signed-off-by: Martin Storsjö <martin@martin.st>
* \|	aarch64: Add NEON optimizations for 10 and 12 bit vp9 MC	Martin Storsjö	2017-01-24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This work is sponsored by, and copyright, Google. This has mostly got the same differences to the 8 bit version as in the arm version. For the horizontal filters, we do 16 pixels in parallel as well. For the 8 pixel wide vertical filters, we can accumulate 4 rows before storing, just as in the 8 bit version. Examples of runtimes vs the 32 bit version, on a Cortex A53: ARM AArch64 vp9_avg4_10bpp_neon: 35.7 30.7 vp9_avg8_10bpp_neon: 93.5 84.7 vp9_avg16_10bpp_neon: 324.4 296.6 vp9_avg32_10bpp_neon: 1236.5 1148.2 vp9_avg64_10bpp_neon: 4639.6 4571.1 vp9_avg_8tap_smooth_4h_10bpp_neon: 130.0 128.0 vp9_avg_8tap_smooth_4hv_10bpp_neon: 440.0 440.5 vp9_avg_8tap_smooth_4v_10bpp_neon: 114.0 105.5 vp9_avg_8tap_smooth_8h_10bpp_neon: 327.0 314.0 vp9_avg_8tap_smooth_8hv_10bpp_neon: 918.7 865.4 vp9_avg_8tap_smooth_8v_10bpp_neon: 330.0 300.2 vp9_avg_8tap_smooth_16h_10bpp_neon: 1187.5 1155.5 vp9_avg_8tap_smooth_16hv_10bpp_neon: 2663.1 2591.0 vp9_avg_8tap_smooth_16v_10bpp_neon: 1107.4 1078.3 vp9_avg_8tap_smooth_64h_10bpp_neon: 17754.6 17454.7 vp9_avg_8tap_smooth_64hv_10bpp_neon: 33285.2 33001.5 vp9_avg_8tap_smooth_64v_10bpp_neon: 16066.9 16048.6 vp9_put4_10bpp_neon: 25.5 21.7 vp9_put8_10bpp_neon: 56.0 52.0 vp9_put16_10bpp_neon/armv8: 183.0 163.1 vp9_put32_10bpp_neon/armv8: 678.6 563.1 vp9_put64_10bpp_neon/armv8: 2679.9 2195.8 vp9_put_8tap_smooth_4h_10bpp_neon: 120.0 118.0 vp9_put_8tap_smooth_4hv_10bpp_neon: 435.2 435.0 vp9_put_8tap_smooth_4v_10bpp_neon: 107.0 98.2 vp9_put_8tap_smooth_8h_10bpp_neon: 303.0 290.0 vp9_put_8tap_smooth_8hv_10bpp_neon: 893.7 828.7 vp9_put_8tap_smooth_8v_10bpp_neon: 305.5 263.5 vp9_put_8tap_smooth_16h_10bpp_neon: 1089.1 1059.2 vp9_put_8tap_smooth_16hv_10bpp_neon: 2578.8 2452.4 vp9_put_8tap_smooth_16v_10bpp_neon: 1009.5 933.5 vp9_put_8tap_smooth_64h_10bpp_neon: 16223.4 15918.6 vp9_put_8tap_smooth_64hv_10bpp_neon: 32153.0 31016.2 vp9_put_8tap_smooth_64v_10bpp_neon: 14516.5 13748.1 These are generally about as fast as the corresponding ARM routines on the same CPU (at least on the A53), in most cases marginally faster. The speedup vs C code is around 4-9x. Signed-off-by: Martin Storsjö <martin@martin.st>
* \|	aarch64: vp9dsp: Restructure the bpp checks	Martin Storsjö	2017-01-24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This work is sponsored by, and copyright, Google. This is more in line with how it will be extended for more bitdepths. Signed-off-by: Martin Storsjö <martin@martin.st>
* \|	arm: Add NEON optimizations for 10 and 12 bit vp9 loop filter	Martin Storsjö	2017-01-24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This work is sponsored by, and copyright, Google. This is pretty much similar to the 8 bpp version, but in some senses simpler. All input pixels are 16 bits, and all intermediates also fit in 16 bits, so there's no lengthening/narrowing in the filter at all. For the full 16 pixel wide filter, we can only process 4 pixels at a time (using an implementation very much similar to the one for 8 bpp), but we can do 8 pixels at a time for the 4 and 8 pixel wide filters with a different implementation of the core filter. Examples of relative speedup compared to the C version, from checkasm: Cortex A7 A8 A9 A53 vp9_loop_filter_h_4_8_10bpp_neon: 1.83 2.16 1.40 2.09 vp9_loop_filter_h_8_8_10bpp_neon: 1.39 1.67 1.24 1.70 vp9_loop_filter_h_16_8_10bpp_neon: 1.56 1.47 1.10 1.81 vp9_loop_filter_h_16_16_10bpp_neon: 1.94 1.69 1.33 2.24 vp9_loop_filter_mix2_h_44_16_10bpp_neon: 2.01 2.27 1.67 2.39 vp9_loop_filter_mix2_h_48_16_10bpp_neon: 1.84 2.06 1.45 2.19 vp9_loop_filter_mix2_h_84_16_10bpp_neon: 1.89 2.20 1.47 2.29 vp9_loop_filter_mix2_h_88_16_10bpp_neon: 1.69 2.12 1.47 2.08 vp9_loop_filter_mix2_v_44_16_10bpp_neon: 3.16 3.98 2.50 4.05 vp9_loop_filter_mix2_v_48_16_10bpp_neon: 2.84 3.64 2.25 3.77 vp9_loop_filter_mix2_v_84_16_10bpp_neon: 2.65 3.45 2.16 3.54 vp9_loop_filter_mix2_v_88_16_10bpp_neon: 2.55 3.30 2.16 3.55 vp9_loop_filter_v_4_8_10bpp_neon: 2.85 3.97 2.24 3.68 vp9_loop_filter_v_8_8_10bpp_neon: 2.27 3.19 1.96 3.08 vp9_loop_filter_v_16_8_10bpp_neon: 3.42 2.74 2.26 4.40 vp9_loop_filter_v_16_16_10bpp_neon: 2.86 2.44 1.93 3.88 The speedup vs C code measured in checkasm is around 1.1-4x. These numbers are quite inconclusive though, since the checkasm test runs multiple filterings on top of each other, so later rounds might end up with different codepaths (different decisions on which filter to apply, based on input pixel differences). Based on START_TIMER/STOP_TIMER wrapping around a few individual functions, the speedup vs C code is around 2-4x. Signed-off-by: Martin Storsjö <martin@martin.st>
* \|	arm: Add NEON optimizations for 10 and 12 bit vp9 itxfm	Martin Storsjö	2017-01-24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This work is sponsored by, and copyright, Google. This is structured similarly to the 8 bit version. In the 8 bit version, the coefficients are 16 bits, and intermediates are 32 bits. Here, the coefficients are 32 bit. For the 4x4 transforms for 10 bit content, the intermediates also fit in 32 bits, but for all other transforms (4x4 for 12 bit content, and 8x8 and larger for both 10 and 12 bit) the intermediates are 64 bit. For the existing 8 bit case, the 8x8 transform fit all coefficients in registers; for 10/12 bit, when the coefficients are 32 bit, the 8x8 transform also has to be done in slices of 4 pixels (just as 16x16 and 32x32 for 8 bit). The slice width also shrinks from 4 elements to 2 elements in parallel for the 16x16 and 32x32 cases. The 16 bit coefficients from idct_coeffs and similar tables also need to be lenghtened to 32 bit in order to be used in multiplication with vectors with 32 bit elements. This leads to the fixed coefficient vectors needing more space, leading to more cases where they have to be reloaded within the transform (in iadst16). This technically would need testing in checkasm for subpartitions in increments of 2, but that slows down normal checkasm runs excessively. Examples of relative speedup compared to the C version, from checkasm: Cortex A7 A8 A9 A53 vp9_inv_adst_adst_4x4_sub4_add_10_neon: 4.83 11.36 5.22 6.77 vp9_inv_adst_adst_8x8_sub8_add_10_neon: 4.12 7.60 4.06 4.84 vp9_inv_adst_adst_16x16_sub16_add_10_neon: 3.93 8.16 4.52 5.35 vp9_inv_dct_dct_4x4_sub1_add_10_neon: 1.36 2.57 1.41 1.61 vp9_inv_dct_dct_4x4_sub4_add_10_neon: 4.24 8.66 5.06 5.81 vp9_inv_dct_dct_8x8_sub1_add_10_neon: 2.63 4.18 1.68 2.87 vp9_inv_dct_dct_8x8_sub4_add_10_neon: 4.52 9.47 4.24 5.39 vp9_inv_dct_dct_8x8_sub8_add_10_neon: 3.45 7.34 3.45 4.30 vp9_inv_dct_dct_16x16_sub1_add_10_neon: 3.56 6.21 2.47 4.32 vp9_inv_dct_dct_16x16_sub2_add_10_neon: 5.68 12.73 5.28 7.07 vp9_inv_dct_dct_16x16_sub8_add_10_neon: 4.42 9.28 4.24 5.45 vp9_inv_dct_dct_16x16_sub16_add_10_neon: 3.41 7.29 3.35 4.19 vp9_inv_dct_dct_32x32_sub1_add_10_neon: 4.52 8.35 3.83 6.40 vp9_inv_dct_dct_32x32_sub2_add_10_neon: 5.86 13.19 6.14 7.04 vp9_inv_dct_dct_32x32_sub16_add_10_neon: 4.29 8.11 4.59 5.06 vp9_inv_dct_dct_32x32_sub32_add_10_neon: 3.31 5.70 3.56 3.84 vp9_inv_wht_wht_4x4_sub4_add_10_neon: 1.89 2.80 1.82 1.97 The speedup compared to the C functions is around 1.3 to 7x for the full transforms, even higher for the smaller subpartitions. Signed-off-by: Martin Storsjö <martin@martin.st>
* \|	arm: Add NEON optimizations for 10 and 12 bit vp9 MC	Martin Storsjö	2017-01-24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This work is sponsored by, and copyright, Google. The plain pixel put/copy functions are used from the 8 bit version, for the double size (e.g. put16 uses ff_vp9_copy32_neon), and a new copy128 is added. Compared with the 8 bit version, the filters can no longer use the trick to accumulate in 16 bit with only saturation at the end, but now the accumulators need to be 32 bit. This avoids the need to keep track of which filter index is the largest though, reducing the size of the executable code for these filters. For the horizontal filters, we only do 4 or 8 pixels wide in parallel (while doing two rows at a time), since we don't have enough register space to filter 16 pixels wide. For the vertical filters, we still do 4 and 8 pixels in parallel just as in the 8 bit case, but we need to store the output after every 2 rows instead of after every 4 rows. Examples of relative speedup compared to the C version, from checkasm: Cortex A7 A8 A9 A53 vp9_avg4_10bpp_neon: 2.25 2.44 3.05 2.16 vp9_avg8_10bpp_neon: 3.66 8.48 3.86 3.50 vp9_avg16_10bpp_neon: 3.39 8.26 3.37 2.72 vp9_avg32_10bpp_neon: 4.03 10.20 4.07 3.42 vp9_avg64_10bpp_neon: 4.15 10.01 4.13 3.70 vp9_avg_8tap_smooth_4h_10bpp_neon: 3.38 6.22 3.41 4.75 vp9_avg_8tap_smooth_4hv_10bpp_neon: 3.89 6.39 4.30 5.32 vp9_avg_8tap_smooth_4v_10bpp_neon: 5.32 9.73 6.34 7.31 vp9_avg_8tap_smooth_8h_10bpp_neon: 4.45 9.40 4.68 6.87 vp9_avg_8tap_smooth_8hv_10bpp_neon: 4.64 8.91 5.44 6.47 vp9_avg_8tap_smooth_8v_10bpp_neon: 6.44 13.42 8.68 8.79 vp9_avg_8tap_smooth_64h_10bpp_neon: 4.66 9.02 4.84 7.71 vp9_avg_8tap_smooth_64hv_10bpp_neon: 4.61 9.14 4.92 7.10 vp9_avg_8tap_smooth_64v_10bpp_neon: 6.90 14.13 9.57 10.41 vp9_put4_10bpp_neon: 1.33 1.46 2.09 1.33 vp9_put8_10bpp_neon: 1.57 3.42 1.83 1.84 vp9_put16_10bpp_neon: 1.55 4.78 2.17 1.89 vp9_put32_10bpp_neon: 2.06 5.35 2.14 2.30 vp9_put64_10bpp_neon: 3.00 2.41 1.95 1.66 vp9_put_8tap_smooth_4h_10bpp_neon: 3.19 5.81 3.31 4.63 vp9_put_8tap_smooth_4hv_10bpp_neon: 3.86 6.22 4.32 5.21 vp9_put_8tap_smooth_4v_10bpp_neon: 5.40 9.77 6.08 7.21 vp9_put_8tap_smooth_8h_10bpp_neon: 4.22 8.41 4.46 6.63 vp9_put_8tap_smooth_8hv_10bpp_neon: 4.56 8.51 5.39 6.25 vp9_put_8tap_smooth_8v_10bpp_neon: 6.60 12.43 8.17 8.89 vp9_put_8tap_smooth_64h_10bpp_neon: 4.41 8.59 4.54 7.49 vp9_put_8tap_smooth_64hv_10bpp_neon: 4.43 8.58 5.34 6.63 vp9_put_8tap_smooth_64v_10bpp_neon: 7.26 13.92 9.27 10.92 For the larger 8tap filters, the speedup vs C code is around 4-14x. Signed-off-by: Martin Storsjö <martin@martin.st>
* \|	arm: vp9dsp: Restructure the bpp checks	Martin Storsjö	2017-01-24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This work is sponsored by, and copyright, Google. This is more in line with how it will be extended for more bitdepths. Signed-off-by: Martin Storsjö <martin@martin.st>
* \|	Merge commit 'fd5e6a095f69495c558069315d6b36ea410c31fa'	Clément Bœsch	2017-01-24
\|\\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* commit 'fd5e6a095f69495c558069315d6b36ea410c31fa': x86util: Extend SPLATW for avx2 This commit is a noop, see 1ace9573dce509e2b25165199c3b658667860ecf (only libavutil/x86/x86util.asm chunk). Merged-by: Clément Bœsch <u@pkh.me>
\| *	x86util: Extend SPLATW for avx2	James Almer	2016-07-18
\| \| \| \| \| \| \| \| \| \| \| \|	Integration to Libav by Josh de Kock <josh@itanimul.li>. Signed-off-by: Alexandra Hájková <alexandra@khirnov.net>
* \|	Merge commit '37961044c6'	Clément Bœsch	2017-01-24
\|\\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* commit '37961044c6': checkasm: arm: Ignore changes to bits 0-4 and 7 of FPSCR cheackasm/arm: remove NEON instructions from checkasm_checked_call_vfp checkasm: arm: Don't start new const blocks for each string This merge is a noop: the changes were included in 9f1c81e5ec. Merged-by: Clément Bœsch <u@pkh.me>
\| *	checkasm: arm: Ignore changes to bits 0-4 and 7 of FPSCR	Martin Storsjö	2016-07-17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	These bits are set by exceptions in NEON instructions. Also print the differing bits when FPSCR is clobbered, and use bic instead of lsl, for clearing the topmost bits. Signed-off-by: Martin Storsjö <martin@martin.st>
\| *	cheackasm/arm: remove NEON instructions from checkasm_checked_call_vfp	Janne Grunau	2016-07-17
\| \| \| \| \| \| \| \| \| \| \| \|	Fixes AS error on non NEON builds introduced in 71a04721145. Also set the fpu directly to vfp in checkasm.S to cause build errors on NEON builds.
\| *	checkasm: arm: Don't start new const blocks for each string	Martin Storsjö	2016-07-17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Each const block needs to be terminated by one endconst invocation so either call endconst after each, or just declare plain labels to the later strings. This fixes errors such as this, on some binutils versions: checkasm.S:38: Error: Macro `endconst' was already defined Signed-off-by: Martin Storsjö <martin@martin.st>
* \|	Merge commit '5ece6911010b3464d2fdacfa8031c15b5bd83418'	Clément Bœsch	2017-01-24
\|\\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* commit '5ece6911010b3464d2fdacfa8031c15b5bd83418': apichanges: Fill in missing hashes and dates This commit is a noop as we need to fill with our own hashes. Merged-by: Clément Bœsch <u@pkh.me>
\| *	apichanges: Fill in missing hashes and dates	Diego Biurrun	2016-07-16
\| \|
* \|	Merge commit 'facdfe40805559963b5875931af9406ed5ddcd5c'	Clément Bœsch	2017-01-24
\|\\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* commit 'facdfe40805559963b5875931af9406ed5ddcd5c': swscale: Add proper ff_ prefix to init functions This commit is a noop, see e8c37160640952ab036e643156add9638c062536 I'm keeping our ff_sws_ vs ff_ since we use ff_sws_ in other places in swscale. Merged-by: Clément Bœsch <u@pkh.me>
\| *	swscale: Add proper ff_ prefix to init functions	Diego Biurrun	2016-07-16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	They are internal symbols that should not be exported. based on a patch by Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com> Signed-off-by: Diego Biurrun <diego@biurrun.de>
* \|	Merge commit 'c0fd2fb27bebd1d5ab028e6df6bca9119d269122'	Clément Bœsch	2017-01-24
\|\\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* commit 'c0fd2fb27bebd1d5ab028e6df6bca9119d269122': swscale: Rename sws_context_class to ff_sws_context_class This commit is a noop, see 8bfbc8c5e504ef3ae914499646d450987b419385 Merged-by: Clément Bœsch <u@pkh.me>
\| *	swscale: Rename sws_context_class to ff_sws_context_class	Andreas Cadhalpun	2016-07-16
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	It is an internal swscale symbol and thus should not be exported. Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com> Signed-off-by: Diego Biurrun <diego@biurrun.de>
* \|	Merge commit '71a0472114574993df7035f4de9aa007e03817b8'	Clément Bœsch	2017-01-24
\|\\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* commit '71a0472114574993df7035f4de9aa007e03817b8': checkasm: arm: report the first clobbered register in checkasm_checked_call Also includes 446353ea18, 59aeed93e4, and 37961044c6 to avoid breaking too much stuff. Merged-by: Clément Bœsch <u@pkh.me>
\| *	checkasm: arm: report the first clobbered register in checkasm_checked_call	Janne Grunau	2016-07-16
\| \|
* \|	avcodec/mjpegdec: Check remaining bitstream in ljpeg_decode_yuv_scan()	Michael Niedermayer	2017-01-24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fixes timeout Fixes: 445/fuzz-3-ffmpeg_VIDEO_AV_CODEC_ID_MJPEG_fuzzer Fixes: 456/fuzz-2-ffmpeg_VIDEO_AV_CODEC_ID_JPEGLS_fuzzer Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/targets/ffmpeg Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* \|	Merge commit 'a8fce24b9c5a87187f5bd864b18f5b3e575f8c3d'	Clément Bœsch	2017-01-24
\|\\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* commit 'a8fce24b9c5a87187f5bd864b18f5b3e575f8c3d': avconv_dxva2: support HEVC Main10 decoding This commit is a noop, see 1ec14612a5bd3506ffd755f8d42db776bfa40ae8 Merged-by: Clément Bœsch <u@pkh.me>
\| *	avconv_dxva2: support HEVC Main10 decoding	Hendrik Leppkes	2016-07-15
\| \| \| \| \| \| \| \|	Signed-off-by: Anton Khirnov <anton@khirnov.net>
* \|	Merge commit '33f6690eb4e21acc4b581688eecfc4cc5ea9515e'	Clément Bœsch	2017-01-24
\|\\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* commit '33f6690eb4e21acc4b581688eecfc4cc5ea9515e': hevc: offer DXVA2 for 10bit 420 This commit is a noop, see ccb94789e2968329947f1c2e00d019f387f9c409 Merged-by: Clément Bœsch <u@pkh.me>
\| *	hevc: offer DXVA2 for 10bit 420	Anton Khirnov	2016-07-15
\| \|
* \|	Merge commit '38efff92f1ef81f3de20ff0460ec7b70c253d714'	Clément Bœsch	2017-01-24
\|\\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* commit '38efff92f1ef81f3de20ff0460ec7b70c253d714': FATE: add a test for H.264 with two fields per packet h264: fix decoding multiple fields per packet with slice threads This merge includes two commits because the FATE test was useful in order to make proper testing. The merge gets rid of the now unused: - SLICE_SINGLETHREAD and SLICE_SKIPED macros - max_contexts - "again" label in decode_nal_units() This commit also includes the fix from d3e4d406b. Thanks to wm4 and Michael Niedermayer for their testing. Merged-by: Clément Bœsch <u@pkh.me> Merged-by: Matthieu Bouron <matthieu.bouron@gmail.com>
\| *	FATE: add a test for H.264 with two fields per packet	Anton Khirnov	2016-07-15
\| \|
\| *	h264: fix decoding multiple fields per packet with slice threads	Anton Khirnov	2016-07-15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Since we only know whether a NAL unit corresponds to a new field after parsing the slice header, this requires reorganizing the calls to slice parsing, per-slice/field/frame init and actual decoding. In the previous code, the function for slice header decoding also immediately started a new field/frame as necessary, so any slices already queued for decoding would no longer be decodable. After this patch, we first parse the slice header, and if we determine that a new field needs to be started we decode all the queued slices.
* \|	avformat/hlsenc: improve to write m3u8 head block	Steven Liu	2017-01-24
\| \| \| \| \| \| \| \|	Signed-off-by: Steven Liu <lq@chinaffmpeg.org>
* \|	avcodec/h264dec: Fix regression with "make fate-h264-attachment-631 THREADS=8"	Michael Niedermayer	2017-01-24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This treats the case of no slices like no frames which it basically is. The field is added to the context as other nal related fields are also there and passing the has_slices field per *arguments is ugly and not consistent Found-by: ubitux Approved-by: ubitux Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* \|	avfilter: add EIA-608 line extractor	Paul B Mahol	2017-01-24
\| \| \| \| \| \| \| \| \| \|	Signed-off-by: Dave Rice <dave@dericed.com> Signed-off-by: Paul B Mahol <onemda@gmail.com>
* \|	avformat/flvenc: refine the flvenc shift_data code	Steven Liu	2017-01-24
\| \| \| \| \| \| \| \| \| \| \| \|	refine the flvenc shift_data move data option Signed-off-by: Steven Liu <lq@chinaffmpeg.org>
* \|	avformat/hlsenc: refine the code readable for time unit	Steven Liu	2017-01-24
\| \| \| \| \| \| \| \| \| \| \| \|	Reviewed-by: Bodecs Bela <bodecsb@vivanet.hu> Reviewed-by: James Almer <jamrial@gmail.com> Signed-off-by: Steven Liu <lq@chinaffmpeg.org>
* \|	libavformat/tee: tee was passing a wrong option name for fifo's format_options	Felipe Astroza	2017-01-24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If fifo is enabled on tee muxer, ffmpeg exits because of an unknown option passed to fifo muxer. Option name "format_options" was replaced by "format_opts" on tee muxer. Signed-off-by: Felipe Astroza <felipe@astroza.cl> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>