libav.git - [no description]

	Commit message (Collapse)	Author	Age
*	swscale: cosmetic fixes	Nelson Gomez	2020-06-14
\| \| \| \|	Signed-off-by: Nelson Gomez <nelson.gomez@microsoft.com>
*	swscale/x86/output: add AVX2 version of yuv2nv12cX	Nelson Gomez	2020-06-14
\| \| \| \| \| \| \| \| \| \|	256 bits is just wide enough to fit all the operands needed to vectorize the software implementation, but AVX2 is needed to for a couple of instructions like cross-lane permutation. Output is bit-for-bit identical to C. Signed-off-by: Nelson Gomez <nelson.gomez@microsoft.com>
*	swscale: make yuv2interleavedX more asm-friendly	Nelson Gomez	2020-06-14
\| \| \| \| \| \| \| \| \|	Extracting information from SwsContext in assembly is difficult, and rearranging SwsContext just for asm access didn't look good. These functions only need a couple of fields from it anyway, so just make them parameters in their own right. Signed-off-by: Nelson Gomez <nelson.gomez@microsoft.com>
*	swscale/utils: return better error code from initFilter()	Limin Wang	2020-06-14
\| \| \| \| \|	Reviewed-by: Michael Niedermayer <michael@niedermayer.cc> Signed-off-by: Limin Wang <lance.lmwang@gmail.com>
*	swscale/utils: reindent	Limin Wang	2020-06-14
\| \| \| \| \|	Reviewed-by: Michael Niedermayer <michael@niedermayer.cc> Signed-off-by: Limin Wang <lance.lmwang@gmail.com>
*	swscale/utils: remove FF_ALLOC_ARRAY_OR_GOTO macros	Limin Wang	2020-06-13
\| \| \| \|	Signed-off-by: Limin Wang <lance.lmwang@gmail.com>
*	swscale: Add swscale input/output support for X2RGB10LE	Fei Wang	2020-06-12
\| \| \| \|	Signed-off-by: Fei Wang <fei.w.wang@intel.com>
*	Bump minor versions after branching 4.3	Michael Niedermayer	2020-06-08
\| \| \| \|	Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
*	Bump minor versions to separate 4.3 from master	Michael Niedermayer	2020-06-08
\| \| \| \|	Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
*	swscale: aarch64: Add a NEON implementation of interleaveBytes	Martin Storsjö	2020-05-15
\| \| \| \| \| \| \| \| \| \| \| \|	This allows speeding up format conversions from yuv420 to nv12. Cortex A53 A72 A73 interleave_bytes_c: 86077.5 51433.0 66972.0 interleave_bytes_neon: 19701.7 23019.2 15859.2 interleave_bytes_aligned_c: 86603.0 52017.2 67484.2 interleave_bytes_aligned_neon: 9061.0 7623.0 6309.0 Signed-off-by: Martin Storsjö <martin@martin.st>
*	swscale: arm: fix NEON hscale init	Josh de Kock	2020-05-15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The NEON hscale function only supports X8 filter sizes and should only be selected when these are being used. At the moment filterAlign is set to 8 but in the future when extra NEON assembly for specific sizes is added they will need to have checks here too. The immediate usecase for this change is making the hscale checkasm test easier and without NEON specific edge-cases (x86 already has these guards). This applies the same fix from 718c8f9aa59751bb490e2688acf2b5cb68fd5ad1 on the 32 bit arm version of the function, fixing fate-checkasm-sw_scale there. Signed-off-by: Martin Storsjö <martin@martin.st>
*	swscale: fix NEON hscale init	Josh de Kock	2020-05-15
\| \| \| \| \| \| \| \| \| \| \| \| \|	The NEON hscale function only supports X8 filter sizes and should only be selected when these are being used. At the moment filterAlign is set to 8 but in the future when extra NEON assembly for specific sizes is added they will need to have checks here too. The immediate usecase for this change is making the hscale checkasm test easier and without NEON specific edge-cases (x86 already has these guards). Signed-off-by: Josh de Kock <josh@itanimul.li>
*	libswscale: fix for floating point formats, require full chroma	Mark Reid	2020-05-12
\| \| \| \| \| \|	upon more floating point testing, looks like I missed adding this bit. Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
*	libswscale: add output support for AV_PIX_FMT_GBRAPF32	Mark Reid	2020-05-05
\| \| \| \|	Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
*	libswscale: add input support AV_PIX_FMT_GBRAPF32	Mark Reid	2020-05-05
\| \| \| \|	Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
*	swscale/vscale: Increase type strictness	Andreas Rheinhardt	2020-04-27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	libswscale/vscale.c makes extensive use of function pointers and in doing so it converts these function pointers to and from a pointer to void. Yet this is actually against the C standard: C90 only guarantees that one can convert a pointer to any incomplete type or object type to void* and back with the result comparing equal to the original which makes pointers to void generic pointers to incomplete or object type. Yet C90 lacks a generic function pointer type. C99 additionally guarantees that a pointer to a function of one type may be converted to a pointer to a function of another type with the result and the original comparing equal when converting back. This makes any function pointer type a generic function pointer type. Yet even this does not make pointers to void generic function pointers. Both GCC and Clang emit warnings for this when in pedantic mode. This commit fixes this by using a union that can hold one member of any of the required function pointer types to store the function pointer. This works even for C90. Reviewed-by: Michael Niedermayer <michael@niedermayer.cc> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
*	swscale: aarch64: Don't clobber callee-saved registers v8-v15	Martin Storsjö	2020-04-21
\| \| \| \|	Signed-off-by: Martin Storsjö <martin@martin.st>
*	swscale: aarch64: Avoid using the x18 register	Martin Storsjö	2020-04-20
\| \| \| \| \| \| \| \| \| \|	The x18 is a reserved platform register on Darwin and Windows. x8/w8 seems to be unused in this function though (and same about x10 and x14), so there's really no reason to use x18 here - just change the uses of x18/w18 into x8/w8 instead without any further rewrites. Signed-off-by: Martin Storsjö <martin@martin.st>
*	swscale/yuv2rgb: Fix vertical dither offset with slices	Michael Niedermayer	2020-04-12
\| \| \| \|	Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
*	swscale/output: Fix integer overflow in yuv2rgb_write_full() with out of ↵	Michael Niedermayer	2020-04-04
\| \| \| \| \| \| \| \| \| \|	range input Fixes: signed integer overflow: 1169365504 + 981452800 cannot be represented in type 'int' Fixes: ticket8293 Found-by: Suhwan Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
*	swscale/output: Fix integer overflow in alpha computation in ↵	Michael Niedermayer	2020-04-04
\| \| \| \| \| \| \| \| \| \|	yuv2gbrp16_full_X_c() Fixes: signed integer overflow: 524280 * 4432 cannot be represented in type 'int' Fixes: ticket8322 Found-by: Suhwan Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
*	swscale/swscale: remove useless code	Ruiling Song	2020-04-03
\| \| \| \| \|	Signed-off-by: Ruiling Song <ruiling.song@intel.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
*	lsws/input: Do not change transparency range.	Carl Eugen Hoyos	2020-03-11
\| \| \| \|	Fixes ticket #8509.
*	libswscale/x86/yuv2rgb: Fix Segmentation Fault when load unaligned data	Ting Fu	2020-02-26
\| \| \| \| \| \|	Fixes ticket #8532 Signed-off-by: Ting Fu <ting.fu@intel.com>
*	swscale: Add swscale input support for Y210LE	Linjie Fu	2020-02-24
\| \| \| \| \| \| \| \|	Add swscale input support for Y210LE, output support and fate test could be added later if there is requirement for software CSC to this packed format. Signed-off-by: Linjie Fu <linjie.fu@intel.com>
*	libswscale/x86/yuv2rgb: add ssse3 version	Ting Fu	2020-02-10
\| \| \| \| \| \| \| \| \| \|	Tested using this command: /ffmpeg -pix_fmt yuv420p -s 19201080 -i ArashRawYuv420.yuv \ -vcodec rawvideo -s 19201080 -pix_fmt rgb24 -f null /dev/null The fps increase from 389 to 640 on Intel(R) Core(TM) i7-8700K CPU @ 3.70GHz Signed-off-by: Ting Fu <ting.fu@intel.com>
*	libswscale/utils.c: Fix bug #8255	Gautam Ramakrishnan	2020-02-09
\| \| \| \| \| \| \| \| \| \|	Bug #8255 points out a double free error in libwscale/utils.c file. The double free is because the pointer to cascaded_context of an sw_context is not set to NULL after freeing it. When the sw_context is later freed, sws_freeContext is called on the cascaded_context, causing a double free. Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
*	libswscale/x86/yuv2rgb: Change inline assembly into nasm code	Ting Fu	2020-02-05
\| \| \| \| \| \| \| \|	The original inline assembly and nasm code have the same fps when called by command. NASM code almost has no impact on the perfromance. Signed-off-by: Ting Fu <ting.fu@intel.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
*	swscale/input: Fix several invalid shifts related to rgb2yuv constants	Michael Niedermayer	2020-01-22
\| \| \| \| \| \| \| \|	Fixes: Invalid shifts Fixes: #8140 Fixes: #8146 Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
*	swscale/output: Fix several invalid shifts in yuv2rgb_full_1_c_template()	Michael Niedermayer	2020-01-22
\| \| \| \| \| \| \| \|	Fixes: Invalid shifts Fixes: #8320 Reviewed-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
*	swscale/swscale: Fix several invalid shifts related to vChrDrop	Michael Niedermayer	2020-01-22
\| \| \| \| \| \| \| \| \|	Fixes: Invalid shifts Fixes: #8166 Fixes: filter-crop_scale_vflip FATE-test Reviewed-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
*	Silence "string-plus-int" warning shown by clang.	Carl Eugen Hoyos	2020-01-06
\| \| \| \|	libswscale/utils.c:89:42: warning: adding 'unsigned long' to a string does not append to the string [-Wstring-plus-int]
*	swscale/aarch64: use multiply accumulate and shift-right narrow	Sebastian Pop	2020-01-04
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch rewrites the innermost loop of ff_yuv2planeX_8_neon to avoid zips and horizontal adds by using fused multiply adds. The patch also uses ld1r to load one element and replicate it across all lanes of the vector. The patch also improves the clipping code by removing the shift right instructions and performing the shift with the shift-right narrow instructions. I see 8% difference on an m6g instance with neoverse-n1 CPUs: $ ffmpeg -nostats -f lavfi -i testsrc2=4k:d=2 -vf bench=start,scale=1024x1024,bench=stop -f null - before: t:0.014015 avg:0.014096 max:0.015018 min:0.013971 after: t:0.012985 avg:0.013013 max:0.013996 min:0.012818 Tested with `make check` on aarch64-linux. Signed-off-by: Sebastian Pop <spop@amazon.com> Reviewed-by: Clément Bœsch <u@pkh.me> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
*	swscale/utils: remove access of AV_PIX_FMT_NB	Zhao Zhili	2019-12-31
\| \| \| \|	Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
*	swscale/aarch64: use multiply accumulate and increase vector factor to 4	Sebastian Pop	2019-12-17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch implements ff_hscale_8_to_15_neon with NEON fused multiply accumulate and bumps the vectorization factor from 2 to 4. The speedup is of 25% on Graviton1 A1 instances based on A-72 cpus: $ ffmpeg -nostats -f lavfi -i testsrc2=4k:d=2 -vf bench=start,scale=1024x1024,bench=stop -f null - before: t:0.040303 avg:0.040287 max:0.040371 min:0.039214 after: t:0.032168 avg:0.032215 max:0.033081 min:0.032146 The speedup is of 39% on Graviton2 m6g instances based on Neoverse-N1 cpus: $ ffmpeg -nostats -f lavfi -i testsrc2=4k:d=2 -vf bench=start,scale=1024x1024,bench=stop -f null - before: t:0.019446 avg:0.019423 max:0.019493 min:0.019181 after: t:0.014015 avg:0.014096 max:0.015018 min:0.013971 Tested with `make check` on aarch64-linux. Signed-off-by: Sebastian Pop <spop@amazon.com> Reviewed-by: Jean-Baptiste Kempf <jb@videolan.org> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
*	swscale/swscale_unscaled: add AV_PIX_FMT_GBRAP10 for LE and BE conversion ↵	Limin Wang	2019-12-10
\| \| \| \| \| \| \|	wrapper Signed-off-by: Limin Wang <lance.lmwang@gmail.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
*	libswscale/swscale_unscaled.c: remove redundant code	Ting Fu	2019-12-06
\| \| \| \| \|	Signed-off-by: Ting Fu <ting.fu@intel.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
*	swscale/swscale_unscaled: fix gbrap10be md5 different on big endian system	Limin Wang	2019-11-01
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	You can reproduce it by below command: ./ffmpeg -f lavfi -i "testsrc=duration=1:rate=30" -vf format=gbrap10 -vcodec rawvideo \ -pix_fmt gbrap10le -flags +bitexact -sws_flags +accurate_rnd+bitexact -fflags +bitexact \ -frames:v 1 -f nut md5: little-endian: f91e2edd8098276579c1929e5e160416 big-endian: ba4d011dbbdc78ccbf6cc7d698630929 Signed-off-by: Limin Wang <lance.lmwang@gmail.com> Reviewed-by: Paul B Mahol <onemda@gmail.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
*	swscale/output: Avoid 64bit in Alpha in yuv2ya16_X_c_template()	Michael Niedermayer	2019-10-16
\| \| \| \|	Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
*	swscale/output: Correct Alpha in yuv2ya16_X_c_template()	Michael Niedermayer	2019-10-16
\| \| \| \| \| \|	Untested, no testcase Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
*	swscale/output: Implement Luma computation from yuv2ya16_X_c_template() ↵	Michael Niedermayer	2019-10-16
\| \| \| \| \| \| \| \| \|	without 64bit This also reverts 21838cad2fc44023ad85e35d5c677e2f8d29a0ef The revert is in this commit to avoid 2 fate updates Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
*	swscale: Fix AltiVec/VSX build with recent GCC	Daniel Kolesa	2019-10-04
\| \| \| \| \| \| \| \| \| \| \| \| \|	The argument to vec_splat_u16 must be a literal. By making the function always inline and marking the arguments const, gcc can turn those into literals, and avoid build errors like: swscale_vsx.c:165:53: error: argument 1 must be a 5-bit signed literal Fixes #7861. Signed-off-by: Daniel Kolesa <daniel@octaforge.org> Signed-off-by: Lauri Kasanen <cand@gmx.com>
*	swscale: Replace illegal vector keyword usage in altivec code	Daniel Kolesa	2019-10-04
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	While this technically compiles in current ffmpeg, this is only because ffmpeg is compiled in strict ISO C mode, which disables the builtin 'vector' keyword for AltiVec/VSX. Instead this gets replaced with a macro inside altivec.h, which defines vector to be actually __vector, which accepts random types. Normally, the vector keyword should be used only with plain scalar non-typedef types, such as unsigned int. But we have the vec_(s\|u)(8\|16\|32) macros, which can be used in a portable manner, in util_altivec.h in libavutil. This is also consistent with other AltiVec/VSX code elsewhere in the tree. Fixes #7861. Signed-off-by: Daniel Kolesa <daniel@octaforge.org> Signed-off-by: Lauri Kasanen <cand@gmx.com>
*	swscale/utils: Fix invalid left shifts of negative numbers	Andreas Rheinhardt	2019-09-28
\| \| \| \| \| \| \| \|	Affected the FATE-tests vsynth_lena-dv-411, vsynth1-dv-411, vsynth2-dv-411 and hevc-paramchange-yuv420p.yuv420p10. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
*	swscale/x86/swscale: Fix undefined left shifts of negative numbers	Andreas Rheinhardt	2019-09-28
\| \| \| \| \| \| \| \| \|	This affected many FATE-tests: The number of failing tests went down from 663 to 344. (Both numbers exclude tests that failed because of unaligned accesses in code that is inside #if HAVE_FAST_UNALIGNED.) Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
*	swscale/swscale: cosmetics	Limin Wang	2019-09-27
\| \| \| \| \|	Signed-off-by: Limin Wang <lance.lmwang@gmail.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
*	swscale/output: fix signed integer overflow for ya16	Paul B Mahol	2019-09-26
\| \| \| \|	Fixes #7666.
*	swscale/swscale: delete unwanted assignments	Limin Wang	2019-09-09
\| \| \| \| \|	Signed-off-by: Limin Wang <lance.lmwang@gmail.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
*	swscale/output: fix some code indentations	Linjie Fu	2019-09-06
\| \| \| \| \|	Signed-off-by: Linjie Fu <linjie.fu@intel.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
*	lsws/ppc/yuv2rgb_altivec: Replace vec_lvsl/vec_perm with vec_xl	Chip Kerchner	2019-08-13
\| \| \| \| \| \| \| \| \| \|	gcc 6.x and 7.x generate wrong code for little endian machines for the vec_lvsl/vec_perm instruction combos in some cases. The bug was fixed in version 8.x If these instructions are replaced with vec_xl, the problem goes away for all versions of the compilers. Fixes ticket #7124.