libav.git - [no description]

	Commit message (Collapse)	Author	Age
*	libavfilter/x86/vf_gblur: correct the order of loop step	Wu Jianhua	2021-09-18
\| \| \| \| \| \| \| \| \| \| \|	The problem was caused by if the width of the processed block minus 1 is a multiple of the aligned number the instruction jle .bscale_scalar would skip the Optimized Loop Step, which will lead to an incorrect sampling when specifying steps more than 1. Move the Optimized Loop Step after .bscale_scalar to ensure the loop step is enabled. Signed-off-by: Wu Jianhua <jianhua.wu@intel.com>
*	libavfilter/x86/vf_gblur: fixed the fate-test failed on MacOS	Wu Jianhua	2021-09-18
\| \| \| \|	Signed-off-by: Wu Jianhua <jianhua.wu@intel.com>
*	libavfilter/x86/vf_gblur: add localbuf and ff_horiz_slice_avx2/512()	Wu Jianhua	2021-08-29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We introduced a ff_horiz_slice_avx2/512() implemented on a new algorithm. In a nutshell, the new algorithm does three things, gathering data from 8/16 rows, blurring data, and scattering data back to the image buffer. Here we used a customized transpose 8x8/16x16 to avoid the huge overhead brought by gather and scatter instructions, which is dependent on the temporary buffer called localbuf added newly. Performance data: ff_horiz_slice_avx2(old): 109.89 ff_horiz_slice_avx2(new): 666.67 ff_horiz_slice_avx512: 1000 Co-authored-by: Cheng Yanfei <yanfei.cheng@intel.com> Co-authored-by: Jin Jun <jun.i.jin@intel.com> Signed-off-by: Wu Jianhua <jianhua.wu@intel.com>
*	libavfilter/x86/vf_gblur: add ff_verti_slice_avx2/512()	Wu Jianhua	2021-08-29
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The new vertical slice with AVX2/512 acceleration can significantly improve the performance of Gaussian Filter 2D. Performance data: ff_verti_slice_c: 32.57 ff_verti_slice_avx2: 476.19 ff_verti_slice_avx512: 833.33 Co-authored-by: Cheng Yanfei <yanfei.cheng@intel.com> Co-authored-by: Jin Jun <jun.i.jin@intel.com> Signed-off-by: Wu Jianhua <jianhua.wu@intel.com>
*	libavfilter/x86/vf_gblur: add ff_postscale_slice_avx512()	Wu Jianhua	2021-08-29
\| \| \| \| \| \|	Co-authored-by: Cheng Yanfei <yanfei.cheng@intel.com> Co-authored-by: Jin Jun <jun.i.jin@intel.com> Signed-off-by: Wu Jianhua <jianhua.wu@intel.com>
*	avfilter/avf_showcqt: switch to TX FFT from avutil	Paul B Mahol	2021-07-27
\|
*	Remove unnecessary mem.h inclusions	Andreas Rheinhardt	2021-07-22
\| \| \| \|	Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
*	x86/vf_gblur: fix reg name in UNIX64 prologue	James Almer	2021-02-17
\| \| \| \|	Signed-off-by: James Almer <jamrial@gmail.com>
*	x86/vf_gblur: fix postscale_slice prologue	James Almer	2021-02-17
\| \| \| \| \| \| \|	x86_32 ABI does not pass float arguments directly on xmm regs, and the Win64 ABI uses only the first four regs for this purpose. Signed-off-by: James Almer <jamrial@gmail.com>
*	avfilter/x86/vf_gblur: add postscale SIMD	Paul B Mahol	2021-02-16
\|
*	avfilter/vf_convolution: add 16-column operation for filter_column()	Paul B Mahol	2021-02-13
\| \| \| \|	Based on patch by Xu Jun <xujunzz@sjtu.edu.cn>
*	avfilter/vf_atadenoise: add sigma options	Paul B Mahol	2021-01-22
\|
*	avfilter/vf_v360: add mitchell interpolation	Paul B Mahol	2020-10-04
\|
*	avfilter/x86/vf_convolution_init: there is asm only for 8bit depth	Paul B Mahol	2020-09-15
\|
*	Revert "avfilter/yadif: simplify the code for better readability"	Limin Wang	2020-08-27
\| \| \| \|	This reverts commit 2a9b934675b9e2d3850b46f8a618c19b03f02551.
*	avfilter/yadif: simplify the code for better readability	Limin Wang	2020-08-26
\| \| \| \|	Signed-off-by: Limin Wang <lance.lmwang@gmail.com>
*	x86/vf_blend: fix warnings about trailing empty parameters	James Almer	2020-07-12
\| \| \| \| \| \|	Finishes fixing ticket #8771 Signed-off-by: James Almer <jamrial@gmail.com>
*	avfilter/x86/vf_v360_init: add missing cases	Paul B Mahol	2020-04-02
\|
*	avfilter/vf_v360: add SIMD for lagrange9 interpolation	Paul B Mahol	2020-04-02
\|
*	vf_ssim: Fix loading doubles to float registers on i386	Martin Storsjö	2020-02-05
\| \| \| \| \| \| \|	This fixes the tests filter-refcmp-ssim-yuv and filter-refcmp-ssim-rgb on i386 after breaking in fcc0424c933742c8fc852371e985d16b6eb4bfe9. Signed-off-by: Martin Storsjö <martin@martin.st>
*	avfilter/vf_ssim: improve precision	Paul B Mahol	2020-02-04
\| \| \| \|	Use doubles for accumulating floats.
*	avfilter/vf_v360: change remaps to int16_t type	Paul B Mahol	2020-01-19
\|
*	avfilter/x86/vf_interlace: always use unaligned movs	Marton Balint	2019-12-15
\| \| \| \| \| \| \| \| \| \|	Fixes crashes in command lines such as: ffmpeg -f lavfi -i testsrc2=704x576:r=50,interlace,pad=720:576:8 -f null none Related to ticket #6491. Signed-off-by: Marton Balint <cus@passwd.hu>
*	avfilter/vf_maskedclamp: add x86 SIMD	Paul B Mahol	2019-10-23
\|
*	x86/vf_transpose: make ff_transpose_8x8_16_sse2 work on x86_32	James Almer	2019-10-22
\| \| \| \| \|	Reviewed-by: Paul B Mahol <onemda@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>
*	x86/vf_transpose: fix cpuflags check	James Almer	2019-10-21
\| \| \| \|	Signed-off-by: James Almer <jamrial@gmail.com>
*	avfilter/vf_transpose: add x86 SIMD	Paul B Mahol	2019-10-21
\|
*	avfilter/x86/vf_atadenoise: fix comment	Paul B Mahol	2019-10-21
\|
*	avfilter/x86/vf_atadenoise: add SIMD for serial too	Paul B Mahol	2019-10-17
\|
*	avfilter/vf_atadenoise: add option to use additional algorithm	Paul B Mahol	2019-10-17
\|
*	avfilter/vf_adadenoise: add x86 SIMD	Paul B Mahol	2019-10-17
\|
*	avfilter/vf_gblur: fix heap-buffer overflow	Paul B Mahol	2019-10-16
\| \| \| \|	Fixes #8282
*	avcodec/filter: Remove extra '; ' outside of functions	Andreas Rheinhardt	2019-10-07
\| \| \| \| \| \| \| \|	They are not allowed outside of functions. Fixes the warning "ISO C does not allow extra ‘;’ outside of a function [-Wpedantic]" when compiling with GCC and -pedantic. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
*	avfilter/vf_eq: fix compilation with x86 asm disabled	James Almer	2019-09-26
\| \| \| \|	Signed-off-by: James Almer <jamrial@gmail.com>
*	avfilter/x86/vf_eq: add SSE2 version	Ting Fu	2019-09-26
\| \| \| \|	Signed-off-by: Ting Fu <ting.fu@intel.com>
*	avfilter/x86/vf_eq: Change inline assembly into nasm code	Ting Fu	2019-09-26
\| \| \| \|	Signed-off-by: Ting Fu <ting.fu@intel.com>
*	avfilter/x86/vf_360: add most of >8 depth asm	Paul B Mahol	2019-09-16
\|
*	x86/vf_v360: use a faster horizontal add in remap4_8bit_line_avx2	James Almer	2019-09-06
\| \| \| \|	Signed-off-by: James Almer <jamrial@gmail.com>
*	x86/vf_v360: make remap{1,2}_8bit_line_avx2 work on x86_32	James Almer	2019-09-06
\| \| \| \|	Signed-off-by: James Almer <jamrial@gmail.com>
*	avfilter/vf_v360: x86 SIMD for interpolations	Paul B Mahol	2019-09-06
\|
*	avfilter/vf_convolution: add x86 SIMD for filter_3x3()	Ruiling Song	2019-08-07
\| \| \| \| \| \| \| \| \| \| \|	Tested using a simple command (apply edge enhance): ./ffmpeg_g -i ~/Downloads/bbb_sunflower_1080p_30fps_normal.mp4 \ -vf convolution="0 0 0 -1 1 0 0 0 0:0 0 0 -1 1 0 0 0 0:0 0 0 -1 1 0 0 0 0:0 0 0 -1 1 0 0 0 0:5:1:1:1:0:128:128:128" \ -an -vframes 1000 -f null /dev/null The fps increase from 151 to 270 on my local machine. Signed-off-by: Ruiling Song <ruiling.song@intel.com>
*	avfilter/vf_gblur: add missing preprocessor check	James Almer	2019-06-12
\| \| \| \| \| \|	Fixes compilation on x86_32 Signed-off-by: James Almer <jamrial@gmail.com>
*	avfilter/vf_gblur: add x86 SIMD optimizations	Ruiling Song	2019-06-12
\| \| \| \| \| \| \| \| \| \| \| \| \|	The horizontal pass get ~2x performance with the patch under single thread. Tested overall performance using the command(avx2 enabled): ./ffmpeg -i 1080p.mp4 -vf gblur -f null /dev/null ./ffmpeg -i 1080p.mp4 -vf gblur=threads=1 -f null /dev/null For single thread, the fps improves from 43 to 60, about 40%. For multi-thread, the fps improves from 110 to 130, about 20%. Signed-off-by: Ruiling Song <ruiling.song@intel.com>
*	avfilter: add anlmdn filter x86 SIMD optimizations	Paul B Mahol	2019-01-10
\|
*	x86/af_afir: use three operand form forat some instructions	James Almer	2019-01-03
\| \| \| \| \| \|	Fixes compilation with old yasm versions. Signed-off-by: James Almer <jamrial@gmail.com>
*	x86/af_afir: add ff_fcmul_add_avx()	James Almer	2019-01-03
\| \| \| \| \| \| \| \| \| \| \|	fcmul_add_c: 1228.8 fcmul_add_sse3: 334.3 fcmul_add_avx: 186.3 Tested on a Core i5 4460 @ 3.2GHz Reviewed-by: Paul B Mahol <onemda@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>
*	avfilter/af_afir: split off fcmul_add into a DSP context	James Almer	2019-01-03
\| \| \| \| \|	Reviewed-by: Paul B Mahol <onemda@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>
*	x86/af_afir: fix processing the last element	James Almer	2019-01-03
\| \| \| \| \| \| \|	ff_fcmul_add_sse3() is now identical to the C version. Reviewed-by: Paul B Mahol <onemda@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>
*	x86/scene_sad: fix link errors when HAVE_X86ASM is not defined	James Almer	2018-11-21
\| \| \| \| \|	Reviewed-by: Haihao Xiang <haihao.xiang@intel.com> Signed-off-by: James Almer <jamrial@gmail.com>
*	avfilter/vf_blend: add 10bit support	Paul B Mahol	2018-11-15
\|