libav.git - [no description]

	Commit message (Collapse)	Author	Age
...
*	lavc/aarch64: new optimization for 8-bit hevc_epel_bi_h	Logan Lyu	2023-12-01
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	put_hevc_epel_bi_h4_8_c: 96.0 put_hevc_epel_bi_h4_8_neon: 36.3 put_hevc_epel_bi_h6_8_c: 288.3 put_hevc_epel_bi_h6_8_neon: 59.3 put_hevc_epel_bi_h8_8_c: 358.5 put_hevc_epel_bi_h8_8_neon: 61.5 put_hevc_epel_bi_h12_8_c: 759.8 put_hevc_epel_bi_h12_8_neon: 159.5 put_hevc_epel_bi_h16_8_c: 1307.0 put_hevc_epel_bi_h16_8_neon: 182.0 put_hevc_epel_bi_h24_8_c: 2778.3 put_hevc_epel_bi_h24_8_neon: 430.5 put_hevc_epel_bi_h32_8_c: 4952.3 put_hevc_epel_bi_h32_8_neon: 679.5 put_hevc_epel_bi_h48_8_c: 11803.3 put_hevc_epel_bi_h48_8_neon: 1443.5 put_hevc_epel_bi_h64_8_c: 20654.8 put_hevc_epel_bi_h64_8_neon: 2737.0 put_hevc_qpel_bi_h4_8_c: 140.0 put_hevc_qpel_bi_h4_8_neon: 111.5 put_hevc_qpel_bi_h6_8_c: 318.0 put_hevc_qpel_bi_h6_8_neon: 85.8 put_hevc_qpel_bi_h8_8_c: 536.5 put_hevc_qpel_bi_h8_8_neon: 95.3 put_hevc_qpel_bi_h12_8_c: 1188.5 put_hevc_qpel_bi_h12_8_neon: 291.3 put_hevc_qpel_bi_h16_8_c: 2064.3 put_hevc_qpel_bi_h16_8_neon: 365.3 put_hevc_qpel_bi_h24_8_c: 4757.5 put_hevc_qpel_bi_h24_8_neon: 1010.0 put_hevc_qpel_bi_h32_8_c: 8351.8 put_hevc_qpel_bi_h32_8_neon: 2917.8 put_hevc_qpel_bi_h48_8_c: 19299.8 put_hevc_qpel_bi_h48_8_neon: 2976.8 put_hevc_qpel_bi_h64_8_c: 34182.5 put_hevc_qpel_bi_h64_8_neon: 5236.3 Co-Authored-By: J. Dekker <jdek@itanimul.li> Signed-off-by: Martin Storsjö <martin@martin.st>
*	lavc/aarch64: new optimization for 8-bit hevc_pel_bi_pixels	Logan Lyu	2023-12-01
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	put_hevc_pel_bi_pixels4_8_c: 54.7 put_hevc_pel_bi_pixels4_8_neon: 43.0 put_hevc_pel_bi_pixels6_8_c: 94.7 put_hevc_pel_bi_pixels6_8_neon: 37.0 put_hevc_pel_bi_pixels8_8_c: 171.0 put_hevc_pel_bi_pixels8_8_neon: 24.0 put_hevc_pel_bi_pixels12_8_c: 354.0 put_hevc_pel_bi_pixels12_8_neon: 68.7 put_hevc_pel_bi_pixels16_8_c: 588.2 put_hevc_pel_bi_pixels16_8_neon: 77.5 put_hevc_pel_bi_pixels24_8_c: 1670.7 put_hevc_pel_bi_pixels24_8_neon: 173.0 put_hevc_pel_bi_pixels32_8_c: 2267.7 put_hevc_pel_bi_pixels32_8_neon: 281.2 put_hevc_pel_bi_pixels48_8_c: 5787.5 put_hevc_pel_bi_pixels48_8_neon: 673.5 put_hevc_pel_bi_pixels64_8_c: 9897.0 put_hevc_pel_bi_pixels64_8_neon: 1159.5 Co-Authored-By: J. Dekker <jdek@itanimul.li> Signed-off-by: Martin Storsjö <martin@martin.st>
*	checkasm/ac3dsp: add float_to_fixed24 test	sunyuechi	2023-12-01
\| \| \| \|	Signed-off-by: Rémi Denis-Courmont <remi@remlab.net>
*	avcodec/ac3dsp: add missing stddef.h include	James Almer	2023-12-01
\| \| \| \| \| \|	Should fix make checkheaders Signed-off-by: James Almer <jamrial@gmail.com>
*	avfilter/framesync: fix OOM case	Paul B Mahol	2023-11-30
\| \| \| \| \| \|	Fixes OOM when caller keeps adding frames into filtergraph that reached EOF by other means, for example EOF is signalled by other filter in filtergraph or by buffersink.
*	avfilter/arls_template: use defines for all constants	Paul B Mahol	2023-11-28
\|
*	avfilter: add Affine Projection adaptive audio filter	Paul B Mahol	2023-11-28
\|
*	lavc/hevcdsp_qpel_neon: using movi.16b instead of movi.2d	xufuji456	2023-11-28
\| \| \| \| \| \| \|	Building iOS platform with arm64, the compiler has a warning: "instruction movi.2d with immediate #0 may not function correctly on this CPU, converting to movi.16b" Signed-off-by: xufuji456 <839789740@qq.com> Signed-off-by: Martin Storsjö <martin@martin.st>
*	avfilter/af_anlms: set output frame duration	Paul B Mahol	2023-11-28
\|
*	avfilter/af_arls: set output frame duration	Paul B Mahol	2023-11-28
\|
*	avfilter/af_amix: set output frame duration	Paul B Mahol	2023-11-28
\|
*	avfilter/af_amultiply: set output frame duration	Paul B Mahol	2023-11-28
\|
*	avfilter/af_amerge: use already provided outlink	Paul B Mahol	2023-11-28
\|
*	avfilter: no need to request more samples if internal frame is available	Paul B Mahol	2023-11-28
\|
*	tools/general_assembly: add newly voted-in extra GA members	Anton Khirnov	2023-11-28
\| \| \| \| \| \| \|	Cf. * https://vote.ffmpeg.org/cgi-bin/civs/results.pl?id=E_d0b225b9aa8d45d5 * http://lists.ffmpeg.org/pipermail/ffmpeg-devel/2023-November/317496.html Message-Id <170115613784.8914.4950266152609138336@lain.khirnov.net>
*	avfilter/af_arls: add double sample format support	Paul B Mahol	2023-11-27
\|
*	avfilter/af_anlms: add double sample format support	Paul B Mahol	2023-11-27
\|
*	checkasm: test for dcmul_add	sunyuechi	2023-11-27
\| \| \| \|	Signed-off-by: Rémi Denis-Courmont <remi@remlab.net>
*	avcodec/videotoolboxenc: refactor dump encoder name	Zhao Zhili	2023-11-27
\| \| \| \|	Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
*	avcodec/videotoolboxenc: Fix build failure due to PropertyKey_EncoderID	Zhao Zhili	2023-11-27
\| \| \| \|	Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
*	fftools/ffplay_renderer: declare function argument as const	Leo Izen	2023-11-27
\| \| \| \| \| \| \| \|	Declaring the function argument as const fixes a warning down the line that the const parameter is stripped. We don't modify this argument. Signed-off-by: Leo Izen <leo.izen@gmail.com> Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
*	avfilter/vf_colorcorrect: fix memory leaks	Paul B Mahol	2023-11-27
\|
*	avfilter/af_dialoguenhance: do output scaling once	Paul B Mahol	2023-11-27
\|
*	avfilter/af_afwtdn: fix crash with EOF handling	Paul B Mahol	2023-11-27
\|
*	avfilter/af_dialoguenhance: simplify channels copy	Paul B Mahol	2023-11-27
\|
*	doc/filters: restore entry for libvmaf option pool	Gyan Doshi	2023-11-27
\| \| \| \| \| \| \| \| \|	3d29724c00 removed the doc entry for the option pool while adding a parser function for it at the same time! The option remains available and undeprecated. Fixes trac #10693
*	avformat: add QOA demuxer	Paul B Mahol	2023-11-26
\|
*	avcodec: add QOA decoder	Paul B Mahol	2023-11-26
\|
*	libavcodec/mlpdec: add missing correction to ch_layout when downmixing	Geoffrey McRae	2023-11-26
\| \| \| \| \| \| \| \|	This fixes corrupted audio for applications relying on ch_layout when codec downmixing is active. Signed-off-by: Geoffrey McRae <geoff@hostfission.com> Signed-off-by: James Almer <jamrial@gmail.com>
*	libavcodec/dcadec: adjust the `ch_layout` when downmix is active	Geoffrey McRae	2023-11-26
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Applications making use of this codec with the `downmix` option are segfaulting unless the `ch_layout` is overridden after `avcodec_open2` as can be seen in projects like MythTV[1] This patch fixes this by overriding the ch_layout as done in other decoders such as AC3. 1: https://github.com/MythTV/mythtv/blob/af6f362a140cd59b9ed784a8c639fd456b5f6967/mythtv/libs/libmythtv/decoders/avformatdecoder.cpp#L4607 Signed-off-by: Geoffrey McRae <geoff@hostfission.com> Signed-off-by: James Almer <jamrial@gmail.com>
*	libavfilter/vf_dnn_detect: Add yolo support	Wenbin Chen	2023-11-26
\| \| \| \| \| \| \| \| \| \| \| \|	Add yolo support. Yolo model doesn't output final result. It outputs candidate boxes, so we need post-process to remove overlap boxes to get final results. Also, the box's coordinators relate to cell and anchors, so we need these information to calculate boxes as well. Model detail please refer to: https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/public/yolo-v2-tf Signed-off-by: Wenbin Chen <wenbin.chen@intel.com> Reviewed-by: Guo Yejun <yejun.guo@intel.com>
*	libavfilter/vf_dnn_detect: Add model_type option.	Wenbin Chen	2023-11-26
\| \| \| \| \| \| \| \| \| \|	There are many kinds of detection DNN model and they have different preprocess and postprocess methods. To support more models, "model_type" option is added to help to choose preprocess and postprocess function. Signed-off-by: Wenbin Chen <wenbin.chen@intel.com> Reviewed-by: Guo Yejun <yejun.guo@intel.com>
*	tools/general_assembly: restore printing HEAD	Anton Khirnov	2023-11-26
\|
*	tools/general_assembly: implement extra GA members	Anton Khirnov	2023-11-26
\|
*	avfilter/vsrc_gradients: allow zero speed	Paul B Mahol	2023-11-26
\|
*	avfilter/vsrc_gradients: add square type	Paul B Mahol	2023-11-26
\|
*	mips/ac3dsp_mips: add missing stddef.h header include	James Almer	2023-11-25
\| \| \| \| \| \| \|	Fixes compilation failures after 567c67c6c8cb9be083f56198bfa979e4bda84c99. Reviewed-by: Michael Niedermayer <michael@niedermayer.cc> Signed-off-by: James Almer <jamrial@gmail.com>
*	x86/ac3dsp: add ff_float_to_fixed24_avx()	James Almer	2023-11-25
\| \| \| \|	Signed-off-by: James Almer <jamrial@gmail.com>
*	x86/ac3dsp: reduce instruction count inside the float_to_fixed24 loop	James Almer	2023-11-25
\| \| \| \|	Signed-off-by: James Almer <jamrial@gmail.com>
*	avfilter/af_dialoguenhance: fix overreads	Paul B Mahol	2023-11-25
\|
*	avfilter/af_channelmap: do not override set channel layout	Paul B Mahol	2023-11-25
\|
*	Revert "avformat/rtmpproto: Pass rw_timeout to underlying transport protocol"	Zhao Zhili	2023-11-25
\| \| \| \| \| \| \| \| \|	This reverts commit bec6dfcd5c0b59dd6d947ec3074986aeffd525aa. The patch is NOP since ffurl_open_whitelist copy options from parent automatically. Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
*	checkasm/riscv: report an error upon SIGILL	Rémi Denis-Courmont	2023-11-23
\| \| \| \| \| \| \| \|	Terminating the whole checkasm process is not very helpful. This will report if an illegal instruction occurs while executing a tested function. This is a common occurrence whilst developping RISC-V assembler, due to the compatibility between vector configuration and instruction done at run-time.
*	checkasm: add helper to report a fatal signal	Rémi Denis-Courmont	2023-11-23
\|
*	lavc/llvidencdsp: add R-V V diff_bytes	Rémi Denis-Courmont	2023-11-23
\| \| \| \| \|	diff_bytes_c: 163.0 diff_bytes_rvv_i32: 52.7
*	lavc/aacpsdsp: use LMUL=2 and amortise strides	Rémi Denis-Courmont	2023-11-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The input is laid out in 16 segments, of which 13 actually need to be loaded. There are no really efficient ways to deal with this: 1) If we load 8 segments wit unit stride, then narrow to 16 segments with right shifts, we can only get one half-size vector per segment, or just 2 elements per vector (EMUL=1/2) - at least with 128-bit vectors. This ends up unsurprisingly about as fas as the C code. 2) The current approach is to load with strides. We keep that approach, but improve it using three 4-segmented loads instead of 12 single-segment loads. This divides the number of distinct loaded addresses by 4. 3) A potential third approach would be to avoid segmentation altogether and splat the scalar coefficient into vectors. Then we can use a unit-stride and maximum EMUL. But the downside then is that we have to multiply the 3 (of 16) unused segments with zero as part of the multiply-accumulate operations. In addition, we also reuse vectors mid-loop so as to increase the EMUL from 1 to 2, which also improves performance a little bit. Oeverall the gains are quite small with the device under test, as it does not deal with segmented loads very well. But at least the code is tidier, and should enjoy bigger speed-ups on better hardware implementation. Before: ps_hybrid_analysis_c: 1819.2 ps_hybrid_analysis_rvv_f32: 1037.0 (before) ps_hybrid_analysis_rvv_f32: 990.0 (after)
*	lavc/g722dsp: optimise R-V V apply_qmf	Rémi Denis-Courmont	2023-11-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This stores the constant coefficients deinterleaved, so that they can be loaded directly with NF=0. Unfortunately, we cannot optimise loading the input, due to insufficient memory alignment (not 32-bit). Before: g722_apply_qmf_c: 82.5 g722_apply_qmf_rvv_i32: 78.2 After: g722_apply_qmf_c: 82.5 g722_apply_qmf_rvv_i32: 65.2
*	lavu/fixed_dsp: R-V V fmul_window_scaled	Rémi Denis-Courmont	2023-11-23
\| \| \| \| \|	vector_fmul_window_scaled_fixed_c: 4393.7 vector_fmul_window_scaled_fixed_rvv_i64: 1642.7
*	lavu/float_dsp: optimise R-V V fmul_reverse & fmul_window	Rémi Denis-Courmont	2023-11-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Roll the loop to avoid slow gathers. Before: vector_fmul_reverse_c: 1561.7 vector_fmul_reverse_rvv_f32: 2410.2 vector_fmul_window_c: 2068.2 vector_fmul_window_rvv_f32: 1879.5 After: vector_fmul_reverse_c: 1561.7 vector_fmul_reverse_rvv_f32: 916.2 vector_fmul_window_c: 2068.2 vector_fmul_window_rvv_f32: 1202.5
*	lavu/fixed_dsp: optimise R-V V fmul_reverse	Rémi Denis-Courmont	2023-11-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Gathers are (unsurprisingly) a notable exception to the rule that R-V V gets faster with larger group multipliers. So roll the function to speed it up. Before: vector_fmul_reverse_fixed_c: 2840.7 vector_fmul_reverse_fixed_rvv_i32: 2430.2 After: vector_fmul_reverse_fixed_c: 2841.0 vector_fmul_reverse_fixed_rvv_i32: 962.2 It might be possible to further optimise the function by moving the reverse-subtract out of the loop and adding ad-hoc tail handling.