libav.git - [no description]

	Commit message (Collapse)	Author	Age
*	avcodec/x86: add cfhdenc SIMD	Paul B Mahol	2021-02-27
\|
*	avcodec/cfhd: add x86 SIMD	Paul B Mahol	2020-08-26
\| \| \| \|	Overall speed changes for 1920x1080, yuv422p10le, 60fps from: 0.19x to 0.343x
*	avcodec/Makefile: add missing pngdsp dependency to the lscr decoder	James Almer	2019-05-14
\| \| \| \|	Signed-off-by: James Almer <jamrial@gmail.com>
*	x86/opusdsp: implement FMA3 accelerated postfilter and deemphasis	Lynne	2019-04-01
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	58893 decicycles in deemphasis_c, 130548 runs, 524 skips 9475 decicycles in deemphasis_fma3, 130686 runs, 386 skips -> 6.21x speedup 24866 decicycles in postfilter_c, 65386 runs, 150 skips 5268 decicycles in postfilter_fma3, 65505 runs, 31 skips -> 4.72x speedup Total decoder speedup: ~14% Deemphasis SIMD based on the following unrolling: const float c1 = CELT_EMPH_COEFF, c2 = c1c1, c3 = c2c1, c4 = c3c1; float state = coeff; for (int i = 0; i < len; i += 4) { y[0] = x[0] + c1state; y[1] = x[1] + c2state + c1x[0]; y[2] = x[2] + c3state + c1x[1] + c2x[0]; y[3] = x[3] + c4state + c1x[2] + c2x[1] + c3*x[0]; state = y[3]; y += 4; x += 4; }
*	celt_pvq_init: only build when CONFIG_OPUS_ENCODER is enabled	Lynne	2019-03-31
\| \| \| \|	The entire function was defined away before.
*	x86/opus_dsp: rename to celt_pvq	Lynne	2019-03-31
\| \| \| \|	Its only used in the encoder and in CELT's PVQ.
*	sbcenc: add MMX optimizations	Aurelien Jacobs	2018-03-07
\| \| \| \| \| \| \| \|	This was originally based on libsbc, and was fully integrated into ffmpeg. Rough speed test: C version: speed= 592x MMX version: speed= 785x
*	libavcodec/exr : add X86 SIMD for reorder_pixels	Martin Vignali	2017-09-17
\| \| \| \|	Signed-off-by: James Almer <jamrial@gmail.com>
*	SIMD opus pvq_search implementation	Ivan Kalvachev	2017-08-18
\| \| \| \| \| \| \| \| \| \| \| \|	Explanation on the workings and methods used by the Pyramid Vector Quantization Search function could be found in the following Work-In-Progress mail threads: http://ffmpeg.org/pipermail/ffmpeg-devel/2017-June/212146.html http://ffmpeg.org/pipermail/ffmpeg-devel/2017-June/212816.html http://ffmpeg.org/pipermail/ffmpeg-devel/2017-July/213030.html http://ffmpeg.org/pipermail/ffmpeg-devel/2017-July/213436.html Signed-off-by: Ivan Kalvachev <ikalvachev@gmail.com>
*	avcodec/utvideodec: add SIMD for restore_rgb_planes	Paul B Mahol	2017-06-27
\| \| \| \|	Signed-off-by: Paul B Mahol <onemda@gmail.com>
*	mdct15: add assembly optimizations for the 15-point FFT	Rostislav Pehlivanov	2017-06-23
\| \| \| \| \| \| \|	c: 1802 decicycles in fft15,16774635 runs, 2581 skips avx: 865 decicycles in fft15,16776378 runs, 838 skips Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
*	build: Generalize yasm/nasm-related variable names	Diego Biurrun	2017-06-21
\| \| \| \| \| \| \| \|	None of them are specific to the YASM assembler. (Cherry-picked from libav commit 39e208f4d4756367c7cd2d581847e0c1b8a429c1) Signed-off-by: James Almer <jamrial@gmail.com>
*	avcodec/x86: move simple_idct to external assembly	James Darnley	2017-05-30
\|
*	cavs: convert idct from inline asm to yasm.	Ronald S. Bultje	2017-04-06
\|
*	lavc/x86/hevc: rename hevc_res_add to hevc_add_res	Clément Bœsch	2017-03-24
\| \| \| \|	This will simplify incoming merge.
*	Merge commit 'b57e38f52cc3f31a27105c28887d57cd6812c3eb'	Clément Bœsch	2017-03-22
\|\ \| \| \| \| \| \| \| \| \| \| \| \|	* commit 'b57e38f52cc3f31a27105c28887d57cd6812c3eb': ac3dsp: x86: Replace inline asm for in-decoder downmixing with standalone asm Merged-by: Clément Bœsch <u@pkh.me>
\| *	ac3dsp: x86: Replace inline asm for in-decoder downmixing with standalone asm	Justin Ruggles	2016-10-01
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Adds a wrapper function for downmixing which detects channel count changes and updates the selected downmix function accordingly. Simplification and porting to current x86inc infrastructure by Diego Biurrun. Signed-off-by: Diego Biurrun <diego@biurrun.de>
\| *	audiodsp/x86: yasmify vector_clipf_sse	Anton Khirnov	2016-09-22
\| \|
\| *	vp9/x86: rename vp9dsp to vp9mc	Anton Khirnov	2016-08-03
\| \| \| \| \| \| \| \|	It only contains the MC SIMD, other SIMD will go into different files.
* \|	Merge commit '1dfc3cf89d0eb026af28be46294b85d79499ffb5'	James Almer	2017-01-31
\|\\| \| \| \| \| \| \| \| \| \| \| \| \|	* commit '1dfc3cf89d0eb026af28be46294b85d79499ffb5': x86: hpeldsp: Split off VP3-specific bits into a separate file Merged-by: James Almer <jamrial@gmail.com>
\| *	x86: hpeldsp: Split off VP3-specific bits into a separate file	Diego Biurrun	2016-07-20
\| \|
\| *	hevc: Add AVX2 DC IDCT	James Almer	2016-07-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Originally written by Pierre Edouard Lepere <pierre-edouard.lepere@insa-rennes.fr>. Integrated to Libav by Josh de Kock <josh@itanimul.li>. Signed-off-by: Alexandra Hájková <alexandra@khirnov.net>
\| *	build: miscellaneous cosmetics	Diego Biurrun	2016-04-07
\| \| \| \| \| \| \| \| \| \| \| \|	Restore alphabetical order in lists, break overly long lines, do some prettyprinting, add some explanatory section comments, group parts together that belong together logically.
\| *	fft: Split MDCT bits off from FFT	Diego Biurrun	2016-03-01
\| \|
* \|	huffyuvencdsp: move shared functions to a new lossless_videoencdsp context	James Almer	2017-01-12
\| \| \| \| \| \| \| \|	Signed-off-by: James Almer <jamrial@gmail.com>
* \|	aacenc: add SIMD optimizations for abs_pow34 and quantization	Rostislav Pehlivanov	2016-10-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Performance improvements: quant_bands: with: 681 decicycles in quant_bands, 8388453 runs, 155 skips without: 1190 decicycles in quant_bands, 8388386 runs, 222 skips Around 42% for the function Twoloop coder: abs_pow34: with/without: 7.82s/8.17s Around 4% for the entire encoder Both: with/without: 7.15s/8.17s Around 12% for the entire encoder Fast coder: abs_pow34: with/without: 3.40s/3.77s Around 10% for the entire encoder Both: with/without: 3.02s/3.77s Around 20% faster for the entire encoder Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com> Tested-by: Michael Niedermayer <michael@niedermayer.cc> Reviewed-by: James Almer <jamrial@gmail.com>
* \|	x86/ttaenc: add ff_ttaenc_filter_process_{ssse3,sse4}	James Almer	2016-08-02
\| \| \| \| \| \| \| \|	Signed-off-by: James Almer <jamrial@gmail.com>
* \|	x86/vc1dsp: Split the file into MC and loopfilter	Timothy Gu	2016-02-29
\| \|
* \|	Merge commit '15a24614aef5836af3cd2c7cc3b2b737eee6bf3c'	Derek Buitenhuis	2016-02-24
\|\\| \| \| \| \| \| \| \| \| \| \| \| \|	* commit '15a24614aef5836af3cd2c7cc3b2b737eee6bf3c': build: Add vc1dsp component for more fine-grained dependencies Merged-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
\| *	build: Add vc1dsp component for more fine-grained dependencies	Diego Biurrun	2016-02-19
\| \|
\| *	x86: build: Group all encoder objects together	Diego Biurrun	2016-01-18
\| \|
\| *	hevcdsp: add x86 SIMD for MC	Anton Khirnov	2015-12-05
\| \|
* \|	x86/dcadec: add ff_lfe_fir0_float_{sse,sse2,avx,fma3}	James Almer	2016-02-06
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Up to ~4 times faster on x86_64, ~8 times on x86_32 if compiling using x87 fp math. Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>
* \|	dirac_dwt: Make x86 files/functions names consistent	Timothy Gu	2016-02-05
\| \|
* \|	diracdsp: Make x86 files/functions names consistent	Timothy Gu	2016-02-05
\| \|
* \|	avcodec/dca: add new decoder based on libdcadec	foo86	2016-01-31
\| \|
* \|	avcodec/dca: remove old decoder	foo86	2016-01-31
\| \| \| \| \| \| \| \| \| \|	Remove all files and functions which are not going to be reused, and disable all functions and FATE tests temporarily which will be.
* \|	avcodec/synth_filter: split off remaining code from dcadec files	James Almer	2016-01-25
\| \| \| \| \| \| \| \|	Signed-off-by: James Almer <jamrial@gmail.com>
* \|	x86/Makefile: move decoder/encoder objects out of the subsystems section	James Almer	2015-10-22
\| \| \| \| \| \| \| \|	Signed-off-by: James Almer <jamrial@gmail.com>
* \|	huffyuvencdsp: Convert ff_diff_bytes_mmx to yasm	Timothy Gu	2015-10-20
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Heavily based upon ff_add_bytes by Christophe Gisquet. Reviewed-by: James Almer <jamrial@gmail.com> Signed-off-by: Timothy Gu <timothygu99@gmail.com>
* \|	vp9: add 10/12bpp mmxext-optimized iwht_iwht_4x4 function.	Ronald S. Bultje	2015-10-13
\| \|
* \|	x86: simple_idct(_put): 10bits versions	Christophe Gisquet	2015-10-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Modeled from the prores version. Clips to [0;1023] and is bitexact. Bitexactness requires to add offsets in different places compared to prores or C, and makes the function approximately 2% slower. For 16 frames of a DNxHD 4:2:2 10bits test sequence: C: 60861 decicycles in idct, 1048205 runs, 371 skips sse2: 27567 decicycles in idct, 1048216 runs, 360 skips avx: 26272 decicycles in idct, 1048171 runs, 405 skips The add version is not implemented, so the corresponding dsp function is set to NULL to make it clear in a code executing it. Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* \|	avcodec/takdec: add x86 SIMD for rest of decorrelation modes	Paul B Mahol	2015-10-09
\| \| \| \| \| \| \| \|	Signed-off-by: Paul B Mahol <onemda@gmail.com>
* \|	x86/alacdsp: add simd optimized functions	James Almer	2015-10-06
\| \| \| \| \| \| \| \|	Signed-off-by: James Almer <jamrial@gmail.com>
* \|	vp9: 16bpp tm/dc/h/v intra pred simd (mostly sse2) functions.	Ronald S. Bultje	2015-10-03
\| \|
* \|	vp9: sse2/ssse3/avx 16bpp loopfilter x86 simd.	Ronald S. Bultje	2015-10-03
\| \|
* \|	x86/hevc_sao: move 10/12bit functions into a separate file	James Almer	2015-09-30
\| \| \| \| \| \| \| \| \| \|	Tested-by: Michael Niedermayer <michaelni@gmx.at> Signed-off-by: James Almer <jamrial@gmail.com>
* \|	vp9: add subpel MC SIMD for 10/12bpp.	Ronald S. Bultje	2015-09-16
\| \|
* \|	vp9: add fullpel (put) MC SIMD for 10/12bpp.	Ronald S. Bultje	2015-09-16
\| \|
* \|	Merge commit 'cad40a3833ad81a352e7657ec6f7d637cea3b798'	Hendrik Leppkes	2015-09-05
\|\\| \| \| \| \| \| \| \| \| \| \| \| \|	* commit 'cad40a3833ad81a352e7657ec6f7d637cea3b798': lavc: Drop deprecated deinterlace module Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>