libav.git - [no description]

	Commit message (Collapse)	Author	Age
*	avcodec/x86: allow future 8-bit simple idct to use slightly different ↵	James Darnley	2017-06-20
\| \| \| \|	coefficients
*	avcodec/x86: modify simple_idct10 macros to add an action paramter	James Darnley	2017-06-20
\|
*	avcodec/x86: cleanup simple_idct10	James Darnley	2017-06-20
\| \| \| \| \| \|	Use named arguments for the functions so we can remove a define. The stride/linesize argument is now ptrdiff_t type so we no longer need to sign extend the register.
*	avcodec/x86/mpegenc: support transpose permuation type	James Darnley	2017-06-20
\|
*	avcodec/x86/mpegenc: check IDCT permutation type is a valid value	James Darnley	2017-06-20
\|
*	avcodec/x86/mpegvideo: Use intra scantable in dct_unquantize_h263_intra_mmx()	Michael Niedermayer	2017-06-20
\| \| \| \|	Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
*	x86/aacpsdsp: add ff_ps_hybrid_analysis_ileave_sse	James Almer	2017-06-18
\| \| \| \|	About 2x faster than the c version.
*	x86/aacpsdsp: add ff_ps_hybrid_synthesis_deint_{sse,sse4}	James Almer	2017-06-18
\| \| \| \|	About 2x faster than the c version.
*	avcodec/aacps: move checks for valid length outside the stereo_interpolate ↵	James Almer	2017-06-15
\| \| \| \| \| \|	dsp function Signed-off-by: James Almer <jamrial@gmail.com>
*	x86/vorbisdsp: optimize ff_vorbis_inverse_coupling_sse	James Almer	2017-06-15
\| \| \| \|	About 7% faster.
*	vp9: fix overwrite in ff_vp9_ipred_dr_16x16_16_avx2.	Ronald S. Bultje	2017-06-14
\| \| \| \|	Fixes trac issue 6459.
*	avcodec/vp9: ipred_dr_16x16_16 avx2 implementation	Ilia Valiakhmetov	2017-06-12
\| \| \| \| \|	Signed-off-by: Ilia Valiakhmetov <zakne0ne@gmail.com> Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
*	x86/aacpsdsp: fix output of ff_ps_stereo_interpolate_ipdopd_sse3	James Almer	2017-06-07
\| \| \| \|	The fate-aac-al_sbr_ps_04_ur test did not detect this mistake.
*	libavcodec/vp9: ipred_dl_32x32_16 avx2 implementation	Ilia Valiakhmetov	2017-06-06
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	vp9_diag_downleft_32x32_8bpp_c: 580.2 vp9_diag_downleft_32x32_8bpp_sse2: 75.6 vp9_diag_downleft_32x32_8bpp_ssse3: 73.7 vp9_diag_downleft_32x32_8bpp_avx: 72.7 vp9_diag_downleft_32x32_10bpp_c: 1101.2 vp9_diag_downleft_32x32_10bpp_sse2: 145.4 vp9_diag_downleft_32x32_10bpp_ssse3: 137.5 vp9_diag_downleft_32x32_10bpp_avx: 134.8 vp9_diag_downleft_32x32_10bpp_avx2: 94.0 vp9_diag_downleft_32x32_12bpp_c: 1108.5 vp9_diag_downleft_32x32_12bpp_sse2: 145.5 vp9_diag_downleft_32x32_12bpp_ssse3: 137.3 vp9_diag_downleft_32x32_12bpp_avx: 135.2 vp9_diag_downleft_32x32_12bpp_avx2: 94.0 ~30% faster than avx implementation Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
*	x86/aacpsdsp: optimize ff_ps_mul_pair_single_sse	James Almer	2017-06-04
\| \| \| \|	~2% faster.
*	x86/aacpsdsp: optimize ff_ps_stereo_interpolate_sse3	James Almer	2017-06-03
\| \| \| \| \| \| \|	Move the unpacking outside of the loop. 5% to 10% faster. Suggested-by: ubitux Signed-off-by: James Almer <jamrial@gmail.com>
*	x86/aacps: add ff_ps_stereo_interpolate_ipdopd_sse3()	James Almer	2017-06-02
\| \| \| \| \| \|	About 2x faster than the c version. Signed-off-by: James Almer <jamrial@gmail.com>
*	avcodec/x86/idctdsp_init: reindent	James Darnley	2017-05-30
\|
*	avcodec/x86: move simple_idct to external assembly	James Darnley	2017-05-30
\|
*	lavc/mpegvideoenc: reformat inv_zigzag_direct16 so the zigzag pattern is visible	Clément Bœsch	2017-05-19
\|
*	Merge commit 'b4a911c189962e563a09fb0efaf6fa9ab56263a4'	Clément Bœsch	2017-05-19
\|\ \| \| \| \| \| \| \| \| \| \| \| \|	* commit 'b4a911c189962e563a09fb0efaf6fa9ab56263a4': mpegvideoenc: make a table const Merged-by: Clément Bœsch <u@pkh.me>
\| *	mpegvideoenc: make a table const	Anton Khirnov	2017-01-19
\| \|
* \|	avcodec/h264: add sse2 versions of previous idct functions	James Darnley	2017-05-15
\| \| \| \| \| \| \| \| \| \| \| \|	Kaby Lake Pentium: - ff_h264_idct_add_8_sse2: ~1.18x faster than mmxext - ff_h264_idct_dc_add_8_sse2: ~1.07x faster than mmxext
* \|	avcodec/h264: add avx 8-bit h264_idct_dc_add	James Darnley	2017-05-15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Haswell: - 1.02x faster (405±0.7 vs. 397±0.8 decicycles) compared with mmxext Skylake-U: - 1.06x faster (498±1.8 vs. 470±1.3 decicycles) compared with mmxext
* \|	avcodec/h264: add avx 8-bit h264_idct_add	James Darnley	2017-05-15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Haswell: - 1.11x faster (522±0.4 vs. 469±1.8 decicycles) compared with mmxext Skylake-U: - 1.21x faster (671±5.5 vs. 555±1.4 decicycles) compared with mmxext
* \|	avcodec/h264: use some 3 operand forms	James Darnley	2017-05-15
\| \|
* \|	avcodec/h264: change RETs into REP_RETs where appropriate	James Darnley	2017-05-15
\| \|
* \|	avcodec/x86/vc1dsp_init: Fix build failure with --disable-optimizations and ↵	Michael Niedermayer	2017-04-27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	clang compilers doing DCE at -O0 do not necessarily understand "complex" boolean expressions Build succeeds with this change, this was the only failure Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* \|	Merge commit '0a35f128f3c6e0ae9a0a2236c557602c108da269'	Clément Bœsch	2017-04-08
\|\\| \| \| \| \| \| \| \| \| \| \| \| \|	* commit '0a35f128f3c6e0ae9a0a2236c557602c108da269': cabac: x86: Give optimizations header a more meaningful name Merged-by: Clément Bœsch <u@pkh.me>
\| *	cabac: x86: Give optimizations header a more meaningful name	Diego Biurrun	2016-12-01
\| \|
* \|	x86/idctdsp_init: reindent.	Ronald S. Bultje	2017-04-06
\| \|
* \|	x86/simple_idct: add explicit sse2 simple_idct_put/add versions.	Ronald S. Bultje	2017-04-06
\| \| \| \| \| \| \| \| \| \| \| \|	These use the mmx IDCT, but sse2 put/add_pixels_clamped implementations. This way we don't need to use the ff_put/add_pixels_clamped function pointers.
* \|	cavs: add a sse2 idct implementation.	Ronald S. Bultje	2017-04-06
\| \| \| \| \| \| \| \| \| \|	This makes using the function pointer ff_add_pixels_clamped() unnecessary, since we always know what the best implementation is at compile-time.
* \|	cavs: convert idct from inline asm to yasm.	Ronald S. Bultje	2017-04-06
\| \|
* \|	x86/xvididct: remove use of ff_put/add_pixels_clamped function pointer.	Ronald S. Bultje	2017-04-06
\| \| \| \| \| \| \| \| \| \|	Since there's separate SSE2 implementations of xvid_idct_put/add, this patch has no practical impact on performance.
* \|	x86/hevc_add_res: merge last remaining changes from ↵	James Almer	2017-03-31
\| \| \| \| \| \| \| \| \| \| \| \|	3d6535983282bea542dac2e568ae50da5796be34 See https://lists.libav.org/pipermail/libav-devel/2016-October/079829.html
* \|	Merge commit '0361e4dcb4d394c88c33364415a3b8fe315b67d1'	Clément Bœsch	2017-03-31
\|\\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* commit '0361e4dcb4d394c88c33364415a3b8fe315b67d1': h264_qpel: x86: Move function with only one instance out of template macro Note: warning is present with clang. Merged-by: Clément Bœsch <cboesch@gopro.com>
\| *	h264_qpel: x86: Move function with only one instance out of template macro	Diego Biurrun	2016-11-08
\| \| \| \| \| \| \| \|	libavcodec/x86/h264_qpel.c:392:785: warning: unused function 'ff_avg_h264_qpel8or16_hv1_lowpass_mmxext' [-Wunused-function]
\| *	x86: Drop stray semicolons after function definitions	Diego Biurrun	2016-11-05
\| \| \| \| \| \| \| \| \| \|	libavcodec/x86/rv40dsp_init.c:97:2: warning: ISO C does not allow extra ‘;’ outside of a function [-Wpedantic] libavcodec/x86/vp9dsp_init.c:94:40: warning: ISO C does not allow extra ‘;’ outside of a function [-Wpedantic]
\| *	vp9: Flip the order of arguments in MC functions	Martin Storsjö	2016-11-03
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This makes it match the pattern already used for VP8 MC functions. This also makes the signature match ffmpeg's version of these functions, easing porting of code in both directions. Signed-off-by: Martin Storsjö <martin@martin.st>
* \|	vp9: re-split the decoder/format/dsp interface header files.	Ronald S. Bultje	2017-03-28
\| \| \| \| \| \| \| \| \| \|	The advantage here is that the internal software decoder interface is not exposed to the DSP functions or the hardware accelerations.
* \|	lavc/vp9: split into vp9{block,data,mvs}	Clément Bœsch	2017-03-27
\| \| \| \| \| \| \| \|	This is following Libav layout to ease merges.
* \|	avcodec/x86/idctdsp: Remove duplicate include	Michael Niedermayer	2017-03-26
\| \| \| \| \| \| \| \|	Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* \|	x86/hevc_add_res: merge missing changes from ↵	James Almer	2017-03-24
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	3d6535983282bea542dac2e568ae50da5796be34 Unrolling the loops triplicates the size of the assembled output while not generating any gain in performance.
* \|	Merge commit '6d5636ad9ab6bd9bedf902051d88b7044385f88b'	Clément Bœsch	2017-03-24
\|\\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* commit '6d5636ad9ab6bd9bedf902051d88b7044385f88b': hevc: x86: Add add_residual() SIMD optimizations See a6af4bf64dae46356a5f91537a1c8c5f86456b37 This merge is only cosmetics (renames, space shuffling, etc). The functionnal changes in the ASM are not merged: - unrolling with %rep is kept - ADD_RES_MMX_4_8 is left untouched: this needs investigation Merged-by: Clément Bœsch <u@pkh.me>
\| *	hevc: x86: Add add_residual() SIMD optimizations	Pierre Edouard Lepere	2016-10-22
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Initially written by Pierre Edouard Lepere <Pierre-Edouard.Lepere@insa-rennes.fr>, extended by James Almer <jamrial@gmail.com>. Signed-off-by: Alexandra Hájková <alexandra@khirnov.net>
\| *	audiodsp: x86: Remove pointless header file	Diego Biurrun	2016-10-19
\| \| \| \| \| \| \| \| \| \|	Its single forward declaration can be moved to the only place it is used, like is done for all other dsp init files.
* \|	lavc/x86/hevc: rename hevc_res_add to hevc_add_res	Clément Bœsch	2017-03-24
\| \| \| \| \| \| \| \|	This will simplify incoming merge.
* \|	Merge commit 'b89804da9bad2d94dd95bf20ac6187447e9c17e9'	James Almer	2017-03-23
\|\\| \| \| \| \| \| \| \| \| \| \| \| \|	* commit 'b89804da9bad2d94dd95bf20ac6187447e9c17e9': x86: videodsp: Add parentheses to expression to work around warning Merged-by: James Almer <jamrial@gmail.com>
\| *	x86: videodsp: Add parentheses to expression to work around warning	Diego Biurrun	2016-10-19
\| \| \| \| \| \| \| \|	libavcodec/x86/videodsp.asm:128: warning: signed dword value exceeds bounds