| Commit message (Collapse) | Author | Age |
| |
|
|
|
|
| |
Overall speed changes for 1920x1080, yuv422p10le, 60fps from: 0.19x to 0.343x
|
|
|
|
| |
Signed-off-by: James Almer <jamrial@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
58893 decicycles in deemphasis_c, 130548 runs, 524 skips
9475 decicycles in deemphasis_fma3, 130686 runs, 386 skips -> 6.21x speedup
24866 decicycles in postfilter_c, 65386 runs, 150 skips
5268 decicycles in postfilter_fma3, 65505 runs, 31 skips -> 4.72x speedup
Total decoder speedup: ~14%
Deemphasis SIMD based on the following unrolling:
const float c1 = CELT_EMPH_COEFF, c2 = c1*c1, c3 = c2*c1, c4 = c3*c1;
float state = coeff;
for (int i = 0; i < len; i += 4) {
y[0] = x[0] + c1*state;
y[1] = x[1] + c2*state + c1*x[0];
y[2] = x[2] + c3*state + c1*x[1] + c2*x[0];
y[3] = x[3] + c4*state + c1*x[2] + c2*x[1] + c3*x[0];
state = y[3];
y += 4;
x += 4;
}
|
|
|
|
| |
The entire function was defined away before.
|
|
|
|
| |
Its only used in the encoder and in CELT's PVQ.
|
|
|
|
|
|
|
|
| |
This was originally based on libsbc, and was fully integrated into ffmpeg.
Rough speed test:
C version: speed= 592x
MMX version: speed= 785x
|
|
|
|
| |
Signed-off-by: James Almer <jamrial@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Explanation on the workings and methods used by the
Pyramid Vector Quantization Search function
could be found in the following Work-In-Progress mail threads:
http://ffmpeg.org/pipermail/ffmpeg-devel/2017-June/212146.html
http://ffmpeg.org/pipermail/ffmpeg-devel/2017-June/212816.html
http://ffmpeg.org/pipermail/ffmpeg-devel/2017-July/213030.html
http://ffmpeg.org/pipermail/ffmpeg-devel/2017-July/213436.html
Signed-off-by: Ivan Kalvachev <ikalvachev@gmail.com>
|
|
|
|
| |
Signed-off-by: Paul B Mahol <onemda@gmail.com>
|
|
|
|
|
|
|
| |
c: 1802 decicycles in fft15,16774635 runs, 2581 skips
avx: 865 decicycles in fft15,16776378 runs, 838 skips
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
|
|
|
|
|
|
|
|
| |
None of them are specific to the YASM assembler.
(Cherry-picked from libav commit 39e208f4d4756367c7cd2d581847e0c1b8a429c1)
Signed-off-by: James Almer <jamrial@gmail.com>
|
| |
|
| |
|
|
|
|
| |
This will simplify incoming merge.
|
|\
| |
| |
| |
| |
| |
| | |
* commit 'b57e38f52cc3f31a27105c28887d57cd6812c3eb':
ac3dsp: x86: Replace inline asm for in-decoder downmixing with standalone asm
Merged-by: Clément Bœsch <u@pkh.me>
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Adds a wrapper function for downmixing which detects channel count changes
and updates the selected downmix function accordingly.
Simplification and porting to current x86inc infrastructure by Diego Biurrun.
Signed-off-by: Diego Biurrun <diego@biurrun.de>
|
| | |
|
| |
| |
| |
| | |
It only contains the MC SIMD, other SIMD will go into different files.
|
|\|
| |
| |
| |
| |
| |
| | |
* commit '1dfc3cf89d0eb026af28be46294b85d79499ffb5':
x86: hpeldsp: Split off VP3-specific bits into a separate file
Merged-by: James Almer <jamrial@gmail.com>
|
| | |
|
| |
| |
| |
| |
| |
| |
| | |
Originally written by Pierre Edouard Lepere <pierre-edouard.lepere@insa-rennes.fr>.
Integrated to Libav by Josh de Kock <josh@itanimul.li>.
Signed-off-by: Alexandra Hájková <alexandra@khirnov.net>
|
| |
| |
| |
| |
| |
| | |
Restore alphabetical order in lists, break overly long lines, do some
prettyprinting, add some explanatory section comments, group parts
together that belong together logically.
|
| | |
|
| |
| |
| |
| | |
Signed-off-by: James Almer <jamrial@gmail.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Performance improvements:
quant_bands:
with: 681 decicycles in quant_bands, 8388453 runs, 155 skips
without: 1190 decicycles in quant_bands, 8388386 runs, 222 skips
Around 42% for the function
Twoloop coder:
abs_pow34:
with/without: 7.82s/8.17s
Around 4% for the entire encoder
Both:
with/without: 7.15s/8.17s
Around 12% for the entire encoder
Fast coder:
abs_pow34:
with/without: 3.40s/3.77s
Around 10% for the entire encoder
Both:
with/without: 3.02s/3.77s
Around 20% faster for the entire encoder
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
Tested-by: Michael Niedermayer <michael@niedermayer.cc>
Reviewed-by: James Almer <jamrial@gmail.com>
|
| |
| |
| |
| | |
Signed-off-by: James Almer <jamrial@gmail.com>
|
| | |
|
|\|
| |
| |
| |
| |
| |
| | |
* commit '15a24614aef5836af3cd2c7cc3b2b737eee6bf3c':
build: Add vc1dsp component for more fine-grained dependencies
Merged-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
|
| | |
|
| | |
|
| | |
|
| |
| |
| |
| |
| |
| |
| | |
Up to ~4 times faster on x86_64, ~8 times on x86_32 if compiling using x87 fp math.
Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
|
| | |
|
| | |
|
| | |
|
| |
| |
| |
| |
| | |
Remove all files and functions which are not going to be reused,
and disable all functions and FATE tests temporarily which will be.
|
| |
| |
| |
| | |
Signed-off-by: James Almer <jamrial@gmail.com>
|
| |
| |
| |
| | |
Signed-off-by: James Almer <jamrial@gmail.com>
|
| |
| |
| |
| |
| |
| |
| | |
Heavily based upon ff_add_bytes by Christophe Gisquet.
Reviewed-by: James Almer <jamrial@gmail.com>
Signed-off-by: Timothy Gu <timothygu99@gmail.com>
|
| | |
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Modeled from the prores version. Clips to [0;1023] and is bitexact.
Bitexactness requires to add offsets in different places compared to
prores or C, and makes the function approximately 2% slower.
For 16 frames of a DNxHD 4:2:2 10bits test sequence:
C: 60861 decicycles in idct, 1048205 runs, 371 skips
sse2: 27567 decicycles in idct, 1048216 runs, 360 skips
avx: 26272 decicycles in idct, 1048171 runs, 405 skips
The add version is not implemented, so the corresponding dsp
function is set to NULL to make it clear in a code executing it.
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
|
| |
| |
| |
| | |
Signed-off-by: Paul B Mahol <onemda@gmail.com>
|
| |
| |
| |
| | |
Signed-off-by: James Almer <jamrial@gmail.com>
|
| | |
|
| | |
|
| |
| |
| |
| |
| | |
Tested-by: Michael Niedermayer <michaelni@gmx.at>
Signed-off-by: James Almer <jamrial@gmail.com>
|
| | |
|
| | |
|
|\|
| |
| |
| |
| |
| |
| | |
* commit 'cad40a3833ad81a352e7657ec6f7d637cea3b798':
lavc: Drop deprecated deinterlace module
Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
|