summaryrefslogtreecommitdiff
path: root/libswresample/resample_template.c
Commit message (Collapse)AuthorAge
* swresample/resample_template: Add filter values in parallelMichael Niedermayer2016-12-10
| | | | | | | | | This is faster 2871 -> 2189 cycles for int16 matrixbench -> 23456hz Fixes a integer overflow in a artificial corner case Fixes part of 668007-media Found-by: Matt Wolenetz <wolenetz@google.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* swresample/resample_template: Reorder operations to avoid one additionMichael Niedermayer2016-12-10
| | | | Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* swresample: add exact_rational optionMuhammad Faiz2016-06-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | give high quality resampling as good as with linear_interp=on as fast as without linear_interp=on tested visually with ffplay ffplay -f lavfi "aevalsrc='sin(10000*t*t)', aresample=osr=48000, showcqt=gamma=5" ffplay -f lavfi "aevalsrc='sin(10000*t*t)', aresample=osr=48000:linear_interp=on, showcqt=gamma=5" ffplay -f lavfi "aevalsrc='sin(10000*t*t)', aresample=osr=48000:exact_rational=on, showcqt=gamma=5" slightly speed improvement for fair comparison with -cpuflags 0 audio.wav is ~ 1 hour 44100 stereo 16bit wav file ffmpeg -i audio.wav -af aresample=osr=48000 -f null - old new real 13.498s 13.121s user 13.364s 12.987s sys 0.131s 0.129s linear_interp=on old new real 23.035s 23.050s user 22.907s 22.917s sys 0.119s 0.125s exact_rational=on real 12.418s user 12.298s sys 0.114s possibility to decrease memory usage if soft compensation is ignored Signed-off-by: Muhammad Faiz <mfcc64@gmail.com>
* swr/resample: use av_clip functionsJames Almer2015-04-05
| | | | | Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* swresample/resample_template: Add () to protect the arguments of the OUT() macroMichael Niedermayer2015-02-17
| | | | Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* swr: initialize only the necessary resample dsp functionsJames Almer2014-07-04
| | | | | Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* Partially revert "swr: add prototypes for resample dsp functions"James Almer2014-07-02
| | | | | | | | | | Prototypes are not needed anymore now that the x86 functions don't include resample_template.c The DO_RESAMPLE_ONE macro is removed for that same reason as well. Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* x86/swr: convert resample_{common, linear}_double_sse2 to yasmJames Almer2014-07-01
| | | | | | | | Signed-off-by: James Almer <jamrial@gmail.com> 312531 -> 311528 dezicycles Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* swr: convert resample_common/linear_int16_mmx2/sse2 to yasm.Ronald S. Bultje2014-06-30
| | | | Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* swresample/resample_template: move division out of loop for float/double ↵Michael Niedermayer2014-06-30
| | | | | | swri_resample_linear() Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* swresample/resample_template: flip order of operations in ↵Michael Niedermayer2014-06-29
| | | | | | | | | | swri_resample_linear() for 32bit Fixes integer overflow Found-by: BBB Reviewed-by: "Ronald S. Bultje" <rsbultje@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* swr: rewrite resample_common/linear_float_sse/avx in yasm.Ronald S. Bultje2014-06-28
| | | | | | | | | | | Linear interpolation goes from 63 (llvm) or 58 (gcc) to 48 (yasm) cycles/sample on 64bit, or from 66 (llvm/gcc) to 52 (yasm) cycles/ sample on 32bit. Bon-linear goes from 43 (llvm) or 38 (gcc) to 32 (yasm) cycles/sample on 64bit, or from 46 (llvm) or 44 (gcc) to 38 (yasm) cycles/sample on 32bit (all testing on OSX 10.9.2, llvm 5.1 and gcc 4.8/9). Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* swr: remove another forgotten division in DSP function.Ronald S. Bultje2014-06-22
| | | | Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* swr: remove div/mod from DSP functions.Ronald S. Bultje2014-06-18
| | | | | | Also fix a bug with resample_compensation resetting dst_incr. Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* swr: reindent.Ronald S. Bultje2014-06-16
| | | | Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* swr: add prototypes for resample dsp functionsJames Almer2014-06-15
| | | | | | | | Should fix compilation failures with MSVC and any other compiler without inline asm support. Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* swr: split out DSP functions.Ronald S. Bultje2014-06-14
| | | | | | | | | | DSP bits of swri_resample go into their own mini-DSP functions; DSP init goes from a per-call branch in multiple_resample to a proper DSP init routine; x86 bits go into x86/; swri_resample() moves out of resample_template.c into resample.c because it's independent of DSP code or sample type; multiple_resample() is simplified. Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* swr: handle initial negative sample index outside DSP function.Ronald S. Bultje2014-06-14
| | | | Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* swr: remove unnecessary assignment.Ronald S. Bultje2014-06-14
| | | | | | | | I don't see dst_incr/dst_incr_frac ever being changed from their initial value (which is the inverse of this operation), so it seems to me that this is a no-op. Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* swr: handle 64bit overflow check in multiple_resample().Ronald S. Bultje2014-06-09
| | | | Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* swr: move compensation_distance handling to swri_resample caller.Ronald S. Bultje2014-06-02
| | | | | | | | | | | I think there's an off-by-one in terms of the switchpoint where we switch from dst_incr to ideal_dst_incr, I don't think that's a massive issue, but just be aware of that. It's probably trivial to prevent but I don't care. Signed-off-by: Michael Niedermayer <michaelni@gmx.at> I could not reproduce any off by 1 error, results are bit exact (michael)
* swr/resample_template: prevent end_index from overflowing and add check for ↵Michael Niedermayer2014-06-02
| | | | | | delta_frac overflow Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* Rewrite main resampling loop (common and linear).Ronald S. Bultje2014-06-02
| | | | | | | | | | | | | | | | | This removes a branch at a performance-sensitive point (in the middle of the loop). In fate-swr-resample-s32p-8000-2626, this makes the code about 10% faster. It also simplifies the loops, allowing us to rewrite it in yasm at some later point. The compensation_distance != 0 code and index < 0 code are still kind of hairy. For compensation_distance != 0, this should likely be handled in the caller, so that it calls swri_resample twice (once until the dst_incr switch-point, and once with the remainder of the samples). For index < 0, the code should probably be rewritten to break out of the loop once sample_index >= 0, and then resume (e.g. as a tail-call) to the common or linear resampling loops. Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* swresample: add swri_resample_float_avxJames Almer2014-05-16
| | | | | Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* swresample: add swri_resample_double_sse2James Almer2014-04-25
| | | | | Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* swresample/resample_template: try to consider src_size more exactlyMichael Niedermayer2014-04-15
| | | | | | | This should avoid slight differences in the output causes by input size alignment differences between archs Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* swresample/resample: simplify index/consumed calculation for the filter = 1 caseMichael Niedermayer2014-04-14
| | | | Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* swresample/resample: Fix fractional part of index in the filter_size = 1 ↵Michael Niedermayer2014-04-14
| | | | | | filters = 1 case Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* swresample/resample: sse float linear interpolationJames Almer2014-03-24
| | | | | | | About two times faster Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* swresample/resample: mmx2/sse2 int16 linear interpolationJames Almer2014-03-24
| | | | | | | About three times faster Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* swresample: add swri_resample_float_sseJames Almer2014-03-20
| | | | | | | At least two times faster than the C version. Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* swresample: reuse COMMON_CORE asm where possibleJames Almer2014-03-18
| | | | | Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* swresample: change COMMON_CORE_INT16 asm from SSSE3 to SSE2James Almer2014-03-18
| | | | | | | | | pshuf+paddd is slightly faster than phaddd. The real gain is in pre-ssse3 processors like AMD K8 and K10, which get a big boost in performance compared to the mmxext version Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* swr/resample: fix integer overflow, add missing castMichael Niedermayer2013-02-04
| | | | | | The effects of this are limited to numeric errors in the output Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* resample: remove disabled debug codeMichael Niedermayer2012-12-06
| | | | Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* swr/resample: move templating parameters to template itself.Clément Bœsch2012-11-15
| | | | | | | | | It has various benefits such as allowing some refactoring, clarifying the code in the inclusion part, and making the template understandable in standalone. This commit is based on the templating method used by Justin Ruggles for libavresample.
* swr: move if() block into the only branch where it can be true.Michael Niedermayer2012-11-15
| | | | | | This should make the code a tiny tiny bit faster. Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* swr: reorder/redesign operations to avoid integer overflow.Michael Niedermayer2012-11-15
| | | | | | | This fixes a out of array read. Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* swr: MMX2 & SSSE3 int16 resample coreMichael Niedermayer2012-06-28
| | | | | | about 4 times faster Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* swr: introduce filter_alloc in preparation of SIMD resample optimisationsMichael Niedermayer2012-06-19
| | | | Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* swr/resample: optimize C code for the most common caseMichael Niedermayer2012-06-19
| | | | | | 15% speedup Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* resample_template: use av_assertMichael Niedermayer2012-06-06
| | | | Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* swr: support float & int32 in the resamplerMichael Niedermayer2012-04-10
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>