summaryrefslogtreecommitdiff
path: root/libavcodec/arm/fft_init_arm.c
Commit message (Collapse)AuthorAge
* fft: Split MDCT bits off from FFTDiego Biurrun2016-03-01
|
* rdft: arm: Split RDFT initialization into a separate fileDiego Biurrun2016-02-26
|
* fft: arm: Drop unnecessary #include, add missing onesDiego Biurrun2016-02-26
|
* arm: add a cpu flag for the VFPv2 vector modeJanne Grunau2015-12-14
| | | | | | | | | | | | | | The vector mode was deprecated in ARMv7-A/VFPv3 and various cpu implementations do not support it in hardware. Vector mode code will depending the OS either be emulated in software or result in an illegal instruction on cpus which does not support it. This was not really problem in practice since NEON implementations of the same functions are preferred. It will however become a problem for checkasm which tests every cpu flag separately. Since this is a cpu feature newer cpu do not support anymore the behaviour of this flag differs from the other flags. It can be only activated by runtime cpu feature selection.
* armv6: Accelerate ff_fft_calc for general case (nbits != 4)Ben Avison2014-07-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | The previous implementation targeted DTS Coherent Acoustics, which only requires nbits == 4 (fft16()). This case was (and still is) linked directly rather than being indirected through ff_fft_calc_vfp(), but now the full range from radix-4 up to radix-65536 is available. This benefits other codecs such as AAC and AC3. The implementaion is based upon the C version, with each routine larger than radix-16 calling a hierarchy of smaller FFT functions, then performing a post-processing pass. This pass benefits a lot from loop unrolling to counter the long pipelines in the VFP. A relaxed calling standard also reduces the overhead of the call hierarchy, and avoiding the excessive inlining performed by GCC probably helps with I-cache utilisation too. I benchmarked the result by measuring the number of gperftools samples that hit anywhere in the AAC decoder (starting from aac_decode_frame()) or specifically in the FFT routines (fft4() to fft512() and pass()) for the same sample AAC stream: Before After Mean StdDev Mean StdDev Confidence Change Audio decode 2245.5 53.1 1599.6 43.8 100.0% +40.4% FFT routines 940.6 22.0 348.1 20.8 100.0% +170.2% Signed-off-by: Martin Storsjö <martin@martin.st>
* arm: dcadsp: Move synth filter initialization to dcadsp fileDiego Biurrun2013-08-29
|
* arm: Add VFP-accelerated version of imdct_halfMartin Storsjö2013-07-22
| | | | | | | | | Before After Mean StdDev Mean StdDev Change This function 2653.0 28.5 1108.8 51.4 +139.3% Overall 17049.5 408.2 15973.0 223.2 +6.7% Signed-off-by: Martin Storsjö <martin@martin.st>
* arm: Add VFP-accelerated version of synth_filter_floatBen Avison2013-07-22
| | | | | | | | | Before After Mean StdDev Mean StdDev Change This function 9295.0 114.9 4853.2 83.5 +91.5% Overall 23699.8 397.6 19285.5 292.0 +22.9% Signed-off-by: Martin Storsjö <martin@martin.st>
* ARM: allow runtime masking of CPU featuresMans Rullgard2012-04-22
| | | | | | | This allows masking CPU features with the -cpuflags avconv option which is useful for testing different optimisations without rebuilding. Signed-off-by: Mans Rullgard <mans@mansr.com>
* ARM: fix build with FFT enabled and MDCT disabledFelipe Contreras2012-01-20
| | | | | Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com> Signed-off-by: Mans Rullgard <mans@mansr.com>
* Move dct and rdft definitions to separate filesMans Rullgard2011-03-20
| | | | | | | This leaves fft.h with only the core FFT and MDCT definitions thus making it more managable. Signed-off-by: Mans Rullgard <mans@mansr.com>
* Replace FFmpeg with Libav in licence headersMans Rullgard2011-03-19
| | | | Signed-off-by: Mans Rullgard <mans@mansr.com>
* FFT: factor a shuffle out of the inner loop and merge it into fft_permute.Loren Merritt2011-02-13
| | | | | | 6% faster SSE FFT on Conroe, 2.5% on Penryn. Signed-off-by: Janne Grunau <janne-ffmpeg@jannau.net>
* Remove unneeded add bias from 3 functions.Justin Ruggles2011-01-31
| | | | | | | | DSPContext.vector_fmul_window() DCADSPContext.lfe_fir() SynthFilterContext.synth_filter_float() Signed-off-by: Mans Rullgard <mans@mansr.com>
* ARM: NEON optimised synth_filter_floatMåns Rullgård2010-04-10
| | | | | | 2.7x faster DCA decoding on Cortex-A8 Originally committed as revision 22828 to svn://svn.ffmpeg.org/ffmpeg/trunk
* ARM: NEON optimised RDFTMåns Rullgård2010-03-23
| | | | Originally committed as revision 22641 to svn://svn.ffmpeg.org/ffmpeg/trunk
* Move FFT parts from dsputil.h to fft.hMåns Rullgård2010-03-06
| | | | Originally committed as revision 22235 to svn://svn.ffmpeg.org/ffmpeg/trunk
* ARM: interleave cos/sin tables for improved NEON MDCTMåns Rullgård2009-09-21
| | | | Originally committed as revision 19940 to svn://svn.ffmpeg.org/ffmpeg/trunk
* Merge FFTContext and MDCTContextMåns Rullgård2009-09-20
| | | | Originally committed as revision 19931 to svn://svn.ffmpeg.org/ffmpeg/trunk
* Move per-arch fft init bits into the corresponding subdirsMåns Rullgård2009-09-15
Originally committed as revision 19864 to svn://svn.ffmpeg.org/ffmpeg/trunk