summaryrefslogtreecommitdiff
path: root/libavcodec/arm/rv34dsp_neon.S
Commit message (Collapse)AuthorAge
* Drop DCTELEM typedefDiego Biurrun2013-01-22
| | | | | | It does not help as an abstraction and adds dsputil dependencies. Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
* ARM: Move asm.S from libavcodec to libavutilJustin Ruggles2012-06-08
| | | | | This will allow for easier implementation of ARM-optimized functions in libraries other than libavcodec.
* rv34: add NEON rv34_idct_addJanne Grunau2012-01-16
| | | | | | | Overall almost 4% faster, idct_add down from 350 to 85 cycles, idct_dc_add down from 83 to 30 cycles. squash: rv34 idct rearrange partial register loads
* rv34: 1-pass inter MB reconstructionChristophe GISQUET2012-01-16
| | | | Implement 1-pass inverse transform and reconstruction for inter blocks.
* ARM: rv34: fix asm syntax in dc transform functionsMans Rullgard2012-01-12
| | | | | Signed-off-by: Mans Rullgard <mans@mansr.com> Signed-off-by: Janne Grunau <janne-libav@jannau.net>
* rv34: NEON optimised dc only inverse transformJanne Grunau2012-01-12
| | | | | 30-50% faster than the C implementation, 0.5% overall speedup on bourne.rmvb.
* rv34: joint coefficient decoding and dequantizationChristophe GISQUET2012-01-04
| | | | | | | | | | | Perform dequantization while decoding coefficients instead of performing it on the entire coefficients buffer. Since quantized coefficients are very sparse, this usually causes a small speedup. Speedup of around 1% on Panda board compared to the removed here neon code. Global speedup is probably around 3%. Signed-off-by: Kostya Shishkov <kostya.shishkov@gmail.com>
* rv34: NEON optimised 4x4 dequantMans Rullgard2011-12-13
| | | | Signed-off-by: Mans Rullgard <mans@mansr.com>
* rv34: NEON optimised inverse transform functionsJanne Grunau2011-12-06
Signed-off-by: Mans Rullgard <mans@mansr.com>