summaryrefslogtreecommitdiff
path: root/libavcodec/arm
Commit message (Collapse)AuthorAge
* lavc/arm: Use the neon vertical chroma loop filter also for H.264 4:2:2.Carl Eugen Hoyos2015-01-31
|
* Merge commit '4c81613df499ba81d64ea102b38d0c6686cc304c'Michael Niedermayer2014-12-10
|\ | | | | | | | | | | | | * commit '4c81613df499ba81d64ea102b38d0c6686cc304c': arm: mlpdsp: handle pic offset calculation in a macro Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * arm: mlpdsp: handle pic offset calculation in a macroJanne Grunau2014-12-09
| | | | | | | | | | Makes the code easier to read since it hides different offset calculations for arm and thumb mode.
* | Merge commit '581c7f0e12b1fa39f73d683e54d6ecda0772c5a9'Michael Niedermayer2014-12-10
|\| | | | | | | | | | | | | * commit '581c7f0e12b1fa39f73d683e54d6ecda0772c5a9': arm: make ff_mlp_filter_channel_arm and ff_mlp_rematrix_channel_arm position independent Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * arm: make ff_mlp_filter_channel_arm and ff_mlp_rematrix_channel_arm position ↵Janne Grunau2014-12-09
| | | | | | | | | | | | independent No significant difference in used cpu cycles on a cortex-a9.
* | Merge commit 'f963f80399deb1a2b44c1bac3af7123e8a0c9e46'Michael Niedermayer2014-12-09
|\| | | | | | | | | | | | | | | | | | | * commit 'f963f80399deb1a2b44c1bac3af7123e8a0c9e46': arm: Use .data.rel.ro for const data with relocations Conflicts: configure Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * arm: Use .data.rel.ro for const data with relocationsMartin Storsjö2014-12-09
| | | | | | | | Signed-off-by: Martin Storsjö <martin@martin.st>
* | Merge commit 'b280c6202b28b371a8d96850194fd69d7ad5dcc0'Michael Niedermayer2014-12-08
|\| | | | | | | | | | | | | * commit 'b280c6202b28b371a8d96850194fd69d7ad5dcc0': arm: fft_vfp: Unify the behaviour in ff_fft_calc_vfp between arm/thumb Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * arm: fft_vfp: Unify the behaviour in ff_fft_calc_vfp between arm/thumbMartin Storsjö2014-12-08
| | | | | | | | | | | | | | | | | | | | | | | | Don't include the function pointer table in the code segment in arm mode. This shouldn't have any significant performance effect. It does end up as a few more instructions than before, for ARM, but only at the entry to this function, not within the fft functions themselves. Signed-off-by: Martin Storsjö <martin@martin.st>
* | Merge commit 'ae81576414f2d2083d3118fb4abe1ebc5a7a4c54'Michael Niedermayer2014-12-08
|\| | | | | | | | | | | | | * commit 'ae81576414f2d2083d3118fb4abe1ebc5a7a4c54': arm: fft_vfp: Add a missing "endconst" when building in thumb mode Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * arm: fft_vfp: Add a missing "endconst" when building in thumb modeMartin Storsjö2014-12-08
| | | | | | | | Signed-off-by: Martin Storsjö <martin@martin.st>
* | Merge commit '9c12c6ff9539e926df0b2a2299e915ae71872600'Michael Niedermayer2014-11-24
|\| | | | | | | | | | | | | | | | | | | | | | | | | * commit '9c12c6ff9539e926df0b2a2299e915ae71872600': motion_est: convert stride to ptrdiff_t Conflicts: libavcodec/me_cmp.c libavcodec/ppc/me_cmp.c libavcodec/x86/me_cmp_init.c See: 9c669672c7fd45ef1cad782ab551be438ceac6cd Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * motion_est: convert stride to ptrdiff_tVittorio Giovara2014-11-24
| | | | | | | | | | CC: libav-stable@libav.org Bug-Id: CID 700556 / CID 700557 / CID 700558
* | x86/flacdsp: add SSE2 and AVX decorrelate functionsJames Almer2014-11-13
| | | | | | | | Two to four times faster depending on instruction set, block size and channel count.
* | avcodec/idctdsp: change {put,add}_pixels_clamped to ptrdiff_t line_sizeJames Almer2014-09-24
| | | | | | | | | | Reviewed-by: Michael Niedermayer <michaelni@gmx.at> Signed-off-by: James Almer <jamrial@gmail.com>
* | Fix compile error on arm4/arm5 platformBernd Kuhls2014-09-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since these commits http://git.videolan.org/?p=ffmpeg.git;a=commitdiff;h=adf8227cf4e7b4fccb2ad88e1e09b6dc00dd00ed http://git.videolan.org/?p=ffmpeg.git;a=commitdiff;h=db7f1c7c5a1d37e7f4da64a79a97bea1c4b6e9f8 compilation on arm4/arm5 fails: libavcodec/libavcodec.so: undefined reference to `ff_startcode_find_candidate_armv6' Because libavcodec/arm/Makefile contains ARMV6-OBJS-$(CONFIG_STARTCODE) += arm/startcode_armv6.o function ff_startcode_find_candidate_armv6 is not included for older ARM archs. The bug was found during automatic buildroot builds: http://autobuild.buildroot.net/results/ec7/ec71e4f16ee9106747dff5f15999cbd17903e76f//build-end.log Quote from configure summary: ARCH arm (armv4t) big-endian no runtime cpu detection yes ARMv5TE enabled no ARMv6 enabled no ARMv6T2 enabled no http://autobuild.buildroot.net/results/be7/be72eb182eaccf0064a32c9dfc2ac1c0d6555506/build-end.log ARCH arm (armv5te) big-endian no runtime cpu detection yes ARMv5TE enabled yes ARMv6 enabled no ARMv6T2 enabled no This patch provides the necessary #if clauses as discussed with Michael: https://ffmpeg.org/pipermail/ffmpeg-devel/2014-September/163329.html Signed-off-by: Bernd Kuhls <bernd.kuhls@t-online.de> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* | Merge commit '95c0cec03acec0a80cc1c7db48f3b2355d9e767b'Michael Niedermayer2014-09-03
|\| | | | | | | | | | | | | | | | | | | | | | | | | * commit '95c0cec03acec0a80cc1c7db48f3b2355d9e767b': idctdsp: Add global function pointers for {add|put}_pixels_clamped functions Conflicts: libavcodec/arm/idctdsp_init_arm.c libavcodec/dct.h libavcodec/idctdsp.c libavcodec/jrevdct.c Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * idctdsp: Add global function pointers for {add|put}_pixels_clamped functionsDiego Biurrun2014-09-02
| | | | | | | | | | | | These function pointers already existed in the ARM code. Adding them globally allows calls to the function pointers to access arch-optimized versions of the functions transparently.
* | Merge commit 'efd26bedec9a345a5960dbfcbaec888418f2d4e6'Michael Niedermayer2014-08-15
|\| | | | | | | | | | | | | | | | | | | | | * commit 'efd26bedec9a345a5960dbfcbaec888418f2d4e6': build: Add explanatory comments to (optimization) blocks in the Makefiles Conflicts: libavcodec/ppc/Makefile libavcodec/x86/Makefile Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * build: Add explanatory comments to (optimization) blocks in the MakefilesDiego Biurrun2014-08-15
| |
* | Merge commit '835f798c7d20bca89eb4f3593846251ad0d84e4b'Michael Niedermayer2014-08-15
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * commit '835f798c7d20bca89eb4f3593846251ad0d84e4b': mpegvideo: cosmetics: Lowercase ugly uppercase MPV_ function name prefixes Conflicts: libavcodec/h261dec.c libavcodec/intrax8.c libavcodec/mjpegenc.c libavcodec/mpeg12dec.c libavcodec/mpeg12enc.c libavcodec/mpeg4videoenc.c libavcodec/mpegvideo.c libavcodec/mpegvideo.h libavcodec/mpegvideo_enc.c libavcodec/rv10.c libavcodec/x86/mpegvideoenc.c Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * mpegvideo: cosmetics: Lowercase ugly uppercase MPV_ function name prefixesDiego Biurrun2014-08-15
| |
* | avcodec/idctdsp: make add/put_pixels_clamped_c internal functionsJames Almer2014-08-13
| | | | | | | | | | | | | | This reduces code duplication and differences with the fork. Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* | avcodec: Change get_pixels() to ptrdiff_t linesizeMichael Niedermayer2014-08-06
| | | | | | | | | | Found-by: ubitux Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* | Merge commit 'adf8227cf4e7b4fccb2ad88e1e09b6dc00dd00ed'Michael Niedermayer2014-08-05
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | * commit 'adf8227cf4e7b4fccb2ad88e1e09b6dc00dd00ed': vc-1: Add platform-specific start code search routine to VC1DSPContext. Conflicts: configure libavcodec/arm/vc1dsp_init_arm.c libavcodec/vc1dsp.c libavcodec/vc1dsp.h See: 9d8ecdd8ca6d248e7439e8fdf255e39eda14e0f2 Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * vc-1: Add platform-specific start code search routine to VC1DSPContext.Ben Avison2014-08-04
| | | | | | | | | | | | | | Initialise VC1DSPContext for parser as well as for decoder. Note, the VC-1 code doesn't actually use the function pointer yet. Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
* | Merge commit 'db7f1c7c5a1d37e7f4da64a79a97bea1c4b6e9f8'Michael Niedermayer2014-08-05
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * commit 'db7f1c7c5a1d37e7f4da64a79a97bea1c4b6e9f8': h264: Move start code search functions into separate source files. Conflicts: libavcodec/arm/Makefile libavcodec/arm/h264dsp_init_arm.c libavcodec/h264_parser.c libavcodec/h264dsp.c libavcodec/startcode.c libavcodec/startcode.h See: 270cede3f3772117454a14b620803d731036942d Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * h264: Move start code search functions into separate source files.Ben Avison2014-08-04
| | | | | | | | | | | | This permits re-use with parsers for codecs which use similar start codes. Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
* | avcodec/arm/idctdsp_init_arm*: Only select non bitexact IDCTs by default ↵Michael Niedermayer2014-07-27
| | | | | | | | | | | | when bitexact is not set Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* | Merge commit '7fb993d338d88f2f62e0a358b6c9f3eb9a3a08ac'Michael Niedermayer2014-07-25
|\| | | | | | | | | | | | | | | | | | | | | | | * commit '7fb993d338d88f2f62e0a358b6c9f3eb9a3a08ac': qpeldsp: Mark source pointer in qpel_mc_func function pointer const Conflicts: libavcodec/h264qpel_template.c libavcodec/x86/cavsdsp.c libavcodec/x86/rv40dsp_init.c Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * qpeldsp: Mark source pointer in qpel_mc_func function pointer constDiego Biurrun2014-07-25
| |
* | Merge commit '6869612f5c7d4d2f20f69a5658328a761deadb1c'Michael Niedermayer2014-07-22
|\| | | | | | | | | | | | | | | | | | | * commit '6869612f5c7d4d2f20f69a5658328a761deadb1c': arm: Macroize the test for 'setend' CPU instruction support Conflicts: libavcodec/arm/h264dsp_init_arm.c Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * arm: Macroize the test for 'setend' CPU instruction supportBen Avison2014-07-21
| | | | | | | | Signed-off-by: Diego Biurrun <diego@biurrun.de>
* | Merge commit '81b9bf319226fe03436c80aaa8a2c91767cab7ce'Michael Niedermayer2014-07-21
|\| | | | | | | | | | | | | | | | | | | * commit '81b9bf319226fe03436c80aaa8a2c91767cab7ce': dct-test: Move arch-specific bits into arch-specific subdirectories Conflicts: libavcodec/dct-test.c Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * dct-test: Move arch-specific bits into arch-specific subdirectoriesDiego Biurrun2014-07-21
| |
* | Merge commit '4de8b60684ce13dff3e3d372dae4f49b9e53f755'Michael Niedermayer2014-07-21
|\| | | | | | | | | | | | | * commit '4de8b60684ce13dff3e3d372dae4f49b9e53f755': idct: Move arm-specific declarations to a header in the arm directory Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * idct: Move arm-specific declarations to a header in the arm directoryDiego Biurrun2014-07-20
| |
* | Merge commit '8b0dd4942aac320d1ca3c40fa7ea1be342c71273'Michael Niedermayer2014-07-18
|\| | | | | | | | | | | | | | | | | | | | | | | * commit '8b0dd4942aac320d1ca3c40fa7ea1be342c71273': idctdsp: prettyprinting cosmetics Conflicts: libavcodec/idctdsp.c libavcodec/ppc/idctdsp.c libavcodec/x86/idctdsp_init.c Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * idctdsp: prettyprinting cosmeticsDiego Biurrun2014-07-18
| |
* | Merge commit 'b4987f72197e0c62cf2633bf835a9c32d2a445ae'Michael Niedermayer2014-07-18
|\| | | | | | | | | | | | | | | | | | | | | * commit 'b4987f72197e0c62cf2633bf835a9c32d2a445ae': idct: Convert IDCT permutation #defines to an enum Conflicts: libavcodec/idctdsp.c libavcodec/x86/cavsdsp.c Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * idct: Convert IDCT permutation #defines to an enumDiego Biurrun2014-07-18
| | | | | | | | Also rename the enum values to be consistent with other DCT permutations.
* | Merge commit '7e18a727d2c2a19f22fcf68875d1b05fd2eafcef'Michael Niedermayer2014-07-18
|\| | | | | | | | | | | | | * commit '7e18a727d2c2a19f22fcf68875d1b05fd2eafcef': arm: cosmetics: Consistently use lowercase for shift operators Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * arm: cosmetics: Consistently use lowercase for shift operatorsMartin Storsjö2014-07-18
| | | | | | | | Signed-off-by: Martin Storsjö <martin@martin.st>
* | Merge commit 'fe67f3fbb5f9f6a6b60f837f6bc5e087ac11f3bf'Michael Niedermayer2014-07-18
|\| | | | | | | | | | | | | * commit 'fe67f3fbb5f9f6a6b60f837f6bc5e087ac11f3bf': arm: cosmetics: Fix a misaligned asm operand Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * arm: cosmetics: Fix a misaligned asm operandMartin Storsjö2014-07-18
| | | | | | | | Signed-off-by: Martin Storsjö <martin@martin.st>
* | Merge commit '87552d54d3337c3241e8a9e1a05df16eaa821496'Michael Niedermayer2014-07-18
|\| | | | | | | | | | | | | * commit '87552d54d3337c3241e8a9e1a05df16eaa821496': armv6: Accelerate ff_fft_calc for general case (nbits != 4) Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * armv6: Accelerate ff_fft_calc for general case (nbits != 4)Ben Avison2014-07-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The previous implementation targeted DTS Coherent Acoustics, which only requires nbits == 4 (fft16()). This case was (and still is) linked directly rather than being indirected through ff_fft_calc_vfp(), but now the full range from radix-4 up to radix-65536 is available. This benefits other codecs such as AAC and AC3. The implementaion is based upon the C version, with each routine larger than radix-16 calling a hierarchy of smaller FFT functions, then performing a post-processing pass. This pass benefits a lot from loop unrolling to counter the long pipelines in the VFP. A relaxed calling standard also reduces the overhead of the call hierarchy, and avoiding the excessive inlining performed by GCC probably helps with I-cache utilisation too. I benchmarked the result by measuring the number of gperftools samples that hit anywhere in the AAC decoder (starting from aac_decode_frame()) or specifically in the FFT routines (fft4() to fft512() and pass()) for the same sample AAC stream: Before After Mean StdDev Mean StdDev Confidence Change Audio decode 2245.5 53.1 1599.6 43.8 100.0% +40.4% FFT routines 940.6 22.0 348.1 20.8 100.0% +170.2% Signed-off-by: Martin Storsjö <martin@martin.st>
| * armv6: Accelerate ff_imdct_half for general case (mdct_bits != 6)Ben Avison2014-07-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The previous implementation targeted DTS Coherent Acoustics, which only requires mdct_bits == 6. This relatively small size lent itself to unrolling the loops a small number of times, and encoding offsets calculated at assembly time within the load/store instructions of each iteration. In the more general case (codecs such as AAC and AC3) much larger arrays are used - mdct_bits == [8, 9, 11]. The old method does not scale for these cases, so more integer registers are used with non-unrolled versions of the loops (and with some stack spillage). The postrotation filter loop is still unrolled by a factor of 2 to permit the double-buffering of some VFP registers to facilitate overlap of neighbouring iterations. I benchmarked the result by measuring the number of gperftools samples that hit anywhere in the AAC decoder (starting from aac_decode_frame()) or specifically in ff_imdct_half_c / ff_imdct_half_vfp, for the same example AAC stream: Before After Mean StdDev Mean StdDev Confidence Change aac_decode_frame 2368.1 35.8 2117.2 35.3 100.0% +11.8% ff_imdct_half_* 457.5 22.4 251.2 16.2 100.0% +82.1% Signed-off-by: Martin Storsjö <martin@martin.st>
* | Merge commit '2d60444331fca1910510038dd3817bea885c2367'Michael Niedermayer2014-07-17
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * commit '2d60444331fca1910510038dd3817bea885c2367': dsputil: Split motion estimation compare bits off into their own context Conflicts: configure libavcodec/Makefile libavcodec/arm/Makefile libavcodec/dvenc.c libavcodec/error_resilience.c libavcodec/h264.h libavcodec/h264_slice.c libavcodec/me_cmp.c libavcodec/me_cmp.h libavcodec/motion_est.c libavcodec/motion_est_template.c libavcodec/mpeg4videoenc.c libavcodec/mpegvideo.c libavcodec/mpegvideo_enc.c libavcodec/x86/Makefile libavcodec/x86/me_cmp_init.c Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * dsputil: Split motion estimation compare bits off into their own contextDiego Biurrun2014-07-17
| |