summaryrefslogtreecommitdiff
path: root/libavutil/arm
Commit message (Collapse)AuthorAge
* arm: Fix vfp dead code elimination with have_vfp_vmMartin Storsjö2016-01-08
| | | | | | | | | This fixes builds with --disable-vfp. Checking for the armv6 cpu flag is incorrect, since vfpv2 isn't armv6 specific. Signed-off-by: Martin Storsjö <martin@martin.st>
* arm: add a cpu flag for the VFPv2 vector modeJanne Grunau2015-12-14
| | | | | | | | | | | | | | The vector mode was deprecated in ARMv7-A/VFPv3 and various cpu implementations do not support it in hardware. Vector mode code will depending the OS either be emulated in software or result in an illegal instruction on cpus which does not support it. This was not really problem in practice since NEON implementations of the same functions are preferred. It will however become a problem for checkasm which tests every cpu flag separately. Since this is a cpu feature newer cpu do not support anymore the behaviour of this flag differs from the other flags. It can be only activated by runtime cpu feature selection.
* arm: Suppress tags about used cpu arch and extensionsMartin Storsjö2015-03-07
| | | | | | | | | | When all the codepaths using manually set .arch/.fpu code is behind runtime detection, the elf attributes should be suppressed. This allows tools to know that the final built binary doesn't strictly require these extensions. Signed-off-by: Martin Storsjö <martin@martin.st>
* libavutil: Add ARM av_clip_intp2_armPeter Meerwald2015-02-21
| | | | | | | | | | add ARM code for implementing av_clip_intp2 using the ssat instruction on Cortex-A8, av_clip_intp2_arm() is faster than av_clip_intp2_c() and the generic av_clip(), about -19% Signed-off-by: Peter Meerwald <pmeerw@pmeerw.net> Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
* arm: Use .data.rel.ro for const data with relocationsMartin Storsjö2014-12-09
| | | | Signed-off-by: Martin Storsjö <martin@martin.st>
* arm: Macroize the test for 'setend' CPU instruction supportBen Avison2014-07-21
| | | | Signed-off-by: Diego Biurrun <diego@biurrun.de>
* armv6: Accelerate butterflies_floatBen Avison2014-07-18
| | | | | | | | | | | | | | I benchmarked the result by measuring the number of gperftools samples that hit anywhere in the AAC decoder (starting from aac_decode_frame()) or specifically in butterflies_float_c() / ff_butterflies_float_vfp() for the same sample AAC stream: Before After Mean StdDev Mean StdDev Confidence Change Audio decode 1542.8 43.7 1470.5 41.5 100.0% +4.9% butterflies_float 130.0 11.9 70.2 12.1 100.0% +85.2% Signed-off-by: Martin Storsjö <martin@martin.st>
* armv6: Accelerate vector_fmul_windowBen Avison2014-07-18
| | | | | | | | | | | | | | I benchmarked the result by measuring the number of gperftools samples that hit anywhere in the AAC decoder (starting from aac_decode_frame()) or specifically in vector_fmul_window_c() / ff_vector_fmul_window_vfp() for the same sample AAC stream: Before After Mean StdDev Mean StdDev Confidence Change Audio decode 1598.2 47.4 1529.2 25.4 100.0% +4.5% vector_fmul_window 244.0 22.1 188.9 22.3 100.0% +29.2% Signed-off-by: Martin Storsjö <martin@martin.st>
* arm: Detect 32 bit cpu features on ARMv8 when running on a 64 bit kernelMartin Storsjö2014-06-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | When running on a 64 bit kernel, /proc/cpuinfo lists different optional features than on 32 bit kernels (because some of them are mandatory in the 64 bit implemenations). The kernel does list the old features properly if they are queried via /proc/self/auxv though - however this file is not always readable (e.g. on most android systems). The getauxval function could also provide the same info as /proc/self/auxv even if this file isn't readable, but this function is not always available (and thus would need to be loaded with dlsym for compatibility with older android versions). The android cpufeatures library does this slightly differently, by assuming that these are available if the "CPU architecture" line is >= 8, see [1] for details. It has been suggested to include the old, non-optional features in /proc/cpuinfo as well, but that suggested patch never was merged. See [2] for the discussion around this suggestion. [1] https://android-review.googlesource.com/91380 [2] http://marc.info/?l=linux-arm-kernel&m=139087240101974 Signed-off-by: Martin Storsjö <martin@martin.st>
* build: check if AS supports the '.func' directiveJanne Grunau2014-06-03
| | | | | Not supported by Clang's integrated assembler. Since it just adds debug information it can safely omitted.
* Update dsputil- and SIMD-related comments to match reality more closelyDiego Biurrun2014-03-13
|
* arm: hpeldsp: prevent overreads in armv6 asmJanne Grunau2014-03-05
| | | | | | | Based on a patch by Russel King <rmk+libav@arm.linux.org.uk> Bug-Id: 646 CC: libav-stable@libav.org
* arm: Mark the stack as non-executableMartin Storsjö2014-02-19
| | | | | | | If linking in an object file without this attribute set, the linker will assume that an executable stack might be needed. Signed-off-by: Martin Storsjö <martin@martin.st>
* arm: Add EXTERN_ASM to the .func and .type declarations for exported symbolsMartin Storsjö2014-02-07
| | | | | | | | | | | | | | | This makes the generated assembly more internally consistent, avoiding declaring two labels for the same function (for cases where EXTERN_ASM is empty) and not declaring a separate unprefixed label in other cases. This also makes sure the .func and .type delcarations have the same prefix. They have previously not been used on the platforms that have prefixed symbols on arm (iOS), but gas-preprocessor has recently started using the .func declarations for adding .thumb_func declarations for such functions. Signed-off-by: Martin Storsjö <martin@martin.st>
* arm: Add an option for making sure NEON registers aren't clobberedMartin Storsjö2014-01-11
| | | | | | This is pretty much based on the same test for XMM registers. Signed-off-by: Martin Storsjö <martin@martin.st>
* arm: Allow overriding the alignment set in the function macroMartin Storsjö2014-01-07
| | | | | | | | | | | | | | | The function macro always sets .align 2 before declaring the function label (since 5c5e1ea3) and always sets the section to .text (since 278caa6a). The .align 5 before certain functions, added in fc252eba, were added before .text and .align were added to the function macro and thus became useless/unused when the function macro got them. This restores the original intention, to align the loop entry points. Signed-off-by: Martin Storsjö <martin@martin.st>
* arm: float_dsp: Propagate cpu_flags to vfp initialization functionDiego Biurrun2013-08-29
|
* avutil: Refactor CPU extension availability macrosDiego Biurrun2013-08-28
|
* avutil: Move internal CPU detection function declarations to private headerDiego Biurrun2013-08-28
|
* Employ consistent LIBAV_COMPAT_ multiple inclusion guards in compat/Diego Biurrun2013-07-18
| | | | Also fix a comment and an #endif comment.
* arm: Only output eabi attributes if building for ELFMartin Storsjö2013-05-27
| | | | | | | | This matches the other eabi attribute in the same file. This is required in order to build for arm/hardfloat with other object file formats than ELF. Signed-off-by: Martin Storsjö <martin@martin.st>
* avutil: Add av_cold attributes to init functions missing themDiego Biurrun2013-05-04
|
* arm: Fall back to runtime cpu feature detection via /proc/cpuinfoMartin Storsjö2013-02-11
| | | | | | | | | | | | | | | | On recent android versions, /proc/self/auxw is unreadable (unless the process is running running under the shell uid or in debuggable mode, which makes it hard to notice). See http://b.android.com/43055 and https://android-review.googlesource.com/51271 for more information about the issue. This makes sure e.g. neon optimizations are enabled at runtime in android apps even when built in release mode, if configured to use the runtime detection. CC: libav-stable@libav.org Signed-off-by: Martin Storsjö <martin@martin.st>
* floatdsp: move scalarproduct_float from dsputil to avfloatdsp.Ronald S. Bultje2013-01-22
| | | | This makes the aac decoder and all voice codecs independent of dsputil.
* floatdsp: move butterflies_float from dsputil to avfloatdsp.Ronald S. Bultje2013-01-22
| | | | | This makes wmadec/enc, twinvq and mpegaudiodec (i.e. mp2/mp3) independent of dsputil.
* floatdsp: move vector_fmul_reverse from dsputil to avfloatdsp.Ronald S. Bultje2013-01-22
| | | | | | Now, nellymoserenc and aacenc no longer depends on dsputil. Independent of this patch, wmaprodec also does not depend on dsputil, so I removed it from there also.
* floatdsp: move vector_fmul_add from dsputil to avfloatdsp.Ronald S. Bultje2013-01-22
|
* lavc: Move vector_fmul_window to AVFloatDSPContextJustin Ruggles2013-01-16
| | | | Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
* arm: detect cpu features at runtime on LinuxMans Rullgard2012-12-07
| | | | | | | | This allows compiling optimised functions for features not enabled in the core build and selecting these at runtime if the system has the necessary support. Signed-off-by: Mans Rullgard <mans@mansr.com>
* arm: rename ARMVFP config symbol to VFPMans Rullgard2012-12-07
| | | | | | | This is consistent with usual ARM nomenclature as well as with the VFPV3 and NEON symbols which both lack the ARM prefix. Signed-off-by: Mans Rullgard <mans@mansr.com>
* arm: use HAVE*_INLINE/EXTERNAL macros for conditional compilationMans Rullgard2012-12-07
| | | | | | These macros reflect the actual capabilities required here. Signed-off-by: Mans Rullgard <mans@mansr.com>
* dsputil: move vector_fmul_scalar() to AVFloatDSPContext in libavutilJustin Ruggles2012-11-26
|
* Move avutil tables only used in libavcodec to libavcodec.Diego Biurrun2012-10-11
|
* ARM: use numeric ID for Tag_ABI_align_preservedMans Rullgard2012-10-03
| | | | | | Some old assemblers still in use do not support named tags. Signed-off-by: Mans Rullgard <mans@mansr.com>
* ARM: bswap: drop armcc version of av_bswap16()Mans Rullgard2012-10-02
| | | | | | | This function causes several versions of armcc to miscompile code, and the performance impact is small. Signed-off-by: Mans Rullgard <mans@mansr.com>
* ARM: set Tag_ABI_align_preserved in all asm filesMans Rullgard2012-10-02
| | | | | | | | All our ARM asm preserves alignment so setting this attribute in a common location is simpler. This removes numerous warnings when linking with armcc. Signed-off-by: Mans Rullgard <mans@mansr.com>
* ARM: fix Thumb PIC on AppleMans Rullgard2012-10-02
| | | | | | | LDR with register offset and PC as base register is not available in the Thumb instruction set so the addition must be done separately. Signed-off-by: Mans Rullgard <mans@mansr.com>
* ARM: use 2-operand syntax for ADD Rd, PC in Apple PIC codeMans Rullgard2012-09-21
| | | | | | | The Apple assembler refuses to assemble the 3-operand form in Thumb2 even though it is valid syntax. Signed-off-by: Mans Rullgard <mans@mansr.com>
* ARM: align PIC offset pools to 4 bytesMans Rullgard2012-09-21
| | | | | | | | | When building Thumb2 code, the end of a function, where the PIC offsets are placed, need not be aligned. Although the values are only accessed with instructions allowing unaligned addresses, keeping them aligned is preferable. Signed-off-by: Mans Rullgard <mans@mansr.com>
* ARM: swap source operands in some add instructionsMans Rullgard2012-09-20
| | | | | | This allows using a 16-bit opcode when generating Thumb2 code. Signed-off-by: Mans Rullgard <mans@mansr.com>
* flacdsp: arm optimised lpc filterMans Rullgard2012-09-15
|
* ARM: intmath: use native-size return types for clipping functionsMans Rullgard2012-08-13
| | | | | | | This avoids having the compiler redundantly mask the values to the smaller size. Signed-off-by: Mans Rullgard <mans@mansr.com>
* libavutil: add saturating addition functionsMans Rullgard2012-08-13
| | | | | | | Fixed-point audio codecs often use saturating arithmetic, and special instructions for these operations are common. Signed-off-by: Mans Rullgard <mans@mansr.com>
* ARM: add missing "cc" clobber in av_clipl_int32_arm()Mans Rullgard2012-08-10
| | | | Signed-off-by: Mans Rullgard <mans@mansr.com>
* ARM: use Q/R inline asm operand modifiers only if supportedMans Rullgard2012-08-07
| | | | | | | | | | Some compilers do not support the Q/R modifiers used to access the low/high parts of a 64-bit register pair. Check for this and disable all uses of it when not supported. Fixes bug #337. Signed-off-by: Mans Rullgard <mans@mansr.com>
* ARM: generate position independent code to access data symbolsMans Rullgard2012-07-01
| | | | | | | | | | | This creates proper position independent code when accessing data symbols if CONFIG_PIC is set. References to external symbols should now use the movrelx macro. Some additional code changes are required since this macro may need a register to hold the GOT pointer. Signed-off-by: Mans Rullgard <mans@mansr.com>
* cosmetics: do not use full path for local headersDiego Biurrun2012-06-22
|
* float_dsp: Move vector_fmac_scalar() from libavcodec to libavutilJustin Ruggles2012-06-18
|
* ARM: fix float_dsp breakage from d5a7229Mans Rullgard2012-06-08
| | | | Signed-off-by: Mans Rullgard <mans@mansr.com>
* Add a float DSP framework to libavutilJustin Ruggles2012-06-08
| | | | Move vector_fmul() from DSPContext to AVFloatDSPContext.