summaryrefslogtreecommitdiff
path: root/libavcodec/aarch64
Commit message (Collapse)AuthorAge
* avcodec: fix arguments on xmm/neon clobber test wrappersJames Almer2016-10-02
| | | | Signed-off-by: James Almer <jamrial@gmail.com>
* avcodec: add missing xmm/neon clobber test wrappers for the new encode APIJames Almer2016-10-01
| | | | | Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>
* avcodec: fix vc1dsp dependenciesXiaolei Yu2016-09-25
|
* avcodec: add missing xmm/neon clobber test wrappers for the new decode APIJames Almer2016-07-03
| | | | | Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>
* lavc/neontest: fix constness in arm/aarch64 avcodec_open2() wrappersClément Bœsch2016-06-25
|
* Merge commit '41ed7ab45fc693f7d7fc35664c0233f4c32d69bb'Clément Bœsch2016-06-21
|\ | | | | | | | | | | | | * commit '41ed7ab45fc693f7d7fc35664c0233f4c32d69bb': cosmetics: Fix spelling mistakes Merged-by: Clément Bœsch <u@pkh.me>
| * cosmetics: Fix spelling mistakesVittorio Giovara2016-05-04
| | | | | | | | Signed-off-by: Diego Biurrun <diego@biurrun.de>
* | aarch64/synth_filter: fix compilationJames Almer2016-05-10
| | | | | | | | Signed-off-by: James Almer <jamrial@gmail.com>
* | Merge commit '01621202aad7e27b2a05c71d9ad7a19dfcbe17ec'Derek Buitenhuis2016-05-09
|\| | | | | | | | | | | | | * commit '01621202aad7e27b2a05c71d9ad7a19dfcbe17ec': build: miscellaneous cosmetics Merged-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
| * build: miscellaneous cosmeticsDiego Biurrun2016-04-07
| | | | | | | | | | | | Restore alphabetical order in lists, break overly long lines, do some prettyprinting, add some explanatory section comments, group parts together that belong together logically.
* | Merge commit 'cdb1665f70def544ddab3e3ed3763ef99c8b3873'Derek Buitenhuis2016-04-24
|\| | | | | | | | | | | | | * commit 'cdb1665f70def544ddab3e3ed3763ef99c8b3873': aarch64: Make transpose_4x4H do a regular transpose Merged-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
| * aarch64: Make transpose_4x4H do a regular transposeMartin Storsjö2016-03-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously, ff_h264_idct_add_neon (originally in the arm version) used a non-regular transpose in order to be able to use more instructions that deal with registers as 128 bit register pairs. The aarch64 translation doesn't do it to the same extent, but brought along the same structure since it was a straight translation. This reshuffles ff_h264_idct_add_neon, bringing it closer to the C implementation, making the transpose_4x4H macro do a regular transpose, usable for other algorithms as well. Previously, the third and fourth output from transpose_4x4H were swapped, and prior to cc29d96d5a, the same inputs as well. In addition to just swapping the outputs, also renumber the intermediate registers for better readability (making the register order match transpose_4x8B). This runs with the same number of cycles as before. Signed-off-by: Martin Storsjö <martin@martin.st>
| * fft: Split MDCT bits off from FFTDiego Biurrun2016-03-01
| |
* | Merge commit '97aec6e75ef36ed0402653519daa8e1fc8ddb555'Derek Buitenhuis2016-04-12
|\| | | | | | | | | | | | | * commit '97aec6e75ef36ed0402653519daa8e1fc8ddb555': fft: arm: Drop unnecessary #include, add missing ones Merged-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
| * fft: arm: Drop unnecessary #include, add missing onesDiego Biurrun2016-02-26
| |
* | avcodec/dca: add new decoder based on libdcadecfoo862016-01-31
| |
* | avcodec/dca: remove old decoderfoo862016-01-31
| | | | | | | | | | Remove all files and functions which are not going to be reused, and disable all functions and FATE tests temporarily which will be.
* | avcodec/synth_filter: split off remaining code from dcadec filesJames Almer2016-01-25
| | | | | | | | Signed-off-by: James Almer <jamrial@gmail.com>
* | Merge commit '2008f76054906e9ff6bf744800af0e5a5bfe61be'Hendrik Leppkes2016-01-02
|\| | | | | | | | | | | | | * commit '2008f76054906e9ff6bf744800af0e5a5bfe61be': dca: remove unused decode_hf function and quant_d tables Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
| * dca: remove unused decode_hf function and quant_d tablesAlexandra Hájková2015-12-24
| | | | | | | | | | They were superseded with their integer equivalents. Rename integer decode_hf to decode_hf.
| * arm64: fix inverted register order in transpose_4x4HJanne Grunau2015-12-21
| | | | | | | | | | | | Fix related register order issue in ff_h264_idct_add_neon. Found-by: zjh8890 <243186085@qq.com>
* | Merge commit 'a0fc780a2093784e8664f88205ee1b215e109cee'Hendrik Leppkes2016-01-02
|\| | | | | | | | | | | | | * commit 'a0fc780a2093784e8664f88205ee1b215e109cee': arm64: int32_to_float_fmul neon asm Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
| * arm64: int32_to_float_fmul neon asmJanne Grunau2015-12-14
| | | | | | | | | | | | | | | | | | | | 3% faster dts decoding on a cortex-a57. cortex-a57 cortex-a53 int32_to_float_fmul_array8_c: 1270.9 4475.6 int32_to_float_fmul_array8_neon: 328.6 569.2 int32_to_float_fmul_scalar_c: 928.5 4119.6 int32_to_float_fmul_scalar_neon: 309.1 524.1
* | Merge commit '705f5e5e155f6f280a360af220fc5b30cfcee702'Hendrik Leppkes2016-01-02
|\| | | | | | | | | | | | | * commit '705f5e5e155f6f280a360af220fc5b30cfcee702': arm64: port synth_filter_float_neon from arm Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
| * arm64: port synth_filter_float_neon from armJanne Grunau2015-12-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | ~25% faster dts decoding overall. The checkasm CPU cycles numbers are not that useful since synth_filter_float() calls FFTContext.imdct_half(). cortex-a57 cortex-a53 synth_filter_float_c: 1866.2 3490.9 synth_filter_float_neon: 915.0 1531.5 With fftc.imdct_half forced to imdct_half_neon: cortex-a57 cortex-a53 synth_filter_float_c: 1718.4 3025.3 synth_filter_float_neon: 926.2 1530.1
* | Merge commit 'c33c1fa8af2b2e82418a06901b6ad17b3d61b73e'Hendrik Leppkes2016-01-02
|\| | | | | | | | | | | | | * commit 'c33c1fa8af2b2e82418a06901b6ad17b3d61b73e': arm64: convert dcadsp neon asm from arm Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
| * arm64: convert dcadsp neon asm from armJanne Grunau2015-12-14
| | | | | | | | | | | | | | | | | | | | | | | | ~2% faster dts decoding overall. cortex-a57 cortex-a53 dca_decode_hf_c: 474.8 1659.9 dca_decode_hf_neon: 225.2 301.1 dca_lfe_fir0_c: 913.2 1537.7 dca_lfe_fir0_neon: 286.8 451.9 dca_lfe_fir1_c: 848.7 1711.5 dca_lfe_fir1_neon: 387.1 506.4
* | avcodec/arm64: fix inverted register order in transpose_4x4HJanne Grunau2015-12-19
| | | | | | | | | | | | | | | | Fix related register order issue in ff_h264_idct_add_neon. Found-by: zjh8890 <243186085@qq.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* | Revert "avcodec/aarch64/neon.S: Update neon.s for transpose_4x4H"Michael Niedermayer2015-12-17
| | | | | | | | | | | | The change was not correct and broke H264 This reverts commit cd83f899c94f691b045697d12efa21f83eb2329f.
* | avcodec/aarch64/neon.S: Update neon.s for transpose_4x4Hzjh88902015-12-12
| | | | | | | | | | The transpose_4x4H is wrong which cost me much time to find this bug. The orders of r2 and r3 are wrong, this bug waste me much time while I make aarch64 arm instruction which used the function.
* | Merge commit 'f56d8d8dd72b1ab52aa814c5a0fccabf8040ef68'Michael Niedermayer2015-07-21
|\| | | | | | | | | | | | | | | | | | | * commit 'f56d8d8dd72b1ab52aa814c5a0fccabf8040ef68': h264: aarch64: intra prediction optimisations Conflicts: libavcodec/h264pred.c Merged-by: Michael Niedermayer <michael@niedermayer.cc>
| * h264: aarch64: intra prediction optimisationsJanne Grunau2015-07-20
| |
| * arm64: constify src in h264qpel dsp function definitionsJanne Grunau2015-06-24
| |
* | Merge commit '3d5d46233cd81f78138a6d7418d480af04d3f6c8'Michael Niedermayer2015-02-02
|\| | | | | | | | | | | | | | | | | | | | | * commit '3d5d46233cd81f78138a6d7418d480af04d3f6c8': opus: Factor out imdct15 into a standalone component Conflicts: configure libavcodec/opus_celt.c Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * opus: Factor out imdct15 into a standalone componentDiego Biurrun2015-02-02
| | | | | | | | It will be reused by the AAC decoder.
* | lavc/aarch64: Do not use the neon horizontal chroma loop filter for H.264 4:2:2.Carl Eugen Hoyos2015-01-31
| |
* | Merge commit '780cd20b00a69e26bbfffbb8eec16fbe999ea793'Michael Niedermayer2014-12-09
|\| | | | | | | | | | | | | * commit '780cd20b00a69e26bbfffbb8eec16fbe999ea793': aarch64: Use .data.rel.ro for const data with relocations Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * aarch64: Use .data.rel.ro for const data with relocationsMartin Storsjö2014-12-09
| | | | | | | | | | | | | | This reverts commit c00365b46d464ce47716315c1801818d811bdb9a in addition to using a different section. Signed-off-by: Martin Storsjö <martin@martin.st>
* | Merge commit 'c00365b46d464ce47716315c1801818d811bdb9a'Michael Niedermayer2014-11-16
|\| | | | | | | | | | | | | * commit 'c00365b46d464ce47716315c1801818d811bdb9a': aarch64: Make the function pointer tables position independent Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * aarch64: Make the function pointer tables position independentMartin Storsjö2014-11-16
| | | | | | | | | | | | | | This allows running the code on android, where 64 bit binaries with text relocations aren't allowed to be loaded. Signed-off-by: Martin Storsjö <martin@martin.st>
* | avcodec/aarch64/h264qpel_init_aarch64: mark src as constMichael Niedermayer2014-08-30
| | | | | | | | Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* | Merge commit 'ac6b95dbc0b53b3ea461bd5e5e7f7f31d2983733'Michael Niedermayer2014-08-04
|\| | | | | | | | | | | | | * commit 'ac6b95dbc0b53b3ea461bd5e5e7f7f31d2983733': aarch64: add ',' between assembler macro arguments where missing Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * aarch64: add ',' between assembler macro arguments where missingJanne Grunau2014-08-04
| | | | | | | | | | | | | | llvm's integrated assembler does not accept spaces as macro argument delimiter when targeting darwin. Using a explicit delimiter is a good idea in principle since it makes case like 'macro 4 -2' vs 'macro 4 - 2' clear.
* | Merge commit 'f23d26a6864128001b03876b0b92fffe131f2060'Michael Niedermayer2014-06-23
|\| | | | | | | | | | | | | * commit 'f23d26a6864128001b03876b0b92fffe131f2060': h264: avoid using uninitialized memory in NEON chroma mc Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * h264: avoid using uninitialized memory in NEON chroma mcJanne Grunau2014-06-23
| | | | | | | | | | Adapt commit 982b596ea6640bfe218a31f6c3fc542d9fe61c31 for the arm and aarch64 NEON asm. 5-10% faster on Cortex-A9.
* | Merge commit 'd3f5b94762fb803c0f3b29f9ad6c5eaa813998ba'Michael Niedermayer2014-05-15
|\| | | | | | | | | | | | | * commit 'd3f5b94762fb803c0f3b29f9ad6c5eaa813998ba': aarch64: opus NEON iMDCT and FFT Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * aarch64: opus NEON iMDCT and FFTJanne Grunau2014-05-15
| | | | | | | | | | Opus celt decoding 11% faster and the iMDCT over 2.5 times faster on Apple's A7.
* | Merge commit '9aa4592076d4dbb29d1198b0e258f9f85c0c00b5'Michael Niedermayer2014-05-13
|\| | | | | | | | | | | | | * commit '9aa4592076d4dbb29d1198b0e258f9f85c0c00b5': aarch64: assembler in clang-3.4 ignores the division by two Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * aarch64: assembler in clang-3.4 ignores the division by twoJanne Grunau2014-05-13
| | | | | | | | Values are positive powers of two, so just replace it with right shift.
* | Merge commit '3956a5e0ea46ed7e27ca888fe11c47986ad99261'Michael Niedermayer2014-04-22
|\| | | | | | | | | | | | | * commit '3956a5e0ea46ed7e27ca888fe11c47986ad99261': aarch64: NEON vorbis_inverse_coupling Merged-by: Michael Niedermayer <michaelni@gmx.at>