summaryrefslogtreecommitdiff
path: root/libavcodec/x86/mdct15_init.c
Commit message (Collapse)AuthorAge
* all: Replace if (ARCH_FOO) checks by #if ARCH_FOOAndreas Rheinhardt2022-06-15
| | | | | | | | | | | | | | | | | | This is more spec-compliant because it does not rely on dead-code elimination by the compiler. Especially MSVC has problems with this, as can be seen in https://ffmpeg.org/pipermail/ffmpeg-devel/2022-May/296373.html or https://ffmpeg.org/pipermail/ffmpeg-devel/2022-May/297022.html This commit does not eliminate every instance where we rely on dead code elimination: It only tackles branching to the initialization of arch-specific dsp code, not e.g. all uses of CONFIG_ and HAVE_ checks. But maybe it is already enough to compile FFmpeg with MSVC with whole-programm-optimizations enabled (if one does not disable too many components). Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
* avutil/avassert: Don't include avutil.hAndreas Rheinhardt2022-02-24
| | | | | Reviewed-by: Martin Storsjö <martin@martin.st> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
* Include attributes.h directlyAndreas Rheinhardt2021-04-19
| | | | | | | | Some files currently rely on libavutil/cpu.h to include it for them; yet said file won't use include it any more after the currently deprecated functions are removed, so include attributes.h directly. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
* mdct15: simplify x86 exptab permutationRostislav Pehlivanov2018-05-07
| | | | | | Removes an unneeded copy and does the 5-point permute in-place. Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
* mdct15: add inverse transform postrotation SIMDRostislav Pehlivanov2017-07-30
| | | | | | | | | | | | | | | | | | | | | | | | | | 2.5ms frames: Before (c): 2638 decicycles in postrotate, 2097040 runs, 112 skips After (sse3): 1467 decicycles in postrotate, 2097083 runs, 69 skips After (avx2): 1244 decicycles in postrotate, 2097085 runs, 67 skips 5ms frames: Before (c): 4987 decicycles in postrotate, 1048371 runs, 205 skips After (sse3): 2644 decicycles in postrotate, 1048509 runs, 67 skips After (avx2): 2031 decicycles in postrotate, 1048523 runs, 53 skips 10ms frames: Before (c): 9153 decicycles in postrotate, 523575 runs, 713 skips After (sse3): 5110 decicycles in postrotate, 523726 runs, 562 skips After (avx2): 3738 decicycles in postrotate, 524223 runs, 65 skips 20ms frames: Before (c): 17857 decicycles in postrotate, 261866 runs, 278 skips After (sse3): 10041 decicycles in postrotate, 261746 runs, 398 skips After (avx2): 7050 decicycles in postrotate, 262116 runs, 28 skips Improves total decoding performance for real world content by 9% with avx2. Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
* mdct15: add assembly optimizations for the 15-point FFTRostislav Pehlivanov2017-06-23
c: 1802 decicycles in fft15,16774635 runs, 2581 skips avx: 865 decicycles in fft15,16776378 runs, 838 skips Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>