Commit message (Collapse) | Author | Age | |
---|---|---|---|
* | all: Replace if (ARCH_FOO) checks by #if ARCH_FOO | Andreas Rheinhardt | 2022-06-15 |
| | | | | | | | | | | | | | | | | | | This is more spec-compliant because it does not rely on dead-code elimination by the compiler. Especially MSVC has problems with this, as can be seen in https://ffmpeg.org/pipermail/ffmpeg-devel/2022-May/296373.html or https://ffmpeg.org/pipermail/ffmpeg-devel/2022-May/297022.html This commit does not eliminate every instance where we rely on dead code elimination: It only tackles branching to the initialization of arch-specific dsp code, not e.g. all uses of CONFIG_ and HAVE_ checks. But maybe it is already enough to compile FFmpeg with MSVC with whole-programm-optimizations enabled (if one does not disable too many components). Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com> | ||
* | avutil/avassert: Don't include avutil.h | Andreas Rheinhardt | 2022-02-24 |
| | | | | | Reviewed-by: Martin Storsjö <martin@martin.st> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com> | ||
* | Include attributes.h directly | Andreas Rheinhardt | 2021-04-19 |
| | | | | | | | | Some files currently rely on libavutil/cpu.h to include it for them; yet said file won't use include it any more after the currently deprecated functions are removed, so include attributes.h directly. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com> | ||
* | mdct15: simplify x86 exptab permutation | Rostislav Pehlivanov | 2018-05-07 |
| | | | | | | Removes an unneeded copy and does the 5-point permute in-place. Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com> | ||
* | mdct15: add inverse transform postrotation SIMD | Rostislav Pehlivanov | 2017-07-30 |
| | | | | | | | | | | | | | | | | | | | | | | | | | | 2.5ms frames: Before (c): 2638 decicycles in postrotate, 2097040 runs, 112 skips After (sse3): 1467 decicycles in postrotate, 2097083 runs, 69 skips After (avx2): 1244 decicycles in postrotate, 2097085 runs, 67 skips 5ms frames: Before (c): 4987 decicycles in postrotate, 1048371 runs, 205 skips After (sse3): 2644 decicycles in postrotate, 1048509 runs, 67 skips After (avx2): 2031 decicycles in postrotate, 1048523 runs, 53 skips 10ms frames: Before (c): 9153 decicycles in postrotate, 523575 runs, 713 skips After (sse3): 5110 decicycles in postrotate, 523726 runs, 562 skips After (avx2): 3738 decicycles in postrotate, 524223 runs, 65 skips 20ms frames: Before (c): 17857 decicycles in postrotate, 261866 runs, 278 skips After (sse3): 10041 decicycles in postrotate, 261746 runs, 398 skips After (avx2): 7050 decicycles in postrotate, 262116 runs, 28 skips Improves total decoding performance for real world content by 9% with avx2. Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com> | ||
* | mdct15: add assembly optimizations for the 15-point FFT | Rostislav Pehlivanov | 2017-06-23 |
c: 1802 decicycles in fft15,16774635 runs, 2581 skips avx: 865 decicycles in fft15,16776378 runs, 838 skips Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com> |