summaryrefslogtreecommitdiff
path: root/libavcodec/x86
Commit message (Collapse)AuthorAge
* mdct15: simplify x86 exptab permutationRostislav Pehlivanov2018-05-07
| | | | | | Removes an unneeded copy and does the 5-point permute in-place. Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
* mdct15: simplify the fft15 x86 SIMDRostislav Pehlivanov2018-05-07
| | | | | | Saves 1 gpr and 2 instructions and simplifies the macros a bit. Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
* mpeg4video: Add support for MPEG-4 Simple Studio Profile.Kieran Kunhya2018-04-02
| | | | This is a profile supporting > 8-bit video and has a higher quality DCT
* sbcenc: add MMX optimizationsAurelien Jacobs2018-03-07
| | | | | | | | This was originally based on libsbc, and was fully integrated into ffmpeg. Rough speed test: C version: speed= 592x MMX version: speed= 785x
* h264_idct: enable unmacro on newer NASM versionsRostislav Pehlivanov2018-02-12
| | | | Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
* avcodec/utvideoenc : add SIMD (avx) for sub_left_predictionMartin Vignali2018-01-28
| | | | asm code by Henrik Gramner
* avcodec: increase AV_INPUT_BUFFER_PADDING_SIZE to 64James Almer2018-01-11
| | | | | | | | | | AVX-512 support has been introduced, and even if no functions currently use zmm registers (able to load as much as 64 bytes of consecutive data per instruction), they will be added eventually. Reviewed-by: Rostislav Pehlivanov <atomnuker@gmail.com> Tested-by: Michael Niedermayer <michael@niedermayer.cc> Signed-off-by: James Almer <jamrial@gmail.com>
* x86/lossless_videodsp: rename ff_add_left_pred_int16_sse4 to ↵James Almer2017-12-10
| | | | | | | | ff_add_left_pred_int16_unaligned_ssse3 SSSE3_FAST is the proper check for it. Signed-off-by: James Almer <jamrial@gmail.com>
* x86/lossless_videodsp: don't overread the dst buffer in ↵James Almer2017-12-10
| | | | | | | | ff_add_left_pred_unaligned_avx2 Fixes valgrind Signed-off-by: James Almer <jamrial@gmail.com>
* avcodec/utvideodec : add SIMD (SSSE3 and AVX2) for gradient_predMartin Vignali2017-12-09
|
* avcodec/x86/lossless_videodsp : add avx2 version for add_left_predMartin Vignali2017-12-09
|
* avcodec/x86/lossless_videodsp.asm : make macro for add_left_pred_unaligned ↵Martin Vignali2017-12-09
| | | | in order to add avx2 version
* avcodec/x86/bswapdsp : use macro for 128 bits constants loading in xmm or ymmMartin Vignali2017-12-02
|
* avcodec/fft: fix INTERL macro on 3dnowMikulas Patocka2017-11-25
| | | | | | | | | | | | | The commit b7c16a3f2c4921f613319938b8ee0e3d6fa83e8d ("x86: fft: Port to cpuflags") breaks the opus decoder in ffmpeg when compiling for 3dnow. The output is audible, but there's a lot of noise. The reason for the breakage is that the commit unintentionally changed the INTERL macro so that it is empty when compiling for 3dnow. This patch fixes it. Signed-off-by: Mikulas Patocka <mikulas@twibright.com> Signed-off-by: James Almer <jamrial@gmail.com>
* avcodec/x86/exrdsp : use ymm constant for pb_80Martin Vignali2017-11-23
| | | | speed seems to be similar, but simplify code
* x86/utvideodsp: reuse shared constantsJames Almer2017-11-21
| | | | | | | Remove the broadcast instructions as well now that they are wide enough. Signed-off-by: James Almer <jamrial@gmail.com>
* x86/constants: make pb_80 32 byte wideJames Almer2017-11-21
| | | | Signed-off-by: James Almer <jamrial@gmail.com>
* avcodec/huffyuvdspenc : add diff_int16 AVX2 funcMartin Vignali2017-11-21
|
* avcodec/huffyuvdspenc : reorganize diff_int16Martin Vignali2017-11-21
|
* avcodec/huffyuvdsp : add add_int16 AVX2 funcMartin Vignali2017-11-21
|
* avcodec/huffyuvdsp : reorganize add_int16 asmMartin Vignali2017-11-21
|
* avcodec/huffyuvdsp(enc) : move duplicate macro to a template fileMartin Vignali2017-11-21
|
* avcodec/x86/utvideodsp.asm : cosmeticMartin Vignali2017-11-21
| | | | | better func separator and add comment for the restore rgb planes10 declaration
* avcodec/utvideodsp : add avx2 version for the dspMartin Vignali2017-11-21
|
* avcodec/x86/utvideodsp : make macro for funcMartin Vignali2017-11-21
|
* x86/jpeg2000dsp: add ff_ict_float_{fma3,fma4}James Almer2017-11-20
| | | | | | | | | jpeg2000_ict_float_c: 2296.0 jpeg2000_ict_float_sse: 628.0 jpeg2000_ict_float_avx: 317.0 jpeg2000_ict_float_fma3: 262.0 Signed-off-by: James Almer <jamrial@gmail.com>
* avcodec/x86/mpegvideodsp: Fix signedness bug in need_emuMichael Niedermayer2017-11-14
| | | | | | | | | Fixes: out of array read Fixes: 3516/attachment-311488.dat Found-by: Insu Yun, Georgia Tech. Tested-by: wuninsu@gmail.com Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* Fix missing used attribute for inline assembly variablesThomas Köppe2017-11-13
| | | | | | | | | | | | | Variables used in inline assembly need to be marked with attribute((used)). Static constants already were, via the define of DECLARE_ASM_CONST. But DECLARE_ALIGNED does not add this attribute, and some of the variables defined with it are const only used in inline assembly, and therefore appeared dead. This change adds a macro DECLARE_ASM_ALIGNED that marks variables as used. This change makes FFMPEG work with Clang's ThinLTO. Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* libavcodec/lossless_video_dsp : cosmetic add better separator for each ↵Martin Vignali2017-11-07
| | | | function, in order to make reading of the asm file easier
* libavcodec/lossless_videodsp : add add_bytes avx2 versionMartin Vignali2017-11-07
|
* x86/bswapdsp: add missing preprocessor wrappers for AVX2 functionsJames Almer2017-10-29
| | | | | | Fixes build with old nasm/yasm. Signed-off-by: James Almer <jamrial@gmail.com>
* libavcodec/bswapdsp : add AVX2 func for bswap_buf (swap uint32_t)Martin Vignali2017-10-29
|
* Merge commit '681a86aba6cb09b98ad716d986182060c7795d20'James Almer2017-10-21
|\ | | | | | | | | | | | | * commit '681a86aba6cb09b98ad716d986182060c7795d20': x86: fft: Port to cpuflags Merged-by: James Almer <jamrial@gmail.com>
| * x86: fft: Port to cpuflagsDiego Biurrun2017-03-14
| |
* | Merge commit 'e9bb77fb1012cba1951a82136df7071f71bce8fb'James Almer2017-10-21
|\| | | | | | | | | | | | | * commit 'e9bb77fb1012cba1951a82136df7071f71bce8fb': x86: h264: Simplify DEQUANT macro with cpuflags Merged-by: James Almer <jamrial@gmail.com>
| * x86: h264: Simplify DEQUANT macro with cpuflagsDiego Biurrun2017-03-14
| |
* | Merge commit '307eb1a8ee363db1fcf869e427a8deb6d9538881'James Almer2017-10-21
|\| | | | | | | | | | | | | * commit '307eb1a8ee363db1fcf869e427a8deb6d9538881': x86: vp8dsp: port FILTER_BILINEAR macro to cpuflags Merged-by: James Almer <jamrial@gmail.com>
| * x86: vp8dsp: port FILTER_BILINEAR macro to cpuflagsDiego Biurrun2017-03-14
| |
* | Merge commit '994c4bc10751e39c7ed9f67ffd0c0dea5223daf2'James Almer2017-10-21
|\| | | | | | | | | | | | | | | | | * commit '994c4bc10751e39c7ed9f67ffd0c0dea5223daf2': x86util: Port all macros to cpuflags See d5f8a642f6eb1c6e305c41dabddd0fd36ffb3f77 Merged-by: James Almer <jamrial@gmail.com>
| * x86util: Port all macros to cpuflagsDiego Biurrun2017-03-14
| | | | | | | | | | | | Also do some small cosmetic changes: Drop pointless _MMX suffix from ABSD2 macro name, drop pointless check for MMX support, we always assume MMX is available in our SIMD code, fix spelling.
* | Merge commit '6eef263aca281fb582e1fa3d841ac20ef747a252'James Almer2017-10-12
|\| | | | | | | | | | | | | * commit '6eef263aca281fb582e1fa3d841ac20ef747a252': x86: Merge align directives into SECTION_RODATA declarations where possible Merged-by: James Almer <jamrial@gmail.com>
| * x86: Merge align directives into SECTION_RODATA declarations where possibleDiego Biurrun2017-03-05
| |
| * build: Generalize yasm/nasm-related variable namesDiego Biurrun2017-03-01
| | | | | | | | None of them are specific to the YASM assembler.
| * x86: hevc: Add missing colons after assembly labelsDiego Biurrun2017-03-01
| | | | | | | | | | This fixes several warnings of the sort warning: label alone on a line without a colon might be in error
* | x86/blockdsp: use three operand form for an instructionJames Almer2017-10-04
| | | | | | | | Fixes assembling with old yasm.
* | avcodec/x86/lossless_videoencdsp: Fix warning: signed dword value exceeds boundsMichael Niedermayer2017-10-05
| | | | | | | | | | | | | | Add () to regsize define Suggested-by: Henrik Gramner <henrik@gramner.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* | avcodec/x86/lossless_videoencdsp: Fix handling of small widthsMichael Niedermayer2017-10-05
| | | | | | | | | | | | | | | | | | | | | | | | Fixes out of array access Fixes: crash-huf.avi Regression since: 6b41b4414934cc930468ccd5db598dd6ef643987 This could also be fixed by adding checks in the C code that calls the dsp Found-by: Zhibin Hu and 连一汉 <lianyihan@360.cn> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* | libavcodec/blockdsp : add AVX versionMartin Vignali2017-10-03
| | | | | | | | | | | | | | Also modify the required alignment, to 32 instead of 16 for several codecs Signed-off-by: James Almer <jamrial@gmail.com>
* | libavcodec/exr : add x86 SIMD for predictorMartin Vignali2017-10-01
| | | | | | | | Signed-off-by: James Almer <jamrial@gmail.com>
* | Merge commit '7abdd026df6a9a52d07d8174505b33cc89db7bf6'James Almer2017-09-26
|\| | | | | | | | | | | | | * commit '7abdd026df6a9a52d07d8174505b33cc89db7bf6': asm: Consistently uppercase SECTION markers Merged-by: James Almer <jamrial@gmail.com>