summaryrefslogtreecommitdiff
path: root/libavfilter/x86
Commit message (Collapse)AuthorAge
* Merge commit '6e9f8d6a7d7392a236df19fef6f4eba41f18167e'Michael Niedermayer2013-05-09
|\ | | | | | | | | | | | | * commit '6e9f8d6a7d7392a236df19fef6f4eba41f18167e': x86: vf_yadif: Remove stray dsputil_mmx #include Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * x86: vf_yadif: Remove stray dsputil_mmx #includeDiego Biurrun2013-05-08
| |
* | Merge commit '093804a93cc5da3f95f98265a5df116912443cec'Michael Niedermayer2013-05-05
|\| | | | | | | | | | | | | | | | | | | | | | | | | * commit '093804a93cc5da3f95f98265a5df116912443cec': avfilter: Add av_cold attributes to init/uninit functions Conflicts: libavfilter/af_ashowinfo.c libavfilter/af_volume.c libavfilter/src_movie.c libavfilter/vf_lut.c Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * avfilter: Add av_cold attributes to init/uninit functionsDiego Biurrun2013-05-04
| |
* | Merge remote-tracking branch 'qatar/master'Michael Niedermayer2013-04-23
|\| | | | | | | | | | | | | | | | | | | | | * qatar/master: x86: Move some conditional code around to avoid unused variable warnings Conflicts: libavcodec/x86/dsputil_mmx.c libavfilter/x86/vf_yadif_init.c Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * x86: Move some conditional code around to avoid unused variable warningsDiego Biurrun2013-04-22
| |
| * lavfi/gradfun: remove rounding to match C and SSE code.Clément Bœsch2013-03-28
| | | | | | | | | | | | There is no noticable benefit for such precision. Signed-off-by: Anton Khirnov <anton@khirnov.net>
| * lavfi/gradfun: fix dithering in MMX code.Clément Bœsch2013-03-28
| | | | | | | | | | | | Current dithering only uses the first 4 instead of the whole 8 random values. Signed-off-by: Anton Khirnov <anton@khirnov.net>
| * lavfi/gradfun: fix rounding in MMX code.Clément Bœsch2013-03-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Current code divides before increasing precision. Also reduce upper bound for strength from 255 to 64. This will prevent an overflow in the SSSE3 and MMX filter_line code: delta is expressed as an u16 being shifted by 2 to the left. If it overflows, having a strength not above 64 will make sure that m is set to 0 (making the m*m*delta >> 14 expression void). A value above 64 should not make any sense unless gradfun is used as a blur filter. Signed-off-by: Anton Khirnov <anton@khirnov.net>
| * hqdn3d: Fix out of array read in LOWPASSLoren Merritt2013-03-13
| | | | | | | | | | CC:libav-stable@libav.org Signed-off-by: Anton Khirnov <anton@khirnov.net>
* | yadif: remove an 'm' from the LOAD macro definitionJames Darnley2013-03-16
| | | | | | | | Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* | yadif: remove repeated check on widthJames Darnley2013-03-16
| | | | | | | | | | | | The filter already checks that width (and height) are greater than 3. Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* | yadif: cosmetic indentation from previous commitsJames Darnley2013-03-16
| | | | | | | | Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* | yadif: x86 assembly for 9 to 14-bit samplesJames Darnley2013-03-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | These smaller samples do not need to be unpacked to double words allowing the code to process more pixels every iteration (still 2 in MMX but 6 in SSE2). It also avoids emulating the missing double word instructions on older instruction sets. Like with the previous code for 16-bit samples this has been tested on an Athlon64 and a Core2Quad. Athlon64: 1809275 decicycles in C, 32718 runs, 50 skips 911675 decicycles in mmx, 32727 runs, 41 skips, 2.0x faster 495284 decicycles in sse2, 32747 runs, 21 skips, 3.7x faster Core2Quad: 921363 decicycles in C, 32756 runs, 12 skips 486537 decicycles in mmx, 32764 runs, 4 skips, 1.9x faster 293296 decicycles in sse2, 32759 runs, 9 skips, 3.1x faster 284910 decicycles in ssse3, 32759 runs, 9 skips, 3.2x faster Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* | yadif: x86 assembly for 16-bit samplesJames Darnley2013-03-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a fairly dumb copy of the assembly for 8-bit samples but it works and produces identical output to the C version. The options have been tested on an Athlon64 and a Core2Quad. Athlon64: 1810385 decicycles in C, 32726 runs, 42 skips 1080744 decicycles in mmx, 32744 runs, 24 skips, 1.7x faster 818315 decicycles in sse2, 32735 runs, 33 skips, 2.2x faster Core2Quad: 924025 decicycles in C, 32750 runs, 18 skips 623995 decicycles in mmx, 32767 runs, 1 skips, 1.5x faster 406223 decicycles in sse2, 32764 runs, 4 skips, 2.3x faster 387842 decicycles in ssse3, 32767 runs, 1 skips, 2.4x faster 307726 decicycles in sse4, 32763 runs, 5 skips, 3.0x faster Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* | yadif: restore speed of the C filtering codeJames Darnley2013-03-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Always use the special filter for the first and last 3 columns (only). Changes made in 64ed397 slowed the filter to just under 3/4 of what it was. This commit restores the speed while maintaining identical output. For reference, on my Athlon64: 1733222 decicycles in old 2358563 decicycles in new 1727558 decicycles in this Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* | Merge commit '64ed397635ef2666b0ca0c8d8c60a8bc44581d82'Michael Niedermayer2013-02-16
|\| | | | | | | | | | | | | | | | | | | | | | | * commit '64ed397635ef2666b0ca0c8d8c60a8bc44581d82': vf_yadif: fix out-of line reads Conflicts: libavfilter/vf_yadif.c tests/ref/fate/filter-yadif-mode0 tests/ref/fate/filter-yadif-mode1 Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * vf_yadif: fix out-of line readsAnton Khirnov2013-02-15
| | | | | | | | Some changes in the border pixels, visually indistinguishable.
* | Merge commit '238614de679a71970c20d7c3fee08a322967ec40'Michael Niedermayer2013-02-06
|\| | | | | | | | | | | | | | | | | | | | | | | * commit '238614de679a71970c20d7c3fee08a322967ec40': cdgraphics: do not rely on get_buffer() initializing the frame. svq1: replace struct svq1_frame_size with an array. vf_yadif: silence a warning. Conflicts: libavcodec/svq1dec.c Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * vf_yadif: silence a warning.Anton Khirnov2013-02-06
| | | | | | | | | | | | | | clang says: libavfilter/vf_yadif.c:192:28: warning: incompatible pointer types assigning to 'void (*)(uint8_t *, uint8_t *, uint8_t *, uint8_t *, int, int, int, int, int)' from 'void (uint16_t *, uint16_t *, uint16_t *, uint16_t *, int, int, int, int, int)'
* | Merge remote-tracking branch 'qatar/master'Michael Niedermayer2013-02-05
|\| | | | | | | | | | | | | * qatar/master: avfilter: x86: consistent filenames for filter optimizations Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * avfilter: x86: consistent filenames for filter optimizationsDiego Biurrun2013-02-04
| |
* | avfilter/x86/vf_hqdn3d_init: fix author attribution & project nameMichael Niedermayer2013-02-02
| | | | | | | | | | | | Reference: 7a1944b907179212e7073b821fdc60e27d536e4a Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* | Merge remote-tracking branch 'qatar/master'Michael Niedermayer2013-02-02
|\| | | | | | | | | | | | | * qatar/master: vf_hqdn3d: x86: Add proper arch optimization initialization Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * vf_hqdn3d: x86: Add proper arch optimization initializationDiego Biurrun2013-02-01
| |
* | Merge commit 'a1c525f7eb0783d31ba7a653865b6cbd3dc880de'Michael Niedermayer2013-01-14
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * commit 'a1c525f7eb0783d31ba7a653865b6cbd3dc880de': pcx: return meaningful error codes. tmv: return meaningful error codes. msrle: return meaningful error codes. cscd: return meaningful error codes. yadif: x86: fix build for compilers without aligned stack lavc: introduce the convenience function init_get_bits8 lavc: check for overflow in init_get_bits Conflicts: libavcodec/cscd.c libavcodec/pcx.c Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * yadif: x86: fix build for compilers without aligned stackDaniel Kang2013-01-14
| | | | | | | | | | | | | | Manually load registers to avoid using 8 registers on x86_32 with compilers that do not align the stack (e.g. MSVC). Signed-off-by: Diego Biurrun <diego@biurrun.de>
* | Merge commit 'f7bf72a4a1146a7583577c9bdc066767e1ba3c6a'Michael Niedermayer2013-01-10
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | * commit 'f7bf72a4a1146a7583577c9bdc066767e1ba3c6a': idcinvideo: correctly set AVFrame defaults yadif: Port inline assembly to yasm au: remove unnecessary casts au: return AVERROR codes instead of -1 Conflicts: libavcodec/idcinvideo.c libavfilter/x86/yadif_template.c libavformat/au.c Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * yadif: Port inline assembly to yasmDaniel Kang2013-01-09
| | | | | | | | Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
* | lavfi/gradfun: remove rounding to match C and SSE code.Clément Bœsch2012-12-19
| | | | | | | | There is no noticable benefit for such precision.
* | lavfi/gradfun: fix dithering in MMX code.Clément Bœsch2012-12-19
| | | | | | | | Current dithering only use the first 4w instead of the whole 8 random values.
* | lavfi/gradfun: fix rounding in MMX code.Clément Bœsch2012-12-19
| | | | | | | | Current code divide before increasing precision.
* | Fix compilation with yasm 0.6.2.Carl Eugen Hoyos2012-12-07
| |
* | Merge commit 'b519298a1578e0c895d53d4b4ed8867b1c031a56'Michael Niedermayer2012-12-06
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * commit 'b519298a1578e0c895d53d4b4ed8867b1c031a56': pixdesc: fix yuva 10bit bit depth avconv: deprecate the -vol option x86: af_volume: add SSE2/SSSE3/AVX-optimized s32 volume scaling x86: af_volume: add SSE2-optimized s16 volume scaling Conflicts: ffmpeg.c tests/ref/lavfi/pixdesc tests/ref/lavfi/pixfmts_copy tests/ref/lavfi/pixfmts_null tests/ref/lavfi/pixfmts_scale tests/ref/lavfi/pixfmts_vflip Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * x86: af_volume: add SSE2/SSSE3/AVX-optimized s32 volume scalingJustin Ruggles2012-12-05
| |
| * x86: af_volume: add SSE2-optimized s16 volume scalingJustin Ruggles2012-12-05
| |
* | Merge commit 'fa8fcab1e0d31074c0644c4ac5194474c6c26415'Michael Niedermayer2012-11-01
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * commit 'fa8fcab1e0d31074c0644c4ac5194474c6c26415': x86: h264_chromamc_10bit: drop pointless PAVG %define x86: mmx2 ---> mmxext in function names swscale: do not forget to swap data in formats with different endianness Conflicts: libavcodec/x86/dsputil_mmx.c libavfilter/x86/gradfun.c libswscale/input.c libswscale/utils.c libswscale/x86/swscale.c tests/ref/lavfi/pixfmts_scale Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * x86: mmx2 ---> mmxext in function namesDiego Biurrun2012-10-31
| |
* | Merge commit '04581c8c77ce779e4e70684ac45302972766be0f'Michael Niedermayer2012-10-31
|\| | | | | | | | | | | | | | | | | | | * commit '04581c8c77ce779e4e70684ac45302972766be0f': x86: yasm: Use complete source path for macro helper %includes Conflicts: Makefile Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * x86: yasm: Use complete source path for macro helper %includesDiego Biurrun2012-10-31
| | | | | | | | | | This is more consistent with the way we handle C #includes and it simplifies the build system.
* | Merge commit '6860b4081d046558c44b1b42f22022ea341a2a73'Michael Niedermayer2012-10-31
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * commit '6860b4081d046558c44b1b42f22022ea341a2a73': x86: include x86inc.asm in x86util.asm cng: Reindent some incorrectly indented lines cngdec: Allow flushing the decoder cngdec: Make the dbov variable have the right unit cngdec: Fix the memset size to cover the full array cngdec: Update the LPC coefficients after averaging the reflection coefficients configure: fix print_config() with broke awks Conflicts: libavcodec/x86/ac3dsp.asm libavcodec/x86/dct32.asm libavcodec/x86/deinterlace.asm libavcodec/x86/dsputil.asm libavcodec/x86/dsputilenc.asm libavcodec/x86/fft.asm libavcodec/x86/fmtconvert.asm libavcodec/x86/h264_chromamc.asm libavcodec/x86/h264_deblock.asm libavcodec/x86/h264_deblock_10bit.asm libavcodec/x86/h264_idct.asm libavcodec/x86/h264_idct_10bit.asm libavcodec/x86/h264_intrapred.asm libavcodec/x86/h264_intrapred_10bit.asm libavcodec/x86/h264_weight.asm libavcodec/x86/vc1dsp.asm libavcodec/x86/vp3dsp.asm libavcodec/x86/vp56dsp.asm libavcodec/x86/vp8dsp.asm Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * x86: include x86inc.asm in x86util.asmDiego Biurrun2012-10-31
| | | | | | | | This is necessary to allow refactoring some x86util macros with cpuflags.
* | Merge commit 'f6c38c5f4ed6683a6a61db2ed418a68bbe5f5507'Michael Niedermayer2012-10-13
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * commit 'f6c38c5f4ed6683a6a61db2ed418a68bbe5f5507': avfilter: call x86 init functions under if (ARCH_X86), not if (HAVE_MMX) rtspdec: Set the default port for listen mode, if none is specified tscc2: Fix an out of array access rtmpproto: Fix an out of array write rtspdec: Fix use of uninitialized byte vp8: reset loopfilter delta values at keyframes. avutil: add yuva422p and yuva444p formats Conflicts: libavutil/pixdesc.c libavutil/pixfmt.h tests/ref/lavfi/pixdesc tests/ref/lavfi/pixfmts_copy tests/ref/lavfi/pixfmts_null tests/ref/lavfi/pixfmts_scale tests/ref/lavfi/pixfmts_vflip Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * avfilter: call x86 init functions under if (ARCH_X86), not if (HAVE_MMX)Diego Biurrun2012-10-12
| |
* | hqdn3d: Fix out of array read in LOWPASSLoren Merritt2012-09-22
| | | | | | | | | | | | | | Fixes ticket1752 Commit message by commiter Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* | Merge remote-tracking branch 'qatar/master'Michael Niedermayer2012-08-31
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * qatar/master: MSS1 and MSS2: set final pixel format after common stuff has been initialised MSS2 decoder configure: handle --disable-asm before check_deps x86: Split inline and external assembly #ifdefs configure: x86: Separate inline from standalone assembler capabilities pktdumper: Use a custom define instead of PATH_MAX for buffers pktdumper: Use av_strlcpy instead of strncpy pktdumper: Use sizeof(variable) instead of the direct buffer length Conflicts: Changelog configure libavcodec/allcodecs.c libavcodec/avcodec.h libavcodec/codec_desc.c libavcodec/dct-test.c libavcodec/imgconvert.c libavcodec/mss12.c libavcodec/version.h libavfilter/x86/gradfun.c libswscale/x86/yuv2rgb.c Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * x86: Split inline and external assembly #ifdefsDiego Biurrun2012-08-31
| |
* | Merge commit 'ec36aa69448f20a78d8c4588265022e0b2272ab5'Michael Niedermayer2012-08-31
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * commit 'ec36aa69448f20a78d8c4588265022e0b2272ab5': x86: Fix linking with some or all of yasm, mmx, optimizations disabled configure: Add more fine-grained SSE CPU capabilities flags avfilter: x86: Use more precise compile template names x86: cosmetics: Comment some #endifs for better readability g723_1: add comfort noise generation utvideoenc: Switch to dsputils' median prediction utvideoenc: Avoid writing into the input picture avtools: remove the distinction between func_arg and func2_arg. avconv: make the -passlogfile option per-stream. avconv: make the -pass option per-stream. cmdutils: make -codecs print lossy/lossless flags. lavc: add lossy/lossless codec properties. Conflicts: Changelog cmdutils.c configure doc/APIchanges ffmpeg.h ffmpeg_opt.c ffprobe.c libavcodec/codec_desc.c libavcodec/g723_1.c libavcodec/utvideoenc.c libavcodec/version.h libavcodec/x86/mpegaudiodec.c libavcodec/x86/rv40dsp_init.c Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * avfilter: x86: Use more precise compile template namesDiego Biurrun2012-08-30
| |
* | Merge remote-tracking branch 'qatar/master'Michael Niedermayer2012-08-26
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * qatar/master: audio_frame_queue: Clean up ff_af_queue_log_state debug function dwt: Remove unused code. cavs: convert cavsdata.h to a .c file cavs: Move inline functions only used in one file out of the header cavs: Move data tables used in only one place to that file fate: Add a single symbol Ut Video decoder test vf_hqdn3d: x86 asm vf_hqdn3d: support 16bit colordepth avconv: prefer user-forced input framerate when choosing output framerate Conflicts: ffmpeg.c libavcodec/audio_frame_queue.c libavcodec/dwt.c Merged-by: Michael Niedermayer <michaelni@gmx.at>