summaryrefslogtreecommitdiff
path: root/libswscale/x86/scale.asm
Commit message (Collapse)AuthorAge
* swscale/x86/swscale: Remove obsolete and harmful MMX(EXT) functionsAndreas Rheinhardt2022-06-22
| | | | | | | | | | | | | | | | | | | | | | | | | | x64 always has MMX, MMXEXT, SSE and SSE2 and this means that some functions for MMX, MMXEXT, SSE and 3dnow are always overridden by other functions (unless one e.g. explicitly disables SSE2). So given that the only systems that benefit from these functions are truely ancient 32bit x86s they are removed. Moreover, some of the removed code was buggy/not bitexact and lead to failures involving the f32le and f32be versions of gray, gbrp and gbrap on x86-32 when SSE2 was not disabled. See e.g. https://fate.ffmpeg.org/report.cgi?time=20220609221253&slot=x86_32-debian-kfreebsd-gcc-4.4-cpuflags-mmx Notice that yuv2yuvX_mmx is not removed, because it is used by SSE3 and AVX2 as fallback in case of unaligned data and also for tail processing. I don't know why yuv2yuvX_mmxext isn't being used for this; an earlier version [1] of 554c2bc7086f49ef5a6a989ad6bc4bc11807eb6f used it, but the version that was eventually applied does not. [1]: https://ffmpeg.org/pipermail/ffmpeg-devel/2020-November/272124.html Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
* Merge commit '994c4bc10751e39c7ed9f67ffd0c0dea5223daf2'James Almer2017-10-21
|\ | | | | | | | | | | | | | | | | * commit '994c4bc10751e39c7ed9f67ffd0c0dea5223daf2': x86util: Port all macros to cpuflags See d5f8a642f6eb1c6e305c41dabddd0fd36ffb3f77 Merged-by: James Almer <jamrial@gmail.com>
| * x86util: Port all macros to cpuflagsDiego Biurrun2017-03-14
| | | | | | | | | | | | Also do some small cosmetic changes: Drop pointless _MMX suffix from ABSD2 macro name, drop pointless check for MMX support, we always assume MMX is available in our SIMD code, fix spelling.
| * swscale: x86: Add some forgotten 12-bit planar YUV casesMichael Niedermayer2016-10-12
| | | | | | | | Signed-off-by: Diego Biurrun <diego@biurrun.de>
* | Merge commit '41ed7ab45fc693f7d7fc35664c0233f4c32d69bb'Clément Bœsch2016-06-21
|\| | | | | | | | | | | | | * commit '41ed7ab45fc693f7d7fc35664c0233f4c32d69bb': cosmetics: Fix spelling mistakes Merged-by: Clément Bœsch <u@pkh.me>
| * cosmetics: Fix spelling mistakesVittorio Giovara2016-05-04
| | | | | | | | Signed-off-by: Diego Biurrun <diego@biurrun.de>
* | x86/scale: fix xmm register count for hscale*_sse2James Almer2014-06-09
| | | | | | | | | | | | | | xmm6 was being clobbered in ff_hscale8to{15,19}_8_sse2 on Win64 Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* | Reinstate proper FFmpeg license for all files.Thilo Borgmann2013-08-30
| |
* | Merge commit '04581c8c77ce779e4e70684ac45302972766be0f'Michael Niedermayer2012-10-31
|\| | | | | | | | | | | | | | | | | | | * commit '04581c8c77ce779e4e70684ac45302972766be0f': x86: yasm: Use complete source path for macro helper %includes Conflicts: Makefile Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * x86: yasm: Use complete source path for macro helper %includesDiego Biurrun2012-10-31
| | | | | | | | | | This is more consistent with the way we handle C #includes and it simplifies the build system.
* | Merge commit '6860b4081d046558c44b1b42f22022ea341a2a73'Michael Niedermayer2012-10-31
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * commit '6860b4081d046558c44b1b42f22022ea341a2a73': x86: include x86inc.asm in x86util.asm cng: Reindent some incorrectly indented lines cngdec: Allow flushing the decoder cngdec: Make the dbov variable have the right unit cngdec: Fix the memset size to cover the full array cngdec: Update the LPC coefficients after averaging the reflection coefficients configure: fix print_config() with broke awks Conflicts: libavcodec/x86/ac3dsp.asm libavcodec/x86/dct32.asm libavcodec/x86/deinterlace.asm libavcodec/x86/dsputil.asm libavcodec/x86/dsputilenc.asm libavcodec/x86/fft.asm libavcodec/x86/fmtconvert.asm libavcodec/x86/h264_chromamc.asm libavcodec/x86/h264_deblock.asm libavcodec/x86/h264_deblock_10bit.asm libavcodec/x86/h264_idct.asm libavcodec/x86/h264_idct_10bit.asm libavcodec/x86/h264_intrapred.asm libavcodec/x86/h264_intrapred_10bit.asm libavcodec/x86/h264_weight.asm libavcodec/x86/vc1dsp.asm libavcodec/x86/vp3dsp.asm libavcodec/x86/vp56dsp.asm libavcodec/x86/vp8dsp.asm Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * x86: include x86inc.asm in x86util.asmDiego Biurrun2012-10-31
| | | | | | | | This is necessary to allow refactoring some x86util macros with cpuflags.
* | sws/x86: add some forgotten 12bit planar yuv casesMichael Niedermayer2012-07-05
| | | | | | | | Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* | Merge remote-tracking branch 'qatar/master'Michael Niedermayer2012-04-13
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * qatar/master: libxvid: remove disabled code qdm2: make a table static const qdm2: simplify bitstream reader setup for some subpacket types qdm2: use get_bits_left() build: Consistently handle conditional compilation for all optimization OBJS. avpacket, bfi, bgmc, rawenc: K&R prettyprinting cosmetics msrle: convert MS RLE decoding function to bytestream2. x86inc improvements for 64-bit Conflicts: common.mak libavcodec/avpacket.c libavcodec/bfi.c libavcodec/msrledec.c libavcodec/qdm2.c Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * x86inc improvements for 64-bitHenrik Gramner2012-04-11
| | | | | | | | | | | | | | | | | | | | | | | | Add support for all x86-64 registers Prefer caller-saved register over callee-saved on WIN64 Support up to 15 function arguments Also (by Ronald S. Bultje) Fix up our asm to work with new x86inc.asm. Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com> Signed-off-by: Justin Ruggles <justin.ruggles@gmail.com>
* | Merge remote-tracking branch 'qatar/master'Michael Niedermayer2012-03-16
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * qatar/master: dxa: remove useless code lavf: don't select an attached picture as default stream for seeking. avconv: remove pointless checks. avconv: check for get_filtered_frame() failure. avconv: remove a pointless check. swscale: convert hscale() to use named arguments. x86inc: add *mp named argument support to DEFINE_ARGS. swscale: convert hscale to cpuflags(). Conflicts: ffmpeg.c libswscale/x86/scale.asm Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * swscale: convert hscale() to use named arguments.Ronald S. Bultje2012-03-14
| |
| * swscale: convert hscale to cpuflags().Ronald S. Bultje2012-03-14
| |
* | Merge remote-tracking branch 'qatar/master'Michael Niedermayer2012-03-07
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * qatar/master: SBR DSP: fix SSE code to not use SSE2 instructions. cpu: initialize mask to -1, so that by default, optimizations are used. error_resilience: initialize s->block_index[]. svq3: protect against negative quantizers. Don't use ff_cropTbl[] for IDCT. swscale: make filterPos 32bit. FATE: add CPUFLAGS variable, mapping to -cpuflags avconv option. avconv: add -cpuflags option for setting supported cpuflags. cpu: add av_set_cpu_flags_mask(). libx264: Allow overriding the sliced threads option avconv: fix counting encoded video size. Conflicts: doc/APIchanges doc/fate.texi doc/ffmpeg.texi ffmpeg.c libavcodec/h264idct_template.c libavcodec/svq3.c libavutil/avutil.h libavutil/cpu.c libavutil/cpu.h libswscale/swscale.c tests/Makefile tests/fate-run.sh tests/regression-funcs.sh Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * swscale: make filterPos 32bit.Ronald S. Bultje2012-03-06
| | | | | | | | | | | | | | Fixes overflows for large image sizes. Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind CC: libav-stable@libav.org
* | Merge remote-tracking branch 'qatar/master'Michael Niedermayer2012-01-28
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * qatar/master: (71 commits) movenc: Allow writing to a non-seekable output if using empty moov movenc: Support adding isml (smooth streaming live) metadata libavcodec: Don't crash in avcodec_encode_audio if time_base isn't set sunrast: Document the different Sun Raster file format types. sunrast: Add a check for experimental type. libspeexenc: use AVSampleFormat instead of deprecated/removed SampleFormat lavf: remove disabled FF_API_SET_PTS_INFO cruft lavf: remove disabled FF_API_OLD_INTERRUPT_CB cruft lavf: remove disabled FF_API_REORDER_PRIVATE cruft lavf: remove disabled FF_API_SEEK_PUBLIC cruft lavf: remove disabled FF_API_STREAM_COPY cruft lavf: remove disabled FF_API_PRELOAD cruft lavf: remove disabled FF_API_NEW_STREAM cruft lavf: remove disabled FF_API_RTSP_URL_OPTIONS cruft lavf: remove disabled FF_API_MUXRATE cruft lavf: remove disabled FF_API_FILESIZE cruft lavf: remove disabled FF_API_TIMESTAMP cruft lavf: remove disabled FF_API_LOOP_OUTPUT cruft lavf: remove disabled FF_API_LOOP_INPUT cruft lavf: remove disabled FF_API_AVSTREAM_QUALITY cruft ... Conflicts: doc/APIchanges libavcodec/8bps.c libavcodec/avcodec.h libavcodec/libx264.c libavcodec/mjpegbdec.c libavcodec/options.c libavcodec/sunrast.c libavcodec/utils.c libavcodec/version.h libavcodec/x86/h264_deblock.asm libavdevice/libdc1394.c libavdevice/v4l2.c libavformat/avformat.h libavformat/avio.c libavformat/avio.h libavformat/aviobuf.c libavformat/dv.c libavformat/mov.c libavformat/utils.c libavformat/version.h libavformat/wtv.c libavutil/Makefile libavutil/file.c libswscale/x86/input.asm libswscale/x86/swscale_mmx.c libswscale/x86/swscale_template.c tests/ref/lavf/ffm Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * config.asm: change %ifdef directives to %if directives.Ronald S. Bultje2012-01-27
| | | | | | | | This allows combining multiple conditionals in a single statement.
* | Merge remote-tracking branch 'qatar/master'Michael Niedermayer2012-01-05
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * qatar/master: (46 commits) mtv: Make sure audio_subsegments is not 0 v4l2: use V4L2_FMT_FLAG_EMULATED only if it is defined avconv: add symbolic names for -vsync parameters flvdec: Fix compiler warning for uninitialized variables rtsp: Fix compiler warning for uninitialized variable ulti: convert to new bytestream API. swscale: Use standard multiple inclusion guards in ppc/ header files. Place some START_TIMER invocations in separate blocks. v4l2: list available formats v4l2: set the proper codec_tag v4l2: refactor device_open v4l2: simplify away io_method v4l2: cosmetics v4l2: uniform and format options v4l2: do not force interlaced mode avio: exit early in fill_buffer without read_packet vc1dec: fix invalid memory access for small video dimensions rv34: fix invalid memory access for small video dimensions rv34: joint coefficient decoding and dequantization avplay: Don't call avio_set_interrupt_cb(NULL) ... Conflicts: Changelog avconv.c doc/APIchanges doc/indevs.texi libavcodec/adxenc.c libavcodec/dnxhdenc.c libavcodec/h264.c libavdevice/v4l2.c libavformat/flvdec.c libavformat/mtv.c libswscale/utils.c Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * swscale: split scale.asm.Ronald S. Bultje2012-01-03
| | | | | | | | | | scale.asm keeps horizontal scaling functions, whereas output.asm gets the vertical scaling/output functions.
* | Merge remote-tracking branch 'qatar/master'Michael Niedermayer2011-12-14
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * qatar/master: (23 commits) applehttp: Properly clean up if unable to probe a segment applehttp: Avoid reading uninitialized memory fate: Replace misleading "aac" in the name of an ADTS test with "adts". fate: Drop pointless "-an" from pictor test command. fate: split off image codec FATE tests into their own file fate: split off WMA codec FATE tests into their own file fate: split off lossless video and audio FATE tests into their own files fate: split off qtrle codec FATE tests into their own file fate: split off Ut Video codec FATE tests into their own file fate: split off screen codec FATE tests into their own file fate: split off Real Inc. codec FATE tests into their own file fate: split off AC-3 codec FATE tests into their own file mpegvideo: remove abort() in ff_find_unused_picture() rv40: NEON optimised loop filter strength selection rv40: rearrange loop filter functions configure: cosmetics: sort some lists where appropriate swscale_mmx: drop no longer required parameters from VSCALEX macros swscale: Mark yuv2planeX_8_mmx as MMX2; it contains MMX2 instructions. build: conditionally compile x86 H.264 chroma optimizations v410 encoder and decoder ... Conflicts: Changelog configure doc/developer.texi doc/general.texi libavcodec/arm/asm.S libavcodec/avcodec.h libavcodec/v410dec.c libavcodec/v410enc.c libavcodec/version.h libavcodec/x86/Makefile libavcodec/x86/dsputil_mmx.c libswscale/x86/swscale_mmx.c tests/Makefile tests/fate2.mak Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * swscale: Mark yuv2planeX_8_mmx as MMX2; it contains MMX2 instructions.Diego Biurrun2011-12-14
| |
* | Merge remote-tracking branch 'qatar/master'Michael Niedermayer2011-11-14
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * qatar/master: lavf: pass options from AVFormatContext to avio. avformat: Use avio_open2, pass the AVFormatContext interrupt_callback onwards avio: add avio_open2, taking an interrupt callback and options avio: add support for passing options to protocols. avio: add and use ffurl_protocol_next(). avformat: Pass the interrupt callback on to chained muxers/demuxers avio: Add an AVIOInterruptCB parameter to ffurl_open/ffurl_alloc avformat: Use ff_check_interrupt avio: Add an internal utility function for checking the new interrupt callback avio: Add AVIOInterruptCB texi2html: remove stray \n doc: prettyfy the texi2html documentation swscale: handle unaligned buffers in yuv2plane1 Conflicts: libavformat/avformat.h libavformat/avio.c libavformat/mov.c Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * swscale: handle unaligned buffers in yuv2plane1Ronald S. Bultje2011-11-13
| | | | | | | | | | | | | | The issue had been introduced in c435653627529e22d74214c2266f571255e404d6 Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
* | Merge remote-tracking branch 'qatar/master'Michael Niedermayer2011-11-07
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * qatar/master: (23 commits) x86inc: use sse versions of common macros instead of sse2 when applicable doc/APIchanges: add missing dates and hashes lavf: don't return from void av_update_cur_dts() Changelog: add more entries. Changelog: update ffmpeg/avconv incompatibility list. avconv: remove some redundant temporary variables. avconv: fix broken indentation avconv: move copy_initial_nonkeyframes to the options context. avconv: use file:stream instead of file.stream in log messages. doc/avconv: elaborate on basic functionality. doc/avconv: -sample_fmts, not -help sample_fmts prints the sample formats openssl: Only use CRYPTO_set_id_callback on OpenSSL < 1.0.0 Call avformat_network_init/deinit in the programs Remove leftover includes of strings.h avutil: Don't allow using strcasecmp/strncasecmp Replace all usage of strcasecmp/strncasecmp avstring: Add locale independent implementations of strcasecmp/strncasecmp avstring: Add locale independent implementations of toupper/tolower cosmetics: insert some spaces in explicit enum value assignments move 8SVX audio codecs to the audio codec list part on the next bump ... Conflicts: avprobe.c doc/APIchanges ffplay.c ffserver.c libavcodec/avcodec.h libavdevice/bktr.c libavdevice/v4l.c libavdevice/v4l2.c libavformat/matroskaenc.c libavformat/wtv.c libavutil/avstring.c libavutil/avstring.h libavutil/avutil.h libswscale/x86/swscale_template.c Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * swscale: write yuv2plane1 MMX/SSE2/SSE4/AVX functions.Ronald S. Bultje2011-11-05
| |
* | Merge remote-tracking branch 'qatar/master'Michael Niedermayer2011-10-24
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * qatar/master: Move id3v2 tag writing to a separate file. swscale: add missing colons to x86 assembly yuv2planeX. g722: split decoder and encoder into separate files cosmetics: remove extra spaces before end-of-statement semi-colons vorbisdec: check output buffer size before writing output wavpack: calculate bpp using av_get_bytes_per_sample() ac3enc: Set max value for mode options correctly lavc: move get_b_cbp() from h263.h to mpeg4videoenc.c mpeg12: move closed_gop from MpegEncContext to Mpeg1Context mpeg12: move full_pel from MpegEncContext to Mpeg1Context mpeg12: move Mpeg1Context from mpeg12.c to mpeg12.h mpegvideo: remove some unused variables from MpegEncContext. Conflicts: libavcodec/mpeg12.c libavformat/mp3enc.c Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * swscale: add missing colons to x86 assembly yuv2planeX.Ronald S. Bultje2011-10-23
| | | | | | | | This fixes assembling using "nasm".
* | Merge remote-tracking branch 'qatar/master'Michael Niedermayer2011-10-23
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * qatar/master: id3v2: fix doxy comment - 'machine byte order' makes no sense on char arrays VC1: restore mistakenly removed code twinvq: check output buffer size before decoding twinvq: return an error when the packet size is too small lavf: export some forgotten symbols with non-av prefixes. swscale: update altivec yuv2planeX asm to new per-plane API. swscale: make yuv2yuvX_10_sse2/avx 8/9/16-bits aware. yuv2planeX10 SIMD swscale: decide whether to use yuv2plane1/X on a per-plane basis. swscale: reintroduce full precision in 16-bit output. Split up yuv2yuvX functions Split out yuv2yuv1 luma and chroma in order to make them generic DSP functions lavc: replace references to deprecated AVCodecContext.error_recognition to use AVCodecContext.err_recognition lavc: translate non-flag-based er options into flag-based ef options at codec open add -err_filter AVOptions to access flag-based error recognition h264_weight: initialize "height" function argument properly. presets: spelling error in libvpx 1080p50_60 avplay: fix fullscreen behaviour with SDL 1.2.14 on Mac OS X Conflicts: ffplay.c libavformat/libavformat.v libswscale/swscale.c libswscale/x86/swscale_template.c tests/ref/lavfi/pixfmts_scale Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * swscale: make yuv2yuvX_10_sse2/avx 8/9/16-bits aware.Ronald S. Bultje2011-10-22
| | | | | | | | Also implement MMX/MMX2 versions and SSE4 versions.
| * yuv2planeX10 SIMDKieran Kunhya2011-10-22
| | | | | | | | Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
* | Merge remote-tracking branch 'qatar/master'Michael Niedermayer2011-10-12
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * qatar/master: (23 commits) fix AC3ENC_OPT_MODE_ON/OFF h264: fix HRD parameters parsing prores: implement multithreading. prores: idct sse2/sse4 optimizations. swscale: use aligned move for storage into temporary buffer. prores: extract idct into its own dspcontext and merge with put_pixels. h264: fix invalid shifts in init_cavlc_level_tab() intfloat_readwrite: fix signed addition overflows mov: do not misreport empty stts mov: cosmetics, fix for and if spacing id3v2: fix NULL pointer dereference mov: read album_artist atom mov: fix disc/track numbers and totals doc: fix references to obsolete presets directories for avconv/ffmpeg flashsv: return more meaningful error value flashsv: fix typo in av_log() message smacker: validate channels and sample format. smacker: check buffer size before reading output size smacker: validate number of channels smacker: Separate audio flags from sample rates in smacker demuxer. ... Conflicts: cmdutils.h doc/ffmpeg.texi libavcodec/Makefile libavcodec/motion_est_template.c libavformat/id3v2.c libavformat/mov.c Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * swscale: use aligned move for storage into temporary buffer.Ronald S. Bultje2011-10-11
| | | | | | | | The intermediate buffer is always aligned.
* | swscale: add 14bit support to the "MMX/SSE2/SSSE3/SSE4 versions for ↵Michael Niedermayer2011-09-14
|/ | | | | | horizontal scaling" Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* sws: implement MMX/SSE2/SSSE3/SSE4 versions for horizontal scaling.Ronald S. Bultje2011-09-13
Speed: from 3.9x to 9.6x speed improvement over C, and some small (up to 15%) speed improvements over existing MMX code (particularly for bigger filters).