summaryrefslogtreecommitdiff
path: root/libswscale/x86
Commit message (Collapse)AuthorAge
* swscale: yuv2planeX 8bit >=sse2 functions need aligned stack on x86-32.Martin Storsjö2012-07-04
| | | | Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
* build: Move all arch OBJS declarations into arch subdirectory Makefiles.Diego Biurrun2012-04-12
|
* x86inc improvements for 64-bitHenrik Gramner2012-04-11
| | | | | | | | | | | | Add support for all x86-64 registers Prefer caller-saved register over callee-saved on WIN64 Support up to 15 function arguments Also (by Ronald S. Bultje) Fix up our asm to work with new x86inc.asm. Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com> Signed-off-by: Justin Ruggles <justin.ruggles@gmail.com>
* swscale: convert hscale() to use named arguments.Ronald S. Bultje2012-03-14
|
* swscale: convert hscale to cpuflags().Ronald S. Bultje2012-03-14
|
* swscale: make filterPos 32bit.Ronald S. Bultje2012-03-06
| | | | | | | Fixes overflows for large image sizes. Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind CC: libav-stable@libav.org
* swscale: make %rep unconditional.Ronald S. Bultje2012-03-03
| | | | Fixes pre-processing with latest versions of nasm.
* swscale: remove now unnecessary hack.Ronald S. Bultje2012-03-03
|
* swscale: take first/lastline over/underflows into account for MMX.Ronald S. Bultje2012-02-23
| | | | | | | Fixes crashes for extremely large resizes (several 100-fold). Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind CC: libav-stable@libav.org
* Revert two swscale commits.Ronald S. Bultje2012-02-19
| | | | | | | | | | | Revert "swscale: update context offsets after removal of AlpMmxFilter." (commit a95e3fa90b4190381b65d180eec5a4027075e2da) and Revert "swscale: Remove some write-only variables related to alpha handling." (commit 9d03cb9fc5ddf914920ab0dbe13f19a34c754966). They broke alpha handling - it's the evil inline asm that still uses that variable, so it's not truely write-only.
* swscale: make access to filter data conditional on filter type.Ronald S. Bultje2012-02-17
| | | | | Prevents crashes on 1-tap filter (unscaled). Also rename "bguf" argument to "vbuf", seems that was a typo.
* swscale: update context offsets after removal of AlpMmxFilter.Ronald S. Bultje2012-02-17
|
* swscale: Remove some write-only variables related to alpha handling.Diego Biurrun2012-02-14
|
* swscale: fix crashes in yuv2yuvX on x86-32.Ronald S. Bultje2012-02-13
| | | | | | They were introduced in an earlier commit that introduced use of named arguments. One cause was a typo, a second cause appears to be a bug in x264asm that I work around by not using named arguments.
* swscale: convert yuv2yuvX() to using named arguments.Ronald S. Bultje2012-02-12
|
* swscale: rename "dstw" to "w" to prevent name collisions.Ronald S. Bultje2012-02-12
| | | | | "dstw" can collide with the word-version of the "dst" argument, causing all kind of weird stuff down the pipe.
* swscale: use named registers in yuv2yuv1_plane() place.Ronald S. Bultje2012-02-12
| | | | | Most of the function had been converted before, but I forgot this particular location.
* swscale: sign-extend integer function argument to qword on x86-64.Ronald S. Bultje2012-02-08
|
* swscale: make yuv2yuv1 use named registers.Ronald S. Bultje2012-02-07
|
* swscale: fix V plane memory location in bilinear/unscaled RGB/YUYV case.Ronald S. Bultje2012-02-07
| | | | | | Fixes bug 221. CC: libav-stable@libav.org
* win64: add a XMM clobber test configure option.Ronald S. Bultje2012-02-02
| | | | | | | This will be useful to test more aggressively for failures to mark XMM registers as clobbered in Win64 builds, and prevent regressions thereof. Based on a patch by Ramiro Polla <ramiro.polla@gmail.com>
* swscale: implement MMX, SSE2 and AVX functions for RGB32 input.Ronald S. Bultje2012-02-01
|
* swscale: enable dithering in MMX functions.Ronald S. Bultje2012-02-01
| | | | | | This was accidently disabled. Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
* swscale: make rgb24 function macros slightly smaller.Ronald S. Bultje2012-02-01
|
* swscale: convert rgb/bgr24ToY/UV_mmx functions from inline asm to yasm.Ronald S. Bultje2012-01-27
| | | | Also implement sse2/ssse3/avx versions.
* config.asm: change %ifdef directives to %if directives.Ronald S. Bultje2012-01-27
| | | | This allows combining multiple conditionals in a single statement.
* swscale: change yuv2yuvX code to use cpuflag().Ronald S. Bultje2012-01-13
|
* swscale: fix crash in fast_bilinear code when compiled with -mred-zone.Ronald S. Bultje2012-01-10
| | | | | | | Additional comments from Måns Rullgard have been integrated by Reinhard Tartler. Signed-off-by: Reinhard Tartler <siretart@tauware.de>
* swscale: specify register type.Oka Motofumi2012-01-10
| | | | | | Fixes a compilation failure on win64. Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
* swscale: convert yuy2/uyvy/nv12/nv21ToY/UV from inline asm to yasm.Ronald S. Bultje2012-01-08
| | | | Also implement SSE2/AVX variants.
* swscale: split scale.asm.Ronald S. Bultje2012-01-03
| | | | | scale.asm keeps horizontal scaling functions, whereas output.asm gets the vertical scaling/output functions.
* swscale_mmx: drop no longer required parameters from VSCALEX macrosDiego Biurrun2011-12-14
|
* swscale: Mark yuv2planeX_8_mmx as MMX2; it contains MMX2 instructions.Diego Biurrun2011-12-14
|
* Remove extraneous semicolonsMans Rullgard2011-12-11
| | | | | | These semicolons cause invalid empty top-level declarations. Signed-off-by: Mans Rullgard <mans@mansr.com>
* swscale: handle unaligned buffers in yuv2plane1Ronald S. Bultje2011-11-13
| | | | | | | The issue had been introduced in c435653627529e22d74214c2266f571255e404d6 Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
* swscale: write yuv2plane1 MMX/SSE2/SSE4/AVX functions.Ronald S. Bultje2011-11-05
|
* swscale: add missing colons to x86 assembly yuv2planeX.Ronald S. Bultje2011-10-23
| | | | This fixes assembling using "nasm".
* swscale: make yuv2yuvX_10_sse2/avx 8/9/16-bits aware.Ronald S. Bultje2011-10-22
| | | | Also implement MMX/MMX2 versions and SSE4 versions.
* yuv2planeX10 SIMDKieran Kunhya2011-10-22
| | | | Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
* Split out yuv2yuv1 luma and chroma in order to make them generic DSP functionsKieran Kunhya2011-10-22
| | | | Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
* swscale: use aligned move for storage into temporary buffer.Ronald S. Bultje2011-10-11
| | | | The intermediate buffer is always aligned.
* sws: implement MMX/SSE2/SSSE3/SSE4 versions for horizontal scaling.Ronald S. Bultje2011-09-13
| | | | | | Speed: from 3.9x to 9.6x speed improvement over C, and some small (up to 15%) speed improvements over existing MMX code (particularly for bigger filters).
* swscale: split hScale() function pointer into h[cy]Scale().Ronald S. Bultje2011-08-17
| | | | | | This allows using more specific implementations for chroma/luma, e.g. we can make assumptions on filterSize being constant, thus avoiding that test at runtime.
* swscale: use 15-bit intermediates for 9/10-bit scaling.Ronald S. Bultje2011-08-12
|
* swscale: rename uv_off/uv_off2 to uv_off_px/byte.Ronald S. Bultje2011-07-08
|
* swscale: error dithering for 16/9/10-bit to 8-bit.Ronald S. Bultje2011-07-08
| | | | Based on a somewhat similar idea in FFmpeg's swscale copy.
* swscale: fix 16-bit scaling when output is 8-bits.Ronald S. Bultje2011-07-08
| | | | | We would use the second half of the U plane buffer, rather than the V plane buffer, to output the V plane pixels.
* swscale: for >8bit scaling, read in native bit-depth.Ronald S. Bultje2011-07-01
| | | | | | For 9/10bit, it means we don't have to upscale to 16bit before actual scaling or pixel format conversion, and thus a performance gain.
* swscale: implement >8bit scaling support.Ronald S. Bultje2011-06-29
| | | | | | This means that precision is retained when scaling between sample formats with >8 bits per component (48bit RGB, 16bit grayscale, 9/10/16bit YUV).
* swscale: change prototypes of scaled YUV output functions.Ronald S. Bultje2011-06-27
| | | | | | | | Remove unused variables "flags" and "dstFormat" in yuv2packed1, merge source rows per plane for yuv2packed[12], and make every source argument int16_t (some where invalidly set to uint16_t). This prevents stack pollution and is part of the Great Evil Plan to simplify swscale.