| Commit message (Collapse) | Author | Age |
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add support for all x86-64 registers
Prefer caller-saved register over callee-saved on WIN64
Support up to 15 function arguments
Also (by Ronald S. Bultje)
Fix up our asm to work with new x86inc.asm.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
Signed-off-by: Justin Ruggles <justin.ruggles@gmail.com>
|
| |
|
| |
|
|
|
|
|
|
|
| |
Fixes overflows for large image sizes.
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
CC: libav-stable@libav.org
|
|
|
|
| |
Fixes pre-processing with latest versions of nasm.
|
| |
|
|
|
|
|
|
|
| |
Fixes crashes for extremely large resizes (several 100-fold).
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
CC: libav-stable@libav.org
|
|
|
|
|
|
|
|
|
|
|
| |
Revert "swscale: update context offsets after removal of AlpMmxFilter."
(commit a95e3fa90b4190381b65d180eec5a4027075e2da)
and
Revert "swscale: Remove some write-only variables related to alpha handling."
(commit 9d03cb9fc5ddf914920ab0dbe13f19a34c754966).
They broke alpha handling - it's the evil inline asm that still uses that
variable, so it's not truely write-only.
|
|
|
|
|
| |
Prevents crashes on 1-tap filter (unscaled). Also rename "bguf" argument
to "vbuf", seems that was a typo.
|
| |
|
| |
|
|
|
|
|
|
| |
They were introduced in an earlier commit that introduced use of named
arguments. One cause was a typo, a second cause appears to be a bug in
x264asm that I work around by not using named arguments.
|
| |
|
|
|
|
|
| |
"dstw" can collide with the word-version of the "dst" argument, causing
all kind of weird stuff down the pipe.
|
|
|
|
|
| |
Most of the function had been converted before, but I forgot this
particular location.
|
| |
|
| |
|
|
|
|
|
|
| |
Fixes bug 221.
CC: libav-stable@libav.org
|
|
|
|
|
|
|
| |
This will be useful to test more aggressively for failures to mark XMM
registers as clobbered in Win64 builds, and prevent regressions thereof.
Based on a patch by Ramiro Polla <ramiro.polla@gmail.com>
|
| |
|
|
|
|
|
|
| |
This was accidently disabled.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
|
| |
|
|
|
|
| |
Also implement sse2/ssse3/avx versions.
|
|
|
|
| |
This allows combining multiple conditionals in a single statement.
|
| |
|
|
|
|
|
|
|
| |
Additional comments from Måns Rullgard have been integrated
by Reinhard Tartler.
Signed-off-by: Reinhard Tartler <siretart@tauware.de>
|
|
|
|
|
|
| |
Fixes a compilation failure on win64.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
|
|
|
|
| |
Also implement SSE2/AVX variants.
|
|
|
|
|
| |
scale.asm keeps horizontal scaling functions, whereas output.asm gets
the vertical scaling/output functions.
|
| |
|
| |
|
|
|
|
|
|
| |
These semicolons cause invalid empty top-level declarations.
Signed-off-by: Mans Rullgard <mans@mansr.com>
|
|
|
|
|
|
|
| |
The issue had been introduced in
c435653627529e22d74214c2266f571255e404d6
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
|
| |
|
|
|
|
| |
This fixes assembling using "nasm".
|
|
|
|
| |
Also implement MMX/MMX2 versions and SSE4 versions.
|
|
|
|
| |
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
|
|
|
|
| |
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
|
|
|
|
| |
The intermediate buffer is always aligned.
|
|
|
|
|
|
| |
Speed: from 3.9x to 9.6x speed improvement over C, and some small
(up to 15%) speed improvements over existing MMX code (particularly
for bigger filters).
|
|
|
|
|
|
| |
This allows using more specific implementations for chroma/luma, e.g.
we can make assumptions on filterSize being constant, thus avoiding
that test at runtime.
|
| |
|
| |
|
|
|
|
| |
Based on a somewhat similar idea in FFmpeg's swscale copy.
|
|
|
|
|
| |
We would use the second half of the U plane buffer, rather than the
V plane buffer, to output the V plane pixels.
|
|
|
|
|
|
| |
For 9/10bit, it means we don't have to upscale to 16bit before
actual scaling or pixel format conversion, and thus a performance
gain.
|
|
|
|
|
|
| |
This means that precision is retained when scaling between sample
formats with >8 bits per component (48bit RGB, 16bit grayscale,
9/10/16bit YUV).
|
|
|
|
|
|
|
|
| |
Remove unused variables "flags" and "dstFormat" in yuv2packed1,
merge source rows per plane for yuv2packed[12], and make every
source argument int16_t (some where invalidly set to uint16_t).
This prevents stack pollution and is part of the Great Evil Plan
to simplify swscale.
|
|
|
|
|
| |
This prevents a crash when converting to NV12/21 without the bitexact
flags enabled.
|