| Commit message (Collapse) | Author | Age |
|
|
|
|
|
|
|
|
|
|
| |
Same principle as previous commit, with sufficiently huge rgb2yuv table
values this produces wrong results and undefined behavior.
The unsigned produces the same incorrect results. That is probably
ok as these cases with huge values seem not to occur in any real
use case.
Fixes: signed integer overflow
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Large rgb2yuv tables and high pixel values cause the intermediate
int32_t of ru*r + gu*g + bu*b to exceed INT_MAX, which is undefined
behavior. This causes libswscale built with LLVM -fsanitize=undefined to
assert. Using unsigned integers instead has defined behavior and
produces identical results, and makes rgb64ToUV_c_template match
rgb64ToY_c_template.
Fixes: signed integer overflow
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Up until now, libswscale/input.c used a macro to read
an input pixel which involved a call to av_pix_fmt_desc_get()
to find out whether the input pixel format is BE or LE
despite this being known at compile-time (there are templates
per pixfmt). Even worse, these calls are made in a loop,
so that e.g. there are six calls to av_pix_fmt_desc_get()
for every pair of UV pixel processed in
rgb64ToUV_half_c_template().
This commit modifies these macros to ensure that isBE()
is evaluated at compile-time. This saved 9743B of .text
for me (GCC 11.2, -O3). For a simple RGB64LE->YUV420P
transformation like
ffmpeg -f lavfi -i haldclutsrc,format=rgba64le -pix_fmt yuv420p \
-threads 1 -t 1:00 -f null -
the amount of decicycles spent in rgb64LEToUV_half_c
(which is created via the template mentioned above)
decreases from 19751 to 5341; for RGBA64BE the number
went down from 11945 to 5393. For shared builds (where
the call to av_pix_fmt_desc_get() is indirect) the old numbers
are 15230 for RGBA64BE and 27502 for RGBA64LE, whereas
the numbers with this patch are indistinguishable from
the numbers from a static build.
Also make the macros that are touched conform to the
usual convention of using uppercase names while just at it.
Reviewed-by: Anton Khirnov <anton@khirnov.net>
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
|
|
|
|
|
|
|
|
|
| |
These macros are definitions, not only declarations and therefore
should not contain a semicolon. Such a semicolon is actually
spec-incompliant, but compilers happen to accept them.
Reviewed-by: Philip Langdale <philipl@overt.org>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
|
| |
|
| |
|
|
|
|
|
| |
As we now have three of these formats, I added macros to generate the
conversion functions.
|
| |
|
|
|
|
|
|
| |
As we already have support for VUYA, I figured I should do the small
amount of work to support VUYX as well. That means a little refactoring
to share code.
|
|
|
|
|
|
| |
This is by no means perfect, since at least ddagrab will return scRGB
data with values outside of 0.0f to 1.0f for HDR values.
Its primary purpose is to be able to work with the format at all.
|
| |
|
|
|
|
|
| |
Reviewed-by: Philip Langdale <philipl@overt.org>
Signed-off-by: James Almer <jamrial@gmail.com>
|
|
|
|
| |
Signed-off-by: Anton Khirnov <anton@khirnov.net>
|
|
|
|
|
|
|
|
|
|
| |
Some of these were made possible by moving several common macros to
libavutil/macros.h.
While just at it, also improve the other headers a bit.
Reviewed-by: Martin Storsjö <martin@martin.st>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
if the float pixel * 65535.0f > 2147483647.0f
lrintf may overfow and return negative values, depending on implementation.
nan and +/-inf values may also be implementation defined
clip the value first so lrintf always works.
values < 0.0f, -inf, nan = 0.0f
values > 65535.0f, +inf = 65535.0f
old timings
195960 decicycles in planar_rgbf32le_to_uv, 1 runs, 0 skips
186120 decicycles in planar_rgbf32le_to_uv, 2 runs, 0 skips
188645 decicycles in planar_rgbf32le_to_uv, 4 runs, 0 skips
183625 decicycles in planar_rgbf32le_to_uv, 8 runs, 0 skips
181157 decicycles in planar_rgbf32le_to_uv, 16 runs, 0 skips
177533 decicycles in planar_rgbf32le_to_uv, 32 runs, 0 skips
175689 decicycles in planar_rgbf32le_to_uv, 64 runs, 0 skips
232960 decicycles in planar_rgbf32be_to_uv, 1 runs, 0 skips
221380 decicycles in planar_rgbf32be_to_uv, 2 runs, 0 skips
216640 decicycles in planar_rgbf32be_to_uv, 4 runs, 0 skips
213505 decicycles in planar_rgbf32be_to_uv, 8 runs, 0 skips
211558 decicycles in planar_rgbf32be_to_uv, 16 runs, 0 skips
210596 decicycles in planar_rgbf32be_to_uv, 32 runs, 0 skips
210202 decicycles in planar_rgbf32be_to_uv, 64 runs, 0 skips
161680 decicycles in planar_rgbf32le_to_y, 1 runs, 0 skips
153540 decicycles in planar_rgbf32le_to_y, 2 runs, 0 skips
148255 decicycles in planar_rgbf32le_to_y, 4 runs, 0 skips
140600 decicycles in planar_rgbf32le_to_y, 8 runs, 0 skips
132935 decicycles in planar_rgbf32le_to_y, 16 runs, 0 skips
128531 decicycles in planar_rgbf32le_to_y, 32 runs, 0 skips
140933 decicycles in planar_rgbf32le_to_y, 64 runs, 0 skips
190980 decicycles in planar_rgbf32be_to_y, 1 runs, 0 skips
176080 decicycles in planar_rgbf32be_to_y, 2 runs, 0 skips
167980 decicycles in planar_rgbf32be_to_y, 4 runs, 0 skips
164685 decicycles in planar_rgbf32be_to_y, 8 runs, 0 skips
162751 decicycles in planar_rgbf32be_to_y, 16 runs, 0 skips
162404 decicycles in planar_rgbf32be_to_y, 32 runs, 0 skips
167849 decicycles in planar_rgbf32be_to_y, 64 runs, 0 skips
new timings
183320 decicycles in planar_rgbf32le_to_uv, 1 runs, 0 skips
175700 decicycles in planar_rgbf32le_to_uv, 2 runs, 0 skips
179570 decicycles in planar_rgbf32le_to_uv, 4 runs, 0 skips
172932 decicycles in planar_rgbf32le_to_uv, 8 runs, 0 skips
168707 decicycles in planar_rgbf32le_to_uv, 16 runs, 0 skips
165224 decicycles in planar_rgbf32le_to_uv, 32 runs, 0 skips
163423 decicycles in planar_rgbf32le_to_uv, 64 runs, 0 skips
184940 decicycles in planar_rgbf32be_to_uv, 1 runs, 0 skips
185150 decicycles in planar_rgbf32be_to_uv, 2 runs, 0 skips
185790 decicycles in planar_rgbf32be_to_uv, 4 runs, 0 skips
185472 decicycles in planar_rgbf32be_to_uv, 8 runs, 0 skips
185277 decicycles in planar_rgbf32be_to_uv, 16 runs, 0 skips
185813 decicycles in planar_rgbf32be_to_uv, 32 runs, 0 skips
185332 decicycles in planar_rgbf32be_to_uv, 64 runs, 0 skips
145400 decicycles in planar_rgbf32le_to_y, 1 runs, 0 skips
145100 decicycles in planar_rgbf32le_to_y, 2 runs, 0 skips
143490 decicycles in planar_rgbf32le_to_y, 4 runs, 0 skips
136687 decicycles in planar_rgbf32le_to_y, 8 runs, 0 skips
131271 decicycles in planar_rgbf32le_to_y, 16 runs, 0 skips
128698 decicycles in planar_rgbf32le_to_y, 32 runs, 0 skips
127170 decicycles in planar_rgbf32le_to_y, 64 runs, 0 skips
156020 decicycles in planar_rgbf32be_to_y, 1 runs, 0 skips
146990 decicycles in planar_rgbf32be_to_y, 2 runs, 0 skips
142020 decicycles in planar_rgbf32be_to_y, 4 runs, 0 skips
141052 decicycles in planar_rgbf32be_to_y, 8 runs, 0 skips
138973 decicycles in planar_rgbf32be_to_y, 16 runs, 0 skips
138027 decicycles in planar_rgbf32be_to_y, 32 runs, 0 skips
143939 decicycles in planar_rgbf32be_to_y, 64 runs, 0 skips
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: James Almer <jamrial@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is ment to be a cosmetic change
old timings:
42780 UNITS in grayf32le, 1 runs, 0 skips
56720 UNITS in grayf32le, 2 runs, 0 skips
67265 UNITS in grayf32le, 4 runs, 0 skips
58082 UNITS in grayf32le, 8 runs, 0 skips
63512 UNITS in grayf32le, 16 runs, 0 skips
52720 UNITS in grayf32le, 32 runs, 0 skips
46491 UNITS in grayf32le, 64 runs, 0 skips
68500 UNITS in grayf32be, 1 runs, 0 skips
66930 UNITS in grayf32be, 2 runs, 0 skips
62305 UNITS in grayf32be, 4 runs, 0 skips
55510 UNITS in grayf32be, 8 runs, 0 skips
50216 UNITS in grayf32be, 16 runs, 0 skips
44480 UNITS in grayf32be, 32 runs, 0 skips
42394 UNITS in grayf32be, 64 runs, 0 skips
new timings:
46660 UNITS in grayf32le, 1 runs, 0 skips
51830 UNITS in grayf32le, 2 runs, 0 skips
53390 UNITS in grayf32le, 4 runs, 0 skips
50910 UNITS in grayf32le, 8 runs, 0 skips
44968 UNITS in grayf32le, 16 runs, 0 skips
40349 UNITS in grayf32le, 32 runs, 0 skips
38330 UNITS in grayf32le, 64 runs, 0 skips
39980 UNITS in grayf32be, 1 runs, 0 skips
49630 UNITS in grayf32be, 2 runs, 0 skips
53540 UNITS in grayf32be, 4 runs, 0 skips
59767 UNITS in grayf32be, 8 runs, 0 skips
51206 UNITS in grayf32be, 16 runs, 0 skips
44743 UNITS in grayf32be, 32 runs, 0 skips
41468 UNITS in grayf32be, 64 runs, 0 skips
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
|
|
|
|
| |
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
|
|
|
|
|
| |
Signed-off-by: Manuel Stoeckl <code@mstoeckl.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
|
|
|
|
|
|
|
|
|
|
| |
These inclusions are not necessary, as cpu.h is already included
wherever it is needed (via direct inclusion or via the arch-specific
headers).
Also remove other unnecessary cpu.h inclusions from ordinary
non-headers.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
|
|
|
|
|
|
|
| |
These conversion appears to be exhibiting the same rounding error as the rgbf32 formats where.
I seperated the rounding value from the 16 and 128 offsets, I think it makes it a little more clear.
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
|
|
|
|
| |
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
|
|
|
|
|
|
|
| |
This fixed FATE fail report by filter-pixfmts* for x2rgb10le on big
endian hardware.
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
|
|
|
|
| |
Signed-off-by: Fei Wang <fei.w.wang@intel.com>
|
|
|
|
| |
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
|
|
|
|
| |
Fixes ticket #8509.
|
|
|
|
|
|
|
|
| |
Add swscale input support for Y210LE, output support and fate
test could be added later if there is requirement for software
CSC to this packed format.
Signed-off-by: Linjie Fu <linjie.fu@intel.com>
|
|
|
|
|
|
|
|
| |
Fixes: Invalid shifts
Fixes: #8140
Fixes: #8146
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
|
|
|
|
|
|
|
|
|
|
|
| |
The implementation is pretty straight-forward. Most of the existing
NV12 codepaths work regardless of subsampling and are re-used as is.
Where necessary I wrote the slightly different NV24 versions.
Finally, the one thing that confused me for a long time was the
asm specific x86 path that did an explicit exclusion check for NV12.
I replaced that with a semi-planar check and also updated the
equivalent PPC code, which Lauri kindly checked.
|
| |
|
|
|
|
| |
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
|
|
|
|
| |
Signed-off-by: Paul B Mahol <onemda@gmail.com>
|
| |
|
|\
| |
| |
| |
| |
| |
| | |
* commit 'de8e096c7eda2bce76efd0a1c1c89d37348c2414':
swscale: Consistently order input YUV pixel formats
Merged-by: Clément Bœsch <u@pkh.me>
|
| |
| |
| |
| |
| |
| |
| | |
Follow a 420, 422, 444 order instead of a random one.
This simplifies double-checking additions of new formats.
Signed-off-by: Diego Biurrun <diego@biurrun.de>
|
| |
| |
| |
| | |
Signed-off-by: Anton Khirnov <anton@khirnov.net>
|
| | |
|
| | |
|
| |
| |
| |
| | |
Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>
|
| |
| |
| |
| |
| | |
Fixes warnings like the following:
libswscale/input.c:951:13: warning: ‘planar_rgb14be_to_a’ defined but not used
|
| | |
|
| |
| |
| |
| | |
Based on 19be5fb7 by Luca Barbato.
|
| |
| |
| |
| | |
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
|
| |
| |
| |
| | |
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
|
| |
| |
| |
| | |
Signed-off-by: Paul B Mahol <onemda@gmail.com>
|
| |
| |
| |
| |
| | |
Removed previous swscale code under '#ifndef NEW_FILTER'
and removed unused fields of SwsContext
|
| |
| |
| |
| | |
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
|
| |
| |
| |
| |
| |
| | |
Fixes part of Ticket5264
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
|
| | |
|
| |
| |
| |
| | |
Signed-off-by: Paul B Mahol <onemda@gmail.com>
|