summaryrefslogtreecommitdiff
path: root/libswscale
Commit message (Collapse)AuthorAge
* swscale/swscale_unscaled: fix gbrap10be md5 different on big endian systemLimin Wang2019-11-01
| | | | | | | | | | | | | | | | You can reproduce it by below command: ./ffmpeg -f lavfi -i "testsrc=duration=1:rate=30" -vf format=gbrap10 -vcodec rawvideo \ -pix_fmt gbrap10le -flags +bitexact -sws_flags +accurate_rnd+bitexact -fflags +bitexact \ -frames:v 1 -f nut md5: little-endian: f91e2edd8098276579c1929e5e160416 big-endian: ba4d011dbbdc78ccbf6cc7d698630929 Signed-off-by: Limin Wang <lance.lmwang@gmail.com> Reviewed-by: Paul B Mahol <onemda@gmail.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* swscale/output: Avoid 64bit in Alpha in yuv2ya16_X_c_template()Michael Niedermayer2019-10-16
| | | | Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* swscale/output: Correct Alpha in yuv2ya16_X_c_template()Michael Niedermayer2019-10-16
| | | | | | Untested, no testcase Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* swscale/output: Implement Luma computation from yuv2ya16_X_c_template() ↵Michael Niedermayer2019-10-16
| | | | | | | | | without 64bit This also reverts 21838cad2fc44023ad85e35d5c677e2f8d29a0ef The revert is in this commit to avoid 2 fate updates Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* swscale: Fix AltiVec/VSX build with recent GCCDaniel Kolesa2019-10-04
| | | | | | | | | | | | | The argument to vec_splat_u16 must be a literal. By making the function always inline and marking the arguments const, gcc can turn those into literals, and avoid build errors like: swscale_vsx.c:165:53: error: argument 1 must be a 5-bit signed literal Fixes #7861. Signed-off-by: Daniel Kolesa <daniel@octaforge.org> Signed-off-by: Lauri Kasanen <cand@gmx.com>
* swscale: Replace illegal vector keyword usage in altivec codeDaniel Kolesa2019-10-04
| | | | | | | | | | | | | | | | | | | | | While this technically compiles in current ffmpeg, this is only because ffmpeg is compiled in strict ISO C mode, which disables the builtin 'vector' keyword for AltiVec/VSX. Instead this gets replaced with a macro inside altivec.h, which defines vector to be actually __vector, which accepts random types. Normally, the vector keyword should be used only with plain scalar non-typedef types, such as unsigned int. But we have the vec_(s|u)(8|16|32) macros, which can be used in a portable manner, in util_altivec.h in libavutil. This is also consistent with other AltiVec/VSX code elsewhere in the tree. Fixes #7861. Signed-off-by: Daniel Kolesa <daniel@octaforge.org> Signed-off-by: Lauri Kasanen <cand@gmx.com>
* swscale/utils: Fix invalid left shifts of negative numbersAndreas Rheinhardt2019-09-28
| | | | | | | | Affected the FATE-tests vsynth_lena-dv-411, vsynth1-dv-411, vsynth2-dv-411 and hevc-paramchange-yuv420p.yuv420p10. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* swscale/x86/swscale: Fix undefined left shifts of negative numbersAndreas Rheinhardt2019-09-28
| | | | | | | | | This affected many FATE-tests: The number of failing tests went down from 663 to 344. (Both numbers exclude tests that failed because of unaligned accesses in code that is inside #if HAVE_FAST_UNALIGNED.) Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* swscale/swscale: cosmeticsLimin Wang2019-09-27
| | | | | Signed-off-by: Limin Wang <lance.lmwang@gmail.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* swscale/output: fix signed integer overflow for ya16Paul B Mahol2019-09-26
| | | | Fixes #7666.
* swscale/swscale: delete unwanted assignmentsLimin Wang2019-09-09
| | | | | Signed-off-by: Limin Wang <lance.lmwang@gmail.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* swscale/output: fix some code indentationsLinjie Fu2019-09-06
| | | | | Signed-off-by: Linjie Fu <linjie.fu@intel.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* lsws/ppc/yuv2rgb_altivec: Replace vec_lvsl/vec_perm with vec_xlChip Kerchner2019-08-13
| | | | | | | | | | gcc 6.x and 7.x generate wrong code for little endian machines for the vec_lvsl/vec_perm instruction combos in some cases. The bug was fixed in version 8.x If these instructions are replaced with vec_xl, the problem goes away for all versions of the compilers. Fixes ticket #7124.
* Bump minor versions again on master to keep 4.2 versions separate from masterMichael Niedermayer2019-07-21
| | | | Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* Bump minor versions to separate 4.2 from masterMichael Niedermayer2019-07-21
| | | | Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* swscale/tests/swscale: Lengthen pixfmt name buffer to 21 bytesMichael Niedermayer2019-05-13
| | | | | | Some formats use longer names than 12. Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* libswcale: Fix possible string overflow in test.Adam Richter2019-05-13
| | | | | | | | | | | In libswcale/tests/swcale.c, the function fileTest() calls sscanf in an argument of "%12s" on character srcStr[] and dstStr[], which are only 12 bytes. So, if the input string is 12 characters, a terminating null byte can be written past the end of these arrays. This bug was found by cppcheck. Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* swscale: Add test for isSemiPlanarYUV to pixdesc_queryPhilip Langdale2019-05-12
| | | | | | Lauri had asked me what the semi planar formats were and that reminded me that we could add it to pixdesc_query so we know exactly what the list is.
* swscale: Add support for NV24 and NV42Philip Langdale2019-05-12
| | | | | | | | | | | The implementation is pretty straight-forward. Most of the existing NV12 codepaths work regardless of subsampling and are re-used as is. Where necessary I wrote the slightly different NV24 versions. Finally, the one thing that confused me for a long time was the asm specific x86 path that did an explicit exclusion check for NV12. I replaced that with a semi-planar check and also updated the equivalent PPC code, which Lauri kindly checked.
* swscale/ppc: Shorten power8 tests via a varLauri Kasanen2019-05-07
|
* swscale/ppc: VSX-optimize hScale16To*Lauri Kasanen2019-05-07
| | | | | | | | | | | | | | | | | | ./ffmpeg -loop 1 -s 1200x1440 -i tux16.png \ -s 2400x720 -f rawvideo -y -vframes 5 -pix_fmt yuv420p16le -nostats test.raw ./ffmpeg -loop 1 -s 1200x1440 -i tux16.png \ -s 2400x720 -f rawvideo -y -vframes 5 -pix_fmt yuv420p -nostats test.raw 32-bit mul, power8 only 2x speedup for hScale8To19_vsx (x86 SSE2 is 2.37): 30896 UNITS in hscale, 8192 runs, 0 skips 63956 UNITS in hscale, 8192 runs, 0 skips 2.06 for hScale16To15_vsx: 30531 UNITS in hscale, 8192 runs, 0 skips 63161 UNITS in hscale, 8192 runs, 0 skips
* swscale/ppc: IndentLauri Kasanen2019-05-07
|
* swscale/ppc: VSX-optimize hScale8To19Lauri Kasanen2019-05-07
| | | | | | | | | ./ffmpeg -f lavfi -i yuvtestsrc=duration=1:size=1200x1440 \ -s 2400x720 -f rawvideo -y -vframes 5 -pix_fmt yuv420p16le -nostats test.raw 2.26 speedup (x86 SSE2 is 2.32): 23772 UNITS in hscale, 4096 runs, 0 skips 53862 UNITS in hscale, 4096 runs, 0 skips
* swscale/ppc: VSX-optimize hscale_fastLauri Kasanen2019-04-30
| | | | | | | | | | | | | ./ffmpeg -f lavfi -i yuvtestsrc=duration=1:size=1200x1440 -sws_flags fast_bilinear \ -s 2400x720 -f rawvideo -vframes 5 -pix_fmt abgr -nostats test.raw 4.27 speedup for hyscale_fast: 24796 UNITS in hyscale_fast, 4096 runs, 0 skips 5797 UNITS in hyscale_fast, 4096 runs, 0 skips 4.48 speedup for hcscale_fast: 19911 UNITS in hcscale_fast, 4095 runs, 1 skips 4437 UNITS in hcscale_fast, 4096 runs, 0 skips
* swscale/ppc: VSX-optimize non-full-chroma yuv2rgb_2Lauri Kasanen2019-04-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ./ffmpeg -f lavfi -i yuvtestsrc=duration=1:size=1200x1440 -sws_flags fast_bilinear \ -s 1200x720 -f null -vframes 100 -pix_fmt $i -nostats \ -cpuflags 0 -v error - 32-bit mul, power8 only. ~2x speedup: rgb24 24431 UNITS in yuv2packed2, 16384 runs, 0 skips 13783 UNITS in yuv2packed2, 16383 runs, 1 skips bgr24 24396 UNITS in yuv2packed2, 16384 runs, 0 skips 14059 UNITS in yuv2packed2, 16384 runs, 0 skips rgba 26815 UNITS in yuv2packed2, 16383 runs, 1 skips 12797 UNITS in yuv2packed2, 16383 runs, 1 skips bgra 27060 UNITS in yuv2packed2, 16384 runs, 0 skips 13138 UNITS in yuv2packed2, 16384 runs, 0 skips argb 26998 UNITS in yuv2packed2, 16384 runs, 0 skips 12728 UNITS in yuv2packed2, 16381 runs, 3 skips bgra 26651 UNITS in yuv2packed2, 16384 runs, 0 skips 13124 UNITS in yuv2packed2, 16384 runs, 0 skips This is a low speedup, but the x86 mmx version also gets only ~2x. The mmx version is also heavily inaccurate, while the vsx version has high accuracy.
* swscale/ppc: VSX-optimize yuv2rgb_full_XLauri Kasanen2019-04-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | ./ffmpeg -f lavfi -i yuvtestsrc=duration=1:size=1200x1440 \ -s 1200x720 -f null -vframes 100 -pix_fmt $i -nostats \ -cpuflags 0 -v error - 32-bit mul, power8 only. ~6.4x speedup: rgb24 214278 UNITS in yuv2packedX, 16384 runs, 0 skips 33249 UNITS in yuv2packedX, 16384 runs, 0 skips bgr24 214616 UNITS in yuv2packedX, 16384 runs, 0 skips 33233 UNITS in yuv2packedX, 16384 runs, 0 skips rgba 214517 UNITS in yuv2packedX, 16384 runs, 0 skips 33271 UNITS in yuv2packedX, 16384 runs, 0 skips bgra 214973 UNITS in yuv2packedX, 16384 runs, 0 skips 33397 UNITS in yuv2packedX, 16384 runs, 0 skips argb 214613 UNITS in yuv2packedX, 16384 runs, 0 skips 33310 UNITS in yuv2packedX, 16384 runs, 0 skips bgra 214637 UNITS in yuv2packedX, 16384 runs, 0 skips 33330 UNITS in yuv2packedX, 16384 runs, 0 skips
* swscale/ppc: VSX-optimize yuv2rgb_full_2Lauri Kasanen2019-04-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | ./ffmpeg -f lavfi -i yuvtestsrc=duration=1:size=1200x1440 -sws_flags area \ -s 1200x720 -f null -vframes 100 -pix_fmt $i -nostats \ -cpuflags 0 -v error - 32-bit mul, power8 only. ~4x speedup: rgb24 52763 UNITS in yuv2packed2, 16384 runs, 0 skips 13453 UNITS in yuv2packed2, 16384 runs, 0 skips bgr24 53144 UNITS in yuv2packed2, 16384 runs, 0 skips 13616 UNITS in yuv2packed2, 16384 runs, 0 skips rgba 52796 UNITS in yuv2packed2, 16384 runs, 0 skips 12904 UNITS in yuv2packed2, 16384 runs, 0 skips bgra 52732 UNITS in yuv2packed2, 16384 runs, 0 skips 13262 UNITS in yuv2packed2, 16384 runs, 0 skips argb 52661 UNITS in yuv2packed2, 16384 runs, 0 skips 12879 UNITS in yuv2packed2, 16384 runs, 0 skips bgra 52662 UNITS in yuv2packed2, 16384 runs, 0 skips 12932 UNITS in yuv2packed2, 16384 runs, 0 skips
* swscale/ppc: VSX-optimize non-full-chroma yuv2rgb_1Lauri Kasanen2019-04-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ./ffmpeg -f lavfi -i yuvtestsrc=duration=1:size=1200x1440 -sws_flags fast_bilinear \ -s 1200x1440 -f null -vframes 100 -pix_fmt $i -nostats \ -cpuflags 0 -v error - 32-bit mul, power8 only. 1.8-2.3x speedup: rgb24 18192 UNITS in yuv2packed1, 32767 runs, 1 skips 9983 UNITS in yuv2packed1, 32760 runs, 8 skips bgr24 18665 UNITS in yuv2packed1, 32766 runs, 2 skips 9925 UNITS in yuv2packed1, 32763 runs, 5 skips rgba 20239 UNITS in yuv2packed1, 32767 runs, 1 skips 8794 UNITS in yuv2packed1, 32759 runs, 9 skips bgra 20354 UNITS in yuv2packed1, 32768 runs, 0 skips 8770 UNITS in yuv2packed1, 32761 runs, 7 skips argb 20185 UNITS in yuv2packed1, 32768 runs, 0 skips 8761 UNITS in yuv2packed1, 32761 runs, 7 skips bgra 20360 UNITS in yuv2packed1, 32766 runs, 2 skips 8759 UNITS in yuv2packed1, 32764 runs, 4 skips This is a low speedup, but the x86 mmx version also gets only ~2x. The mmx version is also heavily inaccurate, while the vsx version has high accuracy.
* swscale/ppc: VSX-optimize yuv2422_XLauri Kasanen2019-03-31
| | | | | | | | | | | | | | | | | | ./ffmpeg -f lavfi -i yuvtestsrc=duration=1:size=1200x1440 \ -s 1200x720 -f null -vframes 100 -pix_fmt $i -nostats \ -cpuflags 0 -v error - 7.2x speedup: yuyv422 126354 UNITS in yuv2packedX, 16384 runs, 0 skips 16383 UNITS in yuv2packedX, 16382 runs, 2 skips yvyu422 117669 UNITS in yuv2packedX, 16384 runs, 0 skips 16271 UNITS in yuv2packedX, 16379 runs, 5 skips uyvy422 117310 UNITS in yuv2packedX, 16384 runs, 0 skips 16226 UNITS in yuv2packedX, 16382 runs, 2 skips
* swscale/ppc: VSX-optimize yuv2422_2Lauri Kasanen2019-03-31
| | | | | | | | | | | | | | | | | | ./ffmpeg -f lavfi -i yuvtestsrc=duration=1:size=1200x1440 -sws_flags area \ -s 1200x720 -f null -vframes 100 -pix_fmt $i -nostats \ -cpuflags 0 -v error - 5.1x speedup: yuyv422 19339 UNITS in yuv2packed2, 16384 runs, 0 skips 3718 UNITS in yuv2packed2, 16383 runs, 1 skips yvyu422 19438 UNITS in yuv2packed2, 16384 runs, 0 skips 3800 UNITS in yuv2packed2, 16380 runs, 4 skips uyvy422 19128 UNITS in yuv2packed2, 16384 runs, 0 skips 3721 UNITS in yuv2packed2, 16380 runs, 4 skips
* swscale/ppc: VSX-optimize yuv2422_1Lauri Kasanen2019-03-31
| | | | | | | | | | | | | | | | | | ./ffmpeg -f lavfi -i yuvtestsrc=duration=1:size=1200x1440 \ -s 1200x1440 -f null -vframes 100 -pix_fmt $i -nostats \ -cpuflags 0 -v error - 15.3x speedup: yuyv422 14513 UNITS in yuv2packed1, 32768 runs, 0 skips 949 UNITS in yuv2packed1, 32767 runs, 1 skips yvyu422 14516 UNITS in yuv2packed1, 32767 runs, 1 skips 943 UNITS in yuv2packed1, 32767 runs, 1 skips uyvy422 14530 UNITS in yuv2packed1, 32767 runs, 1 skips 941 UNITS in yuv2packed1, 32766 runs, 2 skips
* swscale/swscale_unscaled: Fix chroma slice heightMichael Niedermayer2019-03-28
| | | | Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* swscale/swscale_unscaled: fixed the issue that when width/height is not ↵Dong, Jerry2019-03-28
| | | | | | | | 2-multiple, transition of nv12 to u/v planes is not completed. Signed-off-by: Dong, Jerry <jerry.dong@intel.com> Signed-off-by: Decai Lin <decai.lin@intel.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* swscale/ppc: VSX-optimize yuv2rgb_fullLauri Kasanen2019-03-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | ./ffmpeg -f lavfi -i yuvtestsrc=duration=1:size=1200x1440 \ -s 1200x1440 -f null -vframes 100 -pix_fmt $i -nostats \ -cpuflags 0 -v error - This uses 32-bit mul, so POWER8 only. The following output formats get about 4.5x speedup: rgb24 39980 UNITS in yuv2packed1, 32768 runs, 0 skips 8774 UNITS in yuv2packed1, 32768 runs, 0 skips bgr24 40069 UNITS in yuv2packed1, 32768 runs, 0 skips 8772 UNITS in yuv2packed1, 32766 runs, 2 skips rgba 39759 UNITS in yuv2packed1, 32768 runs, 0 skips 8681 UNITS in yuv2packed1, 32767 runs, 1 skips bgra 39729 UNITS in yuv2packed1, 32768 runs, 0 skips 8696 UNITS in yuv2packed1, 32766 runs, 2 skips argb 39766 UNITS in yuv2packed1, 32768 runs, 0 skips 8672 UNITS in yuv2packed1, 32766 runs, 2 skips bgra 39784 UNITS in yuv2packed1, 32768 runs, 0 skips 8659 UNITS in yuv2packed1, 32767 runs, 1 skips
* swscale: Remove duplicated codeLauri Kasanen2019-03-27
| | | | In this function, the exact same clamping happens both in the if and unconditionally.
* swscale/ppc: Add av_unused to template vars only used in one includerLauri Kasanen2019-03-20
|
* swscale/ppc: Clean up some mixed decl warningsLauri Kasanen2019-03-20
|
* libswscale/ppc: VSX-optimize 9-16 bit yuv2planeXLauri Kasanen2019-02-05
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ./ffmpeg_g -f rawvideo -pix_fmt rgb24 -s hd1080 -i /dev/zero -pix_fmt yuv420p16be \ -s 1920x1728 -f null -vframes 100 -v error -nostats - 9-14 bit funcs get about 6x speedup, 16-bit gets about 15x. Fate passes, each format tested with an image to video conversion. Only POWER8 includes 32-bit vector multiplies, so POWER7 is locked out of the 16-bit function. This includes the vec_mulo/mule functions too, not just vmuluwm. With TIMER_REPORT skips disabled: yuv420p9le 12412 UNITS in planarX, 131072 runs, 0 skips 73136 UNITS in planarX, 131072 runs, 0 skips yuv420p9be 12481 UNITS in planarX, 131072 runs, 0 skips 73410 UNITS in planarX, 131072 runs, 0 skips yuv420p10le 12322 UNITS in planarX, 131072 runs, 0 skips 72546 UNITS in planarX, 131072 runs, 0 skips yuv420p10be 12291 UNITS in planarX, 131072 runs, 0 skips 72935 UNITS in planarX, 131072 runs, 0 skips yuv420p12le 12316 UNITS in planarX, 131072 runs, 0 skips 72708 UNITS in planarX, 131072 runs, 0 skips yuv420p12be 12319 UNITS in planarX, 131072 runs, 0 skips 72577 UNITS in planarX, 131072 runs, 0 skips yuv420p14le 12259 UNITS in planarX, 131072 runs, 0 skips 72516 UNITS in planarX, 131072 runs, 0 skips yuv420p14be 12440 UNITS in planarX, 131072 runs, 0 skips 72962 UNITS in planarX, 131072 runs, 0 skips yuv420p16le 10548 UNITS in planarX, 131072 runs, 0 skips 73429 UNITS in planarX, 131072 runs, 0 skips yuv420p16be 10634 UNITS in planarX, 131072 runs, 0 skips 150959 UNITS in planarX, 131072 runs, 0 skips Signed-off-by: Lauri Kasanen <cand@gmx.com>
* swscale/yuv2rgb: Return a more specific error code from ↵Michael Niedermayer2019-01-01
| | | | | | | ff_yuv2rgb_c_init_tables() Reviewed-by: Paul B Mahol <onemda@gmail.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* swscale/output: Altivec-optimize float yuv2plane1Lauri Kasanen2018-12-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This function wouldn't benefit from VSX instructions, so I put it under altivec. ./ffmpeg_g -f rawvideo -pix_fmt rgb24 -s hd1080 -i /dev/zero -pix_fmt grayf32le \ -f null -vframes 100 -v error -nostats - 3743 UNITS in planar1, 65495 runs, 41 skips -cpuflags 0 23511 UNITS in planar1, 65530 runs, 6 skips grayf32be 4647 UNITS in planar1, 65449 runs, 87 skips -cpuflags 0 28608 UNITS in planar1, 65530 runs, 6 skips The native speedup is 6.28133, and the bswapping one 6.15623. Fate passes, each format tested with an image to video conversion. Signed-off-by: Lauri Kasanen <cand@gmx.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* swscale/output: VSX-optimize 16-bit yuv2plane1Lauri Kasanen2018-12-14
| | | | | | | | | | | | | | | | | | ./ffmpeg_g -f rawvideo -pix_fmt rgb24 -s hd1080 -i /dev/zero -pix_fmt yuv420p16le \ -f null -vframes 100 -v error -nostats - 2120 UNITS in planar1, 65393 runs, 143 skips -cpuflags 0 19157 UNITS in planar1, 65512 runs, 24 skips 9.03632 speedup, 16be similarly. Fate passes, each format tested with an image to video conversion. Signed-off-by: Lauri Kasanen <cand@gmx.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* swscale/output: VSX-optimize nbps yuv2plane1Lauri Kasanen2018-12-12
| | | | | | | | | | | | | | | | | | | ./ffmpeg_g -f rawvideo -pix_fmt rgb24 -s hd1080 -i /dev/zero -pix_fmt yuv420p9le \ -f null -vframes 100 -v error -nostats - Speedups: yuv2plane1_9BE_vsx 11.2042 yuv2plane1_9LE_vsx 11.156 yuv2plane1_10BE_vsx 9.89428 yuv2plane1_10LE_vsx 10.3637 yuv2plane1_12BE_vsx 9.71923 yuv2plane1_12LE_vsx 11.0404 yuv2plane1_14BE_vsx 10.1763 yuv2plane1_14LE_vsx 11.2728 Fate passes, each format tested with an image to video conversion. Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* swscale/ppc: Move VSX-using code to its own fileLauri Kasanen2018-12-04
| | | | | | | | Passes fate on LE (with "lavc/jrevdct: Avoid an aliasing violation" applied). Signed-off-by: Lauri Kasanen <cand@gmx.com> Tested-by: Michael Kostylev on BE Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* swscale/output: Altivec-optimize yuv2plane1_8Lauri Kasanen2018-11-26
| | | | | | | | | | | | | | | | | | | | | ./ffmpeg_g -f rawvideo -pix_fmt rgb24 -s hd1080 -i /dev/zero -pix_fmt yuv420p \ -f null -vframes 100 -v error -nostats - 1158 UNITS in planar1, 65528 runs, 8 skips -cpuflags 0 19082 UNITS in planar1, 65533 runs, 3 skips 16.48 speedup ratio. On x86, SSE2 is ~7. Curiously, the Power C version takes as many cycles as the x86 SSE2 version, yikes it's fast. Note that this function uses VSX instructions, but is not marked so. This is because several existing functions also make that mistake. I'll submit a patch moving them once this is reviewed. Signed-off-by: Lauri Kasanen <cand@gmx.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* swscale : add support for YUVA444P12 and YUVA422P12Martin Vignali2018-11-24
|
* Bump minor version for master after 4.1 branchpointMichael Niedermayer2018-11-02
| | | | Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* Bump minor versions for branching 4.1Michael Niedermayer2018-11-02
| | | | Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* swscale/swscale_unscaled : rename packed_16bpc_bswapMartin Vignali2018-10-24
| | | | is used for packed and planar format
* swscale/unscaled : add grayf32 le to beMartin Vignali2018-10-24
|
* swscale/utils : simplify unscaled initial test for float pixfmtMartin Vignali2018-10-24
|