| Commit message (Collapse) | Author | Age |
| |
|
| |
|
|
|
|
|
|
|
| |
Also disable ff_rv34_idct_dc_add_mmx on x86_64 as the presence of sse2
is guaranteed in such builds.
Signed-off-by: James Almer <jamrial@gmail.com>
|
|
|
|
|
|
|
| |
Also disable ff_vp8_idct_dc_add_mmx on x86_64 as the presence of sse2
is guaranteed in such builds.
Signed-off-by: James Almer <jamrial@gmail.com>
|
|
|
|
|
|
|
|
|
| |
The assumption this is based on is wrong, the code is not always run with bitexact flags
This reverts commit a956164e1eb3418922cae949f02ad4035f013213, reversing
changes made to f6005907fdeb9e4de37568ed5c1a8e7b869126f6.
Approved-by: James Almer <jamrial@gmail.com>
|
|\
| |
| |
| |
| |
| |
| | |
* commit 'd06dfaa5cbdd20acfd2364b16c0f4ae4ddb30a65':
x86: huffyuv: Use EXTERNAL_SSSE3_FAST convenience macro where appropriate
Merged-by: James Almer <jamrial@gmail.com>
|
| | |
|
|\|
| |
| |
| |
| |
| |
| | |
* commit '4efab89332ea39a77145e8b15562b981d9dbde68':
x86: Use *_FAST/*_SLOW CPU feature detection macros where appropriate
Merged-by: James Almer <jamrial@gmail.com>
|
| | |
|
|\|
| |
| |
| |
| |
| |
| | |
* commit '0a39c9ac0bfd7345fe676b4e2707d9cec3cbb553':
x86: hpeldsp: Don't check for bitexact flag when initializing VP3-specific code
Merged-by: James Almer <jamrial@gmail.com>
|
| |
| |
| |
| | |
That code is only ever initialized with that flag set.
|
|\|
| |
| |
| |
| |
| |
| | |
* commit '95c1df929b92d81454656c222a35ec5f7db576b4':
x86: hpeldsp: Drop unused function parameters
Merged-by: James Almer <jamrial@gmail.com>
|
| | |
|
|\|
| |
| |
| |
| |
| |
| | |
* commit 'c3e83ad3b7d75f3597f47ada2616ba4479665009':
x86: hpeldsp: Use EXTERNAL_SSE2_FAST where appropriate
Merged-by: James Almer <jamrial@gmail.com>
|
| | |
|
|\|
| |
| |
| |
| |
| |
| | |
* commit '1dfc3cf89d0eb026af28be46294b85d79499ffb5':
x86: hpeldsp: Split off VP3-specific bits into a separate file
Merged-by: James Almer <jamrial@gmail.com>
|
| | |
|
| | |
|
|\|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
* commit 'fca3c3b61952aacc45e9ca54d86a762946c21942':
hevc: Add AVX2 DC IDCT
Mostly noop as we already have that code.
In the ASM, code is merged with the exception of SECTION which is kept
uppercase for consistency with the rest of the codebase.
Still in the ASM, the prototype comment is fixed to honor the '_' added
from the original commit.
idct_dc_proto() is dropped as it's not used anymore here.
Merged-by: Clément Bœsch <cboesch@gopro.com>
|
| |
| |
| |
| |
| |
| |
| | |
Originally written by Pierre Edouard Lepere <pierre-edouard.lepere@insa-rennes.fr>.
Integrated to Libav by Josh de Kock <josh@itanimul.li>.
Signed-off-by: Alexandra Hájková <alexandra@khirnov.net>
|
|\|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
* commit '1bd890ad173d79e7906c5e1d06bf0a06cca4519d':
hevc: Separate adding residual to prediction from IDCT
This commit should be a noop but isn't because of the following renames:
- transform_add → add_residual
- transform_skip → dequant
- idct_4x4_luma → transform_4x4_luma
Merged-by: Clément Bœsch <cboesch@gopro.com>
|
| |
| |
| |
| | |
Signed-off-by: James Almer <jamrial@gmail.com>
|
| |
| |
| |
| | |
Signed-off-by: James Almer <jamrial@gmail.com>
|
| |
| |
| |
| | |
Signed-off-by: James Almer <jamrial@gmail.com>
|
| |
| |
| |
| | |
Signed-off-by: James Almer <jamrial@gmail.com>
|
| |
| |
| |
| |
| |
| | |
Several codecs other than huffyuv use them.
Signed-off-by: James Almer <jamrial@gmail.com>
|
| |
| |
| |
| |
| |
| |
| | |
make fate passes
Reviewed-by: "Ronald S. Bultje" <rsbultje@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
fixes `operation size not specified` errors as described here:
http://stackoverflow.com/questions/36854583/compiling-ffmpeg-for-kali-linux-2
I rebuilt again with yasm and made sure it didn't break that.
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
|
| |
| |
| |
| | |
Signed-off-by: Paul B Mahol <onemda@gmail.com>
|
| |
| |
| |
| | |
32-bit msvc.
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Yorkfield:
- mmx2: 2.53x (504 vs. 199 cycles)
- sse2: 3.83x (504 vs. 131 cycles)
Nehalem:
- mmx2: 2.42x (365 vs. 151 cycles)
- sse2: 3.56x (365 vs. 103 cycles)
Skylake:
- mmx2: 1.81x (308 vs. 170 cycles)
- sse2: 2.84x (308 vs. 108 cycles)
- avx: 2.93x (308 vs. 105 cycles)
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Yorkfield:
- mmx2: 2.45x (279 vs. 114 cycles)
- sse2: 3.36x (279 vs. 83 cycles)
Nehalem:
- mmx2: 2.10x (192 vs. 92 cycles)
- sse2: 2.84x (192 vs. 68 cycles)
Skylake:
- mmx2: 1.75x (170 vs. 97 cycles)
- sse2: 2.47x (170 vs. 69 cycles)
- avx: 2.47x (170 vs. 69 cycles)
|
| | |
|
| | |
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Yorkfield:
- sse2:
- complex: 4.13x faster (1514 vs. 367 cycles)
- simple: 4.38x faster (1836 vs. 419 cycles)
Skylake:
- sse2:
- complex: 3.61x faster ( 936 vs. 260 cycles)
- simple: 3.97x faster (1126 vs. 284 cycles)
- avx (versus sse2):
- complex: 1.07x faster (260 vs. 244 cycles)
- simple: 1.03x faster (284 vs. 274 cycles)
|
| |
| |
| |
| | |
2.87 times faster (1830 vs. 638 cycles)
|
| |
| |
| |
| | |
2.1 times faster (401 vs. 194 cycles)
|
| |
| |
| |
| |
| |
| | |
Fixes compilation with Yasm 1.1.0 and older.
Signed-off-by: James Almer <jamrial@gmail.com>
|
| |
| |
| |
| |
| |
| | |
Also a small cosmetic change to the avx2 idct16 version to make it
explicit that one of the arguments to the write-out macros is unused
for >=avx2 (it uses pmovzxbw instead of punpcklbw).
|
|\|
| |
| |
| |
| |
| |
| | |
* commit '4a081f224e12f4227ae966bcbdd5384f22121ecf':
libavcodec: fix constness in clobber test avcodec_open2() wrappers
Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
|
| |
| |
| |
| | |
Signed-off-by: Martin Storsjö <martin@martin.st>
|
| |
| |
| |
| |
| |
| |
| |
| | |
Thanks to Mathieu Malaterre <malat@debian.org> for reporting the
Que/Queue typo. (https://bugs.debian.org/839542)
Reviewed-by: Lou Logan <lou@lrcd.com>
Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Performance improvements:
quant_bands:
with: 681 decicycles in quant_bands, 8388453 runs, 155 skips
without: 1190 decicycles in quant_bands, 8388386 runs, 222 skips
Around 42% for the function
Twoloop coder:
abs_pow34:
with/without: 7.82s/8.17s
Around 4% for the entire encoder
Both:
with/without: 7.15s/8.17s
Around 12% for the entire encoder
Fast coder:
abs_pow34:
with/without: 3.40s/3.77s
Around 10% for the entire encoder
Both:
with/without: 3.02s/3.77s
Around 20% faster for the entire encoder
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
Tested-by: Michael Niedermayer <michael@niedermayer.cc>
Reviewed-by: James Almer <jamrial@gmail.com>
|
| |
| |
| |
| | |
Signed-off-by: James Almer <jamrial@gmail.com>
|
| |
| |
| |
| |
| | |
Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
|
| |
| |
| |
| |
| |
| |
| | |
Fixes trac 5579
Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com>
Acked-by: Michael Niedermayer <michael@niedermayer.cc>
|
| |
| |
| |
| | |
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
|
| |
| |
| |
| |
| |
| |
| | |
Clean some header includes and use the same naming scheme as
in ttaencdsp
Signed-off-by: James Almer <jamrial@gmail.com>
|
| |
| |
| |
| | |
Signed-off-by: James Almer <jamrial@gmail.com>
|
|\|
| |
| |
| |
| |
| |
| | |
* commit '9df889a5f116c1ee78c2f239e0ba599c492431aa':
h264: rename h264.[ch] to h264dec.[ch]
Merged-by: Clément Bœsch <u@pkh.me>
|