| Commit message (Collapse) | Author | Age |
|
|
|
|
|
| |
In preparation for implementation of skip_frame.
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
|
|
|
|
|
|
| |
Continues where commit 52c75d486ed5f75cbb79e5dbd07b7aef24f3071f left off.
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
|
|
|
|
|
|
|
|
|
| |
Fixes: runtime error: left shift of negative value -1
Fixes: 2299/clusterfuzz-testcase-minimized-4843509351710720
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
|
|\
| |
| |
| |
| |
| |
| | |
* commit '4abe3b049d987420eb891f74a35af2cebbf52144':
hevc: rename hevc.[ch] to hevcdec.[ch]
Merged-by: Clément Bœsch <u@pkh.me>
|
| |
| |
| |
| |
| | |
This is more consistent with the rest of libav and frees up the hevc.h
name for decoder-independent shared declarations.
|
|\|
| |
| |
| |
| |
| |
| | |
* commit 'ba479f3daafc7e4359ec1212164569ebe59f0bb7':
hevc: Change type of array stride parameters to ptrdiff_t
Merged-by: James Almer <jamrial@gmail.com>
|
| |
| |
| |
| | |
ptrdiff_t is the correct type for array strides and similar.
|
|\|
| |
| |
| |
| |
| |
| | |
* commit '4f81f8dba735c212efae077c4fec8ad4fe53b352':
Drop unnecessary golomb.h #includes
Merged-by: Clément Bœsch <clement@stupeflix.com>
|
| | |
|
|\|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
* commit '059a934806d61f7af9ab3fd9f74994b838ea5eba':
lavc: Consistently prefix input buffer defines
Conflicts:
doc/examples/decoding_encoding.c
libavcodec/4xm.c
libavcodec/aac_adtstoasc_bsf.c
libavcodec/aacdec.c
libavcodec/aacenc.c
libavcodec/ac3dec.h
libavcodec/asvenc.c
libavcodec/avcodec.h
libavcodec/avpacket.c
libavcodec/dvdec.c
libavcodec/ffv1enc.c
libavcodec/g2meet.c
libavcodec/gif.c
libavcodec/h264.c
libavcodec/h264_mp4toannexb_bsf.c
libavcodec/huffyuvdec.c
libavcodec/huffyuvenc.c
libavcodec/jpeglsenc.c
libavcodec/libxvid.c
libavcodec/mdec.c
libavcodec/motionpixels.c
libavcodec/mpeg4videodec.c
libavcodec/mpegvideo.c
libavcodec/noise_bsf.c
libavcodec/nuv.c
libavcodec/nvenc.c
libavcodec/options.c
libavcodec/parser.c
libavcodec/pngenc.c
libavcodec/proresenc_kostya.c
libavcodec/qsvdec.c
libavcodec/svq1enc.c
libavcodec/tiffenc.c
libavcodec/truemotion2.c
libavcodec/utils.c
libavcodec/utvideoenc.c
libavcodec/vc1dec.c
libavcodec/wmalosslessdec.c
libavformat/adxdec.c
libavformat/aiffdec.c
libavformat/apc.c
libavformat/apetag.c
libavformat/avidec.c
libavformat/bink.c
libavformat/cafdec.c
libavformat/flvdec.c
libavformat/id3v2.c
libavformat/isom.c
libavformat/matroskadec.c
libavformat/mov.c
libavformat/mpc.c
libavformat/mpc8.c
libavformat/mpegts.c
libavformat/mvi.c
libavformat/mxfdec.c
libavformat/mxg.c
libavformat/nutdec.c
libavformat/oggdec.c
libavformat/oggparsecelt.c
libavformat/oggparseflac.c
libavformat/oggparseopus.c
libavformat/oggparsespeex.c
libavformat/omadec.c
libavformat/rawdec.c
libavformat/riffdec.c
libavformat/rl2.c
libavformat/rmdec.c
libavformat/rtpdec_latm.c
libavformat/rtpdec_mpeg4.c
libavformat/rtpdec_qdm2.c
libavformat/rtpdec_svq3.c
libavformat/sierravmd.c
libavformat/smacker.c
libavformat/smush.c
libavformat/spdifenc.c
libavformat/takdec.c
libavformat/tta.c
libavformat/utils.c
libavformat/vqf.c
libavformat/westwood_vqa.c
libavformat/xmv.c
libavformat/xwma.c
libavformat/yop.c
Merged-by: Michael Niedermayer <michael@niedermayer.cc>
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
+~9% speed on Core i5 on test sample.
All frames are treated as ref frames, skipping only applies
at level "all". The following mail contains information on
how to improve that:
http://ffmpeg.org/pipermail/ffmpeg-devel/2015-July/176116.html
|
|\|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
* commit 'b11acd57326db6c2cc1475dd0bea2a06fbc85aa2':
hevc: remove HEVCContext usage from hevc_ps
Conflicts:
libavcodec/hevc.c
libavcodec/hevc_cabac.c
libavcodec/hevc_filter.c
libavcodec/hevc_mvs.c
libavcodec/hevc_ps.c
libavcodec/hevc_refs.c
libavcodec/hevcpred_template.c
Merged-by: Michael Niedermayer <michael@niedermayer.cc>
|
| |
| |
| |
| |
| |
| | |
Factor out the parameter sets into a separate struct and use it instead.
This will allow us to reuse this code in the parser.
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
hevc seems to be the only place where the C implementation
of the av_clip function is explicitly selected, precluding
platform-specific optimizations
Signed-off-by: Peter Meerwald <pmeerw@pmeerw.net>
Signed-off-by: Anton Khirnov <anton@khirnov.net>
|
| |
| |
| |
| | |
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
|
| |
| |
| |
| |
| |
| |
| | |
The copies are only needed when data must be restored, so skip them
when it must not be.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Original x86 intrinsics code and initial yasm port by Pierre-Edouard Lepere.
Refactoring and optimizations by James Almer.
Benchmarks of BQTerrace_1920x1080_60_qp22.bin with an Intel Core i5-4200U
Width 32
158583 decicycles in edge, sao_edge_filter_8 runs, 0 skips
5205 decicycles in ff_hevc_sao_edge_filter_32_8_ssse3, 32767 runs, 1 skips
2942 decicycles in ff_hevc_sao_edge_filter_32_8_avx2, 32767 runs, 1 skips
Width 64
705639 decicycles in sao_edge_filter_8, 262144 runs, 0 skips
19224 decicycles in ff_hevc_sao_edge_filter_64_8_ssse3, 262111 runs, 33 skips
10433 decicycles in ff_hevc_sao_edge_filter_64_8_avx2, 262115 runs, 29 skips
Signed-off-by: James Almer <jamrial@gmail.com>
|
| |
| |
| |
| |
| |
| | |
The stride_src parameter is always 2 * MAX_PB_SIZE + FF_INPUT_BUFFER_PADDING_SIZE.
Signed-off-by: James Almer <jamrial@gmail.com>
|
| |
| |
| |
| |
| |
| |
| |
| | |
As with sao_band_filter, pass instead the two variables from the struct needed in the function.
This simplifies writing asm optimized versions.
Reviewed-by: Mickaël Raulet <mraulet@insa-rennes.fr>
Signed-off-by: James Almer <jamrial@gmail.com>
|
| |
| |
| |
| |
| | |
Reviewed-by: Christophe Gisquet <christophe.gisquet@gmail.com>
Reviewed-by: Mickaël Raulet <mraulet@insa-rennes.fr>
|
| |
| |
| |
| |
| |
| | |
Reviewed-by: Michael Niedermayer <michaelni@gmx.at>
Reviewed-by: Christophe Gisquet <christophe.gisquet@gmail.com>
Reviewed-by: Mickaël Raulet <mraulet@insa-rennes.fr>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
requirements in FFmpeg
Use edge emu buffers
And enable the code unconditionally
Speed difference without USE_SAO_SMALL_BUFFER and with the new code:
Decicycles: 26772->26220 (BO32), 83803->80942 (BO64)
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
cherry picked from commit 5d9f79edef2c11b915bdac3a025b59a32082f409
SAO edge filter uses pre-SAO pixel data on the left and top of the ctb, so
this data must be kept available. This was done previously by having 2
copies of the frame, one before and one after SAO.
This commit reduces the storage to just that, instead of the previous whole
frame.
Commit message taken from patch by Christophe Gisquet <christophe.gisquet@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
|
| |
| |
| |
| |
| | |
Found-by: Timothy Gu <timothygu99@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
|
| |
| |
| |
| |
| |
| | |
cherry picked from commit 8e50557707d2ec11ccad657470b2e140f314348e
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
For band filter, source and destination are aligned (except for 16x16 ctbs),
and otherwise, they are most often aligned. Overall, the total width is also
too small for amortizing memcpy.
Timings (using an intrinsic version of edge filters):
B/32 B/64 E/32 E/64
Before: 32045 93952 38925 126896
After: 26772 83803 33942 117182
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Original x86 intrinsics code and initial 8bit yasm port by Pierre-Edouard Lepere.
10/12bit yasm ports, refactoring and optimizations by James Almer
Benchmarks of BQTerrace_1920x1080_60_qp22.bin with an Intel Core i5-4200U
width 32
40338 decicycles in sao_band_filter_0_8, 2048 runs, 0 skips
8056 decicycles in ff_hevc_sao_band_filter_8_32_sse2, 2048 runs, 0 skips
7458 decicycles in ff_hevc_sao_band_filter_8_32_avx, 2048 runs, 0 skips
4504 decicycles in ff_hevc_sao_band_filter_8_32_avx2, 2048 runs, 0 skips
width 64
136046 decicycles in sao_band_filter_0_8, 16384 runs, 0 skips
28576 decicycles in ff_hevc_sao_band_filter_8_32_sse2, 16384 runs, 0 skips
26707 decicycles in ff_hevc_sao_band_filter_8_32_avx, 16384 runs, 0 skips
14387 decicycles in ff_hevc_sao_band_filter_8_32_avx2, 16384 runs, 0 skips
Reviewed-by: Christophe Gisquet <christophe.gisquet@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
|
| |
| |
| |
| |
| |
| |
| | |
Pass instead the two variables from the struct needed in the function.
This simplifies writing asm optimized versions of the function
Signed-off-by: James Almer <jamrial@gmail.com>
|
| |
| |
| |
| | |
Signed-off-by: James Almer <jamrial@gmail.com>
|
|\|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
* commit '7acdd3a1275bcd9cad48f9632169f6bbaeb39d84':
hevc_filter: avoid excessive calls to ff_hevc_get_ref_list()
Conflicts:
libavcodec/hevc_filter.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
|
| |
| |
| |
| |
| |
| |
| |
| | |
1) each of the loops run within a single CTB, so the relevant reference
list is constant
2) when that CTB is, or lies on the same slice as, the current one, we
can use a simple access instead of a relatively expensive call to
ff_hevc_get_ref_list()
|
|\|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
* commit 'a7a17e3f1915ce69b787dc58c5d8dba0910fc0a4':
hevc_filter: move some conditions out of loops
Conflicts:
libavcodec/hevc_filter.c
This is possibly less readable than the variant used before.
Thus please take a look and if people agree its worse, dont
hesitate to revert.
See: 83976e40e89655162e5394cf8915d9b6d89702d9
Merged-by: Michael Niedermayer <michaelni@gmx.at>
|
| | |
|
| |
| |
| |
| |
| | |
Use named constants instead of magic numbers, avoid using variables with
inverse meaning from what their name implies.
|
| |
| |
| |
| | |
The if() around those loops ensures this condition is always false.
|
| |
| |
| |
| |
| | |
ff_hevc_deblocking_boundary_strengths() is never called if the
deblocking filter is disabled for the slice.
|
| |
| |
| |
| |
| |
| |
| | |
The x86 asm expects int32_t so use that type.
Reviewed-by: Mickaël Raulet <mraulet@insa-rennes.fr>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
|
| |
| |
| |
| |
| |
| |
| |
| | |
This should help cache locality. On win64:
Before: 1397x cycles, 16216 bytes
After: 1369x cycles, 16040 bytes
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
|
| |
| |
| |
| |
| |
| |
| | |
Signed-off-by: Mickaël Raulet <mraulet@insa-rennes.fr>
cherry picked from commit 348bebedc0012aae201419669fca1eb61ec93ca6
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
|
| |
| |
| |
| |
| |
| | |
cherry picked from commit 6f58c111ad9920d983bb18eacf901193bac5d937
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
|
|\|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
* commit '73bb8f61d48dbf7237df2e9cacd037f12b84b00a':
hevcdsp: remove an unneeded variable in the loop filter
Conflicts:
libavcodec/hevc_filter.c
See: d7e162d46b4a0fc03ca5161cdcac840152f048cb
Merged-by: Michael Niedermayer <michaelni@gmx.at>
|
| |
| |
| |
| | |
beta0 and beta1 will always be the same
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
beta0 and beta1 will always be the same within a CU
Signed-off-by: Mickaël Raulet <mraulet@insa-rennes.fr>
cherry picked from commit 4a23d824741a289c7d2d2f2871d1e2621b63fa1b
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
|
| |
| |
| |
| |
| |
| | |
cherry picked from commit 7d05c95ac5a63d7675bf645e74b4cf1fffff4796
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
There's a lag of one CTB line for SAO behind deblocking filter, except for
last line. However, once SAO has been completed on a line, all its pixels,
i.e. up to y+ctb_size are filtered and ready to be used as reference.
Without SAO, when deblocking filter finishes a CTB line, only the bottom
bottom 4 pixels may be filtered when next CTB is process by the deblocing.
The await_progess for hevc then checks whether the bottom pixels of a PU
requires access beyond that point, so the reporting should effectively
report up to the the above limits.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
|
| |
| |
| |
| |
| | |
cherry picked from commit 4a16cb2c70728a55d2fd723aff01b13ea259c4df
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
- adding one extra pixel all around the frame
- do not copy when SAO is not applied
5% improvement
cherry picked from commit 10fc29fc19a12c4d8168fbe1a954b76386db12d0
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
|
| |
| |
| |
| |
| |
| |
| |
| | |
- support for 4:2:2 and 4:4:4 up to 12 bits
- add a new profile for range extension
(cherry picked from commit d3c067fa65bbc871758d28aa07f54123430ca346)
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
|
| |
| |
| |
| |
| |
| | |
(cherry picked from commit 8fafc96a9805d11bfe32537c8f78a294a5844065)
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
|
| |
| |
| |
| |
| |
| | |
(cherry picked from commit f2c5f647cec786df26f442a85e6d685a131a50c9)
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
|