| Commit message (Collapse) | Author | Age |
|
|
|
|
|
|
|
|
|
|
| |
This fixes out of array reads and/or infinite loops.
30 is the maximum number of bits that can be read into
coeff_abs below.
CC: libav-stable@libav.org
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
Signed-off-by: Martin Storsjö <martin@martin.st>
|
| |
|
|
|
|
| |
Bug-Id: CID 1087088
|
| |
|
| |
|
|
|
|
| |
Added libavutil/timer.h include to all files with {START,STOP}_TIMER.
|
|
|
|
|
| |
This ensures that decode_cabac_residual_internal actually does get inlined,
which it otherwise does not, even though it is marked as always_inline.
|
| |
|
|
|
|
| |
This way it does not look like a constant.
|
|
|
|
| |
This way it does not look like a constant.
|
|
|
|
| |
This way it does not look like a constant.
|
|
|
|
| |
This way it does not look like a constant.
|
|
|
|
| |
This way it does not look like a constant.
|
|
|
|
| |
This way it does not look like a constant.
|
|
|
|
| |
This way it does not look like a constant.
|
| |
|
| |
|
|
|
|
|
|
|
|
|
| |
Instead, keep them in the bitstream buffer until we read them verbatim,
this saves a memcpy() and a subsequent clearing of the target buffer.
decode_cabac+decode_mb for a sample file (CAPM3_Sony_D.jsv) goes from
6121.4 to 6095.5 cycles, i.e. 26 cycles faster.
Signed-off-by: Martin Storsjö <martin@martin.st>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Most of the changes are just trivial are just trivial replacements of
fields from MpegEncContext with equivalent fields in H264Context.
Everything in h264* other than h264.c are those trivial changes.
The nontrivial parts are:
1) extracting a simplified version of the frame management code from
mpegvideo.c. We don't need last/next_picture anymore, since h264 uses
its own more complex system already and those were set only to appease
the mpegvideo parts.
2) some tables that need to be allocated/freed in appropriate places.
3) hwaccels -- mostly trivial replacements.
for dxva, the draw_horiz_band() call is moved from
ff_dxva2_common_end_frame() to per-codec end_frame() callbacks,
because it's now different for h264 and MpegEncContext-based
decoders.
4) svq3 -- it does not use h264 complex reference system, so I just
added some very simplistic frame management instead and dropped the
use of ff_h264_frame_start(). Because of this I also had to move some
initialization code to svq3.
Additional fixes for chroma format and bit depth changes by
Janne Grunau <janne-libav@jannau.net>
Signed-off-by: Anton Khirnov <anton@khirnov.net>
|
|
|
|
|
|
| |
It does not help as an abstraction and adds dsputil dependencies.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
|
|
|
|
|
|
|
|
|
| |
The variable is copied to subsequent threads at the same time, so this
may cause wrong ref_count[] values to be copied to subsequent threads.
This bug was found using TSAN.
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
|
| |
|
| |
|
|
|
|
|
|
|
|
| |
This removes a dependency on implementation details from generic
code and allows easy addition of the equivalent optimisation for
other architectures than x86.
Signed-off-by: Mans Rullgard <mans@mansr.com>
|
|
|
|
|
|
|
|
|
|
| |
This adds a hand-optimized assembly version for get_cabac much like the
existing one, but it works if the table offsets are RIP-relative.
Compared to the non-RIP-relative version this adds 2 lea instructions
and it needs one extra register. get_cabac() gets about 40% faster, for
an overall speedup of about 5%.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The reason is this is easier for PIC code (in particular on darwin...).
Keep the old names as pointers (static in cabac_functions.h so gcc
knows these are just immediate offsets) so the c code can nicely stay the same
(alternatively could use offsets directly in the functions needing the
tables). This should produce the same code as before with non-pic and better
code (confirmed) with pic.
The assembly uses the new table but still won't work for PIC case.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
|
| |
|
| |
|
|
|
|
|
| |
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
CC: libav-stable@libav.org
|
|
|
|
|
|
|
|
|
|
|
| |
Conversion of the luma intra prediction mode to one of the constrained
("alzheimer") ones can happen by crafting special bitstreams, causing
a crash because we'll call a NULL function pointer for 16x16 block intra
prediction, since constrained intra prediction functions are only
implemented for chroma (8x8 blocks).
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
CC: libav-stable@libav.org
|
|
|
|
|
|
|
|
|
|
|
| |
This fixes standalone compilation of some decoders with --disable-optimizations.
cabac.h defines some inline functions that use symbols from cabac.c. Without
optimizations these inline functions are not eliminated and linking fails with
references to non-existing symbols.
Splitting the inline functions off into their own header and only #including
it in the places where the inline functions are used allows #including cabac.h
from anywhere without ill effects.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Originally, prior to 8742a4ff8, the caller code was compiled
within this condition:
ARCH_X86 && HAVE_7REGS && HAVE_EBX_AVAILABLE && !defined(BROKEN_RELOCATIONS)
Since HAVE_7REGS is defined as
(ARCH_X86_64 || (HAVE_EBX_AVAILABLE && HAVE_EBP_AVAILABLE))
the subcondition HAVE_7REGS && HAVE_EBX_AVAILABLE is equal
to HAVE_7REGS (for 32 bit at least). The correct simplification
of the original condition thus is HAVE_7REGS, not
HAVE_EBX_AVAILABLE.
This fixes compilation in some cases where HAVE_EBP_AVAILABLE = 0
and HAVE_EBX_AVAILABLE = 1.
Signed-off-by: Martin Storsjö <martin@martin.st>
|
|
|
|
|
|
|
| |
On 32-bit OS X with gcc 4.0/4.2 and shared libraries enabled, the ebx register
is not available, but required to assemble the functions.
This reverts commit 8742a4f to a simplified version of the original constraints.
|
|
|
|
| |
The definition and the call site where under different #ifdefs.
|
|
|
|
|
|
| |
Add the namespace to {AC_,DC_,MV_}{END,ERROR} macros
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
|
| |
|
|
|
|
|
| |
Signed-off-by: Diego Biurrun <diego@biurrun.de>
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
|
| |
|
|
|
|
|
| |
FF_COMMON_FRAME holds the contents of the AVFrame structure and is also copied
to struct Picture. Replace by an embedded AVFrame structure in struct Picture.
|
| |
|
|
|
|
| |
Faster H.264 decoding with ALLOW_INTERLACE off.
|
|
|
|
| |
Avoid aliasing, unroll loops, and inline more functions.
|
|
|
|
| |
Note: this is 4:4:4 from the 2007 spec revision, not the previous (now deprecated) 4:4:4 mode in H.264.
|
|
|
|
| |
Needs some ARM/PPC asm modifications.
|
|
|
|
| |
Note: this is 4:4:4 from the 2007 spec revision, not the previous (now deprecated) 4:4:4 mode in H.264.
|
|
|
|
|
|
|
|
| |
In high bit depth, the QP values may now be up to (51 + 6*(bit_depth-8)).
Preparatory patch for high bit depth h264 decoding support.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
|
|
|
|
|
|
|
|
|
|
| |
In high bit depth the pixels will not be stored in uint8_t like in the
normal case, but in uint16_t. The pixel size is thus 1 in normal bit
depth and 2 in high bit depth.
Preparatory patch for high bit depth h264 decoding support.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
|
|
|
|
| |
Signed-off-by: Diego Biurrun <diego@biurrun.de>
|
|
|
|
| |
Signed-off-by: Mans Rullgard <mans@mansr.com>
|
| |
|