| Commit message (Collapse) | Author | Age |
|
|
|
|
|
|
| |
The typedefs also exist in the avfft.h header and since typedefs cannot be
legally redefined in C, the code fails to compile with some compilers.
This reverts commits 11c7155cce and 57f1b1dcc7.
|
|
|
|
| |
Signed-off-by: Diego Biurrun <diego@biurrun.de>
|
|
|
|
| |
This fixes a number of incompatible pointer type warnings.
|
|
|
|
| |
Also rearrange #includes into canonical order.
|
| |
|
| |
|
|
|
|
| |
Also merge variable declaration and initialization.
|
|
|
|
| |
Signed-off-by: Diego Biurrun <diego@biurrun.de>
|
| |
|
| |
|
| |
|
| |
|
|
|
|
| |
Without the typedefs there can be trouble depending on #include order.
|
| |
|
|
|
|
| |
It is doubtful if the hack (still) works and Xvid had ten years to fix it.
|
| |
|
|
|
|
| |
Signed-off-by: Diego Biurrun <diego@biurrun.de>
|
|
|
|
|
|
|
|
|
| |
Intrinsics only used on aarch64 since the existing ARMv7 NEON asm
is slightly faster (Cortex-A9, gcc-4.8, micro-benchmarks and full
decoding time).
Signed-off-by: James Yu <james.yu@linaro.org>
Signed-off-by: Janne Grunau <janne-libav@jannau.net>
|
| |
|
| |
|
| |
|
|
|
|
| |
Signed-off-by: Diego Biurrun <diego@biurrun.de>
|
| |
|
|
|
|
|
| |
This reverts commit b31d76e45fc3c6529dd7109e721676f3ec376d00 as it
uses an unkown pixel format.
|
|
|
|
|
|
|
|
| |
Such files can be created using the --bff x264 option.
Sample-Id: h264_direct_temporal_mvs_bff.mkv
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>
|
|
|
|
| |
Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>
|
| |
|
|
|
|
| |
Signed-off-by: Diego Biurrun <diego@biurrun.de>
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
|
|
|
| |
Signed-off-by: Diego Biurrun <diego@biurrun.de>
|
| |
|
|
|
|
| |
Also rename the enum values to be consistent with other DCT permutations.
|
| |
|
| |
|
| |
|
|
|
|
| |
Signed-off-by: Martin Storsjö <martin@martin.st>
|
|
|
|
| |
Signed-off-by: Martin Storsjö <martin@martin.st>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The previous implementation targeted DTS Coherent Acoustics, which only
requires nbits == 4 (fft16()). This case was (and still is) linked directly
rather than being indirected through ff_fft_calc_vfp(), but now the full
range from radix-4 up to radix-65536 is available. This benefits other codecs
such as AAC and AC3.
The implementaion is based upon the C version, with each routine larger than
radix-16 calling a hierarchy of smaller FFT functions, then performing a
post-processing pass. This pass benefits a lot from loop unrolling to
counter the long pipelines in the VFP. A relaxed calling standard also
reduces the overhead of the call hierarchy, and avoiding the excessive
inlining performed by GCC probably helps with I-cache utilisation too.
I benchmarked the result by measuring the number of gperftools samples that
hit anywhere in the AAC decoder (starting from aac_decode_frame()) or
specifically in the FFT routines (fft4() to fft512() and pass()) for the
same sample AAC stream:
Before After
Mean StdDev Mean StdDev Confidence Change
Audio decode 2245.5 53.1 1599.6 43.8 100.0% +40.4%
FFT routines 940.6 22.0 348.1 20.8 100.0% +170.2%
Signed-off-by: Martin Storsjö <martin@martin.st>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The previous implementation targeted DTS Coherent Acoustics, which only
requires mdct_bits == 6. This relatively small size lent itself to
unrolling the loops a small number of times, and encoding offsets
calculated at assembly time within the load/store instructions of each
iteration.
In the more general case (codecs such as AAC and AC3) much larger arrays
are used - mdct_bits == [8, 9, 11]. The old method does not scale for
these cases, so more integer registers are used with non-unrolled versions
of the loops (and with some stack spillage). The postrotation filter loop
is still unrolled by a factor of 2 to permit the double-buffering of some
VFP registers to facilitate overlap of neighbouring iterations.
I benchmarked the result by measuring the number of gperftools samples
that hit anywhere in the AAC decoder (starting from aac_decode_frame())
or specifically in ff_imdct_half_c / ff_imdct_half_vfp, for the same
example AAC stream:
Before After
Mean StdDev Mean StdDev Confidence Change
aac_decode_frame 2368.1 35.8 2117.2 35.3 100.0% +11.8%
ff_imdct_half_* 457.5 22.4 251.2 16.2 100.0% +82.1%
Signed-off-by: Martin Storsjö <martin@martin.st>
|
| |
|
|
|
|
| |
This makes the init files match the structure of the dsputil split.
|
|
|
|
|
| |
Signed-off-by: Diego Biurrun <diego@biurrun.de>
Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>
|
| |
|
| |
|
| |
|