summaryrefslogtreecommitdiff
path: root/libavcodec/x86/dsputil_mmx.c
Commit message (Collapse)AuthorAge
...
* dsputil: x86: Convert mpeg4 qpel and dsputil avg to yasmDaniel Kang2013-01-27
| | | | Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
* x86: h264qpel: Move stray comment to the right spot and clarify itDiego Biurrun2013-01-26
|
* dsputil: Separate h264 qpelMans Rullgard2013-01-24
| | | | | | | | | | The sh4 optimizations are removed, because the code is 100% identical to the C code, so it is unlikely to provide any real practical benefit. Signed-off-by: Diego Biurrun <diego@biurrun.de> Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com> Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
* dsputil: remove one array dimension from avg_no_rnd_pixels_tab.Ronald S. Bultje2013-01-22
|
* dsputil: remove avg_no_rnd_pixels8.Ronald S. Bultje2013-01-22
| | | | This is never used.
* Drop DCTELEM typedefDiego Biurrun2013-01-22
| | | | | | It does not help as an abstraction and adds dsputil dependencies. Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
* vorbisdsp: convert x86 simd functions from inline asm to yasm.Ronald S. Bultje2013-01-22
|
* floatdsp: move scalarproduct_float from dsputil to avfloatdsp.Ronald S. Bultje2013-01-22
| | | | This makes the aac decoder and all voice codecs independent of dsputil.
* floatdsp: move vector_fmul_reverse from dsputil to avfloatdsp.Ronald S. Bultje2013-01-22
| | | | | | Now, nellymoserenc and aacenc no longer depends on dsputil. Independent of this patch, wmaprodec also does not depend on dsputil, so I removed it from there also.
* floatdsp: move vector_fmul_add from dsputil to avfloatdsp.Ronald S. Bultje2013-01-22
|
* dsputil: remove butterflies_float_interleave.Ronald S. Bultje2013-01-20
| | | | The function is unused.
* dsputil: drop non-compliant "fast" qpel mc functionsMans Rullgard2013-01-20
| | | | Signed-off-by: Diego Biurrun <diego@biurrun.de>
* Move vorbis_inverse_coupling from dsputil to vorbisdspcontext.Ronald S. Bultje2013-01-19
| | | | | Conveniently (together with Justin's earlier patches), this makes our vorbis decoder entirely independent of dsputil.
* x86: dsputil: Drop some unused macro definitionsDiego Biurrun2013-01-18
|
* lavc: Move vector_fmul_window to AVFloatDSPContextJustin Ruggles2013-01-16
| | | | Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
* lavc: introduce VideoDSPContextRonald S. Bultje2012-12-20
| | | | | | | | Move some functions from dsputil. The idea is that videodsp contains functions that are useful for a large and varied set of video decoders. Currently, it contains emulated_edge_mc() and prefetch(). Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
* x86: fix build without inline asmDiego Biurrun2012-11-26
| | | | | | | | The qpel functions referenced here are not related to h264 and should thus never have been under CONFIG_H264QPEL. Signed-off-by: Mans Rullgard <mans@mansr.com> Signed-off-by: Diego Biurrun <diego@biurrun.de>
* x86: h264: Convert 8-bit QPEL inline assembly to YASMDaniel Kang2012-11-25
| | | | Signed-off-by: Diego Biurrun <diego@biurrun.de>
* x86: h264: Remove 3dnow QPEL codeDaniel Kang2012-11-25
| | | | | | | The only CPUs that have 3dnow and don't have mmxext are 12 years old. Moreover, AMD has dropped 3dnow extensions from newer CPUs. Signed-off-by: Diego Biurrun <diego@biurrun.de>
* x86: dsputil: port to cpuflagsDiego Biurrun2012-11-16
|
* x86: mmx2 ---> mmxext in asm constructsDiego Biurrun2012-11-14
|
* x86: Move optimization suffix to end of function namesDiego Biurrun2012-10-31
| | | | This simplifies cpuflags porting.
* x86: mmx2 ---> mmxext in function namesDiego Biurrun2012-10-31
|
* x86: MMX2 ---> MMXEXT in macro namesDiego Biurrun2012-10-31
|
* x86: mmx2 ---> mmxext in comments and messagesDiego Biurrun2012-10-31
|
* x86: dsputil: kill VLA in gmc_mmx()Mans Rullgard2012-10-05
| | | | | | | | Instead of using an evil VLA, fall back to C version when edge emulation is needed. MPEG4 GMC is a rarely used fringe feature so the speed loss is an acceptable cost for safer code. Signed-off-by: Mans Rullgard <mans@mansr.com>
* x86: dsputil: Move Xvid IDCT put/add functions to a more suitable placeDiego Biurrun2012-09-14
|
* ac3: move ac3_downmix() from dsputil to ac3dspMans Rullgard2012-09-12
| | | | Signed-off-by: Mans Rullgard <mans@mansr.com>
* x86: dsputil: Move specific optimization settings out of global init functionDiego Biurrun2012-09-11
| | | | They belong in the init functions specific to each CPU capability.
* x86: avcodec: Drop silly "_mmx" suffix from dsputil template namesDiego Biurrun2012-09-07
|
* cavsdsp: set idct permutation independently of dsputilMans Rullgard2012-09-07
| | | | | | | CAVS uses its own idct so using dsputil to set the permutation is fragile. Signed-off-by: Mans Rullgard <mans@mansr.com>
* x86: allow using add_hfyu_median_prediction_cmov on any cpu with cmovMans Rullgard2012-09-07
| | | | | | | | | For some reason add_hfyu_median_prediction_cmov is only selected on 3Dnow-capable CPUs, even though it uses no 3Dnow instructions. This patch allows it to be selected on any cpu with cmov with the possibility of being overridden by the mmxext version. Signed-off-by: Mans Rullgard <mans@mansr.com>
* x86: dsputil: Do not redundantly check for CPU caps before calling init funcsDiego Biurrun2012-09-06
| | | | The init functions check for CPU capabilities on their own already.
* x86: Split inline and external assembly #ifdefsDiego Biurrun2012-08-31
|
* x86: cosmetics: Comment some #endifs for better readabilityDiego Biurrun2012-08-30
|
* x86: avcodec: Drop silly "_mmx" suffixes from filenamesDiego Biurrun2012-08-28
|
* x86: rename libavutil/x86_cpu.h to libavutil/x86/asm.hMans Rullgard2012-08-09
| | | | | | | This puts x86-specific things in the x86/ subdirectory where they belong. Signed-off-by: Mans Rullgard <mans@mansr.com>
* Replace all CODEC_ID_* with AV_CODEC_ID_*Anton Khirnov2012-08-07
|
* x86: build: replace mmx2 by mmxextDiego Biurrun2012-08-03
| | | | | | | Refactoring mmx2/mmxext YASM code with cpuflags will force renames. So switching to a consistent naming scheme beforehand is sensible. The name "mmxext" is more official and widespread and also the name of the CPU flag, as reported e.g. by the Linux kernel.
* x86: Use consistent 3dnowext function and macro name suffixesDiego Biurrun2012-08-03
| | | | | | Currently there is a wild mix of 3dn2/3dnow2/3dnowext. Switching to "3dnowext", which is a more common name of the CPU flag, as reported e.g. by the Linux kernel, unifies this.
* x86: remove libmpeg2 mmx(ext) idct functionsMans Rullgard2012-08-02
| | | | | | | | These functions are not faster than other mmx implementations on any hardware I have been able to test on, and they are horribly inaccurate. There is thus no reason to ever use them. Signed-off-by: Mans Rullgard <mans@mansr.com>
* h264_chromamc_10bit: port x86 simd to cpuflags.Ronald S. Bultje2012-07-27
|
* x86/dsputil: put inline asm under HAVE_INLINE_ASM.Ronald S. Bultje2012-07-25
| | | | | | | This allows compiling with compilers that don't support gcc-style inline assembly. Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
* dsputil_mmx: fix incorrect assembly codeYang Wang2012-07-25
| | | | | | | | | | | | | | | | | | | | | | | | | | In ff_put_pixels_clamped_mmx(), there are two assembly code blocks. In the first block (in the unrolled loop), the instructions "movq 8%3, %%mm1 \n\t", and so forth, have problems. From above instruction, it is clear what the programmer wants: a load from p + 8. But this assembly code doesn’t guarantee that. It only works if the compiler puts p in a register to produce an instruction like this: "movq 8(%edi), %mm1". During compiler optimization, it is possible that the compiler will be able to constant propagate into p. Suppose p = &x[10000]. Then operand 3 can become 10000(%edi), where %edi holds &x. And the instruction becomes "movq 810000(%edx)". That is, it will stride by 810000 instead of 8. This will cause a segmentation fault. This error was fixed in the second block of the assembly code, but not in the unrolled loop. How to reproduce: This error is exposed when we build using Intel C++ Compiler, with IPO+PGO optimization enabled. Crashed when decoding an MJPEG video. Signed-off-by: Michael Niedermayer <michaelni@gmx.at> Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
* x86: dsputil: drop some unused CPU flag debug codeDiego Biurrun2012-07-19
|
* vp3: move idct and loop filter pointers to new vp3dsp contextMans Rullgard2012-07-18
| | | | | | | | This moves all VP3-specific function pointers from dsputil to a new vp3dsp context. There is no reason to ever use the VP3 IDCT where an MPEG2 IDCT is expected or vice versa. Signed-off-by: Mans Rullgard <mans@mansr.com>
* x86: Only use optimizations with cmov if the CPU supports the instructionDiego Biurrun2012-06-23
|
* x86: move some inline asm macros to the only places they are usedMans Rullgard2012-06-23
| | | | Signed-off-by: Mans Rullgard <mans@mansr.com>
* Add a float DSP framework to libavutilJustin Ruggles2012-06-08
| | | | Move vector_fmul() from DSPContext to AVFloatDSPContext.
* Convert vector_fmul range of functions to YASM and add AVX versionsKieran Kunhya2012-05-21
| | | | Signed-off-by: Justin Ruggles <justin.ruggles@gmail.com>