summaryrefslogtreecommitdiff
path: root/libavcodec/x86
Commit message (Collapse)AuthorAge
* mpegaudiodsp: fix x86 and ppc makefilesMans Rullgard2011-05-19
| | | | Signed-off-by: Mans Rullgard <mans@mansr.com>
* Move some mpegaudio functions to new mpegaudiodsp subsystemMans Rullgard2011-05-19
| | | | | | | | This separation allows these functions to be used in a cleaner fashion from other codecs (e.g. qdm2) and simplifies creating optimised versions of them. Signed-off-by: Mans Rullgard <mans@mansr.com>
* 10l: wrap float_interleave functions in HAVE_YASM.Justin Ruggles2011-05-18
| | | | fixes compilation with --disable-yasm
* Add float_interleave() to FmtConvertContext with x86-optimized versions.Justin Ruggles2011-05-18
| | | | | Partially based on patches by clsid2 in ffdshow-tryout. ff_float_interleave6() x86 improvements by Loren Merrit.
* Modify x86util.asm to ease transitioning to 10-bit H.264 assembly.Daniel Kang2011-05-17
| | | | | | | Arguments for variable size instructions are added to many macros, along with other various changes. The x86util.asm code was ported from x264. Signed-off-by: Diego Biurrun <diego@biurrun.de>
* h264dsp_mmx: Add #ifdefs around some mmxext functions on x86_64.Gil Pedersen2011-05-16
| | | | | | This fixes linking errors due to undefined symbols on x86_64 OS X. Signed-off-by: Diego Biurrun <diego@biurrun.de>
* Fix FSF address copy paste error in some license headers.Diego Biurrun2011-05-14
|
* 10-bit H.264 x86 chroma v loopfilter asmJason Garrett-Glaser2011-05-11
| | | | Also delete some unused deblock asm macros.
* Port x86 10-bit H.264 deblock asm from x264Jason Garrett-Glaser2011-05-10
|
* Update x86 H.264 deblock asmJason Garrett-Glaser2011-05-10
| | | | Includes AVX versions from x264.
* h264dsp_mmx: place bracket outside #if/#endif block.Ronald S. Bultje2011-05-10
| | | | Should fix compile on systems missing yasm/nasm.
* Adds 8-, 9- and 10-bit versions of some of the functions used by the h264 ↵Oskar Arvidsson2011-05-10
| | | | | | | | | | | | | | | | | decoder. This patch lets e.g. dsputil_init chose dsp functions with respect to the bit depth to decode. The naming scheme of bit depth dependent functions is <base name>_<bit depth>[_<prefix>] (i.e. the old clear_blocks_c is now named clear_blocks_8_c). Note: Some of the functions for high bit depth is not dependent on the bit depth, but only on the pixel size. This leaves some room for optimizing binary size. Preparatory patch for high bit depth h264 decoding support. Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
* Remove disabled non-optimized code variants.Diego Biurrun2011-04-29
|
* Add AVX FFT implementation.Vitor Sessak2011-04-26
| | | | Signed-off-by: Reinhard Tartler <siretart@tauware.de>
* Update x86inc.asm from x264 to allow AVX emulation using SSE and MMX.Vitor Sessak2011-04-26
| | | | Signed-off-by: Reinhard Tartler <siretart@tauware.de>
* dsputil: allow to skip drawing of top/bottom edges.Alexander Strange2011-03-26
|
* Add apply_window_int16() to DSPContext with x86-optimized versions and use itJustin Ruggles2011-03-22
| | | | in the ac3_fixed encoder.
* Move dct and rdft definitions to separate filesMans Rullgard2011-03-20
| | | | | | | This leaves fft.h with only the core FFT and MDCT definitions thus making it more managable. Signed-off-by: Mans Rullgard <mans@mansr.com>
* Replace FFmpeg with Libav in licence headersMans Rullgard2011-03-19
| | | | Signed-off-by: Mans Rullgard <mans@mansr.com>
* ac3enc: add float_to_fixed24() with x86-optimized versions to AC3DSPContextJustin Ruggles2011-03-17
| | | | and use in scale_coefficients() for the floating-point AC-3 encoder.
* mathops: fix MULL() when the compiler does not inline the function.Justin Ruggles2011-03-15
| | | | | | | If the function is not inlined, an immmediate cannot be used for the shift parameter, so the %cl register must be used instead in that case. This fixes compilation for x86-32 using gcc with --disable-optimizations.
* mathops: change "g" constraint to "rm" in x86-32 version of MUL64().Justin Ruggles2011-03-15
| | | | | The 1-arg imul instruction cannot take an immediate argument, only a register or memory argument.
* mathops: convert MULL/MULH/MUL64 to inline functions rather than macros.Justin Ruggles2011-03-15
| | | | | | This fixes unexpected name collisions that were occurring with variables declared within the macros. It also fixes the fate-acodec-ac3_fixed regression test on x86-32.
* ac3enc: add SIMD-optimized shifting functions for use with the fixed-point ↵Justin Ruggles2011-03-14
| | | | AC3 encoder.
* Add CONFIG_AC3DSP symbol to simplify makefilesMans Rullgard2011-03-12
| | | | Signed-off-by: Mans Rullgard <mans@mansr.com>
* dsputil_mmx.c: remove ff_vector128.Ronald S. Bultje2011-02-19
| | | | Remove ff_vector128, it is identical to ff_pb_80.
* dsputil: move VC1-specific stuff into VC1DSPContext.Ronald S. Bultje2011-02-17
|
* ac3dsp: Change punpckhqdq to movhlps in ac3_max_msb_abs_int16().Justin Ruggles2011-02-16
| | | | Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
* ac3enc: Add x86-optimized function to speed up log2_tab().Justin Ruggles2011-02-13
| | | | | | | AC3DSPContext.ac3_max_msb_abs_int16() finds the maximum MSB of the absolute value of each element in an array of int16_t. Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
* FFT: factor a shuffle out of the inner loop and merge it into fft_permute.Loren Merritt2011-02-13
| | | | | | 6% faster SSE FFT on Conroe, 2.5% on Penryn. Signed-off-by: Janne Grunau <janne-ffmpeg@jannau.net>
* Add x86-optimized versions of exponent_min().Justin Ruggles2011-02-10
| | | | Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
* Fix ff_emu_edge_core_sse() on Win64.Ronald S. Bultje2011-02-08
| | | | | | | Fix emu_edge_v_extend_15 to be <128 bytes on Win64, by being more strict on the size of registers and which registers are being used for operations where multiple are available. This fixes segfaults in emulated_edge() function calls on Win64.
* Separate format conversion DSP functions from DSPContext.Justin Ruggles2011-02-02
| | | | | | | This will be beneficial for use with the audio conversion API without requiring it to depend on all of dsputil. Signed-off-by: Mans Rullgard <mans@mansr.com>
* Fix ff_imdct_calc_sse() on gcc-4.6Alex Converse2011-02-02
| | | | | | | Gcc 4.6 only preserves the first value when using an array with an "m" constraint. Signed-off-by: Mans Rullgard <mans@mansr.com>
* Implement a SIMD version of emulated_edge_mc() for x86.Ronald S. Bultje2011-01-31
| | | | | From ~550 cycles (C version) to 170 (SSE/x86-64), 206 (MMX/x86-32) and 196 (SSE2/x86-32) cycles.
* cosmetics: indentationJustin Ruggles2011-01-31
| | | | Signed-off-by: Mans Rullgard <mans@mansr.com>
* Remove unneeded add bias from 3 functions.Justin Ruggles2011-01-31
| | | | | | | | DSPContext.vector_fmul_window() DCADSPContext.lfe_fir() SynthFilterContext.synth_filter_float() Signed-off-by: Mans Rullgard <mans@mansr.com>
* x86: fix overflow in h264 8x8 planar predictionMans Rullgard2011-01-24
| | | | Signed-off-by: Mans Rullgard <mans@mansr.com>
* Change DSPContext.vector_fmul() from dst=dst*src to dest=src0*src1.Justin Ruggles2011-01-22
| | | | Signed-off-by: Mans Rullgard <mans@mansr.com>
* cosmetics related to LPC changes.Justin Ruggles2011-01-21
| | | | Signed-off-by: Mans Rullgard <mans@mansr.com>
* Separate window function from autocorrelation.Justin Ruggles2011-01-21
| | | | Signed-off-by: Mans Rullgard <mans@mansr.com>
* Move lpc_compute_autocorr() from DSPContext to a new struct LPCContext.Justin Ruggles2011-01-21
| | | | Signed-off-by: Mans Rullgard <mans@mansr.com>
* Fix horizontal/horizontal_up 8x8l intra prediction x86/simd functions.Ronald S. Bultje2011-01-19
| | | | | | | | The original functions did not work correctly for edge pixels, e.g. when CODEC_FLAG_EMU_EDGE is set, leading to corrupt output in e.g. VLC. Based on a patch by Daniel Kang <daniel d kang gmail com>. Signed-off-by: Ronald S. Bultje <rsbultje gmail com>
* Replace ASMALIGN() with .p2alignMans Rullgard2011-01-18
| | | | | This macro has unconditionally used .p2align for a long time and serves no useful purpose.
* x86: remove VLA in ac3_downmix_sseMans Rullgard2011-01-18
|
* consolidate .gitignore patters into a single fileJanne Grunau2011-01-18
| | | | Signed-off-by: Janne Grunau <janne-ffmpeg@jannau.net>
* convert svn:ignore properties to .gitignore filesJanne Grunau2011-01-17
| | | | Signed-off-by: Janne Grunau <janne-ffmpeg@jannau.net>
* Fix overflow in pred16x16_plane x86 simd code. Fixes issue 2547.Ronald S. Bultje2011-01-15
| | | | Originally committed as revision 26381 to svn://svn.ffmpeg.org/ffmpeg/trunk
* Fix ff_pw_3 alignment.Ronald S. Bultje2011-01-14
| | | | Originally committed as revision 26344 to svn://svn.ffmpeg.org/ffmpeg/trunk
* H.264: split luma dc idct out and implement MMX/SSE2 versionsJason Garrett-Glaser2011-01-14
| | | | | | | | | | About 2.5x the speed. NOTE: the way that the asm code handles large qmuls is a bit suboptimal. If x264-style dequant was used (separate shift and qmul values), it might be possible to get some extra speed. Originally committed as revision 26336 to svn://svn.ffmpeg.org/ffmpeg/trunk