summaryrefslogtreecommitdiff
path: root/libavcodec/x86/h264dsp_mmx.c
Commit message (Collapse)AuthorAge
* Move static inline function to a macro, so that constant propagation inRonald S. Bultje2010-09-29
| | | | | | | inline asm works for gcc-3.x also (hopefully). Should fix gcc-3.x FATE breakage after r25254. Originally committed as revision 25262 to svn://svn.ffmpeg.org/ffmpeg/trunk
* Merge b_idx and edge variables, and optimize the ASM to directly load variablesRonald S. Bultje2010-09-29
| | | | | | | | from memory locations/offsets depending on b_idx plus constants, rather than having gcc do this. This saves several lea calls and together saves about 10 cycles in h264_loop_filter_strength_mmx2(). Originally committed as revision 25256 to svn://svn.ffmpeg.org/ffmpeg/trunk
* Remove mv_mask variable. Replace the related pand -1/0 instructions by eitherRonald S. Bultje2010-09-29
| | | | | | | a pxor, or remove the instruction alltogether. Altogether, this saves 1 instruction. Originally committed as revision 25255 to svn://svn.ffmpeg.org/ffmpeg/trunk
* Remove d_idx as a variable, and instead load it as a constant in the asm.Ronald S. Bultje2010-09-29
| | | | | | | This has no measurable speed effect because the surrounding code doesn't take advantage of this yet. Originally committed as revision 25254 to svn://svn.ffmpeg.org/ffmpeg/trunk
* Unroll inner bidir loop in h264_loop_filter_strength_mmx2(), which gets ridRonald S. Bultje2010-09-29
| | | | | | | of the d_idx variable and therefore allows for future optimizations. No speed difference by this commit itself. Originally committed as revision 25253 to svn://svn.ffmpeg.org/ffmpeg/trunk
* Unloop the outer loop in h264_loop_filter_strength_mmx2(), which allowsRonald S. Bultje2010-09-29
| | | | | | | inlining various constants within the loop code. 20 cycles faster on cathedral sample. Originally committed as revision 25252 to svn://svn.ffmpeg.org/ffmpeg/trunk
* Remove unused variable.Ronald S. Bultje2010-09-24
| | | | Originally committed as revision 25173 to svn://svn.ffmpeg.org/ffmpeg/trunk
* x86: disable SSE functions using stack when stack is not alignedMåns Rullgård2010-09-21
| | | | | | This fixes crashes with ICC 10.1. Originally committed as revision 25153 to svn://svn.ffmpeg.org/ffmpeg/trunk
* x86: remove hack disabling sse2 h264 loop filter with 32-bit iccMåns Rullgård2010-09-18
| | | | Originally committed as revision 25146 to svn://svn.ffmpeg.org/ffmpeg/trunk
* Rename h264_idct_sse2.asm to h264_idct.asm; move inline IDCT asm fromRonald S. Bultje2010-09-14
| | | | | | | | | | | | | h264dsp_mmx.c to h264_idct.asm (as yasm code). Because the loops are now coded in asm instead of C, this is (depending on the function) up to 50% faster for cases where gcc didn't do a great job at looping. Since h264_idct_add8() is now faster than the manual loop setup in h264.c, in-asm idct calling can now be enabled for chroma as well (see r16207). For MMX, this is 5% faster. For SSE2 (which isn't done for chroma if h264.c does the looping), this makes it up to 50% faster. Speed gain overall is ~0.5-1.0%. Originally committed as revision 25119 to svn://svn.ffmpeg.org/ffmpeg/trunk
* LGPL SSE2 H.264 iDCTJason Garrett-Glaser2010-09-10
| | | | | | | | This leaves no more GPL-only H.264 decoding asm code. Approved by Loren. Originally committed as revision 25092 to svn://svn.ffmpeg.org/ffmpeg/trunk
* Move mm_support() from libavcodec to libavutil, make it a publicStefano Sabatini2010-09-08
| | | | | | function and rename it to av_get_cpu_flags(). Originally committed as revision 25076 to svn://svn.ffmpeg.org/ffmpeg/trunk
* Rename FF_MM_ symbols related to CPU features flags as AV_CPU_FLAG_Stefano Sabatini2010-09-04
| | | | | | symbols, and move them from libavcodec/avcodec.h to libavutil/cpu.h. Originally committed as revision 25040 to svn://svn.ffmpeg.org/ffmpeg/trunk
* Port latest x264 deblock asm (before they moved to using NV12 as internalRonald S. Bultje2010-09-03
| | | | | | | format), LGPL'ed with permission from Jason and Loren. This includes mmx2 code, so remove inline asm from h264dsp_mmx.c accordingly. Originally committed as revision 25031 to svn://svn.ffmpeg.org/ffmpeg/trunk
* Rename h264_weight_sse2.asm to h264_weight.asm; add 16x8/8x16/8x4 non-squareRonald S. Bultje2010-09-01
| | | | | | | | biweight code to sse2/ssse3; add sse2 weight code; and use that same code to create mmx2 functions also, so that the inline asm in h264dsp_mmx.c can be removed. OK'ed by Jason on IRC. Originally committed as revision 25019 to svn://svn.ffmpeg.org/ffmpeg/trunk
* Split h264dsp_mmx.c (which was #included in dsputil_mmx.c) in h264_qpel_mmx.c,Ronald S. Bultje2010-09-01
| | | | | | | still #included in dsputil_mmx.c and is part of DSPContext, and h264dsp_mmx.c, which represents H264DSPContext and is now compiled on its own. Originally committed as revision 25018 to svn://svn.ffmpeg.org/ffmpeg/trunk
* Split intra prediction initialization (i.e. assigning of function pointers)Ronald S. Bultje2010-08-30
| | | | | | | into its own file, it doesn't belong in h264dsp_mmx.c (much less so in dsputil_mmx.c). Originally committed as revision 24990 to svn://svn.ffmpeg.org/ffmpeg/trunk
* Move H264 chroma MC from inline asm to yasm. This fixes VP3/5/6 and VC-1Ronald S. Bultje2010-08-30
| | | | | | fate failures on Win64. Originally committed as revision 24989 to svn://svn.ffmpeg.org/ffmpeg/trunk
* Put ff_ prefix on non-static {put_signed,put,add}_pixels_clamped_mmx()Ronald S. Bultje2010-08-30
| | | | | | functions. Originally committed as revision 24987 to svn://svn.ffmpeg.org/ffmpeg/trunk
* Remove global mm_flags variableMåns Rullgård2010-08-24
| | | | Originally committed as revision 24909 to svn://svn.ffmpeg.org/ffmpeg/trunk
* Split h264dsp and h264pred in configure.Jason Garrett-Glaser2010-08-07
| | | | | | | | | Many H.264 derivatives, like RV40 and VP8, use the H.264 prediction functions but not the weight/loopfilter functions. This should reduce the size of builds with one of these derivatives but without H.264 decoding itself. Originally committed as revision 24741 to svn://svn.ffmpeg.org/ffmpeg/trunk
* H.264: SSE2/SSSE3 weighted prediction asmEli Friedman2010-08-05
| | | | | | Patch by Eli Friedman <eli.friedman at gmail dot com> Originally committed as revision 24702 to svn://svn.ffmpeg.org/ffmpeg/trunk
* Fix h264/vp8 intra pred on Athlon XPJason Garrett-Glaser2010-07-01
| | | | | | Whose idea was it to have a CPU that didn't SIGILL on an invalid instruction? Originally committed as revision 23927 to svn://svn.ffmpeg.org/ffmpeg/trunk
* Add missing mm_support call toff_h264_pred_init_x86.Jason Garrett-Glaser2010-06-29
| | | | | | I'm not sure if this is supposed to be here, but it can't hurt. Originally committed as revision 23885 to svn://svn.ffmpeg.org/ffmpeg/trunk
* MMXEXT version of vp8 4x4 vertical predJason Garrett-Glaser2010-06-29
| | | | Originally committed as revision 23876 to svn://svn.ffmpeg.org/ffmpeg/trunk
* Add mmx/mmxext/ssse3 4x4 TM intra pred functions for vp8Jason Garrett-Glaser2010-06-28
| | | | Originally committed as revision 23875 to svn://svn.ffmpeg.org/ffmpeg/trunk
* Fix some intra pred MMX functions that used MMXEXT instructionsJason Garrett-Glaser2010-06-28
| | | | | | Also add predict_4x4_dc MMXEXT function for vp8/h264. Originally committed as revision 23873 to svn://svn.ffmpeg.org/ffmpeg/trunk
* Change MMXEXT to MMX2, MMXEXT is deprecatedBaptiste Coudurier2010-06-28
| | | | Originally committed as revision 23865 to svn://svn.ffmpeg.org/ffmpeg/trunk
* Fix x86 build with h264dsp disabledMåns Rullgård2010-06-28
| | | | Originally committed as revision 23844 to svn://svn.ffmpeg.org/ffmpeg/trunk
* Cosmetics: Fix indentation.Carl Eugen Hoyos2010-06-25
| | | | Originally committed as revision 23785 to svn://svn.ffmpeg.org/ffmpeg/trunk
* 16x16 and 8x8c x86 SIMD intra pred functions for VP8 and H.264Jason Garrett-Glaser2010-06-25
| | | | Originally committed as revision 23783 to svn://svn.ffmpeg.org/ffmpeg/trunk
* Replace more "m" constraints with MANGLE to fix compilation issuesReimar Döffinger2010-05-10
| | | | | | with x86_32 gcc 4.4.4 and -fPIC. Originally committed as revision 23082 to svn://svn.ffmpeg.org/ffmpeg/trunk
* Convert two "m" constraints to MANGLE to fix compilation with some compilers.Reimar Döffinger2010-04-01
| | | | Originally committed as revision 22760 to svn://svn.ffmpeg.org/ffmpeg/trunk
* Remove DECLARE_ALIGNED_{8,16} macrosMåns Rullgård2010-03-06
| | | | | | | These macros are redundant. All uses are replaced with the generic DECLARE_ALIGNED macro instead. Originally committed as revision 22233 to svn://svn.ffmpeg.org/ffmpeg/trunk
* optimize h264_loop_filter_strength_mmx2Loren Merritt2010-01-26
| | | | | | 244->160 cycles on core2 Originally committed as revision 21462 to svn://svn.ffmpeg.org/ffmpeg/trunk
* Move array specifiers outside DECLARE_ALIGNED() invocationsMåns Rullgård2010-01-22
| | | | Originally committed as revision 21377 to svn://svn.ffmpeg.org/ffmpeg/trunk
* Use two separate memory arguments since 8+() is invalid gas syntaxDavid Conrad2010-01-21
| | | | Originally committed as revision 21360 to svn://svn.ffmpeg.org/ffmpeg/trunk
* Attempt to fix asm compilation failure.Michael Niedermayer2010-01-20
| | | | | | Only tested on gcc 4 & x86_64. Originally committed as revision 21355 to svn://svn.ffmpeg.org/ffmpeg/trunk
* Use constant offsets for memory operands since gcc is unable toDavid Conrad2010-01-20
| | | | | | This fixes gcc failing to fit 6 memory locations into 7 registers on x86-32 Originally committed as revision 21337 to svn://svn.ffmpeg.org/ffmpeg/trunk
* Fix h264_loop_filter_strength_mmx2() so it works with b frames.Michael Niedermayer2010-01-19
| | | | Originally committed as revision 21327 to svn://svn.ffmpeg.org/ffmpeg/trunk
* Remove -2 -> -1 remapping, its not needed anymore as we must remap allMichael Niedermayer2010-01-19
| | | | | | references per LUT anyway. Originally committed as revision 21323 to svn://svn.ffmpeg.org/ffmpeg/trunk
* Replace more uses of __attribute__((aligned)) by DECLARE_ALIGNED.Ramiro Polla2009-06-04
| | | | Originally committed as revision 19089 to svn://svn.ffmpeg.org/ffmpeg/trunk
* H264: Fix out of bounds reads in SSSE3 MCAlexander Strange2009-05-30
| | | | | | | | | Reading above src[-2] isn't safe, so move loads and palignr ahead 3 pixels to load starting at the first pixel actually used. Fixes issue941. Originally committed as revision 18999 to svn://svn.ffmpeg.org/ffmpeg/trunk
* VC1: add and use avg_no_rnd chroma MC functionsDavid Conrad2009-04-14
| | | | Originally committed as revision 18518 to svn://svn.ffmpeg.org/ffmpeg/trunk
* Rename put_no_rnd_h264_chroma* to reflect its usage in VC1 onlyDavid Conrad2009-04-14
| | | | Originally committed as revision 18517 to svn://svn.ffmpeg.org/ffmpeg/trunk
* fix typo in h264dsp_mmx (no effect currently as the function is not used), ↵Baptiste Coudurier2009-02-08
| | | | | | approved by Dark Shikari on IRC Originally committed as revision 17046 to svn://svn.ffmpeg.org/ffmpeg/trunk
* Change semantic of CONFIG_*, HAVE_* and ARCH_*.Aurelien Jacobs2009-01-13
| | | | | | They are now always defined to either 0 or 1. Originally committed as revision 16590 to svn://svn.ffmpeg.org/ffmpeg/trunk
* Use H264 MMX chroma functions to accelerate RV40 decoding.Mathieu Velten2009-01-04
| | | | | | Patch by Mathieu Velten (matmaul A gmail) Originally committed as revision 16419 to svn://svn.ffmpeg.org/ffmpeg/trunk
* Add x264 SSE2 iDCT functions to H.264 decoder.Jason Garrett-Glaser2009-01-03
| | | | Originally committed as revision 16409 to svn://svn.ffmpeg.org/ffmpeg/trunk
* Rename libavcodec/i386/ --> libavcodec/x86/.Diego Biurrun2008-12-22
It contains optimizations that are not specific to i386 and libavutil uses this naming scheme already. Originally committed as revision 16270 to svn://svn.ffmpeg.org/ffmpeg/trunk