summaryrefslogtreecommitdiff
path: root/libavcodec/x86/vp8dsp.asm
Commit message (Expand)AuthorAge
* Use "d" suffix for general-purpose registers used with movd.Reimar Döffinger2010-09-05
* Mark xmm registers as clobbered in simple loopfilter. Should fix the lastRonald S. Bultje2010-08-24
* Fix segfaults in VP8 SIMD code on Win64 (and FATE/win64 failures).Ronald S. Bultje2010-08-23
* VP8: move zeroing of luma DC block into the WHTJason Garrett-Glaser2010-08-02
* Use word-writing instead of dword-writing (with two cached but otherwiseRonald S. Bultje2010-07-31
* Use pmaddubsw for the mbedge_filter (>=ssse3), 6-10 cycles faster.Ronald S. Bultje2010-07-26
* VP8: Much faster SSE2 MCJason Garrett-Glaser2010-07-26
* Enable no-loop memory/register saving for ssse3/sse4 also.Ronald S. Bultje2010-07-26
* Save a register (or regsize of stackspace for x86-32) for the no-loopRonald S. Bultje2010-07-26
* Use nested ifs instead of &&, which appears to not work with %ifidn (i.e. thisRonald S. Bultje2010-07-26
* Split pextrw macro-spaghetti into several opt-specific macros, this will makeRonald S. Bultje2010-07-26
* Fix obvious bug in assignment. Somehow, the test vectors don't test this...Ronald S. Bultje2010-07-25
* Fix SPLATB_REG mess. Used to be a if/elseif/elseif/elseif spaghetti, so thisRonald S. Bultje2010-07-24
* VP8: optimize DC-only chroma case in the same way as luma.Jason Garrett-Glaser2010-07-23
* VP8 asm: cosmetics (spacing)Jason Garrett-Glaser2010-07-23
* VP8: 30% faster idct_mbJason Garrett-Glaser2010-07-23
* VP8: clear DCT blocks in iDCT instead of using clear_blocks.Jason Garrett-Glaser2010-07-23
* Use pextrw for SSE4 mbedge filter result writing, speedup 5-10cycles onRonald S. Bultje2010-07-22
* Fix and enable horizontal >=SSE2 mbedge loopfilter.Ronald S. Bultje2010-07-22
* Eliminate one instruction in VP8 dc_add_sse4Jason Garrett-Glaser2010-07-21
* Various VP8 x86 deblocking speedupsJason Garrett-Glaser2010-07-21
* Make mmx VP8 WHT fasterJason Garrett-Glaser2010-07-21
* VP8 MBedge loopfilter MMX/MMX2/SSE2 functions for both luma (width=16)Ronald S. Bultje2010-07-20
* Chroma (width=8) inner loopfilter MMX/MMX2/SSE2 for VP8 decoder.Ronald S. Bultje2010-07-20
* Revert r24339 (it causes fate failures on x86-64) - I'll figure out what'sRonald S. Bultje2010-07-19
* Implement chroma (width=8) inner loopfilter MMX/MMX2/SSE2 functions.Ronald S. Bultje2010-07-19
* Be more efficient with registers or stack memory. Saves 8/16 bytes stackRonald S. Bultje2010-07-19
* Change function prototypes for width=8 inner and mbedge loopfilter functionsRonald S. Bultje2010-07-19
* Attempt to fix x86-64 testsuite on fate.Ronald S. Bultje2010-07-16
* Remove duplicate define.Ronald S. Bultje2010-07-16
* Revert 24270, it contained some stuff that shouldn't have been in there.Ronald S. Bultje2010-07-16
* Remove duplicate define.Ronald S. Bultje2010-07-16
* Give x86 r%d registers names, this will simplify implementation of the chromaRonald S. Bultje2010-07-16
* Change return statement, the REP_RET is a mistake since the else case (x86-64,Ronald S. Bultje2010-07-16
* VP8 H/V inner loopfilter MMX/MMXEXT/SSE2 optimizations.Ronald S. Bultje2010-07-15
* Simple H/V loopfilter for VP8 in MMX, MMX2 and SSE2 (yay for yasm macros).Ronald S. Bultje2010-07-03
* SSSE3 versions of vp8 width4 bilinear MC functionsJason Garrett-Glaser2010-07-03
* SSSE3 versions of width4 VP8 6-tap MC functionsJason Garrett-Glaser2010-07-02
* Use add instead of lshift in mmxext vp8 idctJason Garrett-Glaser2010-06-29
* Remove unused macros (duplicates from the now-LGPL x86util.asm).Ronald S. Bultje2010-06-29
* MMX idct_add for VP8.Ronald S. Bultje2010-06-29
* Add mmxext version of VP8 DC Hadamard transformJason Garrett-Glaser2010-06-29
* Fix VP8 bilinear mc on x86_64Jason Garrett-Glaser2010-06-28
* Add x86 asm functions for VP8 put_pixelsJason Garrett-Glaser2010-06-28
* Add MMX, SSE2, SSSE3 asm for VP8 bilinear MCJason Garrett-Glaser2010-06-28
* First shot at VP8 optimizations:Jason Garrett-Glaser2010-06-27