summaryrefslogtreecommitdiff
path: root/libavcodec/h264_loopfilter.c
Commit message (Collapse)AuthorAge
...
* Restructure if() in check_mv()Michael Niedermayer2010-01-28
| | | | | | quite a bit faster Originally committed as revision 21504 to svn://svn.ffmpeg.org/ffmpeg/trunk
* Unroll loops in check_mv()Michael Niedermayer2010-01-28
| | | | | | ~6% faster (slow path) loopfilter (should be ~2% overall) Originally committed as revision 21503 to svn://svn.ffmpeg.org/ffmpeg/trunk
* Factor mv/ref compare code out.Michael Niedermayer2010-01-28
| | | | | | | | This is a hair slower (0.15% maybe) but i really dont want to have the identical code duplicated 3 times because gcc adds odd threaded jumps with register reshuffling and register safe/restore. Originally committed as revision 21502 to svn://svn.ffmpeg.org/ffmpeg/trunk
* Simplify first edge filter condition.Michael Niedermayer2010-01-28
| | | | Originally committed as revision 21497 to svn://svn.ffmpeg.org/ffmpeg/trunk
* Cosmetics, mostly indention, 2 or so new fixme comments that i was to lazyMichael Niedermayer2010-01-28
| | | | | | to split out Originally committed as revision 21496 to svn://svn.ffmpeg.org/ffmpeg/trunk
* Make the fast loop filter path work with unavailable left MBs.Michael Niedermayer2010-01-28
| | | | | | | | This prevents the issue with having to switch between slow and fast code paths in each row. 0.5% faster loopfilter for cathedral Originally committed as revision 21495 to svn://svn.ffmpeg.org/ffmpeg/trunk
* get rid of the start variable.Michael Niedermayer2010-01-28
| | | | | | a few cycles faster Originally committed as revision 21494 to svn://svn.ffmpeg.org/ffmpeg/trunk
* Unroll main loop so the edge==0 case is seperate.Michael Niedermayer2010-01-28
| | | | | | | | | | This allows many things to be simplified away. h264 decoder is overall 1% faster with a mbaff sample and 0.1% slower with the cathedral sample, probably because the slow loop filter code must be loaded into the code cache for each first MB of each row but isnt used for the following MBs. Originally committed as revision 21493 to svn://svn.ffmpeg.org/ffmpeg/trunk
* Update comment.Michael Niedermayer2010-01-27
| | | | Originally committed as revision 21479 to svn://svn.ffmpeg.org/ffmpeg/trunk
* Use table to speedup access to non_zero_count in MBAFF with differing ↵Michael Niedermayer2010-01-27
| | | | | | | | interlacing. ~4 cpu cycles speedup Originally committed as revision 21474 to svn://svn.ffmpeg.org/ffmpeg/trunk
* Optimize loop filtering of the left edge in MBAFF.Michael Niedermayer2010-01-26
| | | | | | 60 cpu cycles speedup Originally committed as revision 21467 to svn://svn.ffmpeg.org/ffmpeg/trunk
* remove unneeded checkMichael Niedermayer2010-01-26
| | | | Originally committed as revision 21460 to svn://svn.ffmpeg.org/ffmpeg/trunk
* Use left_mb_xy from fill_caches instead of recalculating it.Michael Niedermayer2010-01-26
| | | | Originally committed as revision 21459 to svn://svn.ffmpeg.org/ffmpeg/trunk
* Simplify loop filter a little by using top/left_type.Michael Niedermayer2010-01-26
| | | | Originally committed as revision 21457 to svn://svn.ffmpeg.org/ffmpeg/trunk
* Remove all uses of slice_type* from the loop filter, also remove itsMichael Niedermayer2010-01-24
| | | | | | initialization befre the loop filter. Originally committed as revision 21416 to svn://svn.ffmpeg.org/ffmpeg/trunk
* Move +52 from the loop filter to the alpha/beta offsets in the context.Michael Niedermayer2010-01-23
| | | | | | | This should fix a segfault, also it might be faster on systems where the +52 wasnt free. Originally committed as revision 21406 to svn://svn.ffmpeg.org/ffmpeg/trunk
* Set edges based on cbp and mv partitioning, not just skiped MBs.Michael Niedermayer2010-01-23
| | | | | | This is faster for videos that have lots of MBs that fall in this category. Originally committed as revision 21400 to svn://svn.ffmpeg.org/ffmpeg/trunk
* Optimize filter_mb_mbaff_edge*()Michael Niedermayer2010-01-23
| | | | Originally committed as revision 21397 to svn://svn.ffmpeg.org/ffmpeg/trunk
* Optmize 8x8dct check used to skip some borders in the loop filter.Michael Niedermayer2010-01-23
| | | | | | 4 cpu cycles faster. Originally committed as revision 21396 to svn://svn.ffmpeg.org/ffmpeg/trunk
* Move array specifiers outside DECLARE_ALIGNED() invocationsMåns Rullgård2010-01-22
| | | | Originally committed as revision 21377 to svn://svn.ffmpeg.org/ffmpeg/trunk
* Gcc idiocy fixes related to filter_mb_edge*.Michael Niedermayer2010-01-22
| | | | | | | | | | | | | | Change order of operands as gcc uses a hardcoded register per operand it seems even for static functions thus reducing unneeded moved (now functions try to pass the same argument in the same spot). Change signed int to unsigned int for array indexes as signed requires signed extension while unsigned is free. move the +52 up and merge it where it will end as a lea instruction, gcc always splits the 52 out there turning the free +52 into an expensive one otherwise. The changed code becomes a little faster. Originally committed as revision 21375 to svn://svn.ffmpeg.org/ffmpeg/trunk
* Make calculation of mask_edge free of branches, faster of course but probablyMichael Niedermayer2010-01-21
| | | | | | little effect overall as this is not that often executed. Originally committed as revision 21366 to svn://svn.ffmpeg.org/ffmpeg/trunk
* H.264: Declare bS with DECLARE_ALIGNED_8 for uint64_t casts.Alexander Strange2010-01-20
| | | | Originally committed as revision 21345 to svn://svn.ffmpeg.org/ffmpeg/trunk
* Simplify/Optimize another of the mbaff loop filter cases.Michael Niedermayer2010-01-20
| | | | | | Its faster but too rarely used to make a differnce. Originally committed as revision 21344 to svn://svn.ffmpeg.org/ffmpeg/trunk
* Only calculate the second chroma qp if it differs from the firstin the mainMichael Niedermayer2010-01-20
| | | | | | loop filter. (a little faster for the common case where they are equal) Originally committed as revision 21342 to svn://svn.ffmpeg.org/ffmpeg/trunk
* Set bS with 64bits at a time.Michael Niedermayer2010-01-20
| | | | Originally committed as revision 21341 to svn://svn.ffmpeg.org/ffmpeg/trunk
* Merge multiple IS_* macro uses where possible.Michael Niedermayer2010-01-20
| | | | Originally committed as revision 21340 to svn://svn.ffmpeg.org/ffmpeg/trunk
* Simplify and optimize intra code in h264_loopfilter.cMichael Niedermayer2010-01-20
| | | | Originally committed as revision 21339 to svn://svn.ffmpeg.org/ffmpeg/trunk
* Sightly simplify initialization of int start.Michael Niedermayer2010-01-20
| | | | | | No real speed change. Originally committed as revision 21336 to svn://svn.ffmpeg.org/ffmpeg/trunk
* Reenable ff_h264_filter_mb_fast() for all slices it supported before.Michael Niedermayer2010-01-19
| | | | Originally committed as revision 21328 to svn://svn.ffmpeg.org/ffmpeg/trunk
* Fix compilation with -O0.Michael Niedermayer2010-01-18
| | | | Originally committed as revision 21308 to svn://svn.ffmpeg.org/ffmpeg/trunk
* Rather call filter_mb_mbaff_edge*v() more often than do extra calculationsMichael Niedermayer2010-01-18
| | | | | | in the innerst loop. ~150 cpu cycles faster Originally committed as revision 21299 to svn://svn.ffmpeg.org/ffmpeg/trunk
* Use h->slice_num where possible.Michael Niedermayer2010-01-18
| | | | Originally committed as revision 21292 to svn://svn.ffmpeg.org/ffmpeg/trunk
* Enable filter_mb_fast for CAVLC P slices.Michael Niedermayer2010-01-18
| | | | Originally committed as revision 21291 to svn://svn.ffmpeg.org/ffmpeg/trunk
* PAFF CABAC P slices seem to work as well, so enable them for ↵Michael Niedermayer2010-01-18
| | | | | | ff_h264_filter_mb_fast() too. Originally committed as revision 21289 to svn://svn.ffmpeg.org/ffmpeg/trunk
* Reenable filter_mb_fast for I slices and progressive CABAC P slices.Michael Niedermayer2010-01-18
| | | | Originally committed as revision 21288 to svn://svn.ffmpeg.org/ffmpeg/trunk
* Move CAVLC 8x8 DCT special case from ff_h264_filter_mb() to fill_cachesMichael Niedermayer2010-01-18
| | | | | | that way it is also available for ff_h264_filter_mb_fast(). Originally committed as revision 21283 to svn://svn.ffmpeg.org/ffmpeg/trunk
* Perform reference remapping at fill_cache() time instead of in theMichael Niedermayer2010-01-18
| | | | | | | loop filter. This removes one obstacle of getting ff_h264_filter_mb_fast() bitexact. code is maybe 0.1% faster Originally committed as revision 21280 to svn://svn.ffmpeg.org/ffmpeg/trunk
* Move the qp check to skip the loop filter up.Michael Niedermayer2010-01-18
| | | | Originally committed as revision 21274 to svn://svn.ffmpeg.org/ffmpeg/trunk
* Reorganize how values are stored in h->non_zero_count.Michael Niedermayer2010-01-17
| | | | | | ~1% faster Originally committed as revision 21273 to svn://svn.ffmpeg.org/ffmpeg/trunk
* Rearchitecturing the stiched up goose part 1Michael Niedermayer2010-01-17
| | | | | | | | | | | Run loop filter per row instead of per MB, this also should make it much easier to switch to per frame filtering and also doing so in a seperate thread in the future if some volunteer wants to try. Overall decoding speedup of 1.7% (single thread on pentium dual / cathedral sample) This change also allows some optimizations to be tried that would not have been possible before. Originally committed as revision 21270 to svn://svn.ffmpeg.org/ffmpeg/trunk
* Comment for() ; outMichael Niedermayer2010-01-16
| | | | | | | | ~200 bytes smaller ff_h264_filter_mb() please everyone, NEVER add code with the assumtation that gcc will remove it without checking gcc actually does. Chances are it does not. Originally committed as revision 21251 to svn://svn.ffmpeg.org/ffmpeg/trunk
* Mark a few functions as noinline, this makes ff_h264_filter_mb() a bit smallerMichael Niedermayer2010-01-16
| | | | | | | | and 5% faster. ff_h264_filter_mb_fast() stay the same size as gcc decided not to inline these functions there in the first place. Originally committed as revision 21250 to svn://svn.ffmpeg.org/ffmpeg/trunk
* Apply last 2 optimizations to similar code i forgot.Michael Niedermayer2010-01-16
| | | | Originally committed as revision 21249 to svn://svn.ffmpeg.org/ffmpeg/trunk
* Another microopt, 4 cpu cycles for avoidance of FFABS().Michael Niedermayer2010-01-16
| | | | Originally committed as revision 21248 to svn://svn.ffmpeg.org/ffmpeg/trunk
* Minor (2 cpu cycles) optimization ||->|.Michael Niedermayer2010-01-16
| | | | Originally committed as revision 21246 to svn://svn.ffmpeg.org/ffmpeg/trunk
* Avoid wasting 4 cpu cycles per MB in redundantly calculating qp_thresh.Michael Niedermayer2010-01-16
| | | | Originally committed as revision 21243 to svn://svn.ffmpeg.org/ffmpeg/trunk
* Split h264 loop filter off h264.c.Michael Niedermayer2010-01-12
No meassureable speed difference on pentium dual & cathedral sample. Originally committed as revision 21159 to svn://svn.ffmpeg.org/ffmpeg/trunk