| Commit message (Collapse) | Author | Age |
... | |
|
|
|
|
|
| |
quite a bit faster
Originally committed as revision 21504 to svn://svn.ffmpeg.org/ffmpeg/trunk
|
|
|
|
|
|
| |
~6% faster (slow path) loopfilter (should be ~2% overall)
Originally committed as revision 21503 to svn://svn.ffmpeg.org/ffmpeg/trunk
|
|
|
|
|
|
|
|
| |
This is a hair slower (0.15% maybe) but i really dont want to have the
identical code duplicated 3 times because gcc adds odd threaded jumps with
register reshuffling and register safe/restore.
Originally committed as revision 21502 to svn://svn.ffmpeg.org/ffmpeg/trunk
|
|
|
|
| |
Originally committed as revision 21497 to svn://svn.ffmpeg.org/ffmpeg/trunk
|
|
|
|
|
|
| |
to split out
Originally committed as revision 21496 to svn://svn.ffmpeg.org/ffmpeg/trunk
|
|
|
|
|
|
|
|
| |
This prevents the issue with having to switch between slow and
fast code paths in each row.
0.5% faster loopfilter for cathedral
Originally committed as revision 21495 to svn://svn.ffmpeg.org/ffmpeg/trunk
|
|
|
|
|
|
| |
a few cycles faster
Originally committed as revision 21494 to svn://svn.ffmpeg.org/ffmpeg/trunk
|
|
|
|
|
|
|
|
|
|
| |
This allows many things to be simplified away.
h264 decoder is overall 1% faster with a mbaff sample and
0.1% slower with the cathedral sample, probably because the slow loop
filter code must be loaded into the code cache for each first MB of each
row but isnt used for the following MBs.
Originally committed as revision 21493 to svn://svn.ffmpeg.org/ffmpeg/trunk
|
|
|
|
| |
Originally committed as revision 21479 to svn://svn.ffmpeg.org/ffmpeg/trunk
|
|
|
|
|
|
|
|
| |
interlacing.
~4 cpu cycles speedup
Originally committed as revision 21474 to svn://svn.ffmpeg.org/ffmpeg/trunk
|
|
|
|
|
|
| |
60 cpu cycles speedup
Originally committed as revision 21467 to svn://svn.ffmpeg.org/ffmpeg/trunk
|
|
|
|
| |
Originally committed as revision 21460 to svn://svn.ffmpeg.org/ffmpeg/trunk
|
|
|
|
| |
Originally committed as revision 21459 to svn://svn.ffmpeg.org/ffmpeg/trunk
|
|
|
|
| |
Originally committed as revision 21457 to svn://svn.ffmpeg.org/ffmpeg/trunk
|
|
|
|
|
|
| |
initialization befre the loop filter.
Originally committed as revision 21416 to svn://svn.ffmpeg.org/ffmpeg/trunk
|
|
|
|
|
|
|
| |
This should fix a segfault, also it might be faster on systems where the
+52 wasnt free.
Originally committed as revision 21406 to svn://svn.ffmpeg.org/ffmpeg/trunk
|
|
|
|
|
|
| |
This is faster for videos that have lots of MBs that fall in this category.
Originally committed as revision 21400 to svn://svn.ffmpeg.org/ffmpeg/trunk
|
|
|
|
| |
Originally committed as revision 21397 to svn://svn.ffmpeg.org/ffmpeg/trunk
|
|
|
|
|
|
| |
4 cpu cycles faster.
Originally committed as revision 21396 to svn://svn.ffmpeg.org/ffmpeg/trunk
|
|
|
|
| |
Originally committed as revision 21377 to svn://svn.ffmpeg.org/ffmpeg/trunk
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Change order of operands as gcc uses a hardcoded register per operand it seems
even for static functions
thus reducing unneeded moved (now functions try to pass the same argument in
the same spot).
Change signed int to unsigned int for array indexes as signed requires signed
extension while unsigned is free.
move the +52 up and merge it where it will end as a lea instruction, gcc always
splits the 52 out there turning the free +52 into an expensive one otherwise.
The changed code becomes a little faster.
Originally committed as revision 21375 to svn://svn.ffmpeg.org/ffmpeg/trunk
|
|
|
|
|
|
| |
little effect overall as this is not that often executed.
Originally committed as revision 21366 to svn://svn.ffmpeg.org/ffmpeg/trunk
|
|
|
|
| |
Originally committed as revision 21345 to svn://svn.ffmpeg.org/ffmpeg/trunk
|
|
|
|
|
|
| |
Its faster but too rarely used to make a differnce.
Originally committed as revision 21344 to svn://svn.ffmpeg.org/ffmpeg/trunk
|
|
|
|
|
|
| |
loop filter. (a little faster for the common case where they are equal)
Originally committed as revision 21342 to svn://svn.ffmpeg.org/ffmpeg/trunk
|
|
|
|
| |
Originally committed as revision 21341 to svn://svn.ffmpeg.org/ffmpeg/trunk
|
|
|
|
| |
Originally committed as revision 21340 to svn://svn.ffmpeg.org/ffmpeg/trunk
|
|
|
|
| |
Originally committed as revision 21339 to svn://svn.ffmpeg.org/ffmpeg/trunk
|
|
|
|
|
|
| |
No real speed change.
Originally committed as revision 21336 to svn://svn.ffmpeg.org/ffmpeg/trunk
|
|
|
|
| |
Originally committed as revision 21328 to svn://svn.ffmpeg.org/ffmpeg/trunk
|
|
|
|
| |
Originally committed as revision 21308 to svn://svn.ffmpeg.org/ffmpeg/trunk
|
|
|
|
|
|
| |
in the innerst loop. ~150 cpu cycles faster
Originally committed as revision 21299 to svn://svn.ffmpeg.org/ffmpeg/trunk
|
|
|
|
| |
Originally committed as revision 21292 to svn://svn.ffmpeg.org/ffmpeg/trunk
|
|
|
|
| |
Originally committed as revision 21291 to svn://svn.ffmpeg.org/ffmpeg/trunk
|
|
|
|
|
|
| |
ff_h264_filter_mb_fast() too.
Originally committed as revision 21289 to svn://svn.ffmpeg.org/ffmpeg/trunk
|
|
|
|
| |
Originally committed as revision 21288 to svn://svn.ffmpeg.org/ffmpeg/trunk
|
|
|
|
|
|
| |
that way it is also available for ff_h264_filter_mb_fast().
Originally committed as revision 21283 to svn://svn.ffmpeg.org/ffmpeg/trunk
|
|
|
|
|
|
|
| |
loop filter. This removes one obstacle of getting ff_h264_filter_mb_fast()
bitexact. code is maybe 0.1% faster
Originally committed as revision 21280 to svn://svn.ffmpeg.org/ffmpeg/trunk
|
|
|
|
| |
Originally committed as revision 21274 to svn://svn.ffmpeg.org/ffmpeg/trunk
|
|
|
|
|
|
| |
~1% faster
Originally committed as revision 21273 to svn://svn.ffmpeg.org/ffmpeg/trunk
|
|
|
|
|
|
|
|
|
|
|
| |
Run loop filter per row instead of per MB, this also should make it
much easier to switch to per frame filtering and also doing so in a
seperate thread in the future if some volunteer wants to try.
Overall decoding speedup of 1.7% (single thread on pentium dual / cathedral sample)
This change also allows some optimizations to be tried that would not have
been possible before.
Originally committed as revision 21270 to svn://svn.ffmpeg.org/ffmpeg/trunk
|
|
|
|
|
|
|
|
| |
~200 bytes smaller ff_h264_filter_mb()
please everyone, NEVER add code with the assumtation that gcc will remove it
without checking gcc actually does. Chances are it does not.
Originally committed as revision 21251 to svn://svn.ffmpeg.org/ffmpeg/trunk
|
|
|
|
|
|
|
|
| |
and 5% faster.
ff_h264_filter_mb_fast() stay the same size as gcc decided not to inline these
functions there in the first place.
Originally committed as revision 21250 to svn://svn.ffmpeg.org/ffmpeg/trunk
|
|
|
|
| |
Originally committed as revision 21249 to svn://svn.ffmpeg.org/ffmpeg/trunk
|
|
|
|
| |
Originally committed as revision 21248 to svn://svn.ffmpeg.org/ffmpeg/trunk
|
|
|
|
| |
Originally committed as revision 21246 to svn://svn.ffmpeg.org/ffmpeg/trunk
|
|
|
|
| |
Originally committed as revision 21243 to svn://svn.ffmpeg.org/ffmpeg/trunk
|
|
No meassureable speed difference on pentium dual & cathedral sample.
Originally committed as revision 21159 to svn://svn.ffmpeg.org/ffmpeg/trunk
|