| Commit message (Collapse) | Author | Age |
|
|
|
|
|
|
| |
Plus an offset parameter that signals the distance between different
coefficients. This allows to avoid passing so many pointers around,
which reduces register pressure and simplifies writing SIMD. Seems also
to be a little faster.
|
|
|
|
|
| |
Make parameter names more clear/consistent, document them, implement
missing 1U boundary.
|
|
|
|
|
| |
Also, merge the reflect boundary condition into residual calc+add.
Improves performance due to better locality.
|
|
|
|
|
|
| |
Do not do it at every residual calc, which also allows us to get rid of
an extra parameter (and reduce the number of registers used in x86
SIMD).
|
|
|
|
| |
Also, allocate all the diff coeffs together.
|
|
|