aboutsummaryrefslogtreecommitdiff
path: root/residual_calc.asm
Commit message (Collapse)AuthorAge
* x86: add a misc utility headerAnton Khirnov13 days
|
* x86inc.asm: update to current master 04f14f43Anton Khirnov14 days
| | | | | | | Requires changing residual calc functions to AVX2. Also, supply the private prefix via nasm -D option rather than modifying x86inc.asm.
* residual_calc: accept all diff coefficients in a single arrayAnton Khirnov2024-04-15
| | | | | | | Plus an offset parameter that signals the distance between different coefficients. This allows to avoid passing so many pointers around, which reduces register pressure and simplifies writing SIMD. Seems also to be a little faster.
* residual_calc: rename stride to u_strideAnton Khirnov2024-04-15
| | | | | Make it explicit that it only applies to u, as other arrays are not indexed beyond curent line.
* egs: merge residual calc and correct when possibleAnton Khirnov2019-04-24
| | | | | Also, merge the reflect boundary condition into residual calc+add. Improves performance due to better locality.
* egs: premultiply diff_coeffs with the denominator in initAnton Khirnov2019-04-19
| | | | | | Do not do it at every residual calc, which also allows us to get rid of an extra parameter (and reduce the number of registers used in x86 SIMD).
* residual_calc.asm: use the correct coefficients for y derivativesAnton Khirnov2019-02-02
|
* ell_relax: compute the residual norm in residual_calc()Anton Khirnov2019-01-13
| | | | It is cheap and avoids an extra step in mg2d.
* residual_calc.asm: fix partial storesAnton Khirnov2019-01-13
| | | | .store1 and .store3 were switched
* residual_calc.asm: calculate x*=16 by x*=8; x+=xAnton Khirnov2019-01-13
| | | | Frees up one mm register for future use.
* residual_calc.asm: implement writing partial blocksAnton Khirnov2019-01-10
| | | | Avoid overwriting anything over the specified line size.
* residual_calc.asm: templatize the entire residual computationAnton Khirnov2019-01-10
|
* residual_calc.asm: templatize computing the mixed derivativeAnton Khirnov2019-01-10
|
* residual_calc.asm: templatize computing non-mixed derivativesAnton Khirnov2019-01-10
|
* residual_calc.asm: make mm register use more consistent between s1 and s2Anton Khirnov2019-01-10
|
* residual_calc.asm: make register use in s1 more similar to s2Anton Khirnov2019-01-10
|
* residual_calc.asm: reduce the use of magic constantsAnton Khirnov2019-01-10
|
* residual_calc.asm: reduce register use in the s1 variantAnton Khirnov2019-01-10
| | | | | Make it similar to the s2 version, which should make it easier to templatize the code in the future.
* ell_relax: add AVX SIMD for residual_calcAnton Khirnov2018-12-27