Commit message (Collapse) | Author | Age | |
---|---|---|---|
* | x86: add a misc utility header | Anton Khirnov | 13 days |
| | |||
* | x86inc.asm: update to current master 04f14f43 | Anton Khirnov | 14 days |
| | | | | | | | Requires changing residual calc functions to AVX2. Also, supply the private prefix via nasm -D option rather than modifying x86inc.asm. | ||
* | residual_calc: accept all diff coefficients in a single array | Anton Khirnov | 2024-04-15 |
| | | | | | | | Plus an offset parameter that signals the distance between different coefficients. This allows to avoid passing so many pointers around, which reduces register pressure and simplifies writing SIMD. Seems also to be a little faster. | ||
* | residual_calc: rename stride to u_stride | Anton Khirnov | 2024-04-15 |
| | | | | | Make it explicit that it only applies to u, as other arrays are not indexed beyond curent line. | ||
* | egs: merge residual calc and correct when possible | Anton Khirnov | 2019-04-24 |
| | | | | | Also, merge the reflect boundary condition into residual calc+add. Improves performance due to better locality. | ||
* | egs: premultiply diff_coeffs with the denominator in init | Anton Khirnov | 2019-04-19 |
| | | | | | | Do not do it at every residual calc, which also allows us to get rid of an extra parameter (and reduce the number of registers used in x86 SIMD). | ||
* | residual_calc.asm: use the correct coefficients for y derivatives | Anton Khirnov | 2019-02-02 |
| | |||
* | ell_relax: compute the residual norm in residual_calc() | Anton Khirnov | 2019-01-13 |
| | | | | It is cheap and avoids an extra step in mg2d. | ||
* | residual_calc.asm: fix partial stores | Anton Khirnov | 2019-01-13 |
| | | | | .store1 and .store3 were switched | ||
* | residual_calc.asm: calculate x*=16 by x*=8; x+=x | Anton Khirnov | 2019-01-13 |
| | | | | Frees up one mm register for future use. | ||
* | residual_calc.asm: implement writing partial blocks | Anton Khirnov | 2019-01-10 |
| | | | | Avoid overwriting anything over the specified line size. | ||
* | residual_calc.asm: templatize the entire residual computation | Anton Khirnov | 2019-01-10 |
| | |||
* | residual_calc.asm: templatize computing the mixed derivative | Anton Khirnov | 2019-01-10 |
| | |||
* | residual_calc.asm: templatize computing non-mixed derivatives | Anton Khirnov | 2019-01-10 |
| | |||
* | residual_calc.asm: make mm register use more consistent between s1 and s2 | Anton Khirnov | 2019-01-10 |
| | |||
* | residual_calc.asm: make register use in s1 more similar to s2 | Anton Khirnov | 2019-01-10 |
| | |||
* | residual_calc.asm: reduce the use of magic constants | Anton Khirnov | 2019-01-10 |
| | |||
* | residual_calc.asm: reduce register use in the s1 variant | Anton Khirnov | 2019-01-10 |
| | | | | | Make it similar to the s2 version, which should make it easier to templatize the code in the future. | ||
* | ell_relax: add AVX SIMD for residual_calc | Anton Khirnov | 2018-12-27 |