aboutsummaryrefslogtreecommitdiff
path: root/src/vectors-8-SSE2.h
Commit message (Collapse)AuthorAge
* Don't use <x86intrin.h>; this does not exist everywhereeschnett2011-12-15
| | | | git-svn-id: https://svn.cct.lsu.edu/repos/numrel/LSUThorns/Vectors/trunk@40 105869f7-3296-0410-a4ea-f4349344b45a
* Support FMA4 instructions (AMD's fused multiply-add)eschnett2011-12-14
| | | | git-svn-id: https://svn.cct.lsu.edu/repos/numrel/LSUThorns/Vectors/trunk@39 105869f7-3296-0410-a4ea-f4349344b45a
* LSUThorns/Vectors: Remove pos, add sin/cos/tan functionseschnett2011-12-02
| | | | | | | | | | | | Remove kpos, because it is not used (it is a no-op, i.e. the arithmetic + operator). Add sin, cos, and tan. Begin to implement (still commented out) integer vector operations. git-svn-id: https://svn.cct.lsu.edu/repos/numrel/LSUThorns/Vectors/trunk@38 105869f7-3296-0410-a4ea-f4349344b45a
* Use "andnot" instruction when vectorisingeschnett2011-09-26
| | | | | | | | | | | | | Use the "andnot" instruction to reduce the number of different bit masks that are required. Using fewer different bit masks may require fewer registers to hold them, or fewer load instructions to access them, thus potentially improving performance. Do not scalarize ifpos when SSE 4.1 is not available; instead, use logical operations to create a bit mask. git-svn-id: https://svn.cct.lsu.edu/repos/numrel/LSUThorns/Vectors/trunk@31 105869f7-3296-0410-a4ea-f4349344b45a
* Suggest asm statements to support SSE4a with Intel compilers.eschnett2011-08-25
| | | | | | | Indent vector architecture definitions. git-svn-id: https://svn.cct.lsu.edu/repos/numrel/LSUThorns/Vectors/trunk@30 105869f7-3296-0410-a4ea-f4349344b45a
* Make more C++ compilers understand the signbit functioneschnett2011-08-20
| | | | | | | | | | | | | Several C++ compilers cannot handle std::signbit; use a work-around instead. Correct a namespace problem when using the same identifier Vectors_SGN for different precisions (real*4 and real*8). Correct kifpos implementation incorrectly on several architectures. git-svn-id: https://svn.cct.lsu.edu/repos/numrel/LSUThorns/Vectors/trunk@29 105869f7-3296-0410-a4ea-f4349344b45a
* Use a macro name which is less likely to conflict with an existing macro.svn_bwardell2011-08-08
| | | | | | This macro is only used internally anyway. git-svn-id: https://svn.cct.lsu.edu/repos/numrel/LSUThorns/Vectors/trunk@28 105869f7-3296-0410-a4ea-f4349344b45a
* Add more vectorisation tests. Add test case.eschnett2011-08-08
| | | | | | | | | | | | | | Add vectorisation test for vector creation, load, and store statements. Convert C to C++ since vectorisation requires C++. Add test case. Beautify vectorsation templates. git-svn-id: https://svn.cct.lsu.edu/repos/numrel/LSUThorns/Vectors/trunk@26 105869f7-3296-0410-a4ea-f4349344b45a
* Rename kifthen to kifpos as it more accurately reflects what it actually does.svn_bwardell2011-08-07
| | | | git-svn-id: https://svn.cct.lsu.edu/repos/numrel/LSUThorns/Vectors/trunk@24 105869f7-3296-0410-a4ea-f4349344b45a
* Make definition of vec_architecture for SSE and default more explicit.svn_bwardell2011-08-07
| | | | git-svn-id: https://svn.cct.lsu.edu/repos/numrel/LSUThorns/Vectors/trunk@22 105869f7-3296-0410-a4ea-f4349344b45a
* Fix definition of kifthen for architectures where blend instructions are not ↵svn_bwardell2011-08-07
| | | | | | available. git-svn-id: https://svn.cct.lsu.edu/repos/numrel/LSUThorns/Vectors/trunk@20 105869f7-3296-0410-a4ea-f4349344b45a
* Fix typo which caused k8ifthen to not compile if SSE4.1 was not available.svn_bwardell2011-07-21
| | | | git-svn-id: https://svn.cct.lsu.edu/repos/numrel/LSUThorns/Vectors/trunk@15 105869f7-3296-0410-a4ea-f4349344b45a
* Fix error in definition of k8abs_mask for SSE2 architectures.svn_bwardell2011-07-21
| | | | git-svn-id: https://svn.cct.lsu.edu/repos/numrel/LSUThorns/Vectors/trunk@14 105869f7-3296-0410-a4ea-f4349344b45a
* Add new API elements "kifthen" and "vec_architecture"eschnett2011-06-20
| | | | git-svn-id: https://svn.cct.lsu.edu/repos/numrel/LSUThorns/Vectors/trunk@12 105869f7-3296-0410-a4ea-f4349344b45a
* Introduce Cactus options for vectorisationeschnett2011-06-06
| | | | | | | | | | | | | | Introduce configuration-time options for vectorisation, including options to allow architecture-specific choices that may influence performance. Introduce "middle" masked stores for large vector sizes and small loops. Clean up and simplify some of the implementation code. git-svn-id: https://svn.cct.lsu.edu/repos/numrel/LSUThorns/Vectors/trunk@10 105869f7-3296-0410-a4ea-f4349344b45a
* Change naming scheme of architecture fileseschnett2011-01-20
Add support for AVX (next-generation SSE) Add support for Double Hummer (Blue Gene/P) git-svn-id: https://svn.cct.lsu.edu/repos/numrel/LSUThorns/Vectors/trunk@7 105869f7-3296-0410-a4ea-f4349344b45a