aboutsummaryrefslogtreecommitdiff
path: root/src/vectors-4-SSE.h
Commit message (Collapse)AuthorAge
* Correct syntax erroreschnett2013-07-19
| | | | git-svn-id: https://svn.cct.lsu.edu/repos/numrel/LSUThorns/Vectors/trunk@86 105869f7-3296-0410-a4ea-f4349344b45a
* Do not use type punning any moreeschnett2013-07-19
| | | | | | | | | | Do not cast between different pointer types. This is illegal in C/C++, and modern compilers (such as gcc 4.8) then generate wrong code. Instead, use memcpy to re-interpret the bit patterns of values with a different type. git-svn-id: https://svn.cct.lsu.edu/repos/numrel/LSUThorns/Vectors/trunk@85 105869f7-3296-0410-a4ea-f4349344b45a
* Major updateeschnett2013-01-16
| | | | | | | | | | | | | | | | | Disable AVX emulation Set default for streaming stores to "no" Correct QPX vectorisation (IBM Blue Gene/Q) Add MIC vectorisation (Intel Xeon Phi) Convert SSE and AVX vectorisation to using inline functions instead of macros for code clarity Define CCTK_BOOLEAN, CCTK_INTEGER and CCTK_BOOLEAN_VEC, CCTK_INTEGER_VEC to make boolean and integer vectors explicit git-svn-id: https://svn.cct.lsu.edu/repos/numrel/LSUThorns/Vectors/trunk@77 105869f7-3296-0410-a4ea-f4349344b45a
* Use ~0 instead of -1 for intmaxeschnett2012-10-22
| | | | git-svn-id: https://svn.cct.lsu.edu/repos/numrel/LSUThorns/Vectors/trunk@69 105869f7-3296-0410-a4ea-f4349344b45a
* Add support for (dynamic) if-then expressionseschnett2012-09-14
| | | | | | | | | | Add types for holding integers and booleans, and vectors thereof. Add if-then expressions. Add floating point comparisons. Update tests. git-svn-id: https://svn.cct.lsu.edu/repos/numrel/LSUThorns/Vectors/trunk@66 105869f7-3296-0410-a4ea-f4349344b45a
* Add ksgn function (vectorised version of Kranc's Sign)eschnett2012-08-11
| | | | | | | | | | | All architectures: Add copysign and sgn functions. Remove pos function (which does nothing). Add support for Blue Gene/Q (QPX instructions). Correct errors in AVX instructions. git-svn-id: https://svn.cct.lsu.edu/repos/numrel/LSUThorns/Vectors/trunk@62 105869f7-3296-0410-a4ea-f4349344b45a
* Implement asin, sinh, asinh, and friendseschnett2012-04-02
| | | | git-svn-id: https://svn.cct.lsu.edu/repos/numrel/LSUThorns/Vectors/trunk@55 105869f7-3296-0410-a4ea-f4349344b45a
* Add missing casts.barry.wardell2012-03-02
| | | | git-svn-id: https://svn.cct.lsu.edu/repos/numrel/LSUThorns/Vectors/trunk@54 105869f7-3296-0410-a4ea-f4349344b45a
* Implement missing functionalityeschnett2012-02-06
| | | | git-svn-id: https://svn.cct.lsu.edu/repos/numrel/LSUThorns/Vectors/trunk@51 105869f7-3296-0410-a4ea-f4349344b45a
* Various changeseschnett2012-02-05
| | | | | | | | | | | | | | | 1. Implement a simplified partial store interface Implement vec_store_nta_partial, which offers a simpler interface, similar to the one used in OpenCL. 2. Add kifmsg function, and implement kifpos and kifneg in terms of this. 3. Update (and make safer) Kranc-specific code git-svn-id: https://svn.cct.lsu.edu/repos/numrel/LSUThorns/Vectors/trunk@47 105869f7-3296-0410-a4ea-f4349344b45a
* Make vectorisation macros safereschnett2011-12-22
| | | | git-svn-id: https://svn.cct.lsu.edu/repos/numrel/LSUThorns/Vectors/trunk@44 105869f7-3296-0410-a4ea-f4349344b45a
* Simplify setting architecture description stringseschnett2011-12-21
| | | | git-svn-id: https://svn.cct.lsu.edu/repos/numrel/LSUThorns/Vectors/trunk@42 105869f7-3296-0410-a4ea-f4349344b45a
* Don't use <x86intrin.h>; this does not exist everywhereeschnett2011-12-15
| | | | git-svn-id: https://svn.cct.lsu.edu/repos/numrel/LSUThorns/Vectors/trunk@40 105869f7-3296-0410-a4ea-f4349344b45a
* Support FMA4 instructions (AMD's fused multiply-add)eschnett2011-12-14
| | | | git-svn-id: https://svn.cct.lsu.edu/repos/numrel/LSUThorns/Vectors/trunk@39 105869f7-3296-0410-a4ea-f4349344b45a
* Use "andnot" instruction when vectorisingeschnett2011-09-26
| | | | | | | | | | | | | Use the "andnot" instruction to reduce the number of different bit masks that are required. Using fewer different bit masks may require fewer registers to hold them, or fewer load instructions to access them, thus potentially improving performance. Do not scalarize ifpos when SSE 4.1 is not available; instead, use logical operations to create a bit mask. git-svn-id: https://svn.cct.lsu.edu/repos/numrel/LSUThorns/Vectors/trunk@31 105869f7-3296-0410-a4ea-f4349344b45a
* Suggest asm statements to support SSE4a with Intel compilers.eschnett2011-08-25
| | | | | | | Indent vector architecture definitions. git-svn-id: https://svn.cct.lsu.edu/repos/numrel/LSUThorns/Vectors/trunk@30 105869f7-3296-0410-a4ea-f4349344b45a
* Make more C++ compilers understand the signbit functioneschnett2011-08-20
| | | | | | | | | | | | | Several C++ compilers cannot handle std::signbit; use a work-around instead. Correct a namespace problem when using the same identifier Vectors_SGN for different precisions (real*4 and real*8). Correct kifpos implementation incorrectly on several architectures. git-svn-id: https://svn.cct.lsu.edu/repos/numrel/LSUThorns/Vectors/trunk@29 105869f7-3296-0410-a4ea-f4349344b45a
* Use a macro name which is less likely to conflict with an existing macro.svn_bwardell2011-08-08
| | | | | | This macro is only used internally anyway. git-svn-id: https://svn.cct.lsu.edu/repos/numrel/LSUThorns/Vectors/trunk@28 105869f7-3296-0410-a4ea-f4349344b45a
* Add more vectorisation tests. Add test case.eschnett2011-08-08
| | | | | | | | | | | | | | Add vectorisation test for vector creation, load, and store statements. Convert C to C++ since vectorisation requires C++. Add test case. Beautify vectorsation templates. git-svn-id: https://svn.cct.lsu.edu/repos/numrel/LSUThorns/Vectors/trunk@26 105869f7-3296-0410-a4ea-f4349344b45a
* Rename kifthen to kifpos as it more accurately reflects what it actually does.svn_bwardell2011-08-07
| | | | git-svn-id: https://svn.cct.lsu.edu/repos/numrel/LSUThorns/Vectors/trunk@24 105869f7-3296-0410-a4ea-f4349344b45a
* Make definition of vec_architecture for SSE and default more explicit.svn_bwardell2011-08-07
| | | | git-svn-id: https://svn.cct.lsu.edu/repos/numrel/LSUThorns/Vectors/trunk@22 105869f7-3296-0410-a4ea-f4349344b45a
* Fix definition of kifthen for architectures where blend instructions are not ↵svn_bwardell2011-08-07
| | | | | | available. git-svn-id: https://svn.cct.lsu.edu/repos/numrel/LSUThorns/Vectors/trunk@20 105869f7-3296-0410-a4ea-f4349344b45a
* Add new API elements "kifthen" and "vec_architecture"eschnett2011-06-20
| | | | git-svn-id: https://svn.cct.lsu.edu/repos/numrel/LSUThorns/Vectors/trunk@12 105869f7-3296-0410-a4ea-f4349344b45a
* Introduce Cactus options for vectorisationeschnett2011-06-06
| | | | | | | | | | | | | | Introduce configuration-time options for vectorisation, including options to allow architecture-specific choices that may influence performance. Introduce "middle" masked stores for large vector sizes and small loops. Clean up and simplify some of the implementation code. git-svn-id: https://svn.cct.lsu.edu/repos/numrel/LSUThorns/Vectors/trunk@10 105869f7-3296-0410-a4ea-f4349344b45a
* Change naming scheme of architecture fileseschnett2011-01-20
Add support for AVX (next-generation SSE) Add support for Double Hummer (Blue Gene/P) git-svn-id: https://svn.cct.lsu.edu/repos/numrel/LSUThorns/Vectors/trunk@7 105869f7-3296-0410-a4ea-f4349344b45a