% *======================================================================* % Cactus Thorn template for ThornGuide documentation % Author: Ian Kelley % Date: Sun Jun 02, 2002 % $Header$ % % Thorn documentation in the latex file doc/documentation.tex % will be included in ThornGuides built with the Cactus make system. % The scripts employed by the make system automatically include % pages about variables, parameters and scheduling parsed from the % relevant thorn CCL files. % % This template contains guidelines which help to assure that your % documentation will be correctly added to ThornGuides. More % information is available in the Cactus UsersGuide. % % Guidelines: % - Do not change anything before the line % % START CACTUS THORNGUIDE", % except for filling in the title, author, date, etc. fields. % - Each of these fields should only be on ONE line. % - Author names should be separated with a \\ or a comma. % - You can define your own macros, but they must appear after % the START CACTUS THORNGUIDE line, and must not redefine standard % latex commands. % - To avoid name clashes with other thorns, 'labels', 'citations', % 'references', and 'image' names should conform to the following % convention: % ARRANGEMENT_THORN_LABEL % For example, an image wave.eps in the arrangement CactusWave and % thorn WaveToyC should be renamed to CactusWave_WaveToyC_wave.eps % - Graphics should only be included using the graphicx package. % More specifically, with the "\includegraphics" command. Do % not specify any graphic file extensions in your .tex file. This % will allow us to create a PDF version of the ThornGuide % via pdflatex. % - References should be included with the latex "\bibitem" command. % - Use \begin{abstract}...\end{abstract} instead of \abstract{...} % - Do not use \appendix, instead include any appendices you need as % standard sections. % - For the benefit of our Perl scripts, and for future extensions, % please use simple latex. % % *======================================================================* % % Example of including a graphic image: % \begin{figure}[ht] % \begin{center} % \includegraphics[width=6cm]{MyArrangement_MyThorn_MyFigure} % \end{center} % \caption{Illustration of this and that} % \label{MyArrangement_MyThorn_MyLabel} % \end{figure} % % Example of using a label: % \label{MyArrangement_MyThorn_MyLabel} % % Example of a citation: % \cite{MyArrangement_MyThorn_Author99} % % Example of including a reference % \bibitem{MyArrangement_MyThorn_Author99} % {J. Author, {\em The Title of the Book, Journal, or periodical}, 1 (1999), % 1--16. {\tt http://www.nowhere.com/}} % % *======================================================================* % If you are using CVS use this line to give version information % $Header$ \documentclass{article} % Use the Cactus ThornGuide style file % (Automatically used from Cactus distribution, if you have a % thorn without the Cactus Flesh download this from the Cactus % homepage at www.cactuscode.org) \usepackage{../../../../doc/latex/cactus} \begin{document} % The author of the documentation \author{Erik Schnetter \textless eschnetter@perimeterinstitute.ca\textgreater} % The title of the document (not necessarily the name of the Thorn) \title{OpenCL} % the date your document was last changed, if your document is in CVS, % please use: % \date{$ $Date: 2004-01-07 14:12:39 -0600 (Wed, 07 Jan 2004) $ $} \date{May 9 2012} \maketitle % Do not delete next line % START CACTUS THORNGUIDE % Add all definitions used in this documentation here % \def\mydef etc % Add an abstract for this thorn's documentation \begin{abstract} \emph{OpenCL} is a programming standard for heterogeneous systems, i.e.\ for programming CPUs, GPUs, and other types of accelerators. OpenCL is implemented as a library, and OpenCL codes are compiled at run time by passing OpenCL routines, as strings, to the OpenCL library. This is different e.g.\ from \texttt{CUDA}, which is implemented as a language such as C or C++. This thorn \texttt{OpenCL} provides the configuration bits that ensure that Cactus applications can use OpenCL libraries. \end{abstract} \section{Introduction} OpenCL describes itself as: \begin{quote} OpenCL is the first open, royalty-free standard for cross-platform, parallel programming of modern processors found in personal computers, servers and handheld/embedded devices. OpenCL (Open Computing Language) greatly improves speed and responsiveness for a wide spectrum of applications in numerous market categories from gaming and entertainment to scientific and medical software. \end{quote} More information is available at \url{http://www.khronos.org/opencl/}. \section{Availability} There seem to be four OpenCL implementations available at this time. Unfortunately, they each have their drawbacks: \begin{description} \item[AMD] Available at \url{http://developer.amd.com/zones/openclzone/pages/default.aspx}. This supports both CPUs and ATI GPUs. Unfortunately, the OpenCL compiler seems to produce code with a low quality. \item[Apple] Included with the operating system, available by default. This supports both CPU and GPU\@. The compiler is based on LLVM\@. Unfortunately, there seem to be serious bugs -- for example, I can't get the $cos$ function to provide correct results. \item[Intel] Available at \url{http://software.intel.com/en-us/articles/opencl-sdk/}. This supports only (Intel?) CPUs. The compiler is based on LLVM, and the implementation is also based on Intel's TBB (Threading Building Blocks). \item[Nvidia] Available at \url{http://developer.nvidia.com/opencl}, included in their CUDA distribution. This supports only GPUs. \item[pocl] Open source, available at \url{https://launchpad.net/pocl}. This OpenCL implementation has not yet been released (current version is 0.6), and is based on LLVM\@. \end{description} In addition, Wikipedia \url{http://en.wikipedia.org/wiki/OpenCL} lists two IBM implementations for their Power processor and for Intel compatible CPUs, respectively. The latter may be identical with or similar to AMD's implementation. Since OpenCL can run on CPUs, good OpenCL implementation are available at no cost for virtually all platforms. It is possible to install several OpenCL implementations (\emph{platforms}) at the same time, to build against any one of them, and then to choose at run time which devices from which platforms to use. For example, it is possible to build an application using the Intel implementation, and then at run time use the Nvidia platform to access a GPU (assuming that both Intel and Nvidia implementations are installed). On Unix, this is implemented via a system-wide configuration directory \texttt{/etc/OpenCL/vendors} that lists all OpenCL platforms that will be available at run time. \section{OpenCL Programming} OpenCL is very similar to C\@. However, it differs from C in several key aspects: \begin{itemize} \item much smaller run-time library, consisting mostly of mathematical functions (such as sqrt) and printf; \item built-in support for fine-grained and coarse-grainded multi-threading; \item built-in support for vectorisation. \end{itemize} Given this, it is not possible to write a whole application in OpenCL\@. Instead, only the expensive parts (so-called \emph{compute kernels}) are written in OpenCL, and are launched e.g.\ from C or C++. In addition, the hardware architecture of GPUs and other accelerators differs from CPUs in one key aspect: \begin{itemize} \item memory is separate from the host (regular CPU) memory. \end{itemize} That means that one has to explicitly copy data between the host memory and the device memory before and/or after calling compute kernels. \section{OpenCL Programming in Cactus} Cactus supports OpenCL programming at several levels. At the lowest level, one can use this thorn \texttt{OpenCL} directly. While this works fine, it is somewhat tedious because one has to write a certain amount of boilerplate code to detect and initialise the device, to copy data between host and device, and to build and run compute kernels. Since OpenCL is implemented as a library, the flesh knows only little about OpenCL\@. For example, there are no configuration options to spedify an OpenCL compiler, since code is compiled at run time via a library call to which the source code is passed as string. There is, however, one way in which the flesh supports OpenCL: Files with a \texttt{.cl} suffix are converted into a string and placed into the executable. These strings have the type \texttt{char~const~*} in C, and can be accessed at run time under a (globally visible) name \texttt{OpenCL\_source\_THORN\_FILE}, where \texttt{THORN} and \texttt{FILE} and are the thorn name and file name, respectively. (This is also explained in the users' guide.) \section{High-Level OpenCL Programming in Cactus} Cactus also offers a higher-level way of OpenCL programming, implemented in the thorns \texttt{OpenCLRunTime} and \texttt{Accelerator}. Thorn \texttt{OpenCLRunTime} provides a convenient function for executing OpenCL code. This function expects, as input, a string containing the OpenCL kernel code, and then calls this code. Lower-level tasks such as identifying available compute devices, initialising them, compiling the kernel (once, and then remembering it), and handling arguments and parameters are taken care of automatically. Details are described in this thorn's documentation. Thorn \texttt{Accelerator} simplifies memory management for GPUs and other types of devices. One declares in the thorn's schedule which routines read and write what variables, and \texttt{Accelerator} then keeps track which variables need to be copied at what time. It keeps track where (host and/or device) a variable has valid values, and copies data only when necessary, taking time level cycling, synchronisation, and I/O into account. Details are described in that thorn's documentation. % \begin{thebibliography}{9} % % \end{thebibliography} % Do not delete next line % END CACTUS THORNGUIDE \end{document}