From this section onward, we will start learning the OpenCL programming basics by building and running actual code. Since we have not yet gone over the OpenCL grammar, you should concentrate on the general flow of OpenCL programming.
List 3.3 and 3.4 shows the familiar "Hello, World!" program, written in OpenCL. Since standard in/out cannot be used within the kernel, we will use the kernel only to set the char array. In this program, the string set on the kernel will be copied over to the host side, which can then be outputted. (The code can be downloaded from http://www.fixstars.com/books/opencl)
__kernel void hello(__global char* string)
{
string[0] = 'H';
string[1] = 'e';
string[2] = 'l';
string[3] = 'l';
string[4] = 'o';
string[5] = ',';
string[6] = ' ';
string[7] = 'W';
string[8] = 'o';
string[9] = 'r';
string[10] = 'l';
string[11] = 'd';
string[12] = '!';
string[13] = '\0';
}
#include <stdio.h>
#include <stdlib.h>
#ifdef __APPLE__
#include <OpenCL/opencl.h>
#else
#include <CL/cl.h>
#endif
#define MEM_SIZE (128)
#define MAX_SOURCE_SIZE (0x100000)
int main()
{
cl_device_id device_id = NULL;
cl_context context = NULL;
cl_command_queue command_queue = NULL;
cl_mem memobj = NULL;
cl_program program = NULL;
cl_kernel kernel = NULL;
cl_platform_id platform_id = NULL;
cl_uint ret_num_devices;
cl_uint ret_num_platforms;
cl_int ret;
char string[MEM_SIZE];
FILE *fp;
char fileName[] = "./hello.cl";
char *source_str;
size_t source_size;
/* Load the source code containing the kernel*/
fp = fopen(fileName, "r");
if (!fp) {
fprintf(stderr, "Failed to load kernel.\n");
exit(1);
}
source_str = (char*)malloc(MAX_SOURCE_SIZE);
source_size = fread(source_str, 1, MAX_SOURCE_SIZE, fp);
fclose(fp);
/* Get Platform and Device Info */
ret = clGetPlatformIDs(1, &platform_id, &ret_num_platforms);
ret = clGetDeviceIDs(platform_id, CL_DEVICE_TYPE_DEFAULT, 1, &device_id, &ret_num_devices);
/* Create OpenCL context */
context = clCreateContext(NULL, 1, &device_id, NULL, NULL, &ret);
/* Create Command Queue */
command_queue = clCreateCommandQueue(context, device_id, 0, &ret);
/* Create Memory Buffer */
memobj = clCreateBuffer(context, CL_MEM_READ_WRITE,MEM_SIZE * sizeof(char), NULL, &ret);
/* Create Kernel Program from the source */
program = clCreateProgramWithSource(context, 1, (const char **)&source_str,
(const size_t *)&source_size, &ret);
/* Build Kernel Program */
ret = clBuildProgram(program, 1, &device_id, NULL, NULL, NULL);
/* Create OpenCL Kernel */
kernel = clCreateKernel(program, "hello", &ret);
/* Set OpenCL Kernel Parameters */
ret = clSetKernelArg(kernel, 0, sizeof(cl_mem), (void *)&memobj);
/* Execute OpenCL Kernel */
ret = clEnqueueTask(command_queue, kernel, 0, NULL,NULL);
/* Copy results from the memory buffer */
ret = clEnqueueReadBuffer(command_queue, memobj, CL_TRUE, 0,
MEM_SIZE * sizeof(char),string, 0, NULL, NULL);
/* Display Result */
puts(string);
/* Finalization */
ret = clFlush(command_queue);
ret = clFinish(command_queue);
ret = clReleaseKernel(kernel);
ret = clReleaseProgram(program);
ret = clReleaseMemObject(memobj);
ret = clReleaseCommandQueue(command_queue);
ret = clReleaseContext(context);
free(source_str);
return 0;
}
The include header is located in a different directory depending on the environment (Table 3.1). Make sure to specify the correct location.
OpenCL implementation | Include Header |
AMD | CL/cl.h |
Apple | OpenCL/opencl.h |
FOXC | CL/cl.h |
NVIDIA | CL/cl.h |
The sample code defines the following macro so that the header is correctly included in any environment.
#ifdef __APPLE__
#include <OpenCL/opencl.h>
#else
#include <cl.h>
#endif
Once the program is written, we are now ready to build and run the program. This section will describe the procedure under Linux/Mac OS X. The kernel and host code are assumed to exist within the same directory.
The procedures for building vary depending on the OpenCL implementation. "path-to-..." should be replaced with the corresponding OpenCL SDK path. The default SDK path is as shown in Figure 3.2.
SDK | Path |
AMD Stream SDK 2.0 beta4 |
Path-to-AMD/ati-stream-sdk-v2.0-beta4-lnx32
(32-bit Linux) Path-to-AMD/ati-stream-sdk-v2.0-beta4-lnx64 (64-bit Linux) |
FOXC | Path-to-foxc/foxc-install |
NVIDIA GPU Computing SDK | $(HOME)/NVIDIA_GPU_Computing_SDK (Linux) |
The build command on Linux/Max OS X are as follows:
>
gcc -I /path-to-AMD/include -L/path-to-AMD/lib/x86 -o hello hello.c
-Wl,-rpath,/path-to-AMD/lib/x86 -lOpenCL (32-bit
Linux)
>
gcc -I /path-to-AMD/include -L/path-to-AMD/lib/x86_64 -o hello
hello.c -Wl,-rpath,/path-to-AMD/lib/x86_64 -lOpenCL (64-bit Linux)
>
gcc -I /path-to-foxc/include -L /path-to-foxc/lib -o hello
hello.c -Wl,-rpath,/path-to-foxc/lib -lOpenCL
>
gcc -o hello hello.c -framework opencl
>
gcc -I /path-to-NVIDIA/OpenCL/common/inc -L
/path-to-NVIDIA/OpenCL/common/lib/Linux32 -o hello hello.c -lOpenCL
(32-bit
Linux)
>
gcc -I /path-to-NVIDIA/OpenCL/common/inc -L
/path-to-NVIDIA/OpenCL/common/lib/Linux64 -o hello hello.c -lOpenCL
(64-bit
Linux)
Alternatively, you can use the Makefile included with the sample code to run the OpenCL code in various platforms as written below.
> make amd (Linux)
> make apple (Mac OS X)
> make foxc (Linux)
> make nvidia (Linux)
This should create an executable with the name "hello" in working directory. Run the executable as follows. If successful, you should get "Hello World!" on the screen.
> ./../hello
Hello
World!
This section will walk through the building and execution process using Visual C++ 2008 Express under 32-bit Windows Vista environment. The OpenCL header file and library can be included to be used on a project using the following steps.
1. From the project page, go to “C/C++ ” -> “General ”, then add the following in the box for “Additional include directories ”:
C:\ProgramData\NVIDIA Corporation\NVIDIA GPU Computing SDK\OpenCL\common\inc
C:\Program Files\ATI Stream\include
2. From the project page, go to “Linker ” -> “Input”, and in the box for “Additional library path”, type the following.
C:\ProgramData\NVIDIA Corporation\NVIDIA GPU Computing SDK\OpenCL\common\lib\Win32
C:\Program Files\ATI Stream\lib\x86
3. From the project page, go to "Linker" -> "Input", and in the box for "Additional Dependencies", type the following.
OpenCL.lib
These should apply to All Configurations, which can be selected on the pull-down menu located on the top left corner.
The environment should now be setup to allow an OpenCL code to be built on. Build and run the sample code, and make sure you get the correct output.