NI Home
Cart Cart | Help
Company Events Academic NI Developer Zone Support Solutions Products & Services Contact NI MyNI

Currently Being Moderated

Finally I've got working example for CUDA. Technically it was not extremely complicated - just DLL where appropriate functions should be used.

 

You will need the following tools:

 

I. Hardware.

 

Of course you will need GPU. I have NVidia GeForce 8600 GT adapter. This one is not fastest, but enough for experiments.

If you have other card, then check that your card is CUDA-enabled.

 

 

II. Software.

 

1. CUDA compiler and libraries.

This toolkit can be downloaded for free from CUDA Zone.

I have version 1.1 (right now version 2.x is available). The reason why not latest version was used is my second video card - NVidia GeForce FX 5200. The latest driver is not compatible with this card.

In general you needed three components:

- CUDA Driver

- CUDA Toolkit

- CUDA Code Samples.

 

2. Microsoft Visual Studio.

I have Microsoft Visual Studio 2005. MSVC is required for CUDA. Version 6 is not supported.

 

In general these two tools above enough for building CUDA-enabled DLL which can be called from LabVIEW, but I will use NI CVI as environment (its not so heavy as MSVC, and I just love this)

 

Installation

 

Well, first you need to install CUDA. Assumed that you smart and experinced enough to get CUDA installed. Its not very complicated. At the first you have to install Video Driver with CUDA Support. As far as I know, the latest NVidia drivers delivered with CUDA support. In my case I have download and install NVidia Driver 169.21

Then I have installed CUDA Toolkit 1.1 and CUDA SDK 1.1. All these tools obtained from NVidia. You can play also with version 2.x, of course

 

Installation path - C:\CUDA

 

When you ready, then check that CUDA is properly installed. Just run C:\CUDA\SDK\bin\win32\Release\deviceQuery.exe. I've got the following output:

Nvidia device.png

Well, test passed. Now you can play with other examples in C:\CUDA\SDK\bin\win32\Release folder...

 

Device Query from LabVIEW

 

I will try to get it running with CVI IDE. Probably its will be a little bit more easy to get DLL directly from MSVC, but simple way - is not interesting...

 

Well, step by step: at the first start CVI, then select New Project from Template:

CUDA CVI 01.png

 

Now we needed DLL, of course, so I will create new project here:

 

CUDA CVI 02.png

 

Result of wizard - almost empty project:

 

CUDA CVI 03.png

 

Well, now I will copy original deviceQuery code from CUDA samples:

c:\CUDA\SDK\projects\deviceQuery\deviceQuery.cu to c:\CUDA\SDK\projects\deviceQueryCVI\deviceQuery_cuda.c

 

Take a note - I have renamed *.cu file to *.c file! Otherwise CVI don't want to highlight syntax of this file.

 

Now I will add this file to the project:

 

CUDA CVI 04.png

 

Here is one important point: This file will be compiled with CUDA compiler, so I will exclude this from CVI compilation:

CUDA CVI 06.png

 

Now I will modify this file. I will create two functions - GetDevCount and DevQuery. With first function I will be able to get number of supported GPUs, and with second - get some parameters:

 

CUDA CVI 07.png

Both functions declared in deviceQuery_cuda.h

 

Now I needed small wrapper where both function will be called and the values transferred to LabVIEW:

 

CUDA CVI 08.png

 

Almost done.

 

Now I need to prepare environment.

 

Remember, we have excluded one file from build, because this one should be builded by CUDA compiler.

So, select Build->Build Steps...:

 

CUDA CVI 09.png

 

Then edit Custom build action:

 

CUDA CVI 10.png

 

I have entered following lines:

 

copy deviceQuery_cuda.c deviceQuery_cuda.cu
"c:\cuda\bin\nvcc.exe" -ccbin "C:\Program Files\Microsoft Visual Studio 8\VC\bin"  -I"C:\CUDA\SDK\common\inc"  -c -o deviceQuery_cuda.obj -arch sm_11 deviceQuery_cuda.cu

With first line we will copy *.c file to *.cu file (unfortunately *.cu extension is important for CUDA compiler)

With second line we will call CUDA compiler and compile deviceQuery_cuda.cu to object file.

 

Now its a time for first build - we will check that custom build action working:

 

CUDA CVI 11.png

 

You should see something like that:

 

CUDA CVI 12.png

 

Almost done. Now we need to link object files to DLL.

The problem that CVI linlker not working properly for me. DLL is generated, but not working with CUDA.

So, I will add Post build step with Microsoft linker:

 

CUDA CVI 13.png

 

Command line for post build action is following:

 

"C:\Program Files\Microsoft Visual Studio 8\VC\bin\link.exe" deviceQuery_cuda.obj cvibuild.deviceQuery\deviceQuery.niobj /OUT:"deviceQuery.dll" /NODEFAULTLIB:"LIBC.LIB" /INCREMENTAL:NO /NOLOGO /LIBPATH:"C:\CUDA\lib"

/LIBPATH:"C:\CUDA\SDK\common\lib" /LIBPATH:"C:\CUDA\SDK\common\lib" /LIBPATH:"C:\Program Files\Microsoft Visual Studio 8\VC\lib"

/LIBPATH:"C:\Program Files\Microsoft Visual Studio 8\VC\PlatformSDK\Lib" /LIBPATH:"C:\Program Files\Intel\Compiler\11.0\066\cpp\lib\ia32" /DEBUG /PDB:"c:\CUDA\SDK\bin\win32\Release\deviceQuery.pdb" /SUBSYSTEM:CONSOLE /DLL /DEF:deviceQuery.def /OPT:REF /OPT:NOICF /MACHINE:X86 /ERRORREPORT:PROMPT cudart.lib cutil32.lib  kernel32.lib user32.lib gdi32.lib

 

 

If all done accurately and you are a bit lucky, then you will get following output after pressing Ctrl+M:

 

CUDA CVI 14.png

 

And DLL will be created.

 

Now I can call both functions from LabVIEW:

 

CUDA CVI 15.png

 

And finally it works:

 

CUDA CVI 16.png

 

If you will take a look into c:\cuda\sdk\projects\ folder, then you will found more interesting code samples.

 

For example, histogram calculation (located in histogram64 or histogram256 subfolders).

 

Exactly as describe above I have created CVI project for histogram computation:

 

CUDA CVI 17.png

The source code looks like that:

 

CUDA CVI 18.png

 

In the code above we needed to init GPU, upload our image data to GPU, then perform computation.

 

What I have learned - GPU is fast. For example 8600 GT nearby twice faster than Intel Dual Core.

I have compared different histograms implementation and got following results:

 

CUDA CVI 19.png

 

As you can see, GPU needed 7 ms for computing 64 taps histogram vs 15-20 ms on the PC (here clear computation time shown without image transfer).

 

Test project, described above in attachment.

 

Thanks for reading,

With best regards,

 

Andrey Dmitriev.

Attachments:


Jul 30, 2009 5:46 AM Balze Balze    says:

Andrey,

 

you're unbelievable

 

Very nice !

 

Did you get it running in an (commercial) application yet ?

 

Best regards ex-colleauge,

 

Balze

Jul 30, 2009 6:38 AM Andrey Dmitriev Andrey Dmitriev    says in response to Balze:

Thank you, Rainer,

 

No, not yet for commercial applications.

 

There are different reasons:

At the first, it was tested with old CUDA 1.1 (can't install latest CUDA, because incompatible with my hardware)
Second, it makes no sense for simple calculation, because penalties for transfer images to GPU/from GPU are pretty big (not measured yet). But for complicated algorithms we needed too much low level programming work.
Third, it may be much more easy to use high level NVPP library instead of low level CUDA programming. (I've got NVPP, but not tested it yet)
And finally - OpenCL (don't confuse with OpenGL - they are completely different things) seems to be interesting, because theoretically we can be compatible not with Nvidia only, but also with ATI GPUs.

 

So, will continue my play, when time permit...

 

Andrey.

Aug 17, 2009 8:09 PM sumg sumg    says:

There is now a CUDA library for LabVIEW available on NI Labs website

http://decibel.ni.com/content/docs/DOC-6064