Where are nidaqmx low level functions?

Merlin-3 · ‎10-23-2007

Hi,
After many tests with the Nidaqmx on a 6220M board, one issue came up that seems to be critical.

Let's talk about AI.
1) In the MX, you set the input range when creating the task.

2) It seems that in the M series boards (all boards?) there is no gain / channel as in the E series boads (all boards?) and the functions:
    DAQmxSetAIGain(TaskHandle taskHandle, const char channel[], float64 data);
    and
    DAQmxGetAIGain(taskhandle,string,&data);
    return an error.

3) In our software using the Traditional and a 6023E board, we run FOUR acquisitions threads at the same time, each one for a pair of AI channels (a0:1, a2:3, a4:5 and a6:7) and each one with possible DIFFERENT gains (or input ranges).

4) If we create four tasks, each one for a pair of input channels, to set different input ranges, we CANNOT run the four tasks at the same time because there are not hardware resources to run more than one task each time. If you try to run a second task, it returns the error the resources are being used.

5) And if you set one task for all channels (and pick up the data pairs from the buffer), you are restricted to set the SAME input range for all channels.

Conclusion: the MX 6220 + Nidaqmx cannot run what we do with the 6023E + Traditional. Probably the nidaqmx + a E series board could.

Am I wrong on something? (I would like to be...)

LMP

Message Edited by Merlin-3 on 10-23-2007 12:09 PM

Merlin-3 · ‎10-23-2007

Hi MarkFrot, please read the message above, if you can help.
Thanks,
LMP

Dennis_Knutson · ‎10-23-2007

Yes, you are wrong on something.

Look at the post http://forums.ni.com/ni/board/message?board.id=250&message.id=34247. With a single task, you can specify a different gain (by using a different RangeMin, RangeMax) for each DAQmxCreateAIVoltageChan that you call. So, if you had 4 channels with different range settings, you call DAQmxCreateAIVoltageChan 4 different times, pointing to the same task, and then call the start task.

Merlin-3 · ‎10-23-2007

I think the documentation (specially the C reference, with the functions) should tell such things or make them more clear.

Thanks Dennis,

LMP

Message Edited by Merlin-3 on 10-23-2007 12:42 PM

Dennis_Knutson · ‎10-23-2007

I can't comment that much on the C documentation. I do refer to it occasionaly but since the vast majority of my work is with LabVIEW, I can take advantage of that and the large number of examples that come with LabVIEW. The type of questions you have are also very common with LabVIEW programmers making the switch. You may not be able to view the code, but maybe some of the explanations can help.

Merlin-3 · ‎10-28-2007

Hi, another one...

In the Traditional, to be really fast, we use a callback and pick up the SPECIFIC data directly into the circular buffer the NAT board uses to store the data (we do not use the nidaq functions to copy the data). Doing so we can scan faster, run multiple aquisitions and get ONLY the data we need. It is simple and very fast.
Because in the Traditional we set the buffer, we have its address.
And the callback function gives us the current scan number (position, index, what ever...).
But... in the "beloved' nidaqmx... I did not find a way to know the buffer address. It seems I can use DAQmxGetReadCurrReadPos(TaskHandle taskHandle, uInt64 *data); to get the position, but without the buffer address.... we rely only on the nidamx functions to get the data, what is not the best method we could use as we do in the Traditional.

Is there a way to know the circular * buffer address * where the board automatically writes the data to avoid using slow functions like DAQmxReadBinaryI16(task1,NSCANS,1.0,DAQmx_Val_GroupByScanNumber,@indata[pint],maxIndata,@read,nil) ?

Thanks,
LMP

Merlin-3 · ‎10-28-2007

To get the current write position, I think this is the function: DAQmxGetReadTotalSampPerChanAcquired().

But still I do not have the buffer address.

LMP

Merlin-3 · ‎10-29-2007

Hi,

In the Traditional, I am able to use a callback to get samples at 20KHz continuously to read 8 channels (4 aquisitions in 2 channels each).
As I wrote in the previous message, I do not use nidaq functions to get the data, but pick up the data in the buffer directly because in the Traditional we have the buffer address and the scan position. We use this method because it is faster.

In addition to the lack of the buffer address (I didn't find a way to get it -- a big fault?), the max speed of the callback calls in nindamx is much slower than in the Traditional ! Doing some tests, in a 2GHz machine I barely can handle callbacks at 7 or 8 KHz, useful max 2 or 3 KHz (6220M), much less than the 20KHz I get easily in the Traditional (6023E) with valuable spare time to make useful work inside.

And now? What the NI gurus can tell us please?
Is this speed difference because of nidaqmx or the boards? Aren't the M series sold as "faster boards" ?? Or are nidamx drivers really much slower than the Traditional drivers?
Until now, except for the easy task configuration, I didn't see any advantage on using nidaqmx over the Traditional for the applications we use. Only problems.

LMP

Jonathan_Brumley · ‎10-30-2007

Message Edited by Jonathan Brumley on 10-30-2007 10:51 AM

Jonathan_Brumley · ‎10-30-2007

Hi Merlin,

You're right - DAQmx doesn't support allowing the application to allocate the acquisition buffer, which, like you said, has the benefit of allowing zero-copy reads. There are a couple of good reasons we haven't implemented this in DAQmx. The most important is this: after some benchmarking, we determined that DAQmx can get a big boost in performance if it allocates the buffer itself. If the DAQmx driver allocates the buffer, it can ensure the buffer is page locked, page aligned, and allocated in the lower 32-bits of system memory; all these qualities ensure that the underlying DMA transfer is very fast.

Like you, we were initially skeptical that requiring applications to "copy" the data using the Read call would add some significant overhead. However, after more benchmarking, we demonstrated that the driver can sustain very high acquisition rates, up to 80Mbytes/s, even without supporting "zero copy". Since 80Mbytes is faster than any device supported by DAQmx, we have never bothered to support a way for applications to get the buffer pointer. This wasn't a trivial decision. With DAQmx, supporting direct access to the buffer would require a lot more documentation than you might think - for example, for some device combinations, the driver uses more than one buffer per task. Also, without "locking" a region of the buffer during the zero-copy operation, it's very tricky for applications to prevent an accidental overrun into the accessed buffer area. And we don't like the idea of applications getting bad data without noticing the fact.

The good news is that, if you are using a single M-Series board, I suspect the built-in DAQmxReadBinaryI16 (or even one of the scaled functions) allows plenty of throughput for your application.

One gotcha you should know (it sounds like you have already run into it) is that each DAQmx Read call for a PCI M-Series device has an overhead of about 50 microseconds (on my 1.6GHz test machine), regardless of how many samples you acquire. Compared with Traditional, DAQmx does more setup work before the read call; however, the copy and scaling code runs very fast (faster than Traditional DAQ in many cases).

An important implication of this increased per-read overhead is that the more samples you read at a time, the more you will be able to amortize the overhead of the Read call. Therefore, how many samples you read during each call will determine your maximum acquisition rate. The more samples your application reads at a time, the faster the acquisition rate that you will be able to sustain. The maximum rate you should expect to be able to call Read is about 2500 times a second on a computer like mine.

It sounds like you have written your application such that you want to call Read very often (e.g. specifically, you benchmarked that with Traditional, you can read 20000 times a second). Can you can describe what you are doing in your application that would benefit from calling Read more often than 2500 times a second?

(My hidden agenda here is that if I can understand better how applications like yours will benefit from frequent calls to Read, then I can convince my manager that it's worth the time to implement some optimizations - we already know some things we can do to reduce the overhead.)

Hope this helps,
-- Jonathan

PS, if you are using the Unscaled (e.g. BinaryI16) read calls with M-Series, be sure you correctly apply the scaling coefficients for the device. M-Series uses a polynomial scaling function, not linear like E-Series. The polynomial scaling coefficients are a channel property, and you can read them back.

Multifunction DAQ

Where are nidaqmx low level functions?

Re: Where are nidaqmx low level functions?

Re: Where are nidaqmx low level functions?

Re: Where are nidaqmx low level functions?

Re: Where are nidaqmx low level functions?

Re: Where are nidaqmx low level functions?

Re: Where are nidaqmx low level functions?

Re: Where are nidaqmx low level functions?

Re: Where are nidaqmx low level functions?

Re: Where are nidaqmx low level functions?

Re: Where are nidaqmx low level functions?