08-19-2009 02:44 PM
I hate to push the point. But it is a bit more than "not officially supported". Actually it doesn't work or do anything except be a zero sample read. A value of -1 is useless and indeterminate in it's behavior. I believe that as an error input the system will just return the configured number of samples. It is sort of irrelevant since I don't use it.
Second, we have gotten off of the case I am interested in, I am using hardware timing. I want hardware timing. However in the real world hardware failures etc. cause this timing to not be acheivable. I would like to use hardware timing and when that fails, empty the buffer so I can get back to hardware timing. I do not believe that NIDAQmx Base can do this. That is why this is a challenge! Anyone out there who can post code?
As a test harness I have my read and the standard read compared in the attached zip file.
Run the analog buffer test VI.
If you use the "standard read" it will return 2500 scans each iteration as it should. Now use the boolean control to introduce a 1.3 second delay as some sort of OS distraction (or other hardware timeout). It will still return 2500 pts but they become more and more stale, and finally the 20 second buffer will overflow and you will get the infamous error 42 RLP failure. It may take a few minutes for this to happen. This is the buffer overflow and not the ultimate answer to life the universe and EVERYTHING.
For the above case, if I turn on the delay and leave it for only 20 seconds or so and then turn it off, the system will run fast and return the stale data. This make the synchronization with other devices impossible.
Now on my slow G4 (it used to be top of the line) if I use my "read until empty" version. The system will return 2550 pts each iteration. If I turn on the 1.3 second delay it will then return about 3400 pts and keep the buffer empty. I will never get a buffer overflow and thus not only am I ending up with the latest data I want instead of stale data I have a system that doesn't fail. If I turn off the 1.3 second delay the system will go back to happily being hardware timing and returning 2550 pts or so.
What I would like is the above behavior *except* that in the case without the delay the system return exactly 2500 points. With the delay I want the 3250 points but that second number can be approximate since in real life the delay is never exact. The 1.3 second number is just an example, in real life that delay can be a few seconds but I think it will never be more than 10 seconds (which is why I put in a 20 second buffer!).
I feel like I haven't explained this well and hopefully the contained code will show this. It is designed to work with a 32 differential channel card (6330) but I think you can change the channel spec array for other hardware and it will still show this error.
I am fairly specific about the need here which is for a reliable system to make periodic measurements in the face of other equipment problems.
08-19-2009 03:25 PM
08-19-2009 03:30 PM
09-03-2009 06:24 PM
Hi sth,
I'm looking at the timeout code the for the different HW families and realized a 6330 isn't an NI card. We have a 6030e but that doesn't quite match your description of 32 channels. To make sure we find a solution for your HW, what card are you using?
Looking at E And M Series, I can see why you're getting that timeout (especially in the E series case) as well as the data. I'm talking to the developers about that. My initial thoughts are along the lines of your solution - do an extra read with a timeout of 0 and reading a large number of samples. If there are extra samples in the buffer, read them and toss them, if not just timeout and move on. I think you could still do this, though you'd always get that error. I also think there may be a better way, but it depends on the HW. With M Series it looks like it is possible to implement a "number of samples available" property, the other HW lines will take some more research.
Thanks,
Andrew S.
MIO DAQ Product Support Engineer
09-03-2009 08:19 PM
I'm looking at the timeout code the for the different HW families and realized a 6330 isn't an NI card. We have a 6030e but that doesn't quite match your description of 32 channels.
09-04-2009 05:18 PM
Hi Scott,
Your "DAQmx Base Read Until Empty" VI is nearly correct given the limitations of the driver, the only change you need to make is your timeout value for the second and subsequent iterations of the while loop. Instead of 0 or near-zero, you should use a larger value that is less than the length of time it would take the requested number of samples to arrive.
I think you have idealized the processes involved in general DAQ programming in this case, so please let me lay out a thought experiment to make my case. To simplify comparision between each trial, I'll choose a simple AI task that each trial will use. This task is part of larger system that has been written to read data once every second. This task has one channel, is sampled at 1 kHz, and has no external triggering. We will thus expect that in one second, 1,000 samples will be available. We will be reading 1,000 samples each time, so each read will contain one second's worth of data.
Experiment: The Perfect SituationThe OS executes every function call and every processor instruction in zero time.
Variation 1: Start the task and then immediately read.Since the OS responds instantaneously, it is always able service to the task exactly on time. After the task is started, it enters DAQmx Base Read and begins to wait for 1,000 samples to be available in memory. Once they arrive, the read call returns the 1,000 samples. The buffer never has excess samples left over as a result. If we set a timeout of 1.000 second, we will never get a timeout error since everything happens exactly on time every time. If we set a timeout of anything less than 1.000 second, the task will return a timeout error because the DAQ hardware is not returning samples more quickly than 1,000/sec.
Variation 2: Start the task and then wait three seconds for the first read.The buffer is now holding three seconds' worth of data, but will never accumulate more left-over samples. On the first call to DAQmx Base Read, the buffer will return the 1,000 samples (the first second of data). Each call will return the next group of 1,000 every second, but that group of data will lag reality by three seconds, potentially causing problems in the larger application. If we used your "DAQmx Base Read Until Empty" VI here in place of DAQmx Base Read, it would work as you intended: it would read all of the data from the buffer and return it to you. The OS would instantaneosly query how many samples were in the buffer, see that more than 1,000 were available, and pull them into LabVIEW memory, all in zero time. It will continue to do this until there are less than 1,000 samples in the buffer. At that time, with a zero timeout, the DAQ hardare will not be able to fill the buffer until it's at least 1,000 full, and the read call will report a timeout error since 1,000 samples weren't available. This causes your VI to exit and return the data, and you have successfully cleared the data backlog. Subsequent reads will the most recently acquired second's worth of data.
I think you may see where I'm going with this 'instantaneous OS' -- a zero timeout is impossible in the real world since even a no-op instruction consumes time on a processor. Querying the samples available and comparing them with how many are requested takes a non-zero amount of time. Moreover, copying the data from the DMA buffer to LabVIEW takes a non-zero amount of time.
I'll summarize variation two's perfect behavior in a small table. These parameters apply to the situation when DAQmx Base Read Until Empty is called, at the moment when the OS first enters the call.
| time (sec) | timeout | samples in DMA buffer | data returned? | error? |
| 3.000000 | 1.000 | 3,000 | yes | no |
| 3.00000 | 0.000 | 2,000 | yes | no |
| 3.0000 | 0.000 | 1,000 | yes | no |
| 3.000 | 0.000 | 0 | no | yes |
Here's the distinction I'm trying to make: when the buffer has fewer samples than requested, DAQmx Base Read and the OS must wait for the DAQ hardware to make samples become available. When the buffer has more samples than requested, the OS doesn't need to wait on the DAQ hardware and can retrieve the data immediately, and thus timeout is irrelevant if it's 1, 10, or 100 seconds: the only time taken by the call to get the data is the time used to query, compare, and copy.
Your timeout in DAQmx Base Read Until Empty VI needs to greater than this time. Once the timeout is greater than this, DAQmx Base Read will retrieve the data as quickly as the OS can shuttle bytes in RAM; the DAQ board does not affect the length of the call.
On the other hand, there is an upper bound for the timeout, as you know. Our goal is to empty the buffer, and that means eventually calling DAQmx Base Read when there are fewer samples than we expect (0-999). This means the OS has caught up to the DAQ board to point where it has less than a one second lag. Once we hit this situation, the OS will then poll the DMA buffer for samples available. If the timeout is too large, say two seconds, then the DAQ board will have a two seconds to add more samples to the DMA buffer, and even if there were 0 left over samples, the DAQ board will have added another 1,000 before two seconds had passed and the read will succeed. With this timeout value, your Read Until Empty VI will never encounter an error, and it will only stop once LabVIEW has consumed all of your RAM 😉
So the key is to pick a timeout value such that the OS can have its overhead but prevent the DAQ board from keeping pace once the OS cathces up to it. The window of values that work begins at the maximum amount of time it takes the OS to query, compare, and copy; the window ends at the amount of time it takes the DAQ board to fill the buffer with the number of samples you want.
The value that you choose determines how many samples will be left behind when you finally error on a timeout. If you choose a half second timeout, then the read will succeed if there 501-999 samples in the buffer, leaving 1-499 samples in the buffer on exit. Upon the next read, the timeout error will happen since at most only 501-999 samples will be in the buffer at the end of the polling.
The shorter the timeout, the fresher your data. Here's another chart to compare a timeout of 500 ms versus 100 ms, beginning at the call to read in which the OS catches up the DAQ hardware:
Samples avail when... | 500 ms timeout | 100 ms timeout |
...when entering read | 501..999 | 901..999 |
...completing read | 1..499 | 1..99 |
...entering read again and timing out | 501..999 | 101..199 |
With a 500 ms timeout, it possible for 999 samples to be left in the buffer after read times out. With a 100 ms, only as much as 199 samples can be left behind. Since the OS can't act instantanesously, the timeout must be greater than zero. Indeed, even if it could, the DAQ board will have pushed more samples into the buffer after the OS had completed the copy to LabVIEW memory. There will always be left over samples in a non-RT system 🙂
Joe Friedchicken
NI Configuration Based Software Get with your fellow OS users
[ Linux ] [ macOS ]Principal Software Engineer :: Configuration Based Software
Senior Software Engineer :: Multifunction Instruments Applications Group (until May 2018)
Software Engineer :: Measurements RLP Group (until Mar 2014)
Applications Engineer :: High Speed Product Group (until Sep 2008)
09-04-2009 05:21 PM
I have a little more to add. It looks like I ran into the maximum post length 🙂
Obviously, this problem wouldn't exist if there were a way to directly
ask DAQmx Base how many samples there were in the buffer before
reading, and then just requesting that amount. If you poke around the
read VIs, there is a mechanism we use to do just that.
Drill
down to "ESeries -- AI DMA Read Data DMA 2D.vi" and you'll an invoke
node for "DMA Read u16". There's an unwired output called "Samples Left
in Buffer". This output indicates the total number of samples in the
buffer for all channels. To determine how many samples are available
for each channel, you would obviously need to divide by the number of
channels in your scan list. It's possible to bring this indicator up to
the umbrella DAQmx Base Read VI for E and M series, but our other
hardware (namely USB) doesn't communicate so directly.
All in
all, I would say there's a feature request for Base in this mess. My
preference is to tell DAQmx Base to read and return the entire buffer by passing '-1' as the number of samples to read. I'm not sure if
this is something we can do since USB doesn't play so nicely, but it's
worth tracking at the very least.
What are your thoughts?
Joe Friedchicken
NI Configuration Based Software Get with your fellow OS users
[ Linux ] [ macOS ]Principal Software Engineer :: Configuration Based Software
Senior Software Engineer :: Multifunction Instruments Applications Group (until May 2018)
Software Engineer :: Measurements RLP Group (until Mar 2014)
Applications Engineer :: High Speed Product Group (until Sep 2008)
09-10-2009 11:33 AM
Joe,
Thanks. It took me awhile to chew through all that information. Having written some DAQ drivers for the old PDP-11 series it is not that I have idealized the situation to zero processing time. It may be a difference in how we account for that processing time that has lead to my misunderstanding.
I am used to a driver not polling the hardware. Not ever. This is a wasteful, inefficient and bad way to write a driver. It may be the easy way that NI has used for the NIDAQmx base driver but it is a fundamental flaw that shows up here. The driver should pass the request to the PCI(e) card with a DMA address, a timeout and a count. All that polling of the DMA engine should go away and only notify the user when the interrupt occurs telling you that the count items have been retrieved.
Thus the driver never sees any data past the count number of points. All this race condition with the driver waiting a full timeout to tell you that the buffer has overflowed while claiming there isn't enough samples goes way. Polling is bad because of race conditions, inefficiencies and just general kludyness!!!
That being said, there is not much chance that a more efficient driver will appear in the near future.
Where we are disagreeing on behavior can be traced to the "time to transfer data" being counted as part of that timeout. I figured that timeout was passed to the DMA engine itself. So there is now a problem where the timeout can occur even if the samples were available but just not time to transfer the data to the user buffer? Thus my request with a 0 second timeout should have two cases. I am using your 1000 sample blocks as an example.
This didn't work, since I got a timeout with 0 seconds and more than 1000 items in the buffer. So I added a 1 mS timeout as a minimal time. Now on a modern GHz machine this should allow enough time for transfer of 10,000 samples even with a huge amount of overhead. Now in the two cases
# of samples in buffer < 1000, read 1000 with 2 second timeout
# of samples in buffer > 1000, read all the samples and return immediately to try to get back on track.
the problem is that everything I add to handle the second case, adds overhead in the 1st case unless I know beforehand which case I am dealing with. I am not sure how to phrase a feature request for this and not get it kicked out immediately because of the problem with your USB based devices which don't have the peek-ahead ability or would be very hard to do.
I can drill down to the E-Series DMA calls and get that number. It may be possible to ask for 0 samples with a 0 timeout and then get that number though LV application server. I have found that it is hard to maintain a modification to NIDAQmxBase though system upgrades. This is both good and bad.:-)
09-10-2009 03:23 PM
Joe Friedchicken
NI Configuration Based Software Get with your fellow OS users
[ Linux ] [ macOS ]Principal Software Engineer :: Configuration Based Software
Senior Software Engineer :: Multifunction Instruments Applications Group (until May 2018)
Software Engineer :: Measurements RLP Group (until Mar 2014)
Applications Engineer :: High Speed Product Group (until Sep 2008)
09-10-2009 04:26 PM
Joe F. wrote:
Your critique of the DMA architecture for E Series under DAQmx Base is well founded. The driver uses a software loop to poll the DMA chip and ask how many samples it has pushed to host memory. However, perhaps unlike PDP-11 boards, the E Series boards do not have a concept of timeout in their hardware.
Just to correct a misstatement of my original post. In trying to remember back to that earlier incarnation of OS programming, I believe that the timeout was set as a system timer in the driver. You would launch a DMA transfer, set the timer, and then relinquish the CPU waiting for either timeout. If the DMA happened you cancel the timer request, if the timer happened, you cancelled the DMA transfer. The PDP-11 series was probably discontinued around 198? and before NI had analog boards. I think that the ones I used mainly were from Data Translation and they are still around, but these were very rudimentary boards.
I believe that the OS X kernel programming uses a "WorkLoop" concept that is similar. Nowadays you can use the nanosecond clocks for timing. (Not that you really get nanosecond precision on these). But there are mechanisms to do the similar operation.