FPGA DMA transfers....1 channel or three?

tartan5 · ‎09-18-2007

Hello all,

I am developing an application on a PXI-7811R FPGA. I am acquiring 96 digital inputs at a 4 MHz rate, for a data rate of approx 46 MB /sec.

Currently, I have set up 3 DMA channels. During each sample I shove 32 bits into each DMA channel. I haven't finished yet, but it seems like it is going to work.

Would there be any benefit to using only 1 channel, and doing three writes? This is a little more difficult, as I am inside a single cycle loop, but once my data is acquired I can pipeline the data and do the shoves as I am setting up for the next acquisition.

Does this save any overhead? If so, any idea how much? Would I have less problems (if I'll have any....) with DMA channels waiting for one another?

On another note, does anyone know if I'll be able to stream the data from the host FIFO to disk at this rate?

Thanks!

Wiebe@CARYA · ‎09-19-2007

I don't think this will gain much speed, but that's just based on a guts
feeling.

You have to be careful if you do this. My guess is you want to know which
process (1,2, or 3) wrote the value. So you don't have one I32, but two.
Since all things on the FPGA happen in parallel, two processes might be
writing at the same time, so you get ID1, ID2, Value1, Value2, in stead of
ID1, Value1, ID2, Value2. You could solve this by passing 64 bit integers,
but that won't benefit the speed (although it could probably still be done
in one cycle).

I had to send some information to my host a while ago. This information was
a number of u64's, so I figured I might get the above problem. Also, I had 5
processes, so I couldn't use DMA for each process.

I allocated a block of memory, where each process could write to it's own
part. And when done, I'd send a message on a queue. Then the host could read
it. My repetition rate for each process was slow, so I was sure that the
host could read it before the FPGA would write it over.

Just thinking out loud..

Regards,

Wiebe.

Spectre_Dave · ‎09-19-2007

tartan5

I believe that you should be fine with a single DMA channel.

I have streamed 1,500,000 U32s from the host to the FPGA, output these @ 1MHz while simultaneous reading 187,500 result U32s. The results where then DMAed to the host on a second DMA channel. I have also interleaved data onto a DMA channel then decimated the data at the host / FPGA.

Visualize the Solution

CLA

LabVIEW, LabVIEW FPGA

Mr._Jim · ‎09-19-2007

Tartan5,

I'm pretty sure I agree with VADave
I faced similar decisions recently with my application. If it were me, and all of the data is coming from the same place and is timed in the same loop, I'd use one DMA channel, especially because the DMA channels tend to be limited. (I'm not sure how many you have on the PXI-7811R, but I'm working with a 7813R that only has three) I guess it would depend on what else you're doing.

The bottom line is that I think one DMA FIFO would probably be a lot easier to implement unless you start having bandwidth problems. It sounds like you shouldn't based on what the others have said? (fingers crossed)

Interleaving is no problem. I initially had concerns that synchronization might be an issue, but it really hasn't been one. (i.e. channels being misordered on the receiving end of the FIFO)

I hope I haven't missed something somehow - I'm also thinking aloud and I hope it helps.

For what it's worth...
Jim

tartan5 · ‎09-19-2007

Hi All,

Thanks for the input. The only complicating issue is that all of this is happening inside a single cycle timed loop, so to get the 96 bits out I would need to pipeline the data, since (I'm pretty sure) I can't do three writes to the same DMA inside the SCTL. I could set up some on-board (VI scoped, I think it is called?) fifos to another parallel loop, which then would handle the feeding of the DMA FIFO from outside the SCTL.

Any caveats to doing it this way?

Thanks!

Howard

Wiebe@CARYA · ‎09-21-2007

It is definitatelly not fine to do this. In a lot of situations it might be
ok though.

In my situation, I had 5 processes that are triggered at will. If two of
them are triggered more or less at the same time, their data can't be
decimated on the host, because the data is mixed at random.

Regards,

Wiebe.

tartan5 · ‎09-21-2007

wiebe@CARYA wrote:
It is definitatelly not fine to do this. In a lot of situations it might be
ok though.

In my situation, I had 5 processes that are triggered at will. If two of
them are triggered more or less at the same time, their data can't be
decimated on the host, because the data is mixed at random.

Regards,

Wiebe.

Hi Wiebe,

Can you clarify? It is not fine to do what? Stream the data to another parrallel loop in the FPGA? or run 3 DMA channels?

Thanks!

Howard

Wiebe@CARYA · ‎09-21-2007

If you are going to send multiple data elements, in multiple loops, over one
DMA channel.

If each loop proceeds it's data with an ID, and then let's say 5 elements,
there is no guarantee that the data won't be mangled like this:

Loop 1 sends:
ID1, V1, V2, V3, V4, V5

Loop 2 sends:
ID2, V1, V2, V3, V4, V5

What might be received by the host:
ID1, V1, V2, ID2, V3, V1, V4, V2, V5, V3, V4, V5

So you won't be able to decimate the data, because there is no way to
distinguee between the ID's and the data.

Regards,

Wiebe.

tartan5 · ‎09-21-2007

Hi Wiebe,

Thanks for the clarification. I don't think I would have that problem. I am acquiring 96 bits at once (3 U32s) inside a single cycle timed loop. I would pipeline the data, then during the setup of my next acquisition (ACQ1 thru ACQ4) I would do my FIFO writes (ACQ1 write first U32, ACQ2 write second U32, etc). I'm assuming that since my states are hard-coded in this order the FIFO must contain the data in the correct order, or am I horribly mistaken?

Perhaps you mean if I had three independant loops acquiring data, and each one did it's own FIFO write? I could see how this might cause a problem.....

jayde · ‎09-21-2007

Hi,

I'm new at using Labview (coming from the FPGA design world) so please bear with me.

My problem may relate to the original posting, so I thought I should add it to this thread.

I have three channels of data. All are being written at different times, so I assigned each to a unique FPGA to HOST FIFO (using a 7833R FPGA board), call them FIFO1, FIFO2 and FIFO3.

Everything seems OK at the general level, but I find that FIFO3 returns every other word to the Host. The data rate for FIFO3 is 2X the data write for FIFOs 1 and 2. I'm now wondering if all the FIFOs are written to by the FIFO1 write pulse instead their own.

In order to write to the LabView FPGA FIFOs, I put each FIFO into its own single-cycle timed loop. The write strobe, one clock cycle in duration, triggers the single-cycle timed loop. The loop is exited upon deassertion of the write strobe. If there's a better way to do this, please let me know.

I know the write-strobe frequency for each channel is correct because I fed them to I/O and have monitored them with the logic analyzer.

Shouldn't there be three free-running FIFOs written to at their own cadence? Am I missing something fundamental?

Thanks for reading this. Any comments would be most appreciated.

Jayde

Message Edited by jayde on 09-21-2007 03:10 PM

LabVIEW

FPGA DMA transfers....1 channel or three?

FPGA DMA transfers....1 channel or three?

Re: FPGA DMA transfers....1 channel or three?

Re: FPGA DMA transfers....1 channel or three?

Re: FPGA DMA transfers....1 channel or three?

Re: FPGA DMA transfers....1 channel or three?

Re: FPGA DMA transfers....1 channel or three?

Re: FPGA DMA transfers....1 channel or three?

Re: FPGA DMA transfers....1 channel or three?

Re: FPGA DMA transfers....1 channel or three?

Re: FPGA DMA transfers....1 channel or three?