PXIe-5171R DRAM Write/Read Access

Sprow · ‎02-27-2018

Hi All,

I'm having a bit of issue with my code, and while this is specifically for the 5171, I suppose this question could even be generalized to any of the FPGA devices with configurable DRAM. My issue is as follows:

I have an application in which the ADC output is written to a continuous ring buffer in DRAM, and then segments are read out according to logic on the FPGA and host CPU. Given that the word width for the DRAM banks is 384 bits, and there are 2^24 addresses, we operate in a scheme where we write 336 bits to each of the two banks every 3 ticks of Data Clock (8 channels x 14 bits x 3 samples = 336 bits, each tick yields 2 ADC samples for 6 total, so we write the even samples to DRAM0, odd to DRAM1). This process must be deterministic - we cannot miss a write or fall behind on the data stream.

In an different clock domain (167 MHz, but if we can go faster we would like to do so), we read out some length of the DRAM for the waveforms of interest (so long as we know it hasn't been overwritten by checking that we're within the 400 ms time limit). When we read the DRAM, we use the Request Data and Retrieve Data nodes as suggested. The read access needs to be as fast as possible for maximum throughput, but should not interfere with the write access at all.

In our initial version of the code, there was no handshaking between write and read access - and in our testing, we've revealed that when the DRAM is being read, some fraction of the write access is blocked (either both banks were written, only one bank was written, or neither bank was written - there's a fairly even balance between these cases). The portions of memory written when no read occurs come out fine, so I suspect that we're okay with the write logic itself.

All of this being said, I had some fairly general questions regarding DRAM use in FPGA. I've read some NI documentation (http://www.ni.com/white-paper/14571/en/ and http://www.ni.com/tutorial/14652/en/) regarding best practices and LabVIEW FPGA implementations, but it leaves some questions.

Is the DRAM clock rate for the 5171 available information? If so, is there something about the clock how I do my read/write access that is not optimal for this particular application?
I know that you can't read and write to the same memory resource simultaneously, and that there exists a hardware arbiter to handle access to explicitly prevent this - how much latency occurs between the arbiter's response and the resource availability boolean indicator? In other words, if the Ready for Input boolean is true, can I trust this to be truly real time?
Aside from handshaking, are there any other obvious tricks of the trade that should be used here? For example, would it help to place the write and read access in the same loop to make the process more deterministic?

I appreciate the feedback. I'm really hoping to clarify some misconceptions I might have about the workings of DRAM in LabVIEW FPGA. If more specific information is necessary, please let me know and I'll gladly oblige.

Aaron

JMielens · ‎02-27-2018

1) The DRAM clock rate at the 384 bit interface level is 100 MHz. That is, regardless of what loop the access methods are used in, under the hood of the VI, there is a transfer to a 100 MHz DRAM-specific clock domain. For example, if read requests were made from a 40 MHz loop, they would not be able to get maximum throughput. Similarly, once over 100 MHz, there's no improvement in throughput by increasing the rate from say 125 MHz to 150 MHz.

2) There is no additional latency to consider when using the Ready for Input. If Ready for Input is asserted, then the request or write may be made.

3) Regardless of what loop the write and read accesses are in, there will still be a FIFO on each side of the process, to send the requests&writes to the DRAM, and then to receive the read data.

Due to the way the DRAM operates, when switching from writing to reading, reading to writing, or changing addresses, there is a longer delay incurred. When you say that sometimes banks don't get written, does that mean data is getting dropped/lost when reads are occurring? If so, then a few thoughts come to mind.

First, if these DRAM latencies are stalling the memory enough that you're losing write data, adding an additional FIFO to the write side to withstand the write access method temporarily lowering ready for input may help. You can put something on the diagram to monitor ready for input on your write method to see if it is deasserting during these cases, and if so this FIFO addition may help.

Second, both the read and write commands travel along the same path from the VI to the actual memory controller, and ordering is enforced. In other words, if a large number of reads are requested, then a write will have to wait for those reads to complete before the write being completed itself. In a scenario like this, throttling your read requests slightly can allow for more write traffic, obviously at the expense of read throughput.

Sprow · ‎02-27-2018

Thanks for the quick response! This information is very helpful! As a follow-up, I had an additional question:

Given the latency between switching between read and write access, would it make sense to maximize throughput by restricting write access for several consecutive clock ticks, then allowing read access while we accumulate enough data to repeat the writes? In such a scheme, I could imagine an intermediate FIFO to accumulate some number of writes (let's say 10), and then proceeding to write them continuously. Once we've written them, we open up for read access until we've accumulated the next 10 elements for writing. In this way, the controller can stay focused on either just reading or writing for an extended period of time, and hopefully we minimize our dead-time due to latency issues.

Nonetheless, the insight you've provided has certainly given me a significant amount to work with moving forward!

JMielens · ‎02-27-2018

Absolutely! Separating out the write from read traffic in such a manner should improve the overall throughput. The same would be the case for non-sequential accesses, if you were say, reading waveforms from both 0x10000 and 0x20000, it would be better to read all of one consecutively first, and then switch, rather than bouncing back and forth interleaving the accesses.

Sprow · ‎03-20-2018

Thanks again for discussing this issue in detail - it's been really helpful . I've successfully implemented the above mentioned changes (explicitly controlling read/write access, implementing batch read/write), and have resolved the issue.

High-Speed Digitizers

PXIe-5171R DRAM Write/Read Access

PXIe-5171R DRAM Write/Read Access

Re: PXIe-5171R DRAM Write/Read Access

Re: PXIe-5171R DRAM Write/Read Access

Re: PXIe-5171R DRAM Write/Read Access

Re: PXIe-5171R DRAM Write/Read Access