NI Linux Real-Time Discussions

cancel
Showing results for 
Search instead for 
Did you mean: 

Best things to look at for a project that fails intermittently

Hello Community,

I have a rather large project that is FPGA and RT running on an 2013 SP1 install of the cRIO-9068.  We have a few custom c libraries that we have writen and compiled into shared objects.  We interface to these from RT successfully, with out issue most of the time.  When running tests on the project the unit locks up after some unit of time that is between 2 and 6 hours (appears to be random).

We're starting in on the 'slowly remove parts until it doesn't crash' process, however I thought I would post here to see if anyone had some good tips/tricks on what to possibly log and/or look at to give us more insight into what might be happening.

We have a sneaking suspicion that it could be a NULL reference within our c libraries, however it seems odd that it would be random.

Things we are thinking of logging:

  • CPU usage
  • Memory usage
  • Number of calls to shared object functions
  • What shared object functions are being called
  • Disk usage

Any ideas for additional things to log?

Thanks!

-TD

0 Kudos
Message 1 of 2
(2,938 Views)

tduffy,

When you say the "unit locks up", can you be a bit more detailed as to how it appears to be locked up? What still works regarding the device? Does either the serial console or ssh console still work on the device (e.g. is it just inaccessibility from LabVIEW)? This will help narrow down the approaches that I would recommend

0 Kudos
Message 2 of 2
(2,480 Views)