Issue with Priority Dequeue

cirrusio · ‎11-04-2015

Hello all,

I have a rather large system in which I recently experienced a significant failure that I have been struggling to recover from (mostly trying to recover a MAX configuration on an RT system which was rather stale at the time of the failure). In the recovery, I had to reformat the drive and reinstall hardware (on the RT system). Ever since then, I have been unable to successfully run code that had run previously. It seems that the controller is not handling any messages it is sent. It simply sits in Priority Dequeue indefinitely. I have probed the rest of the system to that is sending messages to be handled by this controller and it seems the messages are being sent, but the messages apparently never make it to the controller. Does anyone have any thoughts as to why this might happen? Any help would be great.

(Unfortunately, I can not post the code as the system in question is quite large).

Cheers, Matt

AristosQueue (NI) · ‎11-05-2015

Weird.

Well, those VIs are all open (no passwords), so you can debug into them. First obvious thing to check is whether the copies of the files on your disk have been modified from the shipping versions.

Assuming that is not the problem, then put a probe on the Enqueuer and on the Dequeuer and see if the internal refnum that they're using is the same refnum. If it is not, then you've got a problem where you're not sending to the actor you think you're sending to.

I'm guessing that you're using the Linked Network Actor to talk to the RT box? Have you checked the "error out" terminals on the Network Stream nodes therein? Maybe they're signaling something but the error is somehow getting dropped instead of propagated.

cirrusio · ‎11-05-2015

Thanks, AQ.

Well, those VIs are all open (no passwords), so you can debug into them. First obvious thing to check is whether the copies of the files on your disk have been modified from the shipping versions.

I don't believe this is the case - I am using the same source code on two separate computers and am getting the same result. One thing I should state is that this is written in 2014.

Assuming that is not the problem, then put a probe on the Enqueuer and on the Dequeuer and see if the internal refnum that they're using is the same refnum. If it is not, then you've got a problem where you're not sending to the actor you think you're sending to.

The dequeuer is only exposed in priority dequeue? I will check this. I have verified that the self-enqueuer in the controller has the same refnum as the nested actors that are attempting to send messages to the controller.

I'm guessing that you're using the Linked Network Actor to talk to the RT box? Have you checked the "error out" terminals on the Network Stream nodes therein? Maybe they're signaling something but the error is somehow getting dropped instead of propagated.

Nope. This application uses web services.

Thanks for all of your help. I have a horrible suspicion that one of the DAQmx tasks or a device name in the config is not right and this is affecting something but an error is not being thrown, but I have been unable to verify this. I will report back what I find.

Cheers, Matt

AristosQueue (NI) · ‎11-05-2015

Turn on automatic error handling. As much as some people don't like it, it is useful for finding unwired error terminals. A quick scripting VI can turn on the option on all of your VIs.

cirrusio · ‎11-06-2015

Praise the Lord! I am the problem!

But, in response to your response, AQ - this is a real-time system, i.e. these VIs run without FPs. How does automatic error handling work in this case?

AristosQueue (NI) · ‎11-06-2015

As far as I know, it works the exact same as it does on desktop, but if it ever goes off, it will blow your real-time determinism... at which point you fix it up so that the error isn't being dropped and then try again. The dialog appears back on the host (you have to have a host attached and monitoring).

cirrusio · ‎11-06-2015

So, as it turns out (and as I predicted) this was an issue with one of my actors causing a shutdown of the controller (for whatever reason, I missed this in my own logs!). The reason that it appeared that I was not getting messages is that it is quite difficult to actually debug actors on an RT machine given that they are rentrant. In order to actually see what is going on in the actor cores themselves (without writing a ton of code to wrap around it) you would have to switch off the reentrancy thus making any actor that is not currently receiving messages a blocking actor. Live and learn - this is why I have a log (but seems to be of limited usefulness given that I didn't seem to look it over well enough).

Thanks for your help, AQ.

Cheers, Matt

Actor Framework Discussions

Issue with Priority Dequeue

Issue with Priority Dequeue

Re: Issue with Priority Dequeue

Re: Issue with Priority Dequeue

Re: Issue with Priority Dequeue

Re: Issue with Priority Dequeue

Re: Issue with Priority Dequeue

Re: Issue with Priority Dequeue