LabVIEW

cancel
Showing results for 
Search instead for 
Did you mean: 

Wait at Rendezvous bugs

Bug #1 - Deadlock due to improper error handling

 

There is a bug, IMO, in Wait at Rendezvous.vi that can cause deadlock. I'll explain.

 

As you can see in "Wait at Rendezvous.vi", an error that flows out of "Release Waiting Procs.vi" will cause the Enqueue Element function to fail -- it will not enqueue the single element (rendezvous object data).  This means that the rendezvous object is locked and cannot be accessed anywhere else.

 

Rendezvous Bug - Dequeue Deadlocks2.png

 

 

For example, if one tries to call "Destroy Rendezvous.vi", the call to Dequeu Element in "Destroy A Rendezvous.vi" will wait forever.

 

 Rendezvous Bug - Dequeue Deadlocks.png

Message 1 of 19
(5,736 Views)

Bug #1 - The Fix

 

The fix for issue #1 is to always re-enqueue the data, regardless of errors that occur while operating on the data.

 

Rendezvous Bug - Enqueue Fix.png

 

It should be noted that if a Data Value Reference were used with an In Place Element Structure, this problem would be naturally avoided, since there is no Error In terminal on the DVR Write Element node:

 

DVR Write.png

Message 2 of 19
(5,732 Views)

Bug #2 - Unsent messages due to improper error handling in a For Loop

 

In "Release Waiting Procs.vi" (which is used to send messages to all waiting instances of "Wait at Rendezvous.vi"), if there is an error when calling Enqueue Element in any iteration of the For Loop, subsequent calls to Enqueue Element will fail. (I'll explain why there are invalid queue references later -- yep, more bugs.)

 

Error in Loop.png

Message 3 of 19
(5,731 Views)

Bug #2 - The Fix

 

The way to avoid having an error in one iteration prevent other iterations from succeeding is to not use an error shift register, but use a tunnel, and build an array of output errors.  Note that it is critical to merge in the upstream error, in case the For Loop iterates zero (0) times (due to an empty array).

 

Error Array in Loop.png

Message 4 of 19
(5,726 Views)

Bug #3 - Queue reference going invalid while waiting

 

Now, I have no idea why this bug is occurring -- I think it's a bug inside the queue primitives (inside core LabVIEW), related to CAR 136680 (see What can kill a queue? from lavag.org).

 

Here's what I'm seeing.  I'm getting an Error 1122 (reference became invalid while waiting) inside the Dequeue Element function used to wait on the rendezvous notification, as shown below.

 

 Invalid Queue Ref.png

 

The only think that I can think of is that the reference became invalid because the top-level VI that created the reference went idle.  But, that doesn't make sense, because "Wait at Rendezvous.vi" is the same VI that created the queue and it's still running!

 

Or, is there a possible way the top-level VI could have gone idle, even though this VI is still running...

 

This application uses a lot of top-level VIs, reentrancy, and dynamic dispatch.  Here's a possible scenario.  This particular reentrant instance of "Wait at Rendezvous.vi" is inside a reentrant dynamic dispatch VI. So, it's possible that the dynamic method (and therefor this particular reentrant instance) could be called from any one of several top-level VIs.  So, the recycled queue reference (in the shift register) starts out as valid, and then becomes invalid when the top-level VI (that actually made the dynamic call that created the recycled queue reference) goes idle -- BOOM, invalid queue reference.

Message 5 of 19
(5,711 Views)

Bug #3 - The Fix

 

I was able to fix this issue by never trying to recycle queue references in "Wait at Rendezvous.vi" -- instead, a new queue is created each time "Wait at Rendezvous.vi" is called.

 

First, Always create a new queue -- don't recycle it (regardless of supposed "performance" boost)

 

Create New Queue.png

 

And, always destroy the queue -- don't cache it in a shift register

 

Destroy Queue.png

Message 6 of 19
(5,709 Views)

Because an error in will prevent enqueueing and dequeueing, I never wire error in to queue functions.  If you want to collect errors, it is OK to wire error out with indexing enabled and then merging errors as Jim had pointed out in one of his posts.

NI should fix this inside their queue functions.  Ignore error in, but create error out.

 

- tbob

Inventor of the WORM Global
0 Kudos
Message 7 of 19
(5,654 Views)
These have been reported to R&D for further investigation (CAR 222274, 222275 and 222276 respectively).
Regards,

Jon S.
National Instruments
LabVIEW NXG Product Owner
0 Kudos
Message 8 of 19
(5,567 Views)

Hi Jon,

 

Thanks for the CAR numbers!

 

-Jim

0 Kudos
Message 9 of 19
(5,517 Views)

NI should fix this inside their queue functions.  Ignore error in, but create error out. - For what its worth, I totally disagree... I think that current behaviour is correct and consistant with other components in the environment. If an error occurs before the enqueue element primitive then it shouldn't do anything as the data coming from the previous function could be defective... resulting in an unintended message being sent to some other location in the program causing more problems. No no no this would really suck.

 

If the program needs an element to be enqueued (e.g. in an SEQ implementation) then it should be the responsiblity of the programmer to handle it appropriately. That is, I agree 100% with your initial statement Smiley Happy

 

Here's my approach to #1 - Basically my rational is that if the subVI returns an error its output can't be trusted and I don't want it do anymore damage (e.g. stop the bleeding). Since I dequeue outside the error bypass I know that the queue reference is valid so just re-enqueue the existing element. 

Download All
Message 10 of 19
(5,509 Views)