MGI Panel Actor- potential crash?

BertMcMahan · ‎12-13-2023

I'm hoping someone here can take a look and see if this makes sense. Apologies for the details- tl;dr at the bottom.

For the last few months I've been chasing my tail with a seemingly random hard crash of my program when it was a built executable. Every once in a blue moon I could get a crash with the dev environment, but it was mainly the exe, on multiple computers, built in both LV2020 and LV2023.

After about a million hours of debugging, I started to suspect it had something to do with unloading Actors and replacing them. The crash would happen sometime around when new actors were launched and old ones were disposed, but I couldn't ever pin it down. Finally, I got a somewhat minimum reproduceable program running, and started sticking popups in the code throughout to say "I'm here" when it got to that point. Interestingly enough, when I did that... no crash. Ah- race condition then! But where?

I had one VI that would send an e-stop to one Actor while launching another one (the new one to replace the old). When I added my popups, I wound up serializing the "Send Emergency Stop" VI and the Launch Nested Actor VI, and after some playing I wound up adding a Stall Dataflow.vim between the two. With a 250 ms wait between Stop and Launch, I didn't see a crash. Reducing it to 1 ms DID let me see the crash, and was very repeatable.

So, lots of Googling later, I saw reference to "Open VI Reference" having the potential to crash LabVIEW if it's used with a clone name. I was getting "writing out of bounds" access violations, and that was what it sounded like could happen if a clone VI name was used after the original was closed.

So, I started poking around, and sure enough in the Panel Actor Actor Core, I found "Read Active VI".

tl;dr: Could the following MGI Panel Actor code be the thing responsible for my crashes?

This function gets the reference to the Actor Core and keeps it around. The AF Debug library does the same thing, but it stores the Actor Core's name, not the VI refnum itself. It includes the foreboding note "NI recommends you avoid doing this in applications meant for your end users."

Anyone here familiar with this library? The copyright says it was written by Derek Trepanier who has an account here, but he hasn't been active in a long time. I haven't reached out to MGI directly yet.

As to my issue, for now I can of course work around it by delaying the Close operation, but that doesn't seem like a safe way to do this.

BertMcMahan · ‎12-13-2023

I did a little more reading, specifically this thread. AQ explains that the second reference to the clone won't keep the clone in memory, so if you operate on that second reference *after* the original one is out of memory, you can crash LabVIEW.

So, in this case, I'd think that the usage is somewhat safe. The reference is acquired in Actor Core, and it's used in messages sent to the actor. If those messages are operated upon, then they're being operated on inside Actor Core, so the reference is still valid.

The potential issues come up if the reference is used in the toolkit anywhere else. I need to dig into the toolkit tomorrow, but I could see it potentially being used in some cleanup code- though why anything would operate on a panel reference that the toolkit knows is closed, I can't say. The toolkit does do some shenanigans regarding subpanels though...

My code relies heavily upon launching and closing actors by the user, meaning shared clones get used repeatedly. All this leads to some questions:

1- Would a "Close reference" function called on a stale clone reference cause LV to crash?

2- When do shared clones leave memory, i.e., when would a reference to an Actor Core clone go stale? Specifically in the RTE- I suspect this may be different in the dev environment.

3- Do references persist across shared clone launches? In other words, if an actor launches, then (while it's running) I call Open VI Reference and get a secondary reference, then close the first actor, then launch it again, is my secondary reference stale? It's referring to the same clone, but I'm not sure the "This VI" function would return the exact same number both times.

1 and 3 should be somewhat easy to test tomorrow once I'm back at my dev machine, but there are certainly nuances there that I may not test correctly. I can also dig into the toolkit.

drjdpowell · ‎12-14-2023

To me, that use of Open VI looks safe, as the reference is being created inside the call chain of the clone being referenced. Thus the reference will be invalidated when the clone goes idle, which is before it could possibly be unloaded. If instead the clone name had been passed to another async-running VI, which created the reference, then that reference could possibly remain alive and pointing to a not-in-memory clone. Is there anywhere else in the MGI code that might do that?

Dhakkan · ‎12-14-2023

@BertMcMahan wrote:

I had one VI that would send an e-stop to one Actor while launching another one (the new one to replace the old).

Does your design allow use of the Handle Last Ack Core override so that you can re-launch the e-stopped actor only once it has sent its last ack?

I have never used MGI Panel Actor, nor delved deep into AF's innards - so am of no help with respect to your specific questions on the cloned VI references.

BertMcMahan · ‎12-14-2023

@drjdpowell wrote:

To me, that use of Open VI looks safe, as the reference is being created inside the call chain of the clone being referenced. Thus the reference will be invalidated when the clone goes idle, which is before it could possibly be unloaded. If instead the clone name had been passed to another async-running VI, which created the reference, then that reference could possibly remain alive and pointing to a not-in-memory clone. Is there anywhere else in the MGI code that might do that?

I just spent a while digging through the MGI code. The "Get reference" function is called in Actor Core and is written to the class's private data. It gets used a few times in a lot of messages, and then is cleaned up after the parent Actor Core returns (still within the Panel Actor's Actor Core). I don't see any async processes starting, and I don't see any overrides to any Actor Core VI's that would be called after Actor Core finished calling.

The only thing I see is that the reference may not be explicitly closed (I'm a little fuzzy on that logic). Would the garbage collector that auto-closes unused references cause a crash if it was given a stale clone reference?

drjdpowell · ‎12-14-2023

There's no garbage collector; references die as part of stopping the top-level VI that created them.

BertMcMahan · ‎12-14-2023

@drjdpowell wrote:

There's no garbage collector; references die as part of stopping the top-level VI that created them.

I see, thanks. So it sounds like a dangling "stale" reference wouldn't cause any issues, even if the top level VI closed, right?

BertMcMahan · ‎12-14-2023

Looks like my celebration was premature... adding the delay made the crash slightly less frequent, but didn't eliminate it. Sure seems like something's happening with creating new clones, reusing old memory, something like that.

BertMcMahan · ‎12-14-2023

@BertMcMahan wrote:

1- Would a "Close reference" function called on a stale clone reference cause LV to crash?

2- When do shared clones leave memory, i.e., when would a reference to an Actor Core clone go stale? Specifically in the RTE- I suspect this may be different in the dev environment.

3- Do references persist across shared clone launches? In other words, if an actor launches, then (while it's running) I call Open VI Reference and get a secondary reference, then close the first actor, then launch it again, is my secondary reference stale? It's referring to the same clone, but I'm not sure the "This VI" function would return the exact same number both times.

No solution yet but I did some poking and can provide a bit of data that was interesting to me.

The "This VI" primitive will always return the same reference, even if the VI is started and stopped.

If you insert a clone into a subpanel, then ask the subpanel what VI is in it, it will return a different reference. As far as I can tell, it always returns the same different reference, even if you call the property node from multiple places

For example: within an inserted clone inside a subpanel, the This VI primitive returns X.

From inside the inserted clone, the subpanel's Inserted VI is Y.

From the owner of the subpanel, the subpanel's Inserted VI is still Y.

If you close the reference to Y, then check Inserted VI, it's still Y, though that refnum is now invalid. (In other words- you don't get a new refnum to the "Inserted VI" ever, it's always the same).

If you remove the VI, then reinsert it into the subpanel, the Inserted VI property WILL change.

And a bug (I think) that doesn't cause LV to crash, but does break things:

Inside the VI within the subpanel:

If you read the Inserted VI property, close the reference, call Remove VI, then reinsert the VI into the subpanel (using the This VI primitive), then the subpanel glitches out.

I suspect it's because Insert VI creates a new reference to the clone, and if you close it and try to Remove the VI it will fail internally, putting the subpanel in a weird state.

So no fix for my crash, but I learned all of the above. Maybe that will lead me to an issue down the road.

drjdpowell · ‎12-15-2023

Thanks, that helped me solve an issue I was having with a Reference returned from "InsertedVI".

I was Closing it before calling "RemoveVI".

Actor Framework Discussions

MGI Panel Actor- potential crash?

MGI Panel Actor- potential crash?

Re: MGI Panel Actor- potential crash?

Re: MGI Panel Actor- potential crash?

Re: MGI Panel Actor- potential crash?

Re: MGI Panel Actor- potential crash?

Re: MGI Panel Actor- potential crash?

Re: MGI Panel Actor- potential crash?

Re: MGI Panel Actor- potential crash?

Re: MGI Panel Actor- potential crash?

Re: MGI Panel Actor- potential crash?