I'm thinking of formalizing verification of the Stop message in next rev of AF

FabiolaDelaCueva · ‎04-21-2015

drjdpowell wrote:
AristosQueue wrote:

Another way that I've heard suggested to achieve this is for an actor to send a message to its nested actors saying, "Here's a new queue... use this to talk to me instead of the old one." Then close the old queue.
Side note: if one uses User Events to pass messages instead of a Queue, then one can have more than one User Event, each with a separate registration. Use one as a "Public" or "From Caller" message channel, and the second as a "Private" or "From Nested" channel. On receiving stop, kill the Public registration, dropping all remaining messages, but keep handling the Private one open till shutdown is complete.

Also User Events offer the option to unregister for a selection of messages while remaining registered for the rest.

For an opportunity to learn from experienced developers / entrepeneurs (Steve, Joerg, and Brian amongst them):
Check out DSH Pragmatic Software Development Workshop!

DQMH Lead Architect * DQMH Trusted Advisor * Certified LabVIEW Architect * Certified LabVIEW Embedded Developer * Certified Professional Instructor * LabVIEW Champion * Code Janitor

Have you been nice to future you?

FabiolaDelaCueva · ‎04-21-2015

AristosQueue wrote:
True. But you lose all of the priority queue aspects when you do that. Priority messaging is the major reason the queues are used in the AF. I don't know of any way to apply priority ordering to user events to the degree that AF would need to operate correctly.

Yep, User Events only have two priorities. I believe it was in 2014 that the high priority was added, that enqueues the event at the other end.

For an opportunity to learn from experienced developers / entrepeneurs (Steve, Joerg, and Brian amongst them):
Check out DSH Pragmatic Software Development Workshop!

DQMH Lead Architect * DQMH Trusted Advisor * Certified LabVIEW Architect * Certified LabVIEW Embedded Developer * Certified Professional Instructor * LabVIEW Champion * Code Janitor

Have you been nice to future you?

AristosQueue (NI) · ‎04-21-2015

FabiolaDelaCueva wrote:
Yep, User Events only have two priorities. I believe it was in 2014 that the high priority was added, that enqueues the event at the other end.

Yep. Two priorities. Not the four that AF provides, one of which is off limits to everyone except the framework itself (used for Emergency Stop messages and Last Ack messages).

drjdpowell · ‎04-22-2015

AristosQueue wrote:
True. But you lose all of the priority queue aspects when you do that. Priority messaging is the major reason the queues are used in the AF. I don't know of any way to apply priority ordering to user events to the degree that AF would need to operate correctly.

Well, personally, I'm against having priority in a message queue.

drjdpowell · ‎04-22-2015

Brainstorms wrote:
A child crawls into the storage location, and the system detects this -- but can no longer message the digging machine actor to abort this action. No bueno.

I know this is just an example, but it's important to always stress: human safety must NEVER rely on software. Safety systems must be simple (and often redundant) hardware systems, such as a safety cage with a power kill switch on the door.

Additional safety systems can be software, but this is more to protect hardware from damage, rather than humans. In the case where a "Manager" actor controls both "safety sensor" actor and "Digger" actor, then that Manager is responcible for putting the digger in a safe state before sending shutdown to the Digger.

In fact, the Digger shouldn't even be able to move if it isn't receiving a "heartbeat" of "safety OK" messages, since the safety principle is "do something only if you know it is safe", not "do something unless you know it is not safe".

kegghead · ‎04-22-2015

FabiolaDelaCueva wrote:

Also User Events offer the option to unregister for a selection of messages while remaining registered for the rest.

The decoupling of the user event, registration, and event structure is a wonderfully useful mechanism. The syntax is perhaps a little cumbersome but very powerful and flexible for a native feature set.

Brainstorms · ‎04-22-2015

I shouldn't have given into the temptation of continuing AQ's example, but I agree that it's an important issue on its own. (Fodder for its own thread.)

You did touch on an element of the original issue, though: 'the Digger shouldn't even be able to move if it isn't receiving a "heartbeat" of "safety OK" messages'. (Pre-empt: Yes, this should be done via hardware...)

Let's abstract this somewhat so that details don't derail, and posit a system where messaging must come from other actors to either provide needed guidance or to inform the subsystem to abort the shutdown gracefully, repair a problem preventing graceful shutdown, or to change the direction of how it's shutting down. (You might think about shutting down Microsoft Windows when you have unsaved documents open.)

Those abilities/remedies are lost (or greatly hindered) if an actor that's shutting down a subsystem merely cuts off communications with the rest of the system -- especially its superiors. For one actor with a tiny piece of the pie, that's probably acceptable. For a large subsystem with complex behaviors, responsibilities, and delagate actors, it could be problematic. Even dangerous (to data, if not humans).

AristosQueue (NI) · ‎04-22-2015

I'm glad you did continue the Digger example because it is one of the reasons that no such "controlled shutdown" exists in the AF today. Consider the "Shutdown" command for Windows 98 (if you remember it) and "Shutdown" for Windows 7. With 98, once you told the OS to shutdown, it started closing programs. If it couldn't close programs it aborted them. One way or another, it was shutting down because that's what you told it to do. With Windows 7 (and probably something earlier), you might have programs with unsaved files. Not only do you get a chance nowadays to save those files, you can cancel the whole shutdown process.

The "Stop" command in the AF is a primitive command. It means, unambiguously, "kill the receiving actor." Tell an actor to stop and it stops. In my vision of the AF when I started, users would then build higher level commands out of that. That left it to each application and each actor to define the shutdown process, and use the Stop only for that final "actually quit now" moment. In some applications, users might abandon the Stop command entirely in favor of another shutdown message.

But as we have been discussing libraries of actors for reuse across applications, it has become more obvious to me that a "Stop" message for some actors is more involved but needs, somehow, to be the same Stop message for callers so that callers do not have to read the documentation to figure out how to correctly stop a given nested actor. This implies that the reaction to the Stop message needs to be re-definable on a per-actor basis. Even Emergency Stop may need to go through a controlled shutdown (i.e. even in an emergency, there's a correct way to shut down a nuclear reactor -- you don't just kill the control system).

I'm unsure how to wed these two design desires.

Brainstorms · ‎04-22-2015

I think a good part of the problem was mentioned by Casey in #6: "How a program stops is just as important as how it starts."

Yet how many engineers quickly move past that issue because they want to "get to the fun parts", and more or less pay lip service to the details of properly "safing, securing, stowing, and shutting down".

It needs emphasis as a part of any project that must be carefully designed and properly coded. It needs to be given its due. One does not merely "flip the off switch" and assume "well, then it stops". Design it!

drjdpowell · ‎04-28-2015

Brainstorms wrote:
Let's abstract this somewhat so that details don't derail, and posit a system where messaging must come from other actors to either provide needed guidance or to inform the subsystem to abort the shutdown gracefully, repair a problem preventing graceful shutdown, or to change the direction of how it's shutting down. (You might think about shutting down Microsoft Windows when you have unsaved documents open.)
Those abilities/remedies are lost (or greatly hindered) if an actor that's shutting down a subsystem merely cuts off communications with the rest of the system -- especially its superiors. For one actor with a tiny piece of the pie, that's probably acceptable. For a large subsystem with complex behaviors, responsibilities, and delagate actors, it could be problematic. Even dangerous (to data, if not humans).

I think one should see a difference between the shutdown of a system and a defined "shutdown" message to a subcomponent of the system. The latter useful as an abstraction: all the details that the higher-level system doesn't need to worry about that transition the subcomponent from "running" to "stopped". Anything that requires connection with other parts of the system cannot be abstracted away in this manner and thus cannot be part of a useful definition of a "Shutdown" action. It's still part of the shutdown of the system, but not part of the "shutdown routine" of the subcomponent.

Actor Framework Discussions

I'm thinking of formalizing verification of the Stop message in next rev of AF

Re: I'm thinking of formalizing verification of the Stop message in next rev of AF

Re: I'm thinking of formalizing verification of the Stop message in next rev of AF

Re: I'm thinking of formalizing verification of the Stop message in next rev of AF

Re: I'm thinking of formalizing verification of the Stop message in next rev of AF

Re: I'm thinking of formalizing verification of the Stop message in next rev of AF

Re: I'm thinking of formalizing verification of the Stop message in next rev of AF

Re: I'm thinking of formalizing verification of the Stop message in next rev of AF

Re: I'm thinking of formalizing verification of the Stop message in next rev of AF

Re: I'm thinking of formalizing verification of the Stop message in next rev of AF

Re: I'm thinking of formalizing verification of the Stop message in next rev of AF