Implementing a common/shared resource as an actor

jlokanis · ‎09-18-2012

I am still in the early stages of learning the AF so please excuse common errors in my questions...

I have read through all the posts regarding sibling messaging and I still do not know how to implement an actor to drive a common resource. This could be many things (shared hardware, a log file, database) but in my case it is a database. Let me begin by describing the problem:

(note: I have an existing implementation created in the pre-OOP days that implements this as a QMH/QSM)

My system has a central database that is used to supply the application with instructions (think: test script) and store all test results (on the fly). As a result of this, several asynchronous ‘actors’ in my system will need to access this database.

The actor responsible for loading the test script needs to request the information from the database and await an answer.
The actor responsible for starting the test needs to send the device under test (DUT) information to the database and await a test session ID to use with each ‘start test’, ‘end test’ and log message.
The actor responsible for launching each test needs to request a test ID to use with logging data from that specific test.
The actor responsible for logging the test results as they are returned (asynchronously) from running tests needs to write the data to the database (fire and forget).

These are just a few of the cased but I think this represents all the types of interactions between the siblings and the database actor.

The database actor needs to be responsible for attempting each command with a potential number of retires (databases and networks in my experience are not very reliable). It needs to implement a process to shut down and reestablish a connection to the database if there is a certain class of error (timeout, connection bad, etc.). It needs to distinguish between retry-able errors and syntax related errors (the parameter values are invalid, according to the called database API).

Each database API call has those common needs.

Some calls accept input data but do not return anything. Others need to return query data to the caller. (as noted above)

So, my question is how to best architect something like this. As described above, multiple siblings need to communicate with this actor. The actor needs to implement some common error processing code and specific code for each database API. It also needs to be able to return data to the calling in some but not all instances.

I am thinking about something along these lines:

Top level parent actor implements a ‘call database’ generic method.

Database class does not override actor core. Instead, it overrides ‘call database’ with the implementation of the error handler for the database. Somehow, inside this I need to dynamic dispatch the specific method for each database API. But that is handled by the actor core already. How do I wrap the actor core messaging with my retry mechanism?

Do I create each database API call as a separate method of the Database API or as a child class (so I can dynamic dispatch based on the class)? If child class, I would have to generate a message for each of these. That seems a bit complex (there are a lot of API calls).

I envision the siblings using this by calling the parent’s ‘call database’ method and passing the appropriate message (with parameters). Not sure how to handle the replies. Should I have the database message the caller back separately (and async) or should I wait for a reply? Normally, I need the reply data in the caller to continue, but that would block it from receiving other messages while it waits.

I would like to avoid cross tree communication but that might be a better way to implement this.

Also, maybe the database handler (actor/ controller/etc.) should not be an actor at all? I tend to think it should but I can be convinced otherwise.

(BTW: why am a centralizing database communication? Well, I could open a connection each time I called an API can then close it, but there is a lot of calls going on and there is overhead to the open/close operation. Also, the above implementation is a one of many instances running on my test machine. There could be 100 or more clones of this actor tree running at once. By limited each one to single database handle I keep the number manageable. Otherwise, I could have 1000s of handles open at once instead of only 100s)

Finally, the Database actor needs to be able to tell the parent to gracefully shutdown if the database becomes completely inaccessible (all retired exhausted). I think this is another advantage of using the AF for the database vs something else.

I think that covers it. Thanks for helping guide me in the right direction.

-John

-John
------------------------
Certified LabVIEW Architect

AristosQueue (NI) · ‎09-18-2012

jlokanis wrote:
(BTW: why am a centralizing database communication? Well, I could open a connection each time I called an API can then close it, but there is a lot of calls going on and there is overhead to the open/close operation. Also, the above implementation is a one of many instances running on my test machine. There could be 100 or more clones of this actor tree running at once. By limited each one to single database handle I keep the number manageable. Otherwise, I could have 1000s of handles open at once instead of only 100s)

I want to focus on this option first.

My hand is broken, so I am attempting to type this with Dragon dictation for the first time. Here we go:

If you attempt to create a single actor that represents all of your database accesses, you will end up with the traffic jam at your database. Database software is typically very expert at managing multiple connections simultaneously. You are usurping that role when you have one of your actors play the role of the database. You don't need to open a connection for every individual access to the database. Instead you need each actor that is going to access the database to open one connection. Yes, you could have 1000 handles open at once, but how many of those are going to be active simultaneously? Multiple open handles is not a problem for most database software. Multiple simultaneous writes are where problems occur. And although in theory all of your actors could attempt write simultaneously in practice I suspect that is not the case in practice. Even if writes do occur simultaneously, I suspect those writes are not to the same table.

Before you attempt any architecture where you attempt to centralize the database connections, I would see how your database software actually performs in practice with all the connections open. If you do have to go to a central database actor, you're going to have a lot of infrastructure for sending a message to that central actor having a process in order and reporting back the results of that message to the appropriate originating actor. It's going to become very messy.

Completely off-topic, the dictation software works pretty well, but it will need more training.

jlokanis · ‎09-19-2012

Sorry to hear about your hand. I wish you a quick recovery.

Just to flesh out some more details: this will eventually be part of a much larger application. In that application, I will be spawning 100's of copies of this 'tree' of actors within the same app instance. Each of these copies would have its own database interface 'actor'. Resulting in 100s of open simultaneous connections to the database. If I were to let each actor in the tree have its own connection for its database access purposes, that number would increase by a factor of 10, resulting in 1000's of open connections. Also, since my interface to the database is via stored procedures, I don't have direct knowledge of what tables are being read or written to internally when I call one of these APIs. Additionally, each 'tree' of actors operates under the context of a test session ID, which is global to that tree and required as a parameter for nearly all database transactions. So, each actor that wanted to execute a transaction would need to know what that ID value was. Since this ID would be returned by the database during an initialization process in a single actor, passing it to all other actors seems like a way of creating additional coupling that I would like to avoid. The database 'actor' on the other hand could store this value locally and apply it as needed, since it is the entity that retrieved it in the first place. Finally, I am uncertain how the retry mechanism would work with multiple simultaneous actors executing database transactions for the same session ID. It seems to me that each actors would need to have intimate understanding that it is accessing a database and the need to trap and retry errors and to shut down in the case of an catastrophic error.

However, if I were to not use an actor for database access, how would I go about achieving my goals? Here is one possible way:

I could create a class for the database API. That class could store a database connection handle as private data. It could also store the test session ID as private data. I could store the instance of the class in the private data of each actor that requires database access. I would need to create a constructor method that would read my database parameters from my config file and open the connection. I would also need a destructor to clean up the connection. Each actor would be responsible for calling these at initialization. (or the parent actor could do this prior to launching the actor).

One downside I see is database transaction issues could lock up an actor, preventing it from being shut down. The actor could be stuck in a retry loop or external lockup (due to the .NET api that we use to get to the database) and would not be responsive to other messages (even the stop message). Another downside is I don't see a clean way to pass the Test Session ID from the actor that retrieves it to all the other actors that need to access the database. They will all have their own instance of the database class and no visibility to each other’s private data. Any mechanism to work around this (like storing the ID in the private data of the top level parent) seems to increase coupling. Maybe that is bad, maybe not. Is there a better way? Keep in mind that this ID is global only to the instance of this tree of actors. I need to solve this issue because there will be a lot more data values that fall into this category and I want a clean mechanism to share them within the 'tree'. I could see my parent private data becoming quite large.

So, perhaps you are right and it does not makes sense for the database interface to be an actor. I suppose that since it is a QSM now, I just figured I would make it an actor when refactoring.

Anyone have any arguments to support one path or the other?

Thanks for the help!

-John

-John
------------------------
Certified LabVIEW Architect

jlokanis · ‎09-26-2012

What about constructing the database object in the parent, opening the connection as part of initialization before starting any child actors and then storing the database object as part of the private data of the parent, so all children can have access to it to call methods. The session ID can be part of the private data of the database object.

I think that will work. Not sure what pattern that is exactly.

-John
------------------------
Certified LabVIEW Architect

Actor Framework Discussions

Implementing a common/shared resource as an actor

Implementing a common/shared resource as an actor

Re: Implementing a common/shared resource as an actor

Re: Implementing a common/shared resource as an actor

Re: Implementing a common/shared resource as an actor