LabVIEW

cancel
Showing results for 
Search instead for 
Did you mean: 

Possible strange bug with parallel loops in LV2010

I have a specific code which does not execute correctly in parallel, but making almost any small change to the code causes it to work.  Even more bizarre, I can change it so that it sometimes works correctly, and somtimes doesn't. Interested?  Read on...

 

I have a relatively simple VI (to compute 2D Wavelet Denoising on planes of a 3D array) with the following block diagram (debugging not allowed):

ParallelBug.png

The parallel loop is set to 8 automatically-partitioned instances, and I've tested this on 2-core and 8-core machines.  When the #cores wired into the loop is not one, several of the iterations do not compute (for example 4 out of 24).  I can check this by putting a break in the VI called inside the loop and counting how many times it is called.

 

I can do a number of things to make all iterations run:

- put an indicator inside the loop on the iteration number, or anything else (except the P node)

- wire the iteration number outside the loop to an indicator

- remove the case structure

- remove the loop in the other case (it's at the moment identical except for a call to a different VI which uses Discrete WT instead of Undecimated WT -- if I change the other case to match this one, the other case runs fine but this one still fails)

- change code outside the loop

- turn on debugging!

- change the loop setup to another number of instances other than 8

- change the loop setup to set the number of partitions

 

I can also do a number of things that do not solve the problem:

- rewriting the loop from scratch

- copying the loop from the other case

- saving for LV2009 (in which everything runs as expected) and then reloading (and presumably recompiling) in 2010

 

Stranger still, if I replace the called VI with a dummy VI (which simply adds one to the input 2D array so I can tell if it executes), then as long as the other wires are still wired in, it sometimes executes all iterations, and sometimes doesn't (roughly 50% split)!!!  Any other changes and it always works fine.

 

I've managed to replace as much as possible with dummy code and still keep the results the same (i.e. sometimes executing all iterations and sometimes not).

LoopTest.png

If you want to check it out (on a multicore machine with LV2010), run LoopTest.vi, and the results at the index shown should sometimes be 13 and sometimes 14. Changing #cores to 1 will always give a result of 14, changing WT to DWT always gives 14, even though the code inside both cases is identical!

 

Hope it's not just me, otherwise I'll be sure that LV hates me!

0 Kudos
Message 1 of 11
(3,400 Views)

I thought you couldn't have shift registers on a parallel for loop. Shouldn't that break your Run Arrow? Shift registers don't make sense on a parallel for loop because you can't control in what order the loop iterations execute.

Jarrod S.
National Instruments
0 Kudos
Message 2 of 11
(3,375 Views)

@jarrod S. wrote:

I thought you couldn't have shift registers on a parallel for loop. Shouldn't that break your Run Arrow? Shift registers don't make sense on a parallel for loop because you can't control in what order the loop iterations execute.


Shift registers apepar to be fine, as long as the same index is accessed and replaced (as in this code) -- such an arrangement shouldn't conflict with a variable order of execution. The Run Arrow is broken however if the indices don't match.  I'm using the shift registers here to ensure that memory is not duplicated for what can be very large arrays.

 

0 Kudos
Message 3 of 11
(3,347 Views)

Greg,

 

What does the 13 or 14 represent?  Is it possible for you to simplify the example so we can confirm that the number of loop iterations changes without having to know or dig through your application specific code?

 

Regards,

 

Sam Kristoff

Applications Engineer

National Instruments

0 Kudos
Message 4 of 11
(3,322 Views)

Sammy --

 

The code I've attached is as simple as I can make it while still retaining the problems mentioned - hopefully there's not much to actually dig through.  I'll try to describe what happens:

 

The 3D array "Image (z,y,x)"  has a value of 13 at index [6,0,0].  The VI that's called inside the parallel For-loop simply adds one to the array passed to it.  So if the parallel loop runs correctly, the output value (i.e. the 3D array "Denoised (z,y,x)") at index [6,0,0] should be 14. If it doesn't run correctly, then subarray [6,*,*] is one that is not executed (on two of my machines anyway) and so the output value is the same as the input, i.e. 13.

 

All the other code is just dummy routines, though if I change or simplify any of it any further, the parallel loop starts running correctly.  For example, that VI that adds one, if I remove any of the other wires to it, then things work ok.

 

Thanks for having a look at it.

Cheers ~ Greg

 

0 Kudos
Message 5 of 11
(3,309 Views)

Here's a new screenshot that might help (just new icons for the subVIs):

 

LoopTest_BD.png

0 Kudos
Message 6 of 11
(3,303 Views)

OK, here's a new version that exhibits the same behaviour, but is hopefully simpler to understand.

ParallelLoop_BD.png

 

The expected result is to add one to the whole array, however sometimes not all of the parallel iterations are executed.  Run ParallelLoopTest.vi from the attached ZIP file on a multi-core machine to verify.  This screenshot shows that sometimes it works correctly, and sometimes not:

ParallelLoopTest.png

 

The code in the other case is identical (created by duplicating the case) but always works correctly.

0 Kudos
Message 7 of 11
(3,264 Views)

On my Mac (2 Quad core processors) both cases always produce the correct result.  Could be due to recompiling for the different platform.

 

Lynn

0 Kudos
Message 8 of 11
(3,227 Views)

Thanks Lynn.  I've now replicated the bug on Win32 2-core, Win32 8-core and Win64 4-core for LV 2010f4.

0 Kudos
Message 9 of 11
(3,199 Views)

The issue reproduces on my quad-core Windows machine. This was reported to R&D (# 278100) for further investigation. Thanks for bringing this to our attention.

0 Kudos
Message 10 of 11
(3,167 Views)