LabVIEW

cancel
Showing results for 
Search instead for 
Did you mean: 

Interpolate Pixel


@Andrey_Dmitriev wrote:

@paul_a_cardinale wrote:

To be extra fancy, it should take into account gamma (that would probably slow it down to a crawl).


Interesting, what exactly do you mean? I haven't much experience with color images, so my idea could be wrong, but probably 2D interpolation in color mode other than RGB will help, like in HSL?


In the human visual system, perceived brightness is approximately proportional to the cube root of light intensity.  In order to make the best use of the available dynamic range, digital cameras use a non-linear transfer function.  Probably each brand has its own proprietary function; but they should roughly approximate the human visual system.  Thus, with an 8-bit value, 127 should represent a light level that looks half as bright as the light level represented by 255; which is to say a light level of 1/8 as much as the 255 light level.  Many image editing programs (even expensive ones) don't take this into account, with the effect that downscaling an image causes it to darken.  For example, the average of black (0) and white (255) should be 202, not 127.

0 Kudos
Message 21 of 91
(247 Views)

I always thought that the display devices take care of gamma correction, but this is definitely not my field of expertise. 😄

 

(Most monitors have available gamma adjustment. Same for windows itself.)

 

0 Kudos
Message 22 of 91
(241 Views)

@altenbach wrote:

I always thought that the display devices take care of gamma correction, but this is definitely not my field of expertise. 😄

 

(Most monitors have available gamma adjustment. Same for windows itself.)

 


Yes, they do.  So a display will present a value of 127 with about 1/8 of the light level of a value of 255.

The problem is that when you blend pixels, you want to blend light levels.  But the numeric values are not proportional to the light levels.  So just blending the numbers gives inaccurate results.

0 Kudos
Message 23 of 91
(232 Views)

@paul_a_cardinale wrote:


In the human visual system, perceived brightness is approximately proportional to the cube root of light intensity.  In order to make the best use of the available dynamic range, digital cameras use a non-linear transfer function. 


Thank you very much! In general, we use medical grayscale monitors for viewing industrial x-ray images in DICOM mode, and these monitors should meet the dedicated Grayscale Standard Display Function, which dictates the Luminance (measured in cd/m²) that shall be emitted at each Digital Driving Level on the monitor. They are pre-adjusted and perform self-calibration from time to time. I really never thought about it before, but now with your simple explanation, I understand this characteristic curve much better. An interesting fact is that on these monitors, the human eye can easily recognize also a single gradation difference (like between 200 and 201, when such areas are large enough or on smooth gradients), therefore we can display the image at 10-bit, and the banding effect disappears, because we have 1024 grays (or per color channel in case of color monitor) gradations on the screen simultaneosly.

0 Kudos
Message 24 of 91
(231 Views)

@altenbach wrote:

Replacing the detour via DBL multiplication with a LUT approach, gives me about 3-7x speedup over the original code.

(Since the LUT is quantized to U8 values, it does not really need to be very big)

 

I'll attach some code later.

 

altenbach_0-1713459364110.png

 

 

For example, the following 8x expansion takes about 100microseconds on my laptop (same result as yours). I am sure there is some slack left:

 

altenbach_1-1713459690943.png

 

The LUT cosntant is created as follows:

 

altenbach_0-1713464641823.png

 

 


It took me a while to realize why the final compound add works with a packed triplet of bytes.

In the code to create the LUT, how did you get the multiply function to output U8s?

0 Kudos
Message 25 of 91
(209 Views)

@paul_a_cardinale wrote:

 

In the code to create the LUT, how did you get the multiply function to output U8s?


Right click the node, Properties, then Output Configuration.

Message 26 of 91
(346 Views)

 


what I like about the LUT approach in Message 16 is that it appears that paralleleization can be turned on for the two nested For Loops, and it appears to be significantly faster, than with out

 

alexderjuengere_8-1713531164922.png

 

 


@altenbach wrote:

@paul_a_cardinale wrote:
To be extra fancy, it should take into account gamma (that would probably slow it down to a crawl).

In my code, all you probably need is a nonlinear LUT to take gamma into account, but I have not tried that.


- gamma correction is often just an implementation of Output = Input**(x), where ** means "to the power of"
- we use this to brighten or dark images, so a user can better interpret an image

- So, in a very naive way, I would just put in the following , with y limited to 1 to +Inf in order to "darken" the image

alexderjuengere_0-1713531892959.png

 

- values between 0 and 1 would "brighten" the image, but you have to take some extra care of maximum value  overflowing, and I can't remember at the moment, how I have done this before

 

alexderjuengere_3-1713530938455.png alexderjuengere_5-1713531052398.png  
alexderjuengere_4-1713530950736.png alexderjuengere_6-1713531101063.png  
  alexderjuengere_7-1713531137693.png  


 

 

attached as Labview 2020



 

 

 

 

 

 

 

 

 

 

 

0 Kudos
Message 27 of 91
(318 Views)

@altenbach wrote:

@paul_a_cardinale wrote:
To be extra fancy, it should take into account gamma (that would probably slow it down to a crawl).

In my code, all you probably need is a nonlinear LUT to take gamma into account, but I have not tried that.


That was my initial thought too.  But seems to be more complicated than that.  First the gamma has to be backed out, then the weighted average taken, then the gamma reapplied.  I think that would take 3 LUTs; and my guess is that they would have to be fatter than 8-bits in order to prevent quantization artifacts.

0 Kudos
Message 28 of 91
(311 Views)

@alexderjuengere wrote:

 


what I like about the LUT approach in Message 16 is that it appears that paralleleization can be turned on for the two nested For Loops, and it appears to be significantly faster, than with out

 

alexderjuengere_8-1713531164922.png

 

 


@altenbach wrote:

@paul_a_cardinale wrote:
To be extra fancy, it should take into account gamma (that would probably slow it down to a crawl).

In my code, all you probably need is a nonlinear LUT to take gamma into account, but I have not tried that.


- gamma correction is often just an implementation of Output = Input**(x), where ** means "to the power of"
- we use this to brighten or dark images, so a user can better interpret an image

- So, in a very naive way, I would just put in the following , with y limited to 1 to +Inf in order to "darken" the image

alexderjuengere_0-1713531892959.png

 

- values between 0 and 1 would "brighten" the image, but you have to take some extra care of maximum value  overflowing, and I can't remember at the moment, how I have done this before

 

alexderjuengere_3-1713530938455.png alexderjuengere_5-1713531052398.png  
alexderjuengere_4-1713530950736.png alexderjuengere_6-1713531101063.png  
  alexderjuengere_7-1713531137693.png  


 

 

attached as Labview 2020



 

 

 

 

 

 

 

 

 

 

 


But we're not talking about applying a gamma.

I want to interpolate pixels.  But simply interpolating the tristimulus values is not accurate because those values are not directly proportional to the light levels.   The gamma must be backed out, the interpolation performed, then the gamma reapplied; and this must be done for each pixel.

Message 29 of 91
(304 Views)

@alexderjuengere wrote:

 


what I like about the LUT approach in Message 16 is that it appears that paralleleization can be turned on for the two nested For Loops, and it appears to be significantly faster, than with out

 

alexderjuengere_8-1713531164922.png

 


I would strongly recommend to only parallelize the outer FOR loop. Once every core works on one of the inner loop stacks, there is nothing left to further parallelize.

The parallelization overhead is smallest on the outermost loop, but we still need to carefully test everything to make sure it is really faster. For example the parallelization will have many copies of the LUT and the CPU cache will get tighter.

(For example, during testing, I had a version that did the three colors in a parallel FOR loop and it was an order of magnitude slower than the unrolled version.)

 

As a further comment, you can eliminate the local variable and sequence frame in the caller by just enabling "clear indicators when called".

 

altenbach_0-1713537170733.png

 

 

 

 

 

Message 30 of 91
(296 Views)