How should I specify the operation range of MatrixOp

Hi,

I am trying to compute the following: multiplication of part of two waves --> mean of the resultant wave.
Say for example, wave1 and wave2 both have 1000 pnts, I want to do point-wise multiplication like resultWave[m] = wave1[m]*wave[m+offset]; return mean(resultWave), where offset will be called by another outer function. The below is my attempt to use MatrixOp + deletePoints to do so.

While it works, the duplicate wave and deletePoints were unavoidably repeated and I guess they have used quite a bit of computing power.
Is there a way I can use MatrixOp by letting it know which range of the wave I want it to operate on?

Otherwise, could you suggest a better way to implement the same?
Thanks a lot in advance for your advice.

function covariance_atOffset(offset, wave1, wave2)
    wave wave1, wave2
    variable offset
   
    variable output=0
    duplicate/o wave1, wave1temp
    duplicate/o wave2, wave2temp
    variable waveSize1 = numpnts(wave1)
    variable waveSize2 = numpnts(wave2)
    variable inner_offset

    variable i
    if( offset >= 0)
        //positive offset
        inner_offset = offset
//      for(i=0;i<waveSize1-inner_offset;i++)
//          output += wave1[i] * wave2[i+inner_offset]
//          endfor
        deletePoints 0,abs(inner_offset),wave2temp
        deletePoints waveSize1-abs(inner_offset), inner_offset,wave1temp
        MatrixOp/o resultWave = wave1temp*wave2temp
        MatrixOp/o theMean = mean(resultWave)
        else
        //negative offset
        inner_offset = Offset-1
//      for(i=0;i<waveSize1-inner_offset;i++)
//          output += wave1[i+inner_offset] * wave2[i]
//          endfor
        deletePoints 0,abs(inner_offset),wave1temp
        deletePoints waveSize2-abs(inner_offset), abs(inner_offset),wave2temp
        MatrixOp/o resultWave = wave1temp*wave2temp
        MatrixOp/o theMean = mean(resultWave)
        endif
   
    //killwaves/z   wave1temp, wave2temp, resultWave
    return theMean[0]
    //return output/(waveSize1-inner_offset-1)
   
end

function outer(...)
        ...
        ...
        Covar = covariance_atOffset(p-PlusMinusLags,wave1,wave2)

end

 

I tried myself at writing a function which produces the same output as above code, but with using only MatrixOp and without generating or deleting any data. I am not really sure what your code is for, so I just tried to do the same things as in above code. Here is my result (disclaimer: I have not much experience in using MatrixOp, so there might be a much more efficient method).

function covariance_atOffset_MOp(offset, wave1, wave2)
    wave wave1, wave2
    variable offset
   
    variable inner_offset = offset >= 0 ? abs(offset) : abs(offset-1)
    if( offset >= 0)
        MatrixOp/FREE theMean = mean( subRange(wave1,0,numPoints(wave1)-1-inner_offset,0,0) * subRange(wave2,inner_offset,numPoints(wave2)-1,0,0) )
    else
        MatrixOp/FREE theMean = mean( subRange(wave2,0,numPoints(wave2)-1-inner_offset,0,0) * subRange(wave1,inner_offset,numPoints(wave1)-1,0,0) )    
    endif
   
    return theMean[0]
end

See my use of subRange to extract the relevant parts of the wave, instead of using deletePoints

Thanks for the code, interestingly your code is actually performing a little worst then my original code, although it didn't use any duplicate/deletePoints.

On testing:

My code took 0.15 sec, and your code took 0.2 sec.
I have attached the complete experiment file if you would like to take a look.

Test experiment file (19.08 MB)

Ok, I see. It seems subRange gets too involved since it is more powerful than deletePoints. In your current approach I can get a slight time bonus by combining the MatrixOp calls:

MatrixOp/FREE theMean = mean(wave1temp * wave2temp)

But you can get rid of the MatrixOp call all together. I realized that what you are actually doing here is calculating the Pearson's correlation coefficient. There is a built-in function for that called StatsCorrelation (this does not give you much of a speed boost though):

function covariance_atOffset_Stats(offset, wave1, wave2)
    wave wave1, wave2
    variable offset

    duplicate/o wave1, wave1temp
    duplicate/o wave2, wave2temp
   
    if( offset >= 0)
        deletePoints 0,offset,wave2temp
        deletePoints numpnts(wave1)-offset, offset, wave1temp
    else
        deletePoints 0,abs(offset-1),wave1temp
        deletePoints numpnts(wave2)-abs(offset-1), abs(offset-1), wave2temp
    endif
 
    return StatsCorrelation(wave1temp,wave2temp)
end

 

I found a way to get rid of the deletePoints command, which shaves off yet another small amount of time. Why not duplicate out just the range of interest in the first place (with the r flag for a subrange). Here's my optimized version:

function covariance_atOffset_opt(offset, wave1, wave2)
    wave wave1, wave2
    variable offset

    variable last1 = numpnts(wave1)-1
    variable last2 = numpnts(wave2)-1
    if( offset >= 0)
        duplicate/o/r=[0,last1-offset] wave1, wave1temp
        duplicate/o/r=[offset,last2] wave2, wave2temp
    else
        duplicate/o/r=[abs(offset-1),last1] wave1, wave1temp
        duplicate/o/r=[0,last2-abs(offset-1)] wave2, wave2temp
    endif

    return StatsCorrelation(wave1temp,wave2temp)
end

 

Hello Sandbo,

You might want to try to use MatrixOP with rotateRows(wave,offset) which will allow you to keep the fast multiplication as long as you make sure to remove the edge effects from your results.  That can be done via subRange() or Duplicate/R.

 

A.G.

 

In reply to by Larry Hutchinson

Larry Hutchinson wrote:

You should use duplicate/free in the above unless you want the temp waves to stick around.

The problem is that the duplicate/free gives a rather noticeable time penalty and it seems Sandbo is aiming for quick calculations. Also, while the suggestion of using rotateRows from Igor works it is slower, too.

The only reason non-free might be faster is if the temp waves are reused; perhaps if the function above is used in a loop.

 

Thanks for all the help above, I really appreciate it.

I just realized I never mentioned the motivation:
The code was written to calculate the correlation between two sets of time-series samples. The inner loop perform the correlation computation, while the outer loop applied a time difference between the two sets of data digitally, this is to capture the temporal correlation as if the data was captured at a different time, as well as to catch correlation which has its maximum off zero lag due to a jitter in the digitization.

One reason I wanted to make this fast because the correlation is computed as data is being captured. So let's say we have 1 sec of acquisition time, then we can overlap this computation with the next 1 sec acquisition. By having this correlation calculated within 1 sec (or ideally less) allows us to have "zero" dead time for the overall data acquisition and analysis. In reality, possibly 4 such correlation or more will have to be done so with time of order 0.1 sec I am on the edge (there are other processing).

As we are computing the correlation for (number of lag) times, it can be intensive so I am trying to multi-threaded it. My supervisor actually initially wrote it as a multi-threaded function in C with XOP, so this is just trying to make a Igor only version to increase the portability. At the very end, I might again use GPU to compute this part as it should be much faster given a single data transfer is needed for (number of lag) element-wise multiplication and additions.

You can use simple multi-threading within Igor, as both Duplicate and StatsCorrelation are 'threadsafe' functions. Just declare your function as ThreadSafe:

ThreadSafe function covariance_atOffset_opt(offset, wave1, wave2)
    wave wave1, wave2
    variable offset

    variable last1 = numpnts(wave1)-1
    variable last2 = numpnts(wave2)-1
    if( offset >= 0)
        duplicate/o/r=[0,last1-offset] wave1, wave1temp
        duplicate/o/r=[offset,last2] wave2, wave2temp
    else
        duplicate/o/r=[abs(offset-1),last1] wave1, wave1temp
        duplicate/o/r=[0,last2-abs(offset-1)] wave2, wave2temp
    endif

    return StatsCorrelation(wave1temp,wave2temp)
end

Then you can call your covariance calculation with the MultiThread prefix to (hopefully) get a speed-up:

MultiThread Covar = covariance_atOffset_opt(p-PlusMinusLags,wave1,wave2)

With your test experiment and on my machine this gave a 50% speed-up. This may be enough to push the calculation below the 0.1 sec mark on your machine. Just note that you are limited in the things you do inside a threadsafe function. You may want to read more about it here:

DisplayHelpTopic "ThreadSafe Functions"

Also, note that your (and my) current approach does not work with delta shift values (declared in your function as dx) other than integer values. If you want to calculate the covariance with sub-point shift values (there should not be much reason to do so though), you need to think of a different approach.

I ran the original code in the experiment on my Windows machine using Igor 8.01 and got similar results to Sandbo (0.146 seconds). I then tried chozo's version and got about a 50% speed up (0.073 seconds). This is on a machine with 8 cores (16 threads). Igor's ThreadProcessorCount function returns 16 on my machine.

I then tried using the /NT flag with the MultiThread keyword (this is new to Igor 8). It seems that on my machine, I get the best performance if I set /NT=4 (0.047 seconds). You may or may not get similar results on your system, depending on your hardware. Also, if the size of the waves used in the calculation changes, you might find that a different value of /NT gives the best performance. Since the optimal number of threads is based on so many factors and needs to be empirically determined, I wouldn't necessarily recommend that you use the /NT flag in your code, since it's possible it could artificially limit the performance. But if you need highly tuned code, it could come in handy.

In reply to by chozo

Thanks chozo for the multithreaded code, it is now running double as fast comparing to my original code!
I had experience in the past crashing Igor when declaring function with thread-safe so I didn't really explore it much, seems like I should spend more time into learning it.