How should I specify the operation range of MatrixOp

I tried myself at writing a function which produces the same output as above code, but with using only MatrixOp and without generating or deleting any data. I am not really sure what your code is for, so I just tried to do the same things as in above code. Here is my result (disclaimer: I have not much experience in using MatrixOp, so there might be a much more efficient method).

function covariance_atOffset_MOp(offset, wave1, wave2)
    wave wave1, wave2
    variable offset
    
    variable inner_offset = offset >= 0 ? abs(offset) : abs(offset-1)
    if( offset >= 0)
        MatrixOp/FREE theMean = mean( subRange(wave1,0,numPoints(wave1)-1-inner_offset,0,0) * subRange(wave2,inner_offset,numPoints(wave2)-1,0,0) )
    else
        MatrixOp/FREE theMean = mean( subRange(wave2,0,numPoints(wave2)-1-inner_offset,0,0) * subRange(wave1,inner_offset,numPoints(wave1)-1,0,0) )     
    endif
    
    return theMean[0]
end

See my use of subRange to extract the relevant parts of the wave, instead of using deletePoints.

Log in or register to post comments

August 21, 2018 at 12:13 am - Permalink

Sandbo

Thanks for the code, interestingly your code is actually performing a little worst then my original code, although it didn't use any duplicate/deletePoints.

On testing:

My code took 0.15 sec, and your code took 0.2 sec.
I have attached the complete experiment file if you would like to take a look.

Attachments Test experiment file (19.08 MB)

Log in or register to post comments

August 21, 2018 at 08:29 am - Permalink

chozo

Ok, I see. It seems subRange gets too involved since it is more powerful than deletePoints. In your current approach I can get a slight time bonus by combining the MatrixOp calls:

MatrixOp/FREE theMean = mean(wave1temp * wave2temp)

But you can get rid of the MatrixOp call all together. I realized that what you are actually doing here is calculating the Pearson's correlation coefficient. There is a built-in function for that called StatsCorrelation (this does not give you much of a speed boost though):

function covariance_atOffset_Stats(offset, wave1, wave2)
    wave wave1, wave2
    variable offset
 
    duplicate/o wave1, wave1temp
    duplicate/o wave2, wave2temp
    
    if( offset >= 0)
        deletePoints 0,offset,wave2temp
        deletePoints numpnts(wave1)-offset, offset, wave1temp
    else
        deletePoints 0,abs(offset-1),wave1temp
        deletePoints numpnts(wave2)-abs(offset-1), abs(offset-1), wave2temp
    endif
 
    return StatsCorrelation(wave1temp,wave2temp)
end

Log in or register to post comments

August 21, 2018 at 07:37 pm - Permalink

chozo

I found a way to get rid of the deletePoints command, which shaves off yet another small amount of time. Why not duplicate out just the range of interest in the first place (with the r flag for a subrange). Here's my optimized version:

function covariance_atOffset_opt(offset, wave1, wave2)
    wave wave1, wave2
    variable offset
 
    variable last1 = numpnts(wave1)-1
    variable last2 = numpnts(wave2)-1
    if( offset >= 0)
        duplicate/o/r=[0,last1-offset] wave1, wave1temp
        duplicate/o/r=[offset,last2] wave2, wave2temp
    else
        duplicate/o/r=[abs(offset-1),last1] wave1, wave1temp
        duplicate/o/r=[0,last2-abs(offset-1)] wave2, wave2temp
    endif
 
    return StatsCorrelation(wave1temp,wave2temp)
end

Log in or register to post comments

August 21, 2018 at 10:57 pm - Permalink

Larry Hutchinson

You should use duplicate/free in the above unless you want the temp waves to stick around.

Log in or register to post comments

August 22, 2018 at 10:08 am - Permalink

Igor

Hello Sandbo,

You might want to try to use MatrixOP with rotateRows(wave,offset) which will allow you to keep the fast multiplication as long as you make sure to remove the edge effects from your results. That can be done via subRange() or Duplicate/R.

A.G.

Log in or register to post comments

August 22, 2018 at 10:54 am - Permalink

chozo

Larry Hutchinson wrote:

You should use duplicate/free in the above unless you want the temp waves to stick around.

The problem is that the duplicate/free gives a rather noticeable time penalty and it seems Sandbo is aiming for quick calculations. Also, while the suggestion of using rotateRows from Igor works it is slower, too.

Log in or register to post comments

August 22, 2018 at 07:04 pm - Permalink

Larry Hutchinson

The only reason non-free might be faster is if the temp waves are reused; perhaps if the function above is used in a loop.

Log in or register to post comments

August 23, 2018 at 08:57 am - Permalink

Sandbo

Thanks for all the help above, I really appreciate it.

I just realized I never mentioned the motivation:
The code was written to calculate the correlation between two sets of time-series samples. The inner loop perform the correlation computation, while the outer loop applied a time difference between the two sets of data digitally, this is to capture the temporal correlation as if the data was captured at a different time, as well as to catch correlation which has its maximum off zero lag due to a jitter in the digitization.

One reason I wanted to make this fast because the correlation is computed as data is being captured. So let's say we have 1 sec of acquisition time, then we can overlap this computation with the next 1 sec acquisition. By having this correlation calculated within 1 sec (or ideally less) allows us to have "zero" dead time for the overall data acquisition and analysis. In reality, possibly 4 such correlation or more will have to be done so with time of order 0.1 sec I am on the edge (there are other processing).

As we are computing the correlation for (number of lag) times, it can be intensive so I am trying to multi-threaded it. My supervisor actually initially wrote it as a multi-threaded function in C with XOP, so this is just trying to make a Igor only version to increase the portability. At the very end, I might again use GPU to compute this part as it should be much faster given a single data transfer is needed for (number of lag) element-wise multiplication and additions.

Log in or register to post comments

August 23, 2018 at 09:17 am - Permalink

chozo

You can use simple multi-threading within Igor, as both Duplicate and StatsCorrelation are 'threadsafe' functions. Just declare your function as ThreadSafe:

ThreadSafe function covariance_atOffset_opt(offset, wave1, wave2)
    wave wave1, wave2
    variable offset
 
    variable last1 = numpnts(wave1)-1
    variable last2 = numpnts(wave2)-1
    if( offset >= 0)
        duplicate/o/r=[0,last1-offset] wave1, wave1temp
        duplicate/o/r=[offset,last2] wave2, wave2temp
    else
        duplicate/o/r=[abs(offset-1),last1] wave1, wave1temp
        duplicate/o/r=[0,last2-abs(offset-1)] wave2, wave2temp
    endif
 
    return StatsCorrelation(wave1temp,wave2temp)
end

Then you can call your covariance calculation with the MultiThread prefix to (hopefully) get a speed-up:

MultiThread Covar = covariance_atOffset_opt(p-PlusMinusLags,wave1,wave2)

With your test experiment and on my machine this gave a 50% speed-up. This may be enough to push the calculation below the 0.1 sec mark on your machine. Just note that you are limited in the things you do inside a threadsafe function. You may want to read more about it here:

DisplayHelpTopic "ThreadSafe Functions"

Also, note that your (and my) current approach does not work with delta shift values (declared in your function as dx) other than integer values. If you want to calculate the covariance with sub-point shift values (there should not be much reason to do so though), you need to think of a different approach.

Log in or register to post comments

August 23, 2018 at 09:46 pm - Permalink

aclight

I ran the original code in the experiment on my Windows machine using Igor 8.01 and got similar results to Sandbo (0.146 seconds). I then tried chozo's version and got about a 50% speed up (0.073 seconds). This is on a machine with 8 cores (16 threads). Igor's ThreadProcessorCount function returns 16 on my machine.

I then tried using the /NT flag with the MultiThread keyword (this is new to Igor 8). It seems that on my machine, I get the best performance if I set /NT=4 (0.047 seconds). You may or may not get similar results on your system, depending on your hardware. Also, if the size of the waves used in the calculation changes, you might find that a different value of /NT gives the best performance. Since the optimal number of threads is based on so many factors and needs to be empirically determined, I wouldn't necessarily recommend that you use the /NT flag in your code, since it's possible it could artificially limit the performance. But if you need highly tuned code, it could come in handy.

Log in or register to post comments

August 24, 2018 at 08:13 am - Permalink

Sandbo

Thanks chozo for the multithreaded code, it is now running double as fast comparing to my original code!
I had experience in the past crashing Igor when declaring function with thread-safe so I didn't really explore it much, seems like I should spend more time into learning it.

Log in or register to post comments

August 24, 2018 at 09:27 am - Permalink