How should I specify the operation range of MatrixOp
Hi,
I am trying to compute the following: multiplication of part of two waves --> mean of the resultant wave.
Say for example, wave1 and wave2 both have 1000 pnts, I want to do point-wise multiplication like resultWave[m] = wave1[m]*wave[m+offset]; return mean(resultWave), where offset will be called by another outer function. The below is my attempt to use MatrixOp + deletePoints to do so.
While it works, the duplicate wave and deletePoints were unavoidably repeated and I guess they have used quite a bit of computing power.
Is there a way I can use MatrixOp by letting it know which range of the wave I want it to operate on?
Otherwise, could you suggest a better way to implement the same?
Thanks a lot in advance for your advice.
wave wave1, wave2
variable offset
variable output=0
duplicate/o wave1, wave1temp
duplicate/o wave2, wave2temp
variable waveSize1 = numpnts(wave1)
variable waveSize2 = numpnts(wave2)
variable inner_offset
variable i
if( offset >= 0)
//positive offset
inner_offset = offset
// for(i=0;i<waveSize1-inner_offset;i++)
// output += wave1[i] * wave2[i+inner_offset]
// endfor
deletePoints 0,abs(inner_offset),wave2temp
deletePoints waveSize1-abs(inner_offset), inner_offset,wave1temp
MatrixOp/o resultWave = wave1temp*wave2temp
MatrixOp/o theMean = mean(resultWave)
else
//negative offset
inner_offset = Offset-1
// for(i=0;i<waveSize1-inner_offset;i++)
// output += wave1[i+inner_offset] * wave2[i]
// endfor
deletePoints 0,abs(inner_offset),wave1temp
deletePoints waveSize2-abs(inner_offset), abs(inner_offset),wave2temp
MatrixOp/o resultWave = wave1temp*wave2temp
MatrixOp/o theMean = mean(resultWave)
endif
//killwaves/z wave1temp, wave2temp, resultWave
return theMean[0]
//return output/(waveSize1-inner_offset-1)
end
function outer(...)
...
...
Covar = covariance_atOffset(p-PlusMinusLags,wave1,wave2)
end
I tried myself at writing a function which produces the same output as above code, but with using only MatrixOp and without generating or deleting any data. I am not really sure what your code is for, so I just tried to do the same things as in above code. Here is my result (disclaimer: I have not much experience in using MatrixOp, so there might be a much more efficient method).
wave wave1, wave2
variable offset
variable inner_offset = offset >= 0 ? abs(offset) : abs(offset-1)
if( offset >= 0)
MatrixOp/FREE theMean = mean( subRange(wave1,0,numPoints(wave1)-1-inner_offset,0,0) * subRange(wave2,inner_offset,numPoints(wave2)-1,0,0) )
else
MatrixOp/FREE theMean = mean( subRange(wave2,0,numPoints(wave2)-1-inner_offset,0,0) * subRange(wave1,inner_offset,numPoints(wave1)-1,0,0) )
endif
return theMean[0]
end
See my use of subRange to extract the relevant parts of the wave, instead of using deletePoints.
August 21, 2018 at 12:13 am - Permalink
Thanks for the code, interestingly your code is actually performing a little worst then my original code, although it didn't use any duplicate/deletePoints.
On testing:
My code took 0.15 sec, and your code took 0.2 sec.
I have attached the complete experiment file if you would like to take a look.
August 21, 2018 at 08:29 am - Permalink
Ok, I see. It seems subRange gets too involved since it is more powerful than deletePoints. In your current approach I can get a slight time bonus by combining the MatrixOp calls:
MatrixOp/FREE theMean = mean(wave1temp * wave2temp)
But you can get rid of the MatrixOp call all together. I realized that what you are actually doing here is calculating the Pearson's correlation coefficient. There is a built-in function for that called StatsCorrelation (this does not give you much of a speed boost though):
wave wave1, wave2
variable offset
duplicate/o wave1, wave1temp
duplicate/o wave2, wave2temp
if( offset >= 0)
deletePoints 0,offset,wave2temp
deletePoints numpnts(wave1)-offset, offset, wave1temp
else
deletePoints 0,abs(offset-1),wave1temp
deletePoints numpnts(wave2)-abs(offset-1), abs(offset-1), wave2temp
endif
return StatsCorrelation(wave1temp,wave2temp)
end
August 21, 2018 at 07:37 pm - Permalink
I found a way to get rid of the deletePoints command, which shaves off yet another small amount of time. Why not duplicate out just the range of interest in the first place (with the r flag for a subrange). Here's my optimized version:
wave wave1, wave2
variable offset
variable last1 = numpnts(wave1)-1
variable last2 = numpnts(wave2)-1
if( offset >= 0)
duplicate/o/r=[0,last1-offset] wave1, wave1temp
duplicate/o/r=[offset,last2] wave2, wave2temp
else
duplicate/o/r=[abs(offset-1),last1] wave1, wave1temp
duplicate/o/r=[0,last2-abs(offset-1)] wave2, wave2temp
endif
return StatsCorrelation(wave1temp,wave2temp)
end
August 21, 2018 at 10:57 pm - Permalink
You should use duplicate/free in the above unless you want the temp waves to stick around.
August 22, 2018 at 10:08 am - Permalink
Hello Sandbo,
You might want to try to use MatrixOP with rotateRows(wave,offset) which will allow you to keep the fast multiplication as long as you make sure to remove the edge effects from your results. That can be done via subRange() or Duplicate/R.
A.G.
August 22, 2018 at 10:54 am - Permalink
In reply to You should use duplicate… by Larry Hutchinson
The problem is that the duplicate/free gives a rather noticeable time penalty and it seems Sandbo is aiming for quick calculations. Also, while the suggestion of using rotateRows from Igor works it is slower, too.
August 22, 2018 at 07:04 pm - Permalink
The only reason non-free might be faster is if the temp waves are reused; perhaps if the function above is used in a loop.
August 23, 2018 at 08:57 am - Permalink
Thanks for all the help above, I really appreciate it.
I just realized I never mentioned the motivation:
The code was written to calculate the correlation between two sets of time-series samples. The inner loop perform the correlation computation, while the outer loop applied a time difference between the two sets of data digitally, this is to capture the temporal correlation as if the data was captured at a different time, as well as to catch correlation which has its maximum off zero lag due to a jitter in the digitization.
One reason I wanted to make this fast because the correlation is computed as data is being captured. So let's say we have 1 sec of acquisition time, then we can overlap this computation with the next 1 sec acquisition. By having this correlation calculated within 1 sec (or ideally less) allows us to have "zero" dead time for the overall data acquisition and analysis. In reality, possibly 4 such correlation or more will have to be done so with time of order 0.1 sec I am on the edge (there are other processing).
As we are computing the correlation for (number of lag) times, it can be intensive so I am trying to multi-threaded it. My supervisor actually initially wrote it as a multi-threaded function in C with XOP, so this is just trying to make a Igor only version to increase the portability. At the very end, I might again use GPU to compute this part as it should be much faster given a single data transfer is needed for (number of lag) element-wise multiplication and additions.
August 23, 2018 at 09:17 am - Permalink
You can use simple multi-threading within Igor, as both Duplicate and StatsCorrelation are 'threadsafe' functions. Just declare your function as ThreadSafe:
wave wave1, wave2
variable offset
variable last1 = numpnts(wave1)-1
variable last2 = numpnts(wave2)-1
if( offset >= 0)
duplicate/o/r=[0,last1-offset] wave1, wave1temp
duplicate/o/r=[offset,last2] wave2, wave2temp
else
duplicate/o/r=[abs(offset-1),last1] wave1, wave1temp
duplicate/o/r=[0,last2-abs(offset-1)] wave2, wave2temp
endif
return StatsCorrelation(wave1temp,wave2temp)
end
Then you can call your covariance calculation with the MultiThread prefix to (hopefully) get a speed-up:
MultiThread Covar = covariance_atOffset_opt(p-PlusMinusLags,wave1,wave2)
With your test experiment and on my machine this gave a 50% speed-up. This may be enough to push the calculation below the 0.1 sec mark on your machine. Just note that you are limited in the things you do inside a threadsafe function. You may want to read more about it here:
DisplayHelpTopic "ThreadSafe Functions"
Also, note that your (and my) current approach does not work with delta shift values (declared in your function as dx) other than integer values. If you want to calculate the covariance with sub-point shift values (there should not be much reason to do so though), you need to think of a different approach.
August 23, 2018 at 09:46 pm - Permalink
I ran the original code in the experiment on my Windows machine using Igor 8.01 and got similar results to Sandbo (0.146 seconds). I then tried chozo's version and got about a 50% speed up (0.073 seconds). This is on a machine with 8 cores (16 threads). Igor's ThreadProcessorCount function returns 16 on my machine.
I then tried using the /NT flag with the MultiThread keyword (this is new to Igor 8). It seems that on my machine, I get the best performance if I set /NT=4 (0.047 seconds). You may or may not get similar results on your system, depending on your hardware. Also, if the size of the waves used in the calculation changes, you might find that a different value of /NT gives the best performance. Since the optimal number of threads is based on so many factors and needs to be empirically determined, I wouldn't necessarily recommend that you use the /NT flag in your code, since it's possible it could artificially limit the performance. But if you need highly tuned code, it could come in handy.
August 24, 2018 at 08:13 am - Permalink
In reply to You can use simple multi… by chozo
Thanks chozo for the multithreaded code, it is now running double as fast comparing to my original code!
I had experience in the past crashing Igor when declaring function with thread-safe so I didn't really explore it much, seems like I should spend more time into learning it.
August 24, 2018 at 09:27 am - Permalink