As a beginning IgorPro user, I'm kinda stuck with the following problem: I have a dataset in which 512 x 512 waves have to be fitted with an exponential curve to extract a time constant. Therefor, I made a small procedure, which works well. However, although the /NTHR=0 flag is included in the CurveFit command, only one of my 4 processors is being used...
I'm running IgorPro 6.21 on a Windows7 computer with i5 processor.
Any suggestions that might activate the other 3 CPU's and hopefully bring down the computing time of my little procedure?
The built-in parallelism of the curve fitting operation only deals with a single curve fit at a time. So within a single curvefit, Igor will try to use multiple processors. This can be lead to a speedup if your model function is computationally expensive, so that Igor can independently compute values for e.g. different x coordinates.
Fitting an exponential function is probably not computationally expensive enough to show a net benefit from parallellization. Perhaps Igor even explicitly disables it in the case of exponential fitting.
To get a benefit from this type of parallelization I believe that you will have to parallelize at a higher level, so that several instance of the curve fitting operation are run simultaneously. This is a rather involved topic, for which you will have to resort to creating threadsafe worker functions that are dispatched from the Igor function using ThreadGroupCreate, ThreadStart, and ThreadGroupWait. It's not that difficult, but it takes a while to wrap your head around the concepts.
I'd suggest starting by reading up on "ThreadSafe Functions and Multitasking" in the Igor manual.
I'd also write some more explicit help but I'm afraid I'm in a rush right now, someone else might chime in with more detailed comments.
During testing I found that parallelization of the built-in fit functions didn't help, and often actually hurt. Consequently, only standard user-defined fits are currently parallelized.
Since you are doing many fits, you would probably benefit from running multiple fits at the same time using Igor's support for threaded programming. To see an example of this applied to curve fitting, see File->Example Experiments->Curve Fitting->MultipleFitsInThreads.
My advice would be to spawn a thread for each processor, and have each thread run some part of the loop of 512 fits. My point being, don't do four fits in four threads, return to your main program and do another four fits. Rather, do 128 fits in a loop in each thread. You will have some bookkeeping to do to distribute data and collect results.
It's by no means trivial to do, but will get you better speed-up than the built-in threading in CurveFit.
To learn more about threaded programming in Igor, execute this command:
DisplayHelpTopic "ThreadSafe Functions and Multitasking"
I happen to know that the OP's data is a stack of images of dimensions 512x512xn, where n is the number of planes. The goal here is to perform a fit on each of the beams in the stack, i.e., the 262144 traces of n points. I haven't looked at the example experiment John mentioned, but here's how I would implement this:
Create a function that takes the stack, a row index, and a column index, and that returns a wave. The prototype of this function would be
CurveFit<function type> dataWave[row][column][]/NTHR=1<whatever flags you want> wave W_Coef
// make a copy of W_Coef and return it Duplicate/FREE /O W_Coef, fitResults // note the /FREE flag return fitResults End
This worker function can be then be called from a higher-level function using the multithread keyword. The cool thing here is that we can use MultiThread to take most of the pain out of threading, and do away with explicit constructs entirely. This function could be along these lines:
Make/O/N=(nRows, nCols)/WAVE/FREE M_fitResults // M_fitResults is a wave containing wave references. Each point of this wave // is a reference to a wave where we will store the fit results // of that particular x,y coordinate. // Note that M_fitResults (and the waves it references) will cease to exist when this function // exits. So you will want to copy the data to some other format when the calculation is done.
// now do the actual curve fitting, multithreaded MultiThread M_fitResults = MyWorker(dataWave, p, q)
// here you will want to add code that loops over all the wave references in M_fitResults and processes // them to your convenience or copies them in some format that you can store persistently.
End
For more background on this technique you can take a look at the "Wave Reference MultiThread Example" in the Igor manual.
By the way, now that we're discussing this, I'd like to make a related suggestion: as we all know CurveFit and derivatives offer a /N flag that suppresses updates between iterations, in order to speed up fitting.
Back when I did lots of curve fitting I remember that the real speedup was setting bit 2 of V_fitOptions to get rid of the progress window. It's great that this option is included, but it's pretty well hidden in the documentation. I only become aware of it after you mentioned it in a support email back in 2008.
So what do you think about additionally moving this functionality into the /N flag, where it is easy to find and makes logical sense? For instance /N=1 (still identical to just /N) suppresses updates. /N=2 suppresses updates and the progress window, too.
So what do you think about additionally moving this functionality into the /N flag, where it is easy to find and makes logical sense? For instance /N=1 (still identical to just /N) suppresses updates. /N=2 suppresses updates and the progress window, too.
If you're running version 6.21, use /W=2 to suppress the progress window.
The built-in parallelism of the curve fitting operation only deals with a single curve fit at a time. So within a single curvefit, Igor will try to use multiple processors. This can be lead to a speedup if your model function is computationally expensive, so that Igor can independently compute values for e.g. different x coordinates.
Fitting an exponential function is probably not computationally expensive enough to show a net benefit from parallellization. Perhaps Igor even explicitly disables it in the case of exponential fitting.
To get a benefit from this type of parallelization I believe that you will have to parallelize at a higher level, so that several instance of the curve fitting operation are run simultaneously. This is a rather involved topic, for which you will have to resort to creating threadsafe worker functions that are dispatched from the Igor function using ThreadGroupCreate, ThreadStart, and ThreadGroupWait. It's not that difficult, but it takes a while to wrap your head around the concepts.
I'd suggest starting by reading up on "ThreadSafe Functions and Multitasking" in the Igor manual.
I'd also write some more explicit help but I'm afraid I'm in a rush right now, someone else might chime in with more detailed comments.
April 1, 2011 at 11:54 am - Permalink
Since you are doing many fits, you would probably benefit from running multiple fits at the same time using Igor's support for threaded programming. To see an example of this applied to curve fitting, see File->Example Experiments->Curve Fitting->MultipleFitsInThreads.
My advice would be to spawn a thread for each processor, and have each thread run some part of the loop of 512 fits. My point being, don't do four fits in four threads, return to your main program and do another four fits. Rather, do 128 fits in a loop in each thread. You will have some bookkeeping to do to distribute data and collect results.
It's by no means trivial to do, but will get you better speed-up than the built-in threading in CurveFit.
To learn more about threaded programming in Igor, execute this command:
DisplayHelpTopic "ThreadSafe Functions and Multitasking"
John Weeks
WaveMetrics, Inc.
support@wavemetrics.com
April 1, 2011 at 01:50 pm - Permalink
I happen to know that the OP's data is a stack of images of dimensions 512x512xn, where n is the number of planes. The goal here is to perform a fit on each of the beams in the stack, i.e., the 262144 traces of n points. I haven't looked at the example experiment John mentioned, but here's how I would implement this:
Create a function that takes the stack, a row index, and a column index, and that returns a wave. The prototype of this function would be
// note the Threadsafe identifier
wave dataWave
variable row, column
CurveFit <function type> dataWave[row][column][] /NTHR=1 <whatever flags you want>
wave W_Coef
// make a copy of W_Coef and return it
Duplicate /FREE /O W_Coef, fitResults // note the /FREE flag
return fitResults
End
This worker function can be then be called from a higher-level function using the multithread keyword. The cool thing here is that we can use MultiThread to take most of the pain out of threading, and do away with explicit constructs entirely. This function could be along these lines:
wave dataWave
Variable nRows = DimSize(dataWave, 0)
Variable nCols = DimSize(dataWave, 1)
Make /O/N=(nRows, nCols) /WAVE /FREE M_fitResults
// M_fitResults is a wave containing wave references. Each point of this wave
// is a reference to a wave where we will store the fit results
// of that particular x,y coordinate.
// Note that M_fitResults (and the waves it references) will cease to exist when this function
// exits. So you will want to copy the data to some other format when the calculation is done.
// now do the actual curve fitting, multithreaded
MultiThread M_fitResults = MyWorker(dataWave, p, q)
// here you will want to add code that loops over all the wave references in M_fitResults and processes
// them to your convenience or copies them in some format that you can store persistently.
End
For more background on this technique you can take a look at the "Wave Reference MultiThread Example" in the Igor manual.
April 1, 2011 at 07:04 pm - Permalink
John Weeks
WaveMetrics, Inc.
support@wavemetrics.com
April 4, 2011 at 10:03 am - Permalink
By the way, now that we're discussing this, I'd like to make a related suggestion: as we all know CurveFit and derivatives offer a /N flag that suppresses updates between iterations, in order to speed up fitting.
Back when I did lots of curve fitting I remember that the real speedup was setting bit 2 of V_fitOptions to get rid of the progress window. It's great that this option is included, but it's pretty well hidden in the documentation. I only become aware of it after you mentioned it in a support email back in 2008.
So what do you think about additionally moving this functionality into the /N flag, where it is easy to find and makes logical sense? For instance /N=1 (still identical to just /N) suppresses updates. /N=2 suppresses updates and the progress window, too.
April 4, 2011 at 12:34 pm - Permalink
If you're running version 6.21, use /W=2 to suppress the progress window.
John Weeks
WaveMetrics, Inc.
support@wavemetrics.com
April 5, 2011 at 09:48 am - Permalink