Wave / Variable Access in Thread Groups
HJDrescher
I am stuck with some code and need help:
I am running a lot (~2500) of simple fits on a lot of data sets (~3000 with 2500 points each). The fitfunction is always the same and I only need one coefficient. Unfortunately one fit depends on the one before. The data sets are independent. A normal version takes about 6 hours to run (which is 7 sec for one data set). Processor load is about 18% on an i7 8-core system. The bottle neck is identified as the many calls to funcfit.
The typical approach would be to parallelize along the data sets.
Here is my struggle:
-- The fitfunction takes a while to compute (and is normalized anyway, I only need the amplitude coefficient). It is about 2 times faster to compute it before and then just use the data from the computed wave (Still some improvement if I sacrifice it).
-- The fitfunction might be measured data, hence there is no way to compute it.
Is there a way to provide the same wave to a lot of threats without copying it all the time?
I could get around the computed ("$") wave reference by introducing more fitfunctions.
Threadsafe Function DeconvolutionFuncPCOdd(wc,wy,wx):Fitfunc
Wave wc, wy, wx
Wave DCO=$"root:NAC:Experiment:FitDeconvolutionOdd"+SelectString(wc[4],"","Fitted")
wy=wc[0]+wc[1]*DCO(x-wc[2])
End
Wave wc, wy, wx
Wave DCO=$"root:NAC:Experiment:FitDeconvolutionOdd"+SelectString(wc[4],"","Fitted")
wy=wc[0]+wc[1]*DCO(x-wc[2])
End
The other relevent fragments are (and ThreadGroupPutDF does not work the way I hoped it would work)
NVar Period=root:NAC:Experiment:ChopperPeriod, Win=root:NAC:Machine:DeconvolutionWindow
NewDataFolder root:NAC:MT
Duplicate /O root:NAC:Experiment:FitDeconvolutionOdd root:NAC:MT:FitDeconvolutionOdd
Duplicate /O root:NAC:Experiment:FitDeconvolutionEven root:NAC:MT:FitDeconvolutionEven
Duplicate /O root:NAC:Experiment:FitDeconvolutionOddFitted root:NAC:MT:FitDeconvolutionOddFitted
Duplicate /O root:NAC:Experiment:FitDeconvolutionEvenFitted root:NAC:MT:FitDeconvolutionEvenFitted
Variable /G root:NAC:MT:Chopper=Period, root:NAC:MT:DeconvolutionWindow=Win
Variable TID, NThr=1
TID=ThreadGroupCreate(NThr)
ThreadGroupPutDF TID, root:NAC:MT
For (i=0;i<DimSize(DeconPrep,1)-1;i+=NThr)
For (j=0;j<NThr;j+=1)
If (i+j < DimSize(DeconPrep,1)-1)
If (!DeconFlag[i])
ThreadStart TID, j, DeconvolutePulseMT(DeconPrep, i, Deconvolution, SelectString(Mod(i,2), "Even", "Odd"), Fitted)
//DeconvolutePulseMT(DeconPrep, i, Deconvolution, SelectString(Mod(i,2), "Even", "Odd"), Fitted)
EndIf
EndIf
EndFor
Do
While (ThreadGroupWait(TID, 50)!=0)
ProgressValue=i
DoUpdate /W=NAC_Control
EndFor
TID=ThreadGroupRelease(TID)
NewDataFolder root:NAC:MT
Duplicate /O root:NAC:Experiment:FitDeconvolutionOdd root:NAC:MT:FitDeconvolutionOdd
Duplicate /O root:NAC:Experiment:FitDeconvolutionEven root:NAC:MT:FitDeconvolutionEven
Duplicate /O root:NAC:Experiment:FitDeconvolutionOddFitted root:NAC:MT:FitDeconvolutionOddFitted
Duplicate /O root:NAC:Experiment:FitDeconvolutionEvenFitted root:NAC:MT:FitDeconvolutionEvenFitted
Variable /G root:NAC:MT:Chopper=Period, root:NAC:MT:DeconvolutionWindow=Win
Variable TID, NThr=1
TID=ThreadGroupCreate(NThr)
ThreadGroupPutDF TID, root:NAC:MT
For (i=0;i<DimSize(DeconPrep,1)-1;i+=NThr)
For (j=0;j<NThr;j+=1)
If (i+j < DimSize(DeconPrep,1)-1)
If (!DeconFlag[i])
ThreadStart TID, j, DeconvolutePulseMT(DeconPrep, i, Deconvolution, SelectString(Mod(i,2), "Even", "Odd"), Fitted)
//DeconvolutePulseMT(DeconPrep, i, Deconvolution, SelectString(Mod(i,2), "Even", "Odd"), Fitted)
EndIf
EndIf
EndFor
Do
While (ThreadGroupWait(TID, 50)!=0)
ProgressValue=i
DoUpdate /W=NAC_Control
EndFor
TID=ThreadGroupRelease(TID)
and (the NVARs can be passed as variables to the function)
ThreadSafe Static Function DeconvolutePulseMT(Data, Index, Result, Parity, Fitted)
Wave Data
Variable Index
Wave Result
String Parity
Variable Fitted
NVar Win=root:DeconvolutionWindow, Period=root:ChopperPeriod
//Variable Win=0.025, Period=2
Variable YOffset=0
Variable Step, Shift
Make /FREE /N=4 /O FitCoef
Make /FREE /N=(DimSize(Data,0)) Process
SetScale /P x, DimOffset(Data,0), DimDelta(Data,0), "", Process
Duplicate /FREE Process Diff, Int, Subtr
Process=Data[p][Index]
Step=DimDelta(Data,0)
Diff=0
Int=0
Subtr=0
SetScale /P x, 0, Step, "", Diff, Int
For (Shift=0;Shift<Period+9*Win;Shift+=Step)
FitCoef={0, 1, 0, Fitted} // Offset, Amp, Shift
SetScale /P x, -Shift, Step, "", Process, Subtr
FuncFit /N /NTHR=1 /Q /W=2 /H="1011" $"DeconvolutionFuncPC"+Parity FitCoef Process(-Win,+Win)
StrSwitch(Parity)
Case "Odd":
DeconvolutionFuncPCOdd(FitCoef,Subtr,Subtr)
Break
Case "Even":
DeconvolutionFuncPCEven(FitCoef,Subtr,Subtr)
Break
EndSwitch
Process-=Subtr[p]
Diff[X2Pnt(Diff,Shift)]+=FitCoef[1]
EndFor
Integrate /P Diff /D=Int
SetScale /P x, -5*Win, Step, "", Int
Result[][Index]=Int[p]
Killwaves Process, Int, Diff, Subtr
Return NoError
End
Wave Data
Variable Index
Wave Result
String Parity
Variable Fitted
NVar Win=root:DeconvolutionWindow, Period=root:ChopperPeriod
//Variable Win=0.025, Period=2
Variable YOffset=0
Variable Step, Shift
Make /FREE /N=4 /O FitCoef
Make /FREE /N=(DimSize(Data,0)) Process
SetScale /P x, DimOffset(Data,0), DimDelta(Data,0), "", Process
Duplicate /FREE Process Diff, Int, Subtr
Process=Data[p][Index]
Step=DimDelta(Data,0)
Diff=0
Int=0
Subtr=0
SetScale /P x, 0, Step, "", Diff, Int
For (Shift=0;Shift<Period+9*Win;Shift+=Step)
FitCoef={0, 1, 0, Fitted} // Offset, Amp, Shift
SetScale /P x, -Shift, Step, "", Process, Subtr
FuncFit /N /NTHR=1 /Q /W=2 /H="1011" $"DeconvolutionFuncPC"+Parity FitCoef Process(-Win,+Win)
StrSwitch(Parity)
Case "Odd":
DeconvolutionFuncPCOdd(FitCoef,Subtr,Subtr)
Break
Case "Even":
DeconvolutionFuncPCEven(FitCoef,Subtr,Subtr)
Break
EndSwitch
Process-=Subtr[p]
Diff[X2Pnt(Diff,Shift)]+=FitCoef[1]
EndFor
Integrate /P Diff /D=Int
SetScale /P x, -5*Win, Step, "", Int
Result[][Index]=Int[p]
Killwaves Process, Int, Diff, Subtr
Return NoError
End
Maybe I can pass the fitting wave as an x-wave to the fitfunction, but I don't like these botched constructs.
The goal is to reconstruct the energy input into a detector by measuring a response function and the actual experiment.
The data is too noisy to perform this task with Fourier transformations (and I'm a little bit sad about this).
I know that shared data in a multithread environment is dangerous and usually should be avoided, but since this is read only only access it should be safe.
Long post: potato available on request.
Thanks a lot,
Hans J Drescher
But couldn't you move the datasets just around?
So you partition the number of datasets and let each work thread do all the fits on one dataset and then the next dataset, and so on.
So the steps would be something like:
- Partition data sets
- Start worker threads
- Let worker threads initialize themselves. The idea would be that you move all one time initialization, especially the
Make
parts at the beginning and do that only once before the very first fit.- Start the inner loop in the worker thread which waits for input data in the queue. As soon as it gets data it will perform all 2500 fits on one dataset.
- Start sending out datafolders with one dataset at a time at the threads. You can move the datafolders with the data if you remember that the worker threads return them.
- Collect the results from the output queues of the worker threads.
January 22, 2015 at 03:18 am - Permalink
I was hoping that I could avoid queues.
I'm implementing it at the moment, when I am done I'll share my results.
HJ
January 22, 2015 at 11:37 am - Permalink