Minimization problem: Match parts of two waves via scale and offset

I search a solution for the, at first glance probably simple, problem of generating a close-to-zero difference for only parts of two waves, i.e., automatically bring them to match by optimization of some modification parameters. I have attached some demo data for everyone to play with (you can see in the history area how the data was generated). There are two waves, one with a distinct Gaussian feature near the center and one with plain ‘background’ data. Now, the outer ~30% of the two waves can be brought to match by simply adjusting the offset (75 in this demo) and the scale (0.72 here).

I want to find the offset and scale values for the first wave to match programmatically, i.e., a function which calculates these values from the two waves (and optionally how much of the outer values can be used in the calculation).

My ideas are to use a temporary wave to calculate the difference and use some complicated iteration until the difference is very small or abuse functions like 'Optimize' in some way (maybe I’m just too tired today, but I can't find a way ;). Since noise is included, there has to be some criteria for this. Also there may be cases where there is no exact match possible, which also needs to be considered. In Excel there is the solver add-in, where the program fiddles with some values until a minimum/maximum etc. is achieved for some result value. Basically I need this for the sum of the difference for these two waves (in parts). I hope someone can point me to an elegant solution for this.
Minimization Test.pxp (29.42 KB)
Optimize looks good to me.

A first rough sketch with
#pragma rtGlobals=3     // Use modern global access method and strict wave access.
#include <KBColorizeTraces>

Function func(pw, slope, offset)
    Wave pw
    variable slope, offset
   
    Wave wave1, wave2

    variable pSkipStart = 200
    variable pSkipEnd   = 320
   
    Duplicate/O/FREE/D wave1, sq
   
    sq[] = (p <= pSkipStart || p >= pSkipEnd ? wave1[p] * slope + offset - wave2[p] : 0.0)
    sq = sq[p]^2
   
    variable ret = Sum(sq)
    print ret
    return ret
End

Function doStuff()

    Make/O/N=0/FREE pw
    Optimize/A=0/X={1,0}/T=1e-2 func, pw
   
    Wave wave1, wave2, W_Extremum
    Duplicate/O wave1, result

    variable slope  = W_Extremum[0]
    variable offset = W_Extremum[1]
    result  = wave1[p] * slope + offset
    Display wave1, wave2, result
End


results in following graph.

It is not completely automatized yet as you still need to find pSkipStart and pSkipEnd manually.

Graph4_1.png (110.48 KB) Graph4_1.png (110.48 KB) Graph4_1.png (110.48 KB)
Thanks a lot for the solution. I haven't used Optimize before. It is no problem that you have to give the omitted range manually. I guess there is no way to find this automatically for a wide range of data.
I have modified the code a bit, and giving Optimize a rough parameter size using the /R flag gives a great result. Now I only need to find an elegant way to sneak two arbitrary named waves into the worker function.

Function OptWorker(pw, scale, offset)
    Wave pw
    variable scale, offset
 
    Wave wave1, wave2
    Duplicate/O/D/FREE wave1, sq

    sq      = wave1 * scale + offset - wave2
    sq[pw[0],pw[1]] = 0
    sq      = sq^2
    return  Sum(sq)
End
 
Function doStuff()
    Make/O/N=0/FREE pw={170,350}
    Optimize/Q/A=0/X={1,0}/R = {1, 100} OptWorker, pw
 
    Wave wave1, wave2, W_Extremum
    Duplicate/O wave1, result

    print "Scaling:",W_Extremum[0], "Offset:", W_Extremum[1]
    result = wave1 * W_Extremum[0] + W_Extremum[1]
    KillWaves/Z W_Extremum, W_OptGradient
End
Graph.png (19.92 KB)
Well for automatically finding the omitted range, you could try FindAPeak and assume some width.

For the sneak in part, unfortunately Optimize does not accept wave reference waves as parameter wave.
So we have to resort to wave notes:
Function OptWorker(pw, scale, offset)
    Wave pw
    variable scale, offset
 
    string wvNote = note(pw)
    Wave wave1 = $StringFromList(0,wvNote)
    Wave wave2 = $StringFromList(1,wvNote)
   
    Duplicate/D/FREE wave1, sq
 
    sq      = wave1 * scale + offset - wave2
    sq[pw[0],pw[1]] = 0
    sq      = sq^2
    return  Sum(sq)
End
 
Function doStuff()
    Make/FREE pw={170,350}
    Note pw, "root:wave1;root:wave2"
    Optimize/Q/A=0/X={1,0}/R = {1, 100} OptWorker, pw
 
    Wave wave1, wave2, W_Extremum
    Duplicate/O wave1, result
 
    print "Scaling:",W_Extremum[0], "Offset:", W_Extremum[1]
    result = wave1 * W_Extremum[0] + W_Extremum[1]
    KillWaves/Z W_Extremum, W_OptGradient
End
I have a different solution, which is a bit left-field.
I have fitted one wave to the other using an offset and scale, then excluded the points that are most extreme (positive peak). Then repeated the fit and exclusion a few times.
Example code as follows - I hope it is self explanatory:
Function FitDriver()
    wave wave1 // assume wave1 and wave2 already exist
    wave wave2
    variable i,vNum,vNumIter,vIter,vFrac
    vFrac=0.1 // fraction of points to 'mask' out at each iteration
    vNumIter=3 // number of iterations to run
   
    vNum=vFrac*DimSize(wave1,0) // number of pint to mask out each iteration
   
    duplicate/O wave1,waveFit,waveRes
    duplicate/O wave1,wave1iter
   
    Make/D/O/N=(2) W_coef  // make a coeficient wave
    W_coef[0]=0 // offset
    W_coef[1]=1 // scale
   
    for(vIter=0;vIter<vNumIter;vIter+=1) // loop round each iteration
        // run the fit
        FuncFit/Q myFitFunc, W_coef, wave2 /X=wave1iter
       
        waveFit[]=W_coef[0]+W_coef[1]*wave1iter[p]
        waveRes[]=wave2[p]-waveFit[p] // calculate a residuals wave
       
        for(i=0;i<vNum;i+=1) // loop to exclude extreme points from next fit
            waveStats/Q/Z waveRes
            wave1iter[V_maxloc]=NaN
            waveRes[V_maxloc]=NaN
        endfor
    endfor
    // run the fit one last time
    FuncFit/Q myFitFunc,W_coef,wave2 /X=wave1iter
    // calc fit
    waveFit[]=W_coef[0]+W_coef[1]*wave1[p]
    // plot graph
    Display wave1,wave2,waveFit
    ModifyGraph rgb(wave2)=(0,52224,0),rgb(waveFit)=(0,15872,65280)
    TextBox/C/N=text0 "Fit Parameters:\rOffset = "+num2str(W_coef[0])+"\rScale = "+num2str(W_coef[1])
End
Function myFitFunc(pw, yw, xw) : FitFunc
    WAVE pw, yw, xw
    yw = pw[0]+pw[1]*xw
End

Hope this makes sense.
I tried to get this to work using a mask wave, but failed, hence my odd use of 'wave1iter'.
Regards,
Kurt
Wow, that's also a nice solution. Thanks a lot. I never thought that you could take an arbitrary wave like that and just do a fit. Also, this solution doesn't need a 'hint' the Optimize approach (the result magnitude is needed). Maybe we should run a small contest one in while with problems like these to discover new ways of working. Also, "... number of pint ..." made my day. Cheers. XD
I have realised that the choice to only exclude positive points is not great when the data is noisy as it makes the resultant fit biased to more negative points (you can clearly see this if you run the fit after changing the variables vFrac=0.01 and vNumIter=30 at the start of my old code).
A better solution (at least for this shape of data) is to exclude the most extreme points irrespective of sign.
One could also extend this algorithm to auto stop excluding points when, for example, the largest outlier is 2 x StDev of the residual (say) - this removes another 'human' parameter from the algorithm. (Perhaps this factor should be scaled by number of data points - but my head starts to hurt when I start thinking about that).
New example code as follows:
Function FitDriver()
    wave wave1 // assume wave1 and wave2 already exist
    wave wave2
    variable i,vNum,vMaxNumIter,vIter,vFrac,vMaxSD
    vFrac=0.01 // fraction of points to 'mask' out at each iteration
    vMaxNumIter=50 // maximum number of iterations to run - a safety net
    vMaxSD=2 // Stop excluding points when largest outlier exceeds this number of StdDev's
 
    vNum=vFrac*DimSize(wave1,0) // number of points to mask out each iteration
 
    duplicate/O wave1,waveFit,waveRes
    duplicate/O wave1,wave1iter
 
    Make/D/O/N=(2) W_coef  // make a coefficient wave
    W_coef[0]=0 // offset
    W_coef[1]=1 // scale
 
    vIter=0
    do
        // run the fit
        FuncFit/Q myFitFunc, W_coef, wave2 /X=wave1iter
 
        myFitFunc(W_coef,waveFit,wave1iter)
        waveRes[]=wave2[p]-waveFit[p] // calculate a residuals wave
 
        for(i=0;i<vNum;i+=1) // loop to exclude extreme points from next fit
            waveStats/Q/Z waveRes
            if(abs(V_max)>abs(V_min))
                wave1iter[V_maxloc]=NaN
                waveRes[V_maxloc]=NaN
            else
                wave1iter[V_minloc]=NaN
                waveRes[V_minloc]=NaN
            endif
        endfor
        if(max(abs(V_max),abs(V_min))<vMaxSD*V_sdev) // extreme point not too large...
            break // ... so stop here
        endif
       
        vIter+=1
    while(vIter<vMaxNumIter)
   
    // run the fit one last time
    FuncFit/Q myFitFunc,W_coef,wave2 /X=wave1iter
    // calc fit
    myFitFunc(W_coef,waveFit,wave1)
    // plot graph
    Display wave1,wave2,waveFit
    ModifyGraph rgb(wave2)=(0,52224,0),rgb(waveFit)=(0,15872,65280)
    TextBox/C/N=text0 "Fit Parameters:\rOffset = "+num2str(W_coef[0])+"\rScale = "+num2str(W_coef[1])
End

Function myFitFunc(pw, yw, xw) : FitFunc
    WAVE pw, yw, xw
    yw = pw[0]+pw[1]*xw
End


Apologies for the pint-point typo... perhaps my fingers were hinting at where I should go after work ;)

Regards,
Kurt
Thanks a lot Kurt for the update. Funnily, both versions of your approach give almost the exact same result in my real-world problem unlike the test case. Usually it's the other way around, where you think it works until you discover the solution utterly fails for the real thing. I wonder if there is a more straight forward solution for the excluding-points part. Maybe 'Extract' in some way?