Minimization problem: Match parts of two waves via scale and offset

I search a solution for the, at first glance probably simple, problem of generating a close-to-zero difference for only parts of two waves, i.e., automatically bring them to match by optimization of some modification parameters. I have attached some demo data for everyone to play with (you can see in the history area how the data was generated). There are two waves, one with a distinct Gaussian feature near the center and one with plain ‘background’ data. Now, the outer ~30% of the two waves can be brought to match by simply adjusting the offset (75 in this demo) and the scale (0.72 here).

I want to find the offset and scale values for the first wave to match programmatically, i.e., a function which calculates these values from the two waves (and optionally how much of the outer values can be used in the calculation).

My ideas are to use a temporary wave to calculate the difference and use some complicated iteration until the difference is very small or abuse functions like 'Optimize' in some way (maybe I’m just too tired today, but I can't find a way ;). Since noise is included, there has to be some criteria for this. Also there may be cases where there is no exact match possible, which also needs to be considered. In Excel there is the solver add-in, where the program fiddles with some values until a minimum/maximum etc. is achieved for some result value. Basically I need this for the sum of the difference for these two waves (in parts). I hope someone can point me to an elegant solution for this.
Minimization Test.pxp (29.42 KB)
Optimize
looks good to me.

A first rough sketch with

#pragma rtGlobals=3		// Use modern global access method and strict wave access.
#include <KBColorizeTraces>

Function func(pw, slope, offset)
	Wave pw
	variable slope, offset
	
	Wave wave1, wave2

	variable pSkipStart = 200
	variable pSkipEnd   = 320
	
	Duplicate/O/FREE/D wave1, sq
	
	sq[] = (p <= pSkipStart || p >= pSkipEnd ? wave1[p] * slope + offset - wave2[p] : 0.0)
	sq = sq[p]^2
	
	variable ret = Sum(sq)
	print ret
	return ret
End

Function doStuff()

	Make/O/N=0/FREE pw
	Optimize/A=0/X={1,0}/T=1e-2 func, pw
	
	Wave wave1, wave2, W_Extremum
	Duplicate/O wave1, result

	variable slope  = W_Extremum[0]
	variable offset = W_Extremum[1]
	result  = wave1[p] * slope + offset
	Display wave1, wave2, result
End


results in following graph.

It is not completely automatized yet as you still need to find pSkipStart and pSkipEnd manually.

Graph4_1.png (110.48 KB) Graph4_1.png (110.48 KB) Graph4_1.png (110.48 KB)
Thanks a lot for the solution. I haven't used
Optimize
before. It is no problem that you have to give the omitted range manually. I guess there is no way to find this automatically for a wide range of data.
I have modified the code a bit, and giving Optimize a rough parameter size using the /R flag gives a great result. Now I only need to find an elegant way to sneak two arbitrary named waves into the worker function.


Function OptWorker(pw, scale, offset)
	Wave pw
	variable scale, offset
 
	Wave wave1, wave2
	Duplicate/O/D/FREE wave1, sq

 	sq		= wave1 * scale + offset - wave2
 	sq[pw[0],pw[1]]	= 0
	sq		= sq^2
	return  Sum(sq)
End
 
Function doStuff()
	Make/O/N=0/FREE pw={170,350}
	Optimize/Q/A=0/X={1,0}/R = {1, 100} OptWorker, pw
 
	Wave wave1, wave2, W_Extremum
	Duplicate/O wave1, result

	print "Scaling:",W_Extremum[0], "Offset:", W_Extremum[1]
	result = wave1 * W_Extremum[0] + W_Extremum[1]
	KillWaves/Z W_Extremum, W_OptGradient
End
Graph.png (19.92 KB)
Well for automatically finding the omitted range, you could try
FindAPeak
and assume some width.

For the sneak in part, unfortunately
Optimize
does not accept wave reference waves as parameter wave.
So we have to resort to wave notes:

Function OptWorker(pw, scale, offset)
	Wave pw
	variable scale, offset
 
 	string wvNote = note(pw)
 	Wave wave1 = $StringFromList(0,wvNote)
 	Wave wave2 = $StringFromList(1,wvNote)
 	
	Duplicate/D/FREE wave1, sq
 
 	sq		= wave1 * scale + offset - wave2
 	sq[pw[0],pw[1]]	= 0
	sq		= sq^2
	return  Sum(sq)
End
 
Function doStuff()
	Make/FREE pw={170,350}
	Note pw, "root:wave1;root:wave2"
	Optimize/Q/A=0/X={1,0}/R = {1, 100} OptWorker, pw
 
	Wave wave1, wave2, W_Extremum
	Duplicate/O wave1, result
 
	print "Scaling:",W_Extremum[0], "Offset:", W_Extremum[1]
	result = wave1 * W_Extremum[0] + W_Extremum[1]
	KillWaves/Z W_Extremum, W_OptGradient
End
I have a different solution, which is a bit left-field.
I have fitted one wave to the other using an offset and scale, then excluded the points that are most extreme (positive peak). Then repeated the fit and exclusion a few times.
Example code as follows - I hope it is self explanatory:

Function FitDriver()
	wave wave1 // assume wave1 and wave2 already exist
	wave wave2
	variable i,vNum,vNumIter,vIter,vFrac
	vFrac=0.1 // fraction of points to 'mask' out at each iteration
	vNumIter=3 // number of iterations to run
	
	vNum=vFrac*DimSize(wave1,0) // number of pint to mask out each iteration
	
	duplicate/O wave1,waveFit,waveRes
	duplicate/O wave1,wave1iter
	
	Make/D/O/N=(2) W_coef  // make a coeficient wave
	W_coef[0]=0 // offset
	W_coef[1]=1 // scale
	
	for(vIter=0;vIter<vNumIter;vIter+=1) // loop round each iteration
		// run the fit
		FuncFit/Q myFitFunc, W_coef, wave2 /X=wave1iter
		
		waveFit[]=W_coef[0]+W_coef[1]*wave1iter[p]
		waveRes[]=wave2[p]-waveFit[p] // calculate a residuals wave
		
		for(i=0;i<vNum;i+=1) // loop to exclude extreme points from next fit
			waveStats/Q/Z waveRes
			wave1iter[V_maxloc]=NaN
			waveRes[V_maxloc]=NaN
		endfor
	endfor
	// run the fit one last time
	FuncFit/Q myFitFunc,W_coef,wave2 /X=wave1iter
	// calc fit
	waveFit[]=W_coef[0]+W_coef[1]*wave1[p]
	// plot graph
	Display wave1,wave2,waveFit
	ModifyGraph rgb(wave2)=(0,52224,0),rgb(waveFit)=(0,15872,65280)
	TextBox/C/N=text0 "Fit Parameters:\rOffset = "+num2str(W_coef[0])+"\rScale = "+num2str(W_coef[1])
End
Function myFitFunc(pw, yw, xw) : FitFunc
	WAVE pw, yw, xw
	yw = pw[0]+pw[1]*xw
End

Hope this makes sense.
I tried to get this to work using a mask wave, but failed, hence my odd use of 'wave1iter'.
Regards,
Kurt
Wow, that's also a nice solution. Thanks a lot. I never thought that you could take an arbitrary wave like that and just do a fit. Also, this solution doesn't need a 'hint' the Optimize approach (the result magnitude is needed). Maybe we should run a small contest one in while with problems like these to discover new ways of working. Also, "... number of pint ..." made my day. Cheers. XD
I have realised that the choice to only exclude positive points is not great when the data is noisy as it makes the resultant fit biased to more negative points (you can clearly see this if you run the fit after changing the variables vFrac=0.01 and vNumIter=30 at the start of my old code).
A better solution (at least for this shape of data) is to exclude the most extreme points irrespective of sign.
One could also extend this algorithm to auto stop excluding points when, for example, the largest outlier is 2 x StDev of the residual (say) - this removes another 'human' parameter from the algorithm. (Perhaps this factor should be scaled by number of data points - but my head starts to hurt when I start thinking about that).
New example code as follows:

Function FitDriver()
	wave wave1 // assume wave1 and wave2 already exist
	wave wave2
	variable i,vNum,vMaxNumIter,vIter,vFrac,vMaxSD
	vFrac=0.01 // fraction of points to 'mask' out at each iteration
	vMaxNumIter=50 // maximum number of iterations to run - a safety net
	vMaxSD=2 // Stop excluding points when largest outlier exceeds this number of StdDev's
 
	vNum=vFrac*DimSize(wave1,0) // number of points to mask out each iteration
 
	duplicate/O wave1,waveFit,waveRes
	duplicate/O wave1,wave1iter
 
	Make/D/O/N=(2) W_coef  // make a coefficient wave
	W_coef[0]=0 // offset
	W_coef[1]=1 // scale
 
	vIter=0
	do
		// run the fit
		FuncFit/Q myFitFunc, W_coef, wave2 /X=wave1iter
 
		myFitFunc(W_coef,waveFit,wave1iter)
		waveRes[]=wave2[p]-waveFit[p] // calculate a residuals wave
 
		for(i=0;i<vNum;i+=1) // loop to exclude extreme points from next fit
			waveStats/Q/Z waveRes
			if(abs(V_max)>abs(V_min))
				wave1iter[V_maxloc]=NaN
				waveRes[V_maxloc]=NaN
			else
				wave1iter[V_minloc]=NaN
				waveRes[V_minloc]=NaN
			endif
		endfor
		if(max(abs(V_max),abs(V_min))<vMaxSD*V_sdev) // extreme point not too large...
			break // ... so stop here
		endif
		
		vIter+=1
	while(vIter<vMaxNumIter)
	
	// run the fit one last time
	FuncFit/Q myFitFunc,W_coef,wave2 /X=wave1iter
	// calc fit
	myFitFunc(W_coef,waveFit,wave1)
	// plot graph
	Display wave1,wave2,waveFit
	ModifyGraph rgb(wave2)=(0,52224,0),rgb(waveFit)=(0,15872,65280)
	TextBox/C/N=text0 "Fit Parameters:\rOffset = "+num2str(W_coef[0])+"\rScale = "+num2str(W_coef[1])
End

Function myFitFunc(pw, yw, xw) : FitFunc
	WAVE pw, yw, xw
	yw = pw[0]+pw[1]*xw
End


Apologies for the pint-point typo... perhaps my fingers were hinting at where I should go after work ;)

Regards,
Kurt
Thanks a lot Kurt for the update. Funnily, both versions of your approach give almost the exact same result in my real-world problem unlike the test case. Usually it's the other way around, where you think it works until you discover the solution utterly fails for the real thing. I wonder if there is a more straight forward solution for the excluding-points part. Maybe 'Extract' in some way?