Removing row of matrix that contains specific value in first column
arnold.downey
Hello. I m stuck on an issue that seems trivial, but somehow I must be missing some simple mistake. I want to make a function that takes a 2D wave and a specific value as input and deletes all rows of the 2D wave that contain this value in the first column.
For example take the test matrix below:
I want to remove the rows containing 8.129 in the first column, so rows 2 and 4. So I have written a function:
#pragma TextEncoding = "UTF-8"
#pragma rtGlobals=3
//(there are other functions in the procedure that use these specs, including here in case it makes a difference)
Function RemoveRowsContainingValue(Data_Matrix, Val)
Wave Data_Matrix
Variable Val
Variable i
For(i=0; i<(DimSize(Data_Matrix, 0)); i+=1)
If(Data_Matrix[i][0] == Val)
DeletePoints/M=0 i, 1, Data_Matrix
EndIf
EndFor
End
#pragma rtGlobals=3
//(there are other functions in the procedure that use these specs, including here in case it makes a difference)
Function RemoveRowsContainingValue(Data_Matrix, Val)
Wave Data_Matrix
Variable Val
Variable i
For(i=0; i<(DimSize(Data_Matrix, 0)); i+=1)
If(Data_Matrix[i][0] == Val)
DeletePoints/M=0 i, 1, Data_Matrix
EndIf
EndFor
End
When I run RemoveRowsContainingValue(PM_Data_Test_Matrix, 8.129) though, it does not remove the rows. What is it I'm missing here?
Thanks,
Hi,
It could be an issue with precision. The value shown in the table may be truncated and not be precisely equal. Though I just tested your function where I created a value specifically 8.129 in the table and it worked as intended
Also I would work backwards through your matrix, because if you delete a row, the next time you index will be off. As an example I but 8.129 into the first two rows and only the first row was deleted. First time through the loop i = 0 and it catches and deletes. Now that second 8.129 is in the 0 (zero) row because the initial one was deleted, but the i index is now at 1 and does not look at row 0 which was formally row 1 before the first deletion.
Andy
December 8, 2022 at 10:12 am - Permalink
You are comparing two floating point numbers for equality.
Don't do that; use a comparison within a small delta.
Wave Data_Matrix
Variable Val
Variable i
Variable Epsilon = val*1e-5
For(i=0; i<(DimSize(Data_Matrix, 0)); i+=1)
//If(Data_Matrix[i][0] == Val)
If( abs(Data_Matrix[i][0] - Val) < Epsilon)
DeletePoints/M=0 i, 1, Data_Matrix
EndIf
EndFor
End
December 8, 2022 at 10:18 am - Permalink
In reply to Hi, It could be an issue… by hegedus
I did already verify that the values in the matrix weren't just being truncated, so this wasn't the issue. But thank you for the suggestion to work backwards, this is smart for removing rows in a loop, so I have implemented this.
December 8, 2022 at 10:34 am - Permalink
In reply to You are comparing two… by JimProuty
Thank you for this, JimProuty. I didn't realize this kind of comparison is problematic for floating points, so I implemented your approach and the function now works as expected!
December 8, 2022 at 10:36 am - Permalink
This function will simply set the row to NaN.
wave Data_Matrix
variable value, epsilon
variable ic, tmp
if (ParamIsDefault(epsilon))
epsilon = 0.1
endif
for(ic=0; ic<(DimSize(Data_Matrix, 0)); ic+=1)
tmp = abs(Data_Matrix[ic][0] - value)
Data_Matrix[ic][] = (tmp < epsilon*value) ? NaN : Data_Matrix[p][q]
endfor
return 0
end
I had hopes to create this in a collapsed notation to avoid the for-endfor loop. I also might imagine a ZapNAN option that is multi-dimensional aware would remove the offending NaN rows.
December 8, 2022 at 04:47 pm - Permalink
With a for loop it's better to iterate backwards through the rows:
// possibly delete ith row
EndFor
otherwise you will skip a row when i increments. An additional advantage is that DimSize is calculated only once.
EDIT: I should read all the comments before posting :)
December 9, 2022 at 12:15 am - Permalink
Here is a slightly different approach for the main task to eliminate rows (or columns) according so some criterion. The criterion needs to be converted into an index wave with 1 for "keep" or 0 for "delete". This may be more efficient for very large 2D waves compared to using DeletePoints (not tested though).
Requires IP9 or MatrixOP zapNaNs would beed to be replaced by WaveTransform.
// w2d is an n x m matrix
// idx is either 1D with numPnts(idx) = n or 2D with 1 x m columns
// idx has values of either 1 or 0, if 0 at p or q, rows or cols of w2d will be eliminated
// if dl is specified as non-zero integer DimLabels are preserved
int nRows = DimSize(w2d,0)
int nCols = DimSize(w2d,1)
int dim = DimSize(idx,1) != 0 ? 1 : 0
int nPoints = DimSize(idx,dim)
int nKeeps = sum(idx)
int i
Duplicate/FREE idx temp
if(dim == 0)
// eliminate rows
if(nRows != nPoints)
print "Incompatible dimensions"
return 0
endif
MultiThread temp = temp[p] == 1 ? p : NaN
MatrixOP/FREE temp = zapNans(temp)
Make/FREE/N=(nKeeps, nCols) out
MultiThread Out = w2d[temp[p]][q]
if(!paramIsDefault(dl))
CopyDimlabels/Cols=1 w2d, out
for(i=0; i<nKeeps; i++)
SetDimlabel 0, i, $GetDimLabel(w2d, 0, temp[i]), out
endfor
endif
else
// eliminate columns
if(nCols != nPoints)
print "Incompatible dimensions"
return 0
endif
MultiThread temp = temp[0][q] == 1 ? q : NaN
MatrixOP/FREE temp = zapNans(temp)^t
Make/FREE/N=(nRows, nKeeps) out
MultiThread Out = w2d[p][temp[q]]
if(!paramIsDefault(dl))
CopyDimlabels/Rows=0 w2d, out
for(i=0; i<nKeeps; i++)
SetDimlabel 1, i, $GetDimLabel(w2d, 1, temp[i]), out
endfor
endif
endif
Duplicate/O out, w2D
end
December 9, 2022 at 02:48 am - Permalink
First, I'd like to assume that you know how to identify the rows that you want to eliminate. Next, it is useful to remember that it is more efficient to eliminate columns than rows so:
1. transpose your input matrix (say inMat). The new matrix inMat^t dimensions are Nr by Nc.
2. Create a 1D wave w1d of Nc points set to 1 for cols that you want to keep and NaN for cols that you want to delete.
3. Execute MatrixOP scaleCols() to set the cols to be deleted to NaN
4. Execute MatrixOP zapNaNs() to remove the NaNs
5. Redimension to the new rows and cols
6. Transpose the final matrix.
All this can be done in one line of code:
MatrixOP/O newMat=Redimension(zapNaNs(scaleCols((inMat^t),w1d)),numCols(inMat),sum(zapNaNs(w1d)))^t
December 9, 2022 at 05:35 pm - Permalink
Hello A.G.,
thanks for the one-liner, I need to remove cols/rows quite often and your version is about 30-40% faster!
And I realised that I can use a 1-row wave where I assumed I need to provide a variable, here e.g. sum(w) in Redimension.
Still much to learn about MatrixOP!
December 13, 2022 at 11:33 pm - Permalink
Hello CharLie,
In IP10 you will have a MatrixOP removeCol() function.
I have not decided if it is worth the time to implement removeRow() or use the transpose operator.
A.G.
December 14, 2022 at 05:27 pm - Permalink