 
    
    
    
    Identifying and deleting columns from a 2D wave containing same value in all rows
Hi,
This probably has a pretty simple solution but I just couldn't get to it.. I have a 2D wave that contains some columns where all rows have the same value. For example, in my case, I need to delete all such columns from my 2D wave where all rows of the column read zero or nan. I'm looking for a simple way to do this.
In a different situation, I had to nan all such rows where all columns had the same value in a row. I am attaching a code here that I put together to nan all such rows of a 2D wave where every column had the same value, but I'm pretty sure a better way exists to do this more elegantly with some library function. Kindly advise.
Thanks!
Function raw2Dwavecheck() //This function sets all such rows to nan where the entire row has the same value. wave cantons_all_pmf,cantons_all_err_pmf variable i,j,n,temp,val variable rowcount=dimsize(cantons_all_pmf,0) variable colcount=dimsize(cantons_all_pmf,1) for(i=0;i<rowcount;i+=1) n=0 temp=0 val=cantons_all_pmf[i][0] //taking the first value of the row and check the remaining columns against it for(j=0;j<colcount;j+=1) temp=cantons_all_pmf[i][j]/val //checking the quotient for each column in a row if(temp==1) n=n+1 endif endfor print n/colcount if(n/colcount>0.8) //if more than 80% columns give a quotient of 1, nan the row cantons_all_pmf[i][]=nan cantons_all_err_pmf[i][]=nan endif endfor end

I wrote this. It works but is not the most elegant since I am running two for loops. Wondering if there is a simpler non-loopy way to perform this task.
October 25, 2021 at 08:46 am - Permalink
In reply to I wrote this. It compiles… by Peeyush Khare
Hi Peeyush,
One technique it to use the finduplicates function.
FindDuplicates [flags ] srcWave
The FindDuplicates operation identifies duplicate values in a wave and optionally creates various output waves. srcWave can be either numeric or text.
When srcWave is numeric, the /DN, /INDX, /RN and /SN flags create output waves as described below. If you omit all of these flags then FindDuplicates does nothing.
Loop through columns, I would go through in reverse order, seeing if the /RN wave is one point and that is equal to zero. If so delete that column and move to the next lower one.
Andy
October 25, 2021 at 09:50 am - Permalink
A simple way to detect if columns have constant value is, e.g., for the matrix 'ddd':
MatrixOP/O aa=varCols(ddd)
This should give you a 0 entry for each column that has a constant value.
The choice of what to do next depends on whether your data contain NaNs or not. If not it is fairly easy to accomplish what you want in three lines of code...
You start by creating a scaling wave that has NaN for each 0 entry in the varCols wave:
Next you scale the input matrix:
So the code you need to execute is:
I hope this helps,
AG
October 25, 2021 at 10:03 am - Permalink
Thanks a ton, Andy and AG! Great learning experience. I very much appreciate it!
October 26, 2021 at 04:21 am - Permalink