FindDuplicates improvements for text waves
KZarzana
Apologies if this has been posted already.
FindDuplicates is pretty handy, but it seems to be missing some features when used on text waves.
1. It would be nice if there was a flag to turn off case sensitivity. StringMatch is already case insensitive, and strsearch and cmpstr can be either, so adding this feature would make the behavior of FindDuplicates more consistent with the other functions.
2. It would also be nice if there was a text wave equivalent of the UN and UNC flags for numeric waves. (Also, a description of those flags isn't in either the pdf of the manual or in the help files in Igor).
Maybe you can use Extract?
•junk[0,;2]="w"+num2str(p)
•junk[1,;2]="W"+num2str(p)
•Extract junk, lowercase, CmpStr((junk)[0,1], "w1", 1)==0
•print lowercase
lowercase[0]= {"w10","w12","w14","w16","w18"}
August 27, 2019 at 09:40 am - Permalink
I should have mentioned that I have workarounds using lowerstr and upperstr, and adding the case sensitive flag is more about convenience and having nice compact code than not being able to do something.
August 27, 2019 at 02:12 pm - Permalink
I second the request of KZarzana. Both would be nice improvements.
@KZarzana: What do the UN/UNC flags?
August 28, 2019 at 03:05 am - Permalink
The description of those flags isn't in the help but can be found here: https://www.wavemetrics.com/products/igorpro/newfeatures/whatsnew8
"Added /UN flag to FindDuplicates that generates a wave containing the unique numerical values in the input wave.
Added /UNC flag to FindDuplicates that generates a wave containing the count of occurrences in the input wave of each unique numerical value."
August 28, 2019 at 06:59 am - Permalink
@KZarzana: Thx.
August 28, 2019 at 01:26 pm - Permalink
The conversion of arbitrary encoded text to upper/lower is not simple. If you happen to know that the contents of your wave are ascii characters it is simple enough for you to convert before calling FindDuplicates as mentioned above. I will add your request to the wish list.
A.G.
August 29, 2019 at 12:42 pm - Permalink
Great, thanks!
September 3, 2019 at 08:27 am - Permalink
FWIW, case insensitivity support was added for IP9.
A.G.
September 9, 2019 at 10:23 am - Permalink
Hi,
I noticed today that if the input wave has a length of 1, it throws an error of insufficient number of points. Since I am am only looking for unique values, a wave with length 1 should return that one point. I have coded around it but it would be nice if findduplicate could handle a length of 1.
Andy
March 19, 2020 at 08:29 am - Permalink
Hi Andy,
I find it logically impossible to look for duplicates when you have less than two points in the wave.
A.G.
March 19, 2020 at 12:54 pm - Permalink
In reply to Hi Andy, I find it… by Igor
Yes this is true. But I was just suggesting that the algorithm be robust so I do have to trap the incoming wave for that condition.
Andy
March 19, 2020 at 01:16 pm - Permalink
We will have to disagree on what it means to be "robust".
My programming philosophy is that I want an operation or function to return an error as soon as one is encountered. Otherwise you may find that something did not work 15 steps later and you would have to trace it all the way to an operation or function that silently returned a zero point wave or a NaN. In other words, sooner or later you need to implement some tests in your code. They could be before you use bad input data or after.
March 19, 2020 at 01:59 pm - Permalink
Can we compromise and at least make a note in the documentation? It took some troubleshooting to figure out that that specific entry had only 1 point. I was scanning 6000+ input sets and it barfed in the middle.
Andy
March 19, 2020 at 02:04 pm - Permalink
It's always a good idea to improve the documentation. Please email me a specific suggestion.
A.G.
March 19, 2020 at 02:37 pm - Permalink