How to identify if list item is number or string?
Hello, I have list of stuff, which has keyword=value; structure. Principally value can be either number (most common), string (sometimes), and rarely date or time. Unluckily, I cannot create list of keyword names with description how to treat them.
I want to convert numbers to numbers and store in number waves, string in string names etc.
Example of list is "Name=Sample1_128C_20min;StartTime=2019-06-20 08:00:00;Transmission=0.9;" etc. Obviously, the first is string (even though it contains numbers), second time, third number.
Unluckily, I have also possible pathological case where "Name=12SampleA;" I was going to test on str2num(value) and if Nan assume it is string, but this pathological case reports the number. I do not care much about the times... But need to distinguish strings/numbers. Any clever ideas, please?
Hi,
That is odd.
One idea is to use strlen to look at the number of characters and compare it to the numeric version.
Andy
April 5, 2020 at 08:45 am - Permalink
This GREP string will return anything that is not a number or "dot" ([^0-9.]). It will fail to return anything for cases that are not just numbers. When it fails to return something, the input must be a string.
Presuming that you have a way to extract just the values for the keys, you could design a SplitString option and test whether the output is empty. In that case, the value is a string. Otherwise, it is a pure number.
April 5, 2020 at 10:11 am - Permalink
GrepString seems to be way to go...But the above did not work - this seems to mostly work :
print GrepString("12Sample_35C", "(?i)[a-z]" )
This returns 1 if there is at least one character in the string,
However, it fails on exponents: "1.256e-6" is number, but this thing above returns 1 since it sees e. I wonder if one could improve the grep and accept single e (or E) character. Anyone knows how to fix the grep syntax for that?
April 5, 2020 at 10:47 am - Permalink
Hm, if lost, ask Mr. Google...
https://www.regular-expressions.info/floatingpoint.html
1
•print GrepString("1235.3226", "^[-+]?[0-9]*\.?[0-9]+([eE][-+]?[0-9]+)?$")
1
•print GrepString("1235.32Sample26", "^[-+]?[0-9]*\.?[0-9]+([eE][-+]?[0-9]+)?$")
0
•print GrepString("-1235.3226", "^[-+]?[0-9]*\.?[0-9]+([eE][-+]?[0-9]+)?$")
1
•print GrepString("Sample1", "^[-+]?[0-9]*\.?[0-9]+([eE][-+]?[0-9]+)?$")
0
April 5, 2020 at 10:58 am - Permalink
Well, I thought also
print str2num(cleanupname("12SampleA",0))
--> NaN
print str2num("12SampleA")
--> 12
Maybe something could be done here???
April 5, 2020 at 11:21 am - Permalink
In reply to Even better print str2num… by jjweimer
I don't think this satisfies the basic request of getting a number if the string is such.
print str2num(cleanupname("12",0))
also returns NaN where 12 would be desired.
Andy
April 5, 2020 at 11:26 am - Permalink
Can you address this when writing the keyword=value string by using a different keyword for numeric values versus string values?
April 6, 2020 at 04:39 am - Permalink
No, not really. Some is history (20 years of it by now) and some names are human readable names (from upstream sources of data) which I have little to no control of. But this is more or less solved good enough (regular expressions are amazing!):
GrepString(StringToTest, "^[-+]?[0-9]*\.?[0-9]+([eE][-+]?[0-9]+)?$")
This return 1 when StringToTest is number - and it identified all test cases I threw at it for now. I check first for number, if not, I managed to modify some code I found here to parse common date-time formats (so many possibilities!), so I test if it is date-time, if that does not work, I assume it is string. And I gave up on cases when I have number with unit, which, unluckily is also option. I may have to add some special case code for those if needed. This is solved for now. Not pretty, but done.
April 9, 2020 at 06:45 pm - Permalink