Speeding up StringFromList on a lengthy string
ankit7540
Hello all,
We are trying to convert samples provided by oscilloscope as a string of concatenated ASCII values separated by a comma. The number of samples are over 60K.
To convert the values in this string to a wave of floating values we are using `StringFromList`, as described in the manual.
// This function processes a string (read buffer in this case) and
// returns a wave of floating numbers
Function Wave_from_StringFromList ( sample_str )
String sample_str // A semicolon-separated string list
String separator = ","
Variable separatorLen = strlen(separator)
Variable numItems = ItemsInList( sample_str , ",")
Variable offset = 0
Variable i
print "\t number of items in the list : ", numItems
Variable timerRefNum
variable microSeconds
timerRefNum = StartMSTimer
make /FREE /d /n=(numItems) temp
string item
for(i=0; i<numItems; i+=1)
// When using offset, the index parameter is always 0
item = StringFromList(0, sample_str , separator, offset)
temp [i] = str2num(item)
offset += strlen(item) + separatorLen
endfor
microSeconds = StopMSTimer(timerRefNum)
Print /d microSeconds /1e6 , " seconds "
duplicate /o temp out
End
////////////////////////////////////////////////////////////////////////
// returns a wave of floating numbers
Function Wave_from_StringFromList ( sample_str )
String sample_str // A semicolon-separated string list
String separator = ","
Variable separatorLen = strlen(separator)
Variable numItems = ItemsInList( sample_str , ",")
Variable offset = 0
Variable i
print "\t number of items in the list : ", numItems
Variable timerRefNum
variable microSeconds
timerRefNum = StartMSTimer
make /FREE /d /n=(numItems) temp
string item
for(i=0; i<numItems; i+=1)
// When using offset, the index parameter is always 0
item = StringFromList(0, sample_str , separator, offset)
temp [i] = str2num(item)
offset += strlen(item) + separatorLen
endfor
microSeconds = StopMSTimer(timerRefNum)
Print /d microSeconds /1e6 , " seconds "
duplicate /o temp out
End
////////////////////////////////////////////////////////////////////////
This operation takes very long time compared to the actual measurement and data transfer which typically takes <0.1 s. Hence, the iterative conversion process takes considerably long time.
How can we improve the speed of this conversion.
// output when running the above function
Wave_from_StringFromList ( root:RTO_str )
number of items in the list : 61954
22.6564711 seconds
Wave_from_StringFromList ( root:RTO_str )
number of items in the list : 61954
22.6564711 seconds
See the attached pxp file for the sample_string saved as a global_variable for testing purpose.
How are you capturing the string? Can you capture it into a string wave rather than a string list? Can you force the conversion to a number at the input reading?
In the meantime, I tested an implicit versus explicit for loop.
variable timerRefNum, microSeconds, nitems, ic
nitems = ItemsInList(instr,",")
make/D/FREE/N=(nitems) temp
timerRefNum = StartMSTimer
temp = str2num(StringFromList(p,instr,","))
microSeconds = StopMSTimer(timerRefNum)
Print/D microSeconds/1e6 , " seconds with implicit ", nitems
temp = 0
timerRefNum = StartMSTimer
for (ic=0;ic<nitems;ic+=1)
temp[ic] = str2num(StringFromList(ic,instr,","))
endfor
microSeconds = StopMSTimer(timerRefNum)
Print/D microSeconds/1e6 , " seconds with for loop ", nitems
return 0
end
15.970029476 seconds with implicit 61954
15.926372869 seconds with for loop 61954
Perhaps the str2num conversion is the slow step.
May 30, 2023 at 10:05 am - Permalink
No, definitely StringFromList() is the bottleneck here. I guess because this function searches through the whole thing each step. Do this instead:
variable timerRefNum, microSeconds
timerRefNum = StartMSTimer
wave/T w = ListToTextWave(instr, ",")
make/d/o/n=(numpnts(w)) out = str2num(w[p])
microSeconds = StopMSTimer(timerRefNum)
Print/D microSeconds/1e6, "seconds"
return 0
end
0.0661931 seconds
May 30, 2023 at 10:41 am - Permalink
In Igor Pro 7 or later, StringFromList has an optional offset parameter that can dramatically speed up iterating through lists of many strings. See the DemoQuickStringFromList function in the help for StringFromList.
ListToTextWave is also a good solution.
May 30, 2023 at 05:47 pm - Permalink
As described by Chozo, StringFromList() is the slow step in this issue. String to number conversion is not the bottleneck.
Converting to text wave `ListToTextWave` is extremely fast, and practically solves the issue.
Thank you all, in particular chozo, for the great inputs.
May 30, 2023 at 10:33 pm - Permalink