Replacestring for a text wave
MGHuber
make /n=10 /t TxtWave
TxtWave[0,4]="Psi"
TxtWave[5,9]="PSI "
//In the form of replacestring
Replace4txtWave("PSI ",TxtWave,"Psi")
function Replace4txtWave(replaceThisStr, TxtWave, withThisStr)
string replaceThisStr
wave /t TxtWave
string withThisStr
variable i
for(i=0;i<numpnts(TxtWave);i+=1)
if(cmpstr(replaceThisStr,TxtWave[i])==0)
TxtWave[i]=withThisStr
endif
endfor
end
TxtWave[0,4]="Psi"
TxtWave[5,9]="PSI "
//In the form of replacestring
Replace4txtWave("PSI ",TxtWave,"Psi")
function Replace4txtWave(replaceThisStr, TxtWave, withThisStr)
string replaceThisStr
wave /t TxtWave
string withThisStr
variable i
for(i=0;i<numpnts(TxtWave);i+=1)
if(cmpstr(replaceThisStr,TxtWave[i])==0)
TxtWave[i]=withThisStr
endif
endfor
end
Some of my text waves are long and my Replace4txtWave function is slowing my code down quite a bit.
Note that you can't use the
<expression> ? <TRUE> : <FALSE>
construction for strings.February 19, 2015 at 07:15 am - Permalink
TxtWave[] = cmpstr(replaceThisStr, TxtWave[p]==0) ? withThisStr : TxtWave[p]
February 19, 2015 at 07:19 am - Permalink
What are your typical text waves sizes and do you know the ratio between matching entries and non matching entries?
February 19, 2015 at 11:19 am - Permalink
FindValue
in ado-while
loop. Use theV_Value
output of theFindValue
operation to specify where to start searching again. This operation has the advantage of being case sensitive, etc.February 19, 2015 at 02:55 pm - Permalink
string replaceThisStr
wave /t TxtWave
string withThisStr
variable V_Value = 0
do
//Look at documentation for what TXOP does.
FindValue/S=V_Value /Text=replaceThisStr/TXOP=1/z TxtWave
if (V_Value == -1)
break
endif
TxtWave[V_Value] = replacestring(replaceThisStr, TxtWave[V_Value], withThisStr)
while(1)
end
February 19, 2015 at 03:02 pm - Permalink
February 20, 2015 at 06:41 am - Permalink
February 20, 2015 at 06:49 am - Permalink
The "programmer notes" in
displayhelptopic "text waves"
indicate a bottle neck in the memory management for text waves (which is hard/expensive to avoid).
HJ
February 20, 2015 at 07:21 am - Permalink
Variable waveSize
String targetString = "PSI"
String replacementString = "Psi"
Make/N=(waveSize)/T/FREE/O originalTextWave
originalTextWave[0,(waveSize/2) - 1] = replacementString
originalTextWave[(waveSize/2), *] = targetString
Variable method
For (method = 0; method < 4; method +=1)
Duplicate/O/FREE originalTextWave, testTextWave
Variable start = StopMSTimer(-2)
Switch (method)
case 0: // Original method
Replace4txtWave(targetString, testTextWave, replacementString)
break;
case 1: // SelectString method
Replace4txtWave1(targetString, testTextWave, replacementString)
break;
case 2: // FindValue method
Replace4txtWave2(targetString, testTextWave, replacementString)
break;
case 3: // MultiThread text wave assignment method
#if IgorVersion() >= 7
Replace4txtWave3(targetString, testTextWave, replacementString)
#else
print "Method 3 requires Igor 7."
#endif
break;
EndSwitch
printf "Execution took %g ms using method %d.\r", (StopMSTimer(-2) - start)/1e3, method
EndFor
End
function Replace4txtWave(replaceThisStr, TxtWave, withThisStr)
string replaceThisStr
wave /t TxtWave
string withThisStr
variable i
Variable waveNumPnts = numpnts(TxtWave)
for(i=0;i<waveNumPnts;i+=1)
if(cmpstr(replaceThisStr,TxtWave[i])==0)
TxtWave[i]=withThisStr
endif
endfor
end
function Replace4txtWave1(replaceThisStr, TxtWave, withThisStr)
string replaceThisStr
wave /t TxtWave
string withThisStr
TxtWave = SelectString( cmpstr(TxtWave[p],replaceThisStr)==0, TxtWave[p], withThisStr)
end
function Replace4txtWave2(replaceThisStr, TxtWave, withThisStr)
string replaceThisStr
wave /t TxtWave
string withThisStr
variable startPoint = 0
Variable waveNumPnts = numpnts(TxtWave)
do
FindValue/S=(startPoint)/TEXT=replaceThisStr/TXOP=4 TxtWave
if (V_value == -1) // Value not found
break
endif
TxtWave[V_value]=withThisStr
startPoint = V_value + 1
while (startPoint < waveNumPnts)
end
#if IgorVersion() >= 7
function Replace4txtWave3(replaceThisStr, TxtWave, withThisStr)
string replaceThisStr
wave /t TxtWave
string withThisStr
// MultiThread with a text wave requires Igor 7 built 21Feb2015 or later.
MultiThread TxtWave = Replace4txtWave3_worker(TxtWave[p], replaceThisStr, withThisStr)
end
ThreadSafe Function/S Replace4txtWave3_worker(actualString, replaceThisStr, withThisStr)
String actualString
string replaceThisStr
string withThisStr
if(cmpstr(replaceThisStr,actualString)==0)
return withThisStr
endif
return actualString // No replacement necessary
End
#endif // IgorVersion() >= 7
Here are the results:
Igor 6
test(200000) Execution took 341.875 ms using method 0. Execution took 383.267 ms using method 1. Execution took 330.678 ms using method 2. Method 3 requires Igor 7. Execution took 15.8403 ms using method 3.
Igor 7
test(200000) Execution took 108.052 ms using method 0. Execution took 220.391 ms using method 1. Execution took 328.766 ms using method 2. Execution took 88.5857 ms using method 3.
If any Igor 7 preview testers want to reproduce these results, you must use the latest build dated 21Feb2015 or later in order to use method 3.
For even larger waves, method 3 performs even better relative to the other methods.
One note--If you're using Igor 6, the following three lines take longer to execute than the rest of the code:
originalTextWave[0,(waveSize/2) - 1] = replacementString
originalTextWave[(waveSize/2), *] = targetString
Igor 7 has an optimization for text waves that apparently makes the creation of the text wave much faster.
February 20, 2015 at 02:15 pm - Permalink
Here is the test code:
Variable waveSize
String targetString = "PSI"
String replacementString = "Psi"
Make/N=(waveSize)/T/FREE/O originalTextWave
originalTextWave[0,(waveSize/2) - 1] = replacementString
originalTextWave[(waveSize/2), *] = targetString
Variable method
For (method = 0; method < 6; method +=1)
Duplicate/O/FREE originalTextWave, testTextWave
// print "before:", testTextWave
Variable start = StopMSTimer(-2)
Switch (method)
case 0: // Original method
Replace4txtWave(targetString, testTextWave, replacementString)
break;
case 1: // SelectString method
Replace4txtWave1(targetString, testTextWave, replacementString)
break;
case 2: // FindValue method
Replace4txtWave2(targetString, testTextWave, replacementString)
break;
case 3: // MultiThread text wave assignment method
#if IgorVersion() >= 7
Replace4txtWave3(targetString, testTextWave, replacementString)
#else
print "Method 3 requires Igor 7."
#endif
break;
case 4: // Extract method
Replace4txtWave4(targetString, testTextWave, replacementString)
break;
case 5: // Extract method
Replace4txtWave5(targetString, testTextWave, replacementString)
break;
EndSwitch
// print "after:", testTextWave
printf "Execution took %g ms using method %d.\r", (StopMSTimer(-2) - start)/1e3, method
EndFor
End
function Replace4txtWave(replaceThisStr, TxtWave, withThisStr)
string replaceThisStr
wave /t TxtWave
string withThisStr
variable i
Variable waveNumPnts = numpnts(TxtWave)
for(i=0;i<waveNumPnts;i+=1)
if(cmpstr(replaceThisStr,TxtWave[i])==0)
TxtWave[i]=withThisStr
endif
endfor
end
function Replace4txtWave1(replaceThisStr, TxtWave, withThisStr)
string replaceThisStr
wave /t TxtWave
string withThisStr
TxtWave = SelectString( cmpstr(TxtWave[p],replaceThisStr)==0, TxtWave[p], withThisStr)
end
function Replace4txtWave2(replaceThisStr, TxtWave, withThisStr)
string replaceThisStr
wave /t TxtWave
string withThisStr
variable startPoint = 0
Variable waveNumPnts = numpnts(TxtWave)
do
FindValue/S=(startPoint)/TEXT=replaceThisStr/TXOP=4 TxtWave
if (V_value == -1) // Value not found
break
endif
TxtWave[V_value]=withThisStr
startPoint = V_value + 1
while (startPoint < waveNumPnts)
end
#if IgorVersion() >= 7
function Replace4txtWave3(replaceThisStr, TxtWave, withThisStr)
string replaceThisStr
wave /t TxtWave
string withThisStr
// MultiThread with a text wave requires Igor 7 built 21Feb2015 or later.
MultiThread TxtWave = Replace4txtWave3_worker(TxtWave[p], replaceThisStr, withThisStr)
end
ThreadSafe Function/S Replace4txtWave3_worker(actualString, replaceThisStr, withThisStr)
String actualString
string replaceThisStr
string withThisStr
if(cmpstr(replaceThisStr,actualString)==0)
return withThisStr
endif
return actualString // No replacement necessary
End
#endif // IgorVersion() >= 7
function Replace4txtWave4(replaceThisStr, TxtWave, withThisStr)
string replaceThisStr
wave /t TxtWave
string withThisStr
Extract/FREE/O/INDX/T TxtWave, ExtractedWave, (cmpstr(replaceThisStr,TxtWave[p])==0)
variable i
Variable numToReplace = numpnts(ExtractedWave)
for(i=0;i<numToReplace;i+=1)
TxtWave[ExtractedWave[i]]=withThisStr
endfor
end
function Replace4txtWave5(replaceThisStr, TxtWave, withThisStr)
string replaceThisStr
wave /t TxtWave
string withThisStr
Grep/INDX/Q/E="^" + replaceThisStr + "$" TxtWave
WAVE W_Index
variable i
Variable numToReplace = numpnts(W_Index)
for(i=0;i<numToReplace;i+=1)
TxtWave[W_Index[i]]=withThisStr
endfor
end
Here are results on my Windows machine (the previous test results were on my Macintosh machine):
Igor 6
•test(200000) Execution took 246.57 ms using method 0. Execution took 274.317 ms using method 1. Execution took 258.047 ms using method 2. Method 3 requires Igor 7. Execution took 0.846398 ms using method 3. Execution took 272.347 ms using method 4. Execution took 180.707 ms using method 5.
Igor 7
•test(200000) Execution took 104.597 ms using method 0. Execution took 87.317 ms using method 1. Execution took 219.459 ms using method 2. Execution took 47.842 ms using method 3. Execution took 125.289 ms using method 4. Execution took 166.079 ms using method 5.
February 23, 2015 at 07:21 am - Permalink
In that special case here one can speed up the initial wave assignment with preallocated storage
Nice to see IP7 perform so quickly with the new text wave memory layout mentioned in [1]. Looking at your code the new memory layout seems also to be the default in IP7.
I'd extend the grep code as follows
so that any funny characters in replaceThisStr are taken literally.
[1]: http://wavemetrics.com/search/viewmlid.php?mid=27741
February 25, 2015 at 04:03 am - Permalink
Using Igor 6, using /T=size does decrease the time it takes to execute the Make statement that creates the originalTextWave wave, but it doesn't change the performance of the text replacement by more than a few ms one way or another.
Using Igor 7, the text wave is created quickly whether or not /T=size is used. However, if it's created using /T=size, the actual replacement methods execute slower. In particular, the Igor 7 only method 3, that uses a MultiThread wave assignment statement, is slower because the MultiThread keyword is ignored if the target of a text wave assignment stores the text data in contiguous bytes instead as an array of pointers. Using /T=size forces the wave's text data to be stored as contiguous bytes. So I don't recommend it's use in this situation unless the real-life bottleneck is in the initial creation of the wave and not in the replacement.
Here are the timings on Macintosh using Igor 7:
// Line 8 is: Make/N=(waveSize)/T/FREE/O originalTextWave •test(200000) Execution took 112.238 ms using method 0. Execution took 77.6911 ms using method 1. Execution took 344.949 ms using method 2. Execution took 72.5701 ms using method 3. Execution took 109.374 ms using method 4. Execution took 143.778 ms using method 5. // Line 8 is: Make/N=(waveSize)/T=(strlen(targetString))/FREE/O originalTextWave •test(200000) Execution took 197.69 ms using method 0. Execution took 168.698 ms using method 1. Execution took 443.477 ms using method 2. Execution took 287.046 ms using method 3. Execution took 205.962 ms using method 4. Execution took 181.9 ms using method 5.
Yes, that's a good idea when creating a regular expression from user input.
February 25, 2015 at 07:44 am - Permalink