For Loop to remove errors from multiple files
thunderstruck71
I've got 30 files 'FILE_01, _02, _03 etc...' and in these files there are several thousand points, some of which need removing. I have the code below which cycles through the files, but I'm stuck at which point to include a loop to go through each wave looking for number <0 and removing them. I know this is a real beginner question but I'd appreciate any pointers!
Thanks.
function remove_errors()
variable i
string bin
for (i=1;i<=30;i+=1)
bin=num2str(i)
if(strlen(bin)==1)
bin="0"+bin
endif
name="root:cdp_"+bin
endfor
end
If so, and your "files" were actually waves then you could execute the following function:
string list
variable i
string nextwave = ""
do
nextwave = stringfromlist(i, list)
if (strlen(nextwave) == 0)
break
endif
wave w = $nextwave
w = SelectNumber(numtype(w) == 2, w,0)
i += 1
while(1)
end
say, if your waves of interest were named w1, w2, w3 you would execute:
from the command line. In case of more waves you can use the WaveList - command to get a string of all the waves.
"NumType" in the above code identifies NaN's and SelectNumber can be used here a bit like an if-statement.
I hope this points into the right direction!
Cheers
C
January 24, 2012 at 04:28 am - Permalink
So now lets look at the first step of iterating through your files. You solution may work, but is very hardcoded to the specific case with the exact wave names (better call them 'waves', because 'files' make you think of stuff saved on the hard drive instead of loaded data in Igor) and count of your waves. How about a way to process the waves on demand for this simple task? I would do this by utilizing the execute command button in the data browser. Like so:
wave my_wave
//do stuff with my_wave here
End
Now you can select all the waves you need to process and hit the 'Exec. Cmd.' button. Write 'remove_errors(%s)' in there and hit Ok (%s is a placeholder where the actual wave name will be pasted). All waves you have selected will be processed with you function. No need for loops and hardcoding wave names (leave that until you have really complicated tasks, and even then hardcoding stuff can always be avoided).
Now to the actual task of inserting NaNs. This is quickly done in one line. Just use 'my_wave = my_wave[p]<0 ? NaN : my_wave[p]', to replace all values smaller than zero with NaN. This expression is a shortcut for an if-else statement. You may want to read more about it in the help browser, but here is what it basically does: The equation automatically goes through all points of the wave; no need for a loop. The current point is expressed by the counter p. First cones a question: Is the current point <0? The answer decides if the value before (here NaN) or after the ':' is picked. The full function will look like this then:
wave my_wave
my_wave = my_wave[p]<0 ? NaN : my_wave[p]
End
January 24, 2012 at 04:59 am - Permalink
I was just curious which 1-liner runs faster:
(assuming the task is NaN's to 0) runs about twice as fast compared to
January 24, 2012 at 05:36 am - Permalink
Hmm, interesting that it makes such a huge difference. I guess it has something to do with what is written in the help file:
Unlike the ? : conditional operator, SelectNumber always evaluates all of the numeric expression parameters val1, val2, ...
SelectNumber works in a macro, whereas the conditional operator does not.
January 24, 2012 at 05:53 am - Permalink
Thanks for the help. Actually what I was trying to do was remove negative numbers from several waves, so actually turn some values into NaNs. I've solved it now using two different functions:
wave wave1
variable i,j=numpnts(wave1)
for (i=0;i<j;i+=1)
if(wave1[i]<0)
wave1[i]=NAN
endif
endfor
end
the above function is executed within this function:
variable i
string name,bin
for (i=1;i<=30;i+=1)
bin=num2str(i)
if(strlen(bin)==1)
bin="0"+bin
endif
name="cdp_"+bin
wave source_data=$name
remove_wave_errors(source_data)
endfor
end
January 24, 2012 at 06:32 am - Permalink
I didn't know that the Exec Cmd button could be used in this way. Your way seems a hell of a lot easier than the 2 functions I've got to achieve the same result. Thanks.
January 24, 2012 at 06:46 am - Permalink
This will work, but as chozo pointed out before, this will be much faster:
wave wave1
wave1 = wave1[p] < 0 ? NaN : wave1[p]
end
If you're going to be doing much coding, it's best to get familiar with wave assignment statements sooner rather than later, because they are faster and will make coding a lot easier. For more information, execute the following command on Igor's command line:
January 24, 2012 at 07:16 am - Permalink
Thanks. One of the reasons I needed a function like the on I've got working is because I'll potentially be doing it over and over with lots of data, so although highlighting the files and using the Exec Cmd button is quick, long term being able to execute the function will knock some time off the task.
January 24, 2012 at 07:23 am - Permalink
Sure...it sounds like using a function that can fix the errors on all waves with a certain name pattern automatically is the way to go here. I was just pointing out that chozo's way of actually removing the errors was faster than your way. If your waves are not very long then you probably won't notice any difference in performance, but if you are operating on lots of long waves it's always nice to have things go as fast as possible.
If you ever get to writing assignment statements for multi-dimensional waves, using nested for loops gets to be a real drag and can be error prone. So knowing how to "correctly" write a wave assignment statement can be very helpful.
January 24, 2012 at 07:36 am - Permalink
Ok, maybe you want to reduce the hardcoded stuff in your other function a bit, like aclight was already hinting? Like so:
string NameMatch
string names = WaveList(NameMatch+"*",";","") // generates a list with all waves beginning with 'NameMatch'
variable i
for (i = 0; i < ItemsInList(names); i += 1) // go through the list
wave source_data=$(StringFromList(i, names))
remove_wave_errors(source_data)
endfor
End
Call 'Remove_cdp_errors("cdp")' to get the job done. Now you are free to choose the wave name pattern. Getting the function to work for all waves in a folder would make it even shorter (by removing all the NameMatch code).
January 24, 2012 at 09:38 am - Permalink