Multi-Threading Question for Old Code with Lots of Macros, Lots of Waves
astrostu
Here's the basic idea:
0. I have a giant list of points along tens of thousands of crater rims. List is lat/lon of the vertex points, and the craters are separated by a 0 in the longitude and NaN in the latitude columns (that's how other software outputs them.
1. I have an EasyRead macro which uses that information to separate the rims, storing the "current" crater rim to new lat/lon waves (rim_lon and rim_lat). It stores the rims, in order. For each new rim it stores, it calls
2. CraterQuantitative macro. This macro projects the rim points from decimal degrees into kilometers from the crater center. It then determines if there are enough points to fit a circle.
3. If there are, it calls the CircleFitNLLS macro (which is actually faster than the built-in implicit fit function Igor uses). This CircleFitNLLS fits the circle and populates rows in 6 different global waves with information.
4. Then, CraterQuantitative determines if there are enough points to fit an ellipse.
5. If there are, it calls the EllipseFitMatrixMethod macro, which fits an ellipse and populates rows in 7 different global waves with information.
6. At this point, since the final step for this crater is finished, then EasyRead takes back over and extracts the next crater rim (stores to rim_lon and rim_lat) and repeats 2-5.
This seems like it would be good for threading because the EasyRead task is the only one that need be done in serial, and all the processing for that crater can be done in parallel while another crater is being worked on.
I've spent the last few hours trying to tackle this and have run up against problems. I've converted everything to Functions. I access the global waves by doing Wave [name] = root:[name] for the functions that need to access and populate the stuff that stores data (CraterQuantitative, CircleFitNLLS, and EllipseFitMatrixMethod) and to read the original rim waves (EasyRead). And I pass the waves that just have "this" crater's lat/lon between EasyRead and CraterQuantitative and the other stuff (rim_lon and rim_lat).
I should note: this works without threading. As in, converting everything to functions works (really, all that was left was EasyRead and CraterQuantitative).
To attempt to avoid issues with waves being overwritten that are in use, I create two strings (s_rim_lon and s_rim_lat) and affix the thread number to the rim_lat/lon name. I also look for the first thread that's free:
Variable nthreads = ThreadProcessorCount
Variable mt = ThreadGroupCreate(nthreads)
Variable tgs, ti
if(ThreadGroupWait(mt,50) != 0)
ti = ThreadGroupWait(mt,-2)-1
if(ti < 0)
continue
endif
endif
print ti
s_rim_lon = "rim_lon"+num2str(ti)
s_rim_lat = "rim_lat"+num2str(ti)
Variable mt = ThreadGroupCreate(nthreads)
Variable tgs, ti
if(ThreadGroupWait(mt,50) != 0)
ti = ThreadGroupWait(mt,-2)-1
if(ti < 0)
continue
endif
endif
print ti
s_rim_lon = "rim_lon"+num2str(ti)
s_rim_lat = "rim_lat"+num2str(ti)
The problem with threading at the moment seems to be the passing of rim_lon and rim_lat. In EasyRead, I had a Redimension call that redimensions rim_lon and rim_lat to store the current crater's lat/lon rim points. Then it calls the CraterQuantitative to do everything. I was getting errors that I couldn't Redimension a wave that was being used by a preemptive thread. So I did the next stupid thing and used Make/O to overwrite them and then copy over the data from the original vectors to rim_lon and rim_lat.
Make/O/N=(V_LevelX-i_counter_rims_index) $s_rim_lon, $s_rim_lat
Duplicate/O/R=[i_counter_rims_index,V_LevelX-1] rim_lon_raw $s_rim_lon
Duplicate/O/R=[i_counter_rims_index,V_LevelX-1] rim_lat_raw $s_rim_lat
Duplicate/O/R=[i_counter_rims_index,V_LevelX-1] rim_lon_raw $s_rim_lon
Duplicate/O/R=[i_counter_rims_index,V_LevelX-1] rim_lat_raw $s_rim_lat
And then I call CraterQuantitative with
ThreadStart mt, ti, CraterQuantitative($s_rim_lon, $s_rim_lat)
.The error I'm getting is "While executing a wave assignment, the following error occurred: Attempt to operate on a null (missing) wave." (That was what caused me to do the s_rim_lon/lat to begin with, but that didn't fix the issue.)
But I don't see how that error could come up. I tried to mitigate it by explicitly declaring all possible values at the beginning of EasyRead, running Make rim_lon0, rim_lat0 with 0-7 (8-processor machine). Same error.
For the record, this works:
CraterQuantitative($s_rim_lon, $s_rim_lat)
This generates the error:
ThreadStart mt, ti, CraterQuantitative($s_rim_lon, $s_rim_lat)
I even limited it to only 1 thread so it should always be using rim_lon0 and rim_lat0, just as it does when I don't thread. Same error.
Ideas?
P.S. I'm really trying to be specific WITHOUT attaching sample files because the file has 2613 lines of code and is otherwise large, too. If I really need to, I'll attach a version with all the frills removed.
Wave rim_lon, rim_lat
Wave LATITUDE_CIRCLE_IMAGE = root:LATITUDE_CIRCLE_IMAGE
Wave LATITUDE_CIRCLE_SD_IMAGE = root:LATITUDE_CIRCLE_SD_IMAGE
Wave LONGITUDE_CIRCLE_IMAGE = root:LONGITUDE_CIRCLE_IMAGE
Wave LONGITUDE_CIRCLE_SD_IMAGE = root:LONGITUDE_CIRCLE_SD_IMAGE
Wave DIAM_CIRCLE_IMAGE = root:DIAM_CIRCLE_IMAGE
Wave DIAM_CIRCLE_SD_IMAGE = root:DIAM_CIRCLE_SD_IMAGE
Wave LATITUDE_ELLIPSE_IMAGE = root:LATITUDE_ELLIPSE_IMAGE
Wave LONGITUDE_ELLIPSE_IMAGE = root:LONGITUDE_ELLIPSE_IMAGE
Wave DIAM_ELLIPSE_MAJOR_IMAGE = root:DIAM_ELLIPSE_MAJOR_IMAGE
Wave DIAM_ELLIPSE_MINOR_IMAGE = root:DIAM_ELLIPSE_MINOR_IMAGE
Wave DIAM_ELLIPSE_ANGLE_IMAGE = root:DIAM_ELLIPSE_ANGLE_IMAGE
Wave DIAM_ELLIPSE_ECCEN_IMAGE = root:DIAM_ELLIPSE_ECCEN_IMAGE
Wave DIAM_ELLIPSE_ELLIP_IMAGE = root:DIAM_ELLIPSE_ELLIP_IMAGE
Wave PTS_USED_RIM_IMAGE = root:PTS_USED_RIM_IMAGE
I get NO error when threading until I try to do something to any of these waves in CraterQuantitative (like storing a value to them). So my guess is that the issue is actually that I do not understand how to access and use global waves from ThreadSafe Functions?
November 10, 2014 at 09:19 pm - Permalink
DisplayHelpTopic "Thread Data Environment"
November 10, 2014 at 10:04 pm - Permalink
If I really really really need to, I'll just pass these waves to the function. That's how I've always done it before, and it's led to many-a-400-character-long function calls.
But I'd like to have something more elegant, if possible, where once a thread for CraterQuantitative is spawned, it can still access (read/write) root folder waves.
November 11, 2014 at 11:03 am - Permalink
John Weeks
WaveMetrics, Inc.
support@wavemetrics.com
November 11, 2014 at 04:40 pm - Permalink
If you pass a wave to a thread using a worker function parameter it does not disappear from the main thread's data hierarchy. It does disappear if you use ThreadGroupPutDF.
But the original poster is not passing a wave to a thread. He is trying to reference a wave in the data hierarchy of the main thread from a thread worker function:
Wave rim_lon, rim_lat
...
But each thread worker function has its own data hierarchy and can not see the data hierarchy of any other threads.
That is not possible unless you pass the wave to the worker thread as parameter in which case you need to heed the restrictions on such waves: you must ensure that at most one thread writes to the wave and you can't redimension it.
I'm not sure but I think you can pass a structure containing wave references or a wave containing wave references to the worker thread. But the same restrictions apply as if you passed the waves directly to the worker function.
November 11, 2014 at 05:33 pm - Permalink
Sigh. That's what I feared. I re-wrote the code to do it that way. Now runs about 22x faster than before, so I can analyze 140k craters in 17 seconds instead of 8.5 minutes. Not a bad increase.
It delegates threads to find the first one free, which is usually the 1st or 2nd thread because most craters only have ~10 points to fit, but on much larger craters, it can be >8 threads so it sits there spinning its wheels until there's a free thread.
November 14, 2014 at 04:39 pm - Permalink