Non-parametric Wind Regression - speed up calculations
J-E Petit
I am working on the adaptation of a Python script I made to an Igor procedure for Non-parametric Wind Regression. First, what is NWR? it is a smoothing algorithm to better visualize the coupling of wind data (direction and speed) and pollutant concentrations. I enclosed the formulas associated with the methodology. It is basically, for each (theta,mu) couple a weighted average of concentrations where weighing coefficients are determined with two Kernel functions.
Thus, we can imagine having:
- an angle wave (theta), from 0° to 360° with a resolution of 0.5°, which makes 720 rows
- a speed wave (mu), from 0 to 20km/h with a resolution of 0.1, which makes 200 rows
The final concentration matrix will be (720,200)
So far, the procedure I wrote works pretty well, but it does the calculation cell by cell. Which takes quite a lot of time....
The ideal case would be to calculate the final matrix by only one mathemical operation (matrix multiplication), but it applies dealing with 3D matrices that are too big to handle for my computer (the concentration, ws, wd waves contains 3501 rows, leading to a 720*200*3501-element matrix). An other solution is to fragment the calculation to reduce the size of the 3D matrices.
But still, I am not very familiar with Matrix operations on Igor, and would need any kind of help!
I know my question is a pretty big thing to answer... apologies for that.
I enclose an Igor experiment in which this is:
- the NWR procedure I wrote
- some data (ws, wd, conc waves)
The main function is NWR(sigma,h). The others are here for data preparation.
Many many thanks in advance!
J-E
Please post your procedure file (missing from your experiment) if you want help optimizing the code.
A.G.
WaveMetrics, Inc.
March 11, 2015 at 10:22 am - Permalink
here it is
March 12, 2015 at 01:11 am - Permalink
1. You need to avoid using angles in degrees. You are wasting computation time on conversion back and forth. Keep everything in radians and modify labels at the very end for the benefit of people who do not understand pi/2.
2. I do not know which data waves are important to you so I did not modify much of the code. It would be better if you wrote your code with waves as function arguments so your inputs are clear.
3. Your goal should be to convert the double loop at the end of the procedure to straight matrix operations. You should be able to do that if you create intermediate 2D waves for diff_speed and the like.
4. Replace wave assignments with MatrixOP wherever possible.
Here is a first iteration of going through your code.
I hope this helps,
A.G.
WaveMetrics, Inc.
March 12, 2015 at 11:48 am - Permalink
As I'm starting programming on Igor, the procedure may be a little messy; but I promise, I'll keep improving!
1/ got it. I am on of these people that are not very confident with radians... :)
2/ the most important waves are the input data (ws, wd and conc), and the "angle" & "speed" waves. The other waves are intermediate for the calculation. the "mx" 2D wave is the final result.
3/ Yes, exactly, and this is the heart the problem. The first thing I could try is to remove the second loop, by doing the calculation angle-by-angle with all speed values at each time. It will lead to the handling of 2D waves which, I guess, is manageable. I'll definitly try that first. The second solution is to remove the 2 loops, but it leads to 3D-matrices that are too big... or am I missing something?
4/ I am not familiar with matrixOP and/or wave assignment... what do you mean by assign waves with MatrixOP?
Thanks again!
March 12, 2015 at 03:24 pm - Permalink
I did not see any reason for a 3D matrix but then I may have missed something. Can you tell me what is the ultimate goal here? Are you performing some kind of interpolation? Also, your kernel is zero outside a limited domain -- you may be able to save a lot of multiplications by applying it locally.
MatrixOP is a built-in operation that speeds up wave assignments. For example
March 12, 2015 at 03:48 pm - Permalink
yep, it is a sort of interpolation. The ultimate goal is to have, for any wind direction & speed, a concentration value. All this from discreet variables at the input. Does it make sense to you? I enclose an image that may be clearer than some words.
Would the Interpolate XOP be useful here?
March 12, 2015 at 04:35 pm - Permalink
If the image on the left displays "scatter" of points were data are sampled, I'd approach the problem differently.
March 12, 2015 at 05:01 pm - Permalink
March 13, 2015 at 12:06 am - Permalink
I assume you have 3 waves of data: radius, angle and some scalar data each containing N points.
1. Create a triplet wave:
2. Fill the third column with scalar data:
3. compute the x and y cols from radius and angle waves:
4. determine the maximum radius value:
5. create a regular grid of interpolated data:
6. display the result:
NewImage M_InterpolatedImage
Note that M_InterpolatedImage will contain NaNs outside the convex domain of the data.
HTH,
A.G.
WaveMetrics, Inc.
March 13, 2015 at 09:47 am - Permalink
ImageInterpolate/S={-rmax,dr,rmax,-rmax,dr,rmax} voronoi tripletWave
see attached image
maybe I did something wrong... but it seems like the "dr" variable is not defined. Is it related?
March 13, 2015 at 10:42 am - Permalink
March 13, 2015 at 10:44 am - Permalink
This sets the resolution of the output and depends on your data.
March 13, 2015 at 12:41 pm - Permalink
and then:
ImageInterpolate/S={-rmax,dr1,rmax,-rmax,dr2,rmax} voronoi tripletWave
but I get the same error, which is not very logical to me, because the ImageInterpolate operation doesn't need a /N flag....
March 15, 2015 at 12:26 am - Permalink
March 15, 2015 at 06:39 pm - Permalink
Tunrs out that calculations are particularly long. Maybe I used dr values that not adapted.
Anyway, I got back to the initial approach, and tried to get rid of one loop (out of the two). It seems to work pretty well, and it is way faster than calculations through each cell (now it is row-by-row). Maybe some lines of code can be optimized?
Next step is to plot the mx wave on a polar plot :) (already in progress on the forum)
March 17, 2015 at 09:49 am - Permalink
There are two components for that calculation: triangulation and interpolation. The value of (dr) will only affect the interpolation stage and its time-dependence is linear for each dimension. The triangulation is O(N^2) where N is the number of rows in your triplet wave.
Looking at the last function in your code, here are a few more suggestions you might want to try:
March 17, 2015 at 10:34 am - Permalink