Multi thread calculation with Hyper-Threading Technology sometimes returns erroneous answer
kanekaka
Have anyone experienced similar problem?
I wish 'ThreadProcessorCount' to return 'core number' instead of 'thread number'.
ThreadSafe Static Function TSPeakDetect(wSou, x1, x2, xWidth, wPkDt)
Wave wSou, wPkDt
Variable x1, x2, xWidth
Variable vDeltaX = deltax(wSou), i
for(i = 0; x1 < x2; i += 1)
if (x1 + xWidth > x2)
xWidth = x2 - x1 + vDeltaX
endif
WaveStats /M=1 /Q/Z/R = (x1, x1 + xWidth - vDeltaX) wSou
wPkDt[i * 2] = V_min
wPkDt[i * 2 + 1] = V_max
x1 += xWidth
endfor
return i
end
Function/S MTPeakDetect(wSou, xWidth)
Wave wSou
Variable xWidth
Variable vDeltaX = deltax(wSou), vNumpnt = numpnts(wSou)
Variable vSize = 2 * ceil(vDeltaX * vNumpnt / xWidth), x1 = leftx(wSou), x2 = pnt2x(wSou, vNumpnt - 1)
String sDest = UniqueName(NameOfWave(wSou) + "PkDt", 1, 0)
Duplicate wSou, $sDest
Redimension /N=(vSize) $sDest
SetScale/P x x1, xWidth / 2, WaveUnits(wSou, 0), $sDest
Wave wPkDt = $sDest
Variable vPC = ThreadProcessorCount, i
if (vPC >= 2)
if ((vPC >= 4) && !mod(vPC, 2))
vPC /= 2 // By Core i5(2 core) with HTT(4 thread), 4 thread calcuration answers wrong result.
endif
Variable vMT = ceil(vSize / 2 / vPC), vTGS = 0, vTGW = 100
String sDests = UniqueName("Tmp", 1, 0), sX1 = UniqueName("sX1_", 1, 0), sX2 = UniqueName("sX2_", 1, 0)
Make /D/N=(vPC) $sX1, $sX2
Wave wX1 = $sX1
Wave wX2 = $sX2
wX1[0] = x1
for(i = 1; i < vPC; i += 1)
Make /N=(vMT * 2) $StringFromList(i - 1, sDests)
sDest = UniqueName("Tmp", 1, 0)
sDests += ";" + sDest
wX2[i - 1] = wX1[i - 1] + xWidth * vMT
wX1[i] = wX2[i - 1]
endfor
Make /N=(vMT * 2) $StringFromList(i - 1, sDests)
wX2[i - 1] = x2
vMT = ThreadGroupCreate(vPC)
for(i = 0; i < vPC; i += 1)
ThreadStart vMT, i, TSPeakDetect(wSou, wX1[i], wX2[i], xWidth, $StringFromList(i, sDests))
endfor
do
vTGS = ThreadGroupWait(vMT, vTGW)
while (vTGS != 0)
x1 = 0
for(i = 0; i < vPC; i += 1)
x2 = ThreadReturnValue(vMT, i) * 2
Wave wTmp = $StringFromList(i, sDests)
wPkDt[x1, x1 + x2 - 1] = wTmp[p - x1]
x1 = x2
KillWaves /Z wTmp
endfor
KillWaves /Z $sX1, $sX2
vTGS = ThreadGroupRelease(vMT)
else
for(i = 0; x1 < x2; i += 1)
if (x1 + xWidth > x2)
xWidth = x2 - x1 + vDeltaX
endif
WaveStats /M=1 /Q/Z/R = (x1, x1 + xWidth - vDeltaX) wSou
wPkDt[i * 2] = V_min
wPkDt[i * 2 + 1] = V_max
x1 += xWidth
endfor
endif
return NameOfWave(wPkDt)
end
Wave wSou, wPkDt
Variable x1, x2, xWidth
Variable vDeltaX = deltax(wSou), i
for(i = 0; x1 < x2; i += 1)
if (x1 + xWidth > x2)
xWidth = x2 - x1 + vDeltaX
endif
WaveStats /M=1 /Q/Z/R = (x1, x1 + xWidth - vDeltaX) wSou
wPkDt[i * 2] = V_min
wPkDt[i * 2 + 1] = V_max
x1 += xWidth
endfor
return i
end
Function/S MTPeakDetect(wSou, xWidth)
Wave wSou
Variable xWidth
Variable vDeltaX = deltax(wSou), vNumpnt = numpnts(wSou)
Variable vSize = 2 * ceil(vDeltaX * vNumpnt / xWidth), x1 = leftx(wSou), x2 = pnt2x(wSou, vNumpnt - 1)
String sDest = UniqueName(NameOfWave(wSou) + "PkDt", 1, 0)
Duplicate wSou, $sDest
Redimension /N=(vSize) $sDest
SetScale/P x x1, xWidth / 2, WaveUnits(wSou, 0), $sDest
Wave wPkDt = $sDest
Variable vPC = ThreadProcessorCount, i
if (vPC >= 2)
if ((vPC >= 4) && !mod(vPC, 2))
vPC /= 2 // By Core i5(2 core) with HTT(4 thread), 4 thread calcuration answers wrong result.
endif
Variable vMT = ceil(vSize / 2 / vPC), vTGS = 0, vTGW = 100
String sDests = UniqueName("Tmp", 1, 0), sX1 = UniqueName("sX1_", 1, 0), sX2 = UniqueName("sX2_", 1, 0)
Make /D/N=(vPC) $sX1, $sX2
Wave wX1 = $sX1
Wave wX2 = $sX2
wX1[0] = x1
for(i = 1; i < vPC; i += 1)
Make /N=(vMT * 2) $StringFromList(i - 1, sDests)
sDest = UniqueName("Tmp", 1, 0)
sDests += ";" + sDest
wX2[i - 1] = wX1[i - 1] + xWidth * vMT
wX1[i] = wX2[i - 1]
endfor
Make /N=(vMT * 2) $StringFromList(i - 1, sDests)
wX2[i - 1] = x2
vMT = ThreadGroupCreate(vPC)
for(i = 0; i < vPC; i += 1)
ThreadStart vMT, i, TSPeakDetect(wSou, wX1[i], wX2[i], xWidth, $StringFromList(i, sDests))
endfor
do
vTGS = ThreadGroupWait(vMT, vTGW)
while (vTGS != 0)
x1 = 0
for(i = 0; i < vPC; i += 1)
x2 = ThreadReturnValue(vMT, i) * 2
Wave wTmp = $StringFromList(i, sDests)
wPkDt[x1, x1 + x2 - 1] = wTmp[p - x1]
x1 = x2
KillWaves /Z wTmp
endfor
KillWaves /Z $sX1, $sX2
vTGS = ThreadGroupRelease(vMT)
else
for(i = 0; x1 < x2; i += 1)
if (x1 + xWidth > x2)
xWidth = x2 - x1 + vDeltaX
endif
WaveStats /M=1 /Q/Z/R = (x1, x1 + xWidth - vDeltaX) wSou
wPkDt[i * 2] = V_min
wPkDt[i * 2 + 1] = V_max
x1 += xWidth
endfor
endif
return NameOfWave(wPkDt)
end
ThreadProcessorCount always returns the number of processors including hyperthreads. Even if that was an option, I don't think there is any guarantee that threads created with ThreadGroupCreate and executed with ThreadStart will actually run on separate cores/processors.
I opened your experiment in Igor 8 and executed MTPeakDetect(tsTest1, 2e-5) and Igor gave an index out of range error. If you are ignoring this, that might explain your problem. It's hard to track down the cause of these types of errors in a multiple thread situation, but I think the problem is here:
wPkDt[x1, x1 + x2 - 1] = wTmp[p - x1]
If I add a print statement just before that line, like this:
wPkDt[x1, x1 + x2 - 1] = wTmp[p - x1]
then I get this output:
x1= 0 x2= 128 numpnts(wTmp)= 128
x1= 128 x2= 128 numpnts(wTmp)= 128
x1= 128 x2= 128 numpnts(wTmp)= 128
x1= 128 x2= 128 numpnts(wTmp)= 128
x1= 128 x2= 128 numpnts(wTmp)= 128
x1= 128 x2= 128 numpnts(wTmp)= 128
x1= 128 x2= 130 numpnts(wTmp)= 128
x1= 130 x2= 128 numpnts(wTmp)= 128
Note that the next to the last line has a value for x2 that will cause the wave assignment statement to give the index out of range error.
I suspect that your problem ultimately comes down to floating point rounding error, since you're passing around x values and telling WaveStats to use an X range instead of a point range (/R=() vs /R=[]).
I don't know whether the issues I pointed out above are responsible for your calculation getting the wrong answer, but I have a feeling they are.
October 22, 2018 at 10:52 am - Permalink
In reply to ThreadProcessorCount always… by aclight
Thank you so much aclight.
Your suggestion helps me.
One mistake caused the trouble.
Modified procedure is bellow.
Wave wSou
Variable xWidth
Variable vDeltaX = deltax(wSou), vNumpnt = numpnts(wSou)
Variable vSize = 2 * ceil(vDeltaX * vNumpnt / xWidth), x1 = leftx(wSou), x2 = pnt2x(wSou, vNumpnt - 1)
String sDest = UniqueName(NameOfWave(wSou) + "PkDt", 1, 0)
Duplicate wSou, $sDest
Redimension /N=(vSize) $sDest
SetScale/P x x1, xWidth / 2, WaveUnits(wSou, 0), $sDest
Wave wPkDt = $sDest
Variable vPC = ThreadProcessorCount, i
if (vPC >= 2)
Variable vMT = ceil(vSize / 2 / vPC), vTGS = 0, vTGW = 100
String sDests = UniqueName("Tmp", 1, 0), sX1 = UniqueName("sX1_", 1, 0), sX2 = UniqueName("sX2_", 1, 0)
Make /D/N=(vPC) $sX1, $sX2
Wave wX1 = $sX1
Wave wX2 = $sX2
wX1[0] = x1
for(i = 1; i < vPC; i += 1)
Make /N=(vMT * 2) $StringFromList(i - 1, sDests)
sDest = UniqueName("Tmp", 1, 0)
sDests += ";" + sDest
wX2[i - 1] = wX1[i - 1] + xWidth * vMT
wX1[i] = wX2[i - 1]
endfor
Make /N=(vMT * 2) $StringFromList(i - 1, sDests)
wX2[i - 1] = x2
vMT = ThreadGroupCreate(vPC)
for(i = 0; i < vPC; i += 1)
ThreadStart vMT, i, TSPeakDetect(wSou, wX1[i], wX2[i], xWidth, $StringFromList(i, sDests))
endfor
do
vTGS = ThreadGroupWait(vMT, vTGW)
while (vTGS != 0)
x1 = 0
for(i = 0; i < vPC; i += 1)
x2 = ThreadReturnValue(vMT, i) * 2
Wave wTmp = $StringFromList(i, sDests)
wPkDt[x1, x1 + x2 - 1] = wTmp[p - x1]
x1 += x2 // Modified @2018-10-23
KillWaves /Z wTmp
endfor
KillWaves /Z $sX1, $sX2
vTGS = ThreadGroupRelease(vMT)
else
for(i = 0; x1 < x2; i += 1)
if (x1 + xWidth > x2)
xWidth = x2 - x1 + vDeltaX
endif
WaveStats /M=1 /Q/Z/R = (x1, x1 + xWidth - vDeltaX) wSou
wPkDt[i * 2] = V_min
wPkDt[i * 2 + 1] = V_max
x1 += xWidth
endfor
endif
return NameOfWave(wPkDt)
end
October 22, 2018 at 08:11 pm - Permalink