Parallel For-Loops for Matrix Operations

Dear IgorExchange,

I have some data that I need to load into Igor and was looking to speed up the process. Each image to load is comprised of 399 text files that need to be summed and then concatenated, with the MWE below requiring ca 120 sec on my laptop (i7, 16 GB RAM, SSD).

I've cut this MWE out of a larger script containing multiple functions, which is why there are two for-loops. I imagine that for a significant speed improvement, the first for-loop needs to be split into multiple processes running in parallel on different cores. This is an aspect of Igor programming I've not yet attempt - any tips?

I've attached a zip file with the text files. Once unzipped, the MWE below should be pointed to the folder.

As a smaller point, can Igor be directed directly to the zip file rather than manually unzipping it?

Many thanks in advance!

Shannon

Function MWE_RIXS()
    //  MWE of RIXS map import
    variable i
    String ImpList = ""
   
    NewPath/O/Q Sympath
    PathInfo Sympath
    String pathname = S_path
   
    Variable TimerA = StartMSTimer
   
    String fileName
    String fileList = IndexedFile(Sympath, -1, ".txt")
   
    for(i=0; i<ItemsInList(fileList); i+=1)
        fileName = StringfromList(i,fileList)

        LoadWave/A/G/D/O/Q/M/P=Sympath fileName
        Variable ext0 = strsearch(FileName,"_",Inf,1)
        fileName = S_fileName[0,ext0-1]
       
        String XESimagName = "XESimag_" + fileName
        String XESspecName = "XESspec_" + fileName
       
        Wave ImpMatrix = $StringFromList(0, S_waveNames)
        Rename ImpMatrix, $XESimagName
        Duplicate ImpMatrix, $XESspecName
        Wave ImpCurve = $XESspecName

        MatrixTranspose ImpCurve
        MatrixOP/O ImpCurve=sumCols(ImpCurve)
        MatrixTranspose ImpCurve
       
        KillWaves/Z ImpMatrix
       
       
        Variable Energy = round(str2num(FileName) / 100) / 10
        String LastEnergy = StringFromList(ItemsInList(fileList)-1,fileList)
        Variable LastEnergy1 = round(str2num(LastEnergy) / 100) / 10
        Print "Importing RIXS Map: Current Energy is " + num2str(Energy) + " / " + num2str(LastEnergy1) + " eV"

        ImpList += NameOfWave(ImpCurve) + ";"
    endfor
   
    Make/O/D/N=(ItemsInList(ImpList)+1) $("E_eV")
    Wave eWave = $("E_eV")
    Variable Enum
   
    for(i=0; i<ItemsInList(ImpList); i+=1)
        String XEScurve = StringFromList(i,ImpList)
        Concatenate/NP {$XEScurve}, RIXS
       
        sscanf XEScurve, "XESspec_%i",Enum
        Enum /=100
        eWave[i] = round(Enum)
    endfor
    eWave /= 10
   
    Variable TimerB = StopMSTimer(TimerA)
    Print "Processing Time = " + num2str(round(TimerB/10E5)) + " seconds"
   
End

 

Text Files to Load (4.58 MB)

Here is a modified version of your code that does the load in multiple threads and is somewhat faster. I've also made some other changes that theoretically improve performance but, in this situation, don't really matter since they weren't bottlenecks to begin with. Read the comments for information on why I changed things. Note that my version below does not keep the individual waves loaded from each file like your original code does.

Function MWE_RIXS()
    //  MWE of RIXS map import
    variable i
    String ImpList = ""
   
    NewPath/O/Q Sympath
    PathInfo Sympath
    String pathname = S_path
   
    Variable TimerA = StartMSTimer
   
    String fileName
    String fileList = IndexedFile(Sympath, -1, ".txt")
    // NOTE: The order of files in fileList is undefined. If it matters,
    // then you must sort them in order to ensure that the files
    // are ordered how your code expects them to be ordered.
    fileList = SortList(fileList, ";", 16)
   
    WAVE/T fileListWave = ListToTextWave(fileList, ";")
    Variable numFiles = numpnts(fileListWave)
    Make/O/FREE/WAVE/N=(numFiles) ImpWaveWave
   
    MultiThread ImpWaveWave = LoadDataFromFile(pathname, fileListWave[p])
   
    Make/O/D/N=(numFiles+1) $("E_eV")
    Wave eWave = $("E_eV")
    Variable Enum
   
    Concatenate {ImpWaveWave}, RIXS
   MultiThread eWave[0,numFiles-1] = ExtractEnumFromName(nameofwave(ImpWaveWave[p])) / 10
     
    Variable TimerB = StopMSTimer(TimerA)
    Print "Processing Time = " + num2str(round(TimerB/10E5)) + " seconds"
   
End

ThreadSafe Function/WAVE LoadDataFromFile(String pathOnDisk, String fileName)
    LoadWave/A/G/D/O/Q/M pathOnDisk + fileName
    Variable ext0 = strsearch(FileName,"_",Inf,1)
    fileName = S_fileName[0,ext0-1]
   
    String XESimagName = "XESimag_" + fileName
    String XESspecName = "XESspec_" + fileName
   
    Wave ImpMatrix = $StringFromList(0, S_waveNames)
    // There was no need to duplicate the wave here, just change the
    // 2nd argument to Rename. With this change, I also removed
    // the KillWave command that came later.
    Rename ImpMatrix, $XESspecName
    Wave ImpCurve = $XESspecName

    // This can be done in one line, as below. But doing it as one
    // line leaves the output as a 1D wave, while the original
    // code left it as a 2D wave with 1 column.
    // Therefore I removed the /NP flag from the call to
    // concatenate in the calling function.
    //    MatrixTranspose ImpCurve
    //    MatrixOP/O ImpCurve=sumCols(ImpCurve)
    //    MatrixTranspose ImpCurve

    MatrixOP/O ImpCurve=sumCols(ImpCurve^t)^t
   
    return ImpCurve
End

ThreadSafe Function ExtractEnumFromName(String name)
    Variable Enum
    sscanf name, "XESspec_%i",Enum
    Enum /=100
    Enum = round(Enum)
    return Enum
End

On my Windows machine running 8.04 Beta 1 with a 16 core/32 thread processor, the time it takes to load all of the data went from about 120 seconds to about 28 seconds (your original code vs. my code above). [Edited with correct information, see next comment]

I sampled Igor while running your code and looked at what function calls took the most time. Note this is something I can do since I have access to Igor's debugging symbols but a regular user could not do this. Much of the time is spent by the different threads waiting to acquire mutexes that are used to protect the file handling code.

On Windows, the type of mutex we use is slow to acquire. Changing to a more lightweight mutex is something we're looking into for Igor 9. On Macintosh, we're already using relatively fast mutexes. You didn't say which platform you're using.

I believe that you could improve performance somewhat by specifying the numLines parameter of LoadWave's /L flag. It looks like that value is constant at least for the data set you attached, but I don't know if it's constant for all of your data sets. If it is, then I would hard code that value. Otherwise you could load a single file to determine the number of lines and then do the rest of the loading as I do above, in multiple threads.

As the "Loading Very Large Files" section of the LoadWave documentation details, on Windows if a file is over 500K Igor will automatically determine the number of lines in the file. However my profiling data shows that doing this takes about 20% of the total time to load one of the files, so if you can pre-specify this value that will help.

If you use my code above, you might play around with adding the /NT flag to the following wave assignment statement:

MultiThread ImpWaveWave = LoadDataFromFile(pathname, fileListWave[p])

Due to the issues with mutexes, it may be that the optimal number of threads to use is less than the available number of threads but still greater than 1.

As a smaller point, can Igor be directed directly to the zip file rather than manually unzipping it?

There's nothing built into Igor that can unzip a file. You could either call ExecuteScriptText to use a shell program to unzip, or you might try the ZIP XOP (https://github.com/IP-XOP/ZIP).

Correction: With my code in the comment above, it takes 28 seconds, not 85 seconds, to load all of the data. The 85 second test was with my sampling profiler enabled, which causes a surprising slowdown in this case.

Thanks a lot for that code!

I was unfortunately trapped in some other tasks the last weeks but have now tested it on my laptop. It's a Win10 running Igor 8.04 (34722) with 2 core/4 thread i7 processor. The retesting the original code required 150 sec for me. Defining the number of lines with the LoadWave /L flag cuts this to 112 sec. Your multithreaded code runs at 72 sec, but drops to 53 sec with number of lines defined. This is a huge improvement.

The dimensions of the text file matrices are consistent since they represent pixels of a detector, thus defining the size in advance is wise.

Thanks for the other tips about moving the matrix operations into one line as well!

There are two other tweaks you can make to my most recent code:

1. Add the /ENCG={1,4} flag to the LoadWave command. If you do this, you must be using Igor Pro 8.04 or later--otherwise, Igor will crash. This tells Igor that your text file contains UTF-8 encoded data. Your data is ASCII, a subset of UTF-8, to telling Igor this lets it save a bit of time since it doesn't need to check.

2. Change your MatrixOp command to: 

MatrixOP/O ImpCurve=sumRows(ImpCurve)

 

Here is the modified version of the LoadDataFromFile worker function:

// Uses LoadWave to do the loading
ThreadSafe Function/WAVE LoadDataFromFile(String pathOnDisk, String fileName)
    LoadWave/A/G/D/O/Q/M/ENCG={1,4} pathOnDisk + fileName
    Variable ext0 = strsearch(FileName,"_",Inf,1)
    fileName = S_fileName[0,ext0-1]
   
    String XESimagName = "XESimag_" + fileName
    String XESspecName = "XESspec_" + fileName
   
    Wave ImpMatrix = $StringFromList(0, S_waveNames)
    // There was no need to duplicate the wave here, just change the
    // 2nd argument to Rename. With this change, I also removed
    // the KillWave command that came later.
    Rename ImpMatrix, $XESspecName
    Wave ImpCurve = $XESspecName
    MatrixOP/O ImpCurve=sumRows(ImpCurve)
   
    return ImpCurve
End

I have made several changes to Igor 9 that dramatically improve the load time. On my machine, I'm down to about 4-5 seconds to load all of the files. The biggest change I made will impact performance only when loading using multiple threads at once, and the more threads you have the bigger improvement you get. That matters a lot on my 32 thread machine, but less on your 4 thread machine. But it should still help.

The other big change I made is that Igor is now using a much more performant algorithm for inspecting the individual bytes in the text file (eg. is this a number, letter, tab, etc.). That change improves performance by about 20%.

We'll hopefully start beta testing Igor 9 early next year, so keep your eyes out on the forums for beta testing information if you are interested.

Thanks for posting your original question and providing an interesting problem. You might not have intended this, but it demonstrated several bottlenecks in Igor that we've fixed to make text file loading much faster, and some of the fixes will improve performance in other ways as well.

Forum

Support

Gallery

Igor Pro 9

Learn More

Igor XOP Toolkit

Learn More

Igor NIDAQ Tools MX

Learn More