read in data (txt file) with special time format

Here is a solution to the issue.

I recommend saving this as an Igor Procedure file so that it will be automatically loaded when you launch Igor. Execute this for details:
DisplayHelpTopic "Special Folders"

If this does not work with your file then it is probably due to formatting subtleties. In that case, attach a zip archive of your file so I can see exactly what the file contains.

// This loads a file containing data like this:
//  dd.mm.yyyy<space>hh.mm.ss.ff<space> ...
// where hh.mm.ss.ff is a time of day with fractional seconds. The procedure changes the
// dots in the time to colons so that it works with LoadWave/J, writes a temporary file containing
// the massaged text, and loads the data from the temporary file.
// There is a DeleteFile call at the end of the procedure to delete the temporary file. If you want to
// inspect the temporary file, comment out the DeleteFile call.
 
// In addition to the issue of the formatting of the time, the file has another peculiarity as posted
// at http://www.igorexchange.com/node/7831. It looks like this:
// Datum Zeit Tmp_ak WGms WR DatumZeit
// 17.09.2016 09.50.00.02 15.8 1.4 248 17.09.2016 09.50.00.02
// The column names for the first date/time have a space between them but the column name for the last
// date/time has no space. To sidestep this issue, I am ignoring the column names altogether and
// using LoadWave /B to set the column names.
 
// There is another problem. The use of space to separate the date and time as well as to separate one column
// from the next creates problems with LoadWave. Therefore the procedure replaces the space used to separate
// one column from the next with a tab.
 
Menu "Load Waves"
    "Load Kerstin File...", LoadKerstinFile("", "")
End
 
static Function/S FixText(textIn)
    String textIn
 
    String textOut = ""
 
    Variable numBytes = strlen(textIn)
    Variable bytesLeft = numBytes
    Variable offset = 0
    for(offset=0; offset<numBytes; )
        String ch = textIn[offset]
        Variable isSpace = CmpStr(ch," ")==0
        if (isSpace && bytesLeft>=12)               // Space and sufficient remaining bytes to be hh.mm.ss.ff?
            String section = textIn[offset+1,offset+11]
            String regExp = "[[:digit:]]{2}\.[[:digit:]]{2}\.[[:digit:]]{2}\.[[:digit:]]{2}"    // hh.mm.ss.ff
            if (GrepString(section,regExp))
                section = ReplaceString(".", section, ":")
                textOut += ch + section
                Variable sectionLength = strlen(section)
                offset += 1 + sectionLength                 // 1 for ch (space)
                bytesLeft -= 1 + sectionLength
                continue
            endif
        endif
        if (isSpace)
            ch = "\t"                               // Replace space separating columns with tab
        endif
        textOut += ch
        offset += 1
        bytesLeft -= 1
    endfor
 
    return textOut
End
 
//  LoadKerstinFile(pathName, fileName)
//  A data file has unwanted line breaks for lines longer than 80 characters
//  This routine creates a temporary version of the file without the bad line breaks
//  and loads data from the temporary file.
Function LoadKerstinFile(pathName, fileName)
    String pathName     // Name of an Igor symbolic path or "".
    String fileName         // Name of file or full path to file.
 
    Variable refNum
 
    // First get a valid reference to a file.
    if ((strlen(pathName)==0) || (strlen(fileName)==0))
        // Display dialog looking for file.
        // Replace /T="????" with, for example, /T=".dat" if your files are .dat files.
        Open /D /R /P=$pathName /T="????" refNum as fileName
        fileName = S_fileName           // S_fileName is set by Open/D
        if (strlen(fileName) == 0)      // User cancelled?
            return -1
        endif
    endif
 
    // Open source file and read the raw text from it into a string variable
    Open/Z=1/R/P=$pathName refNum as fileName
    if (V_flag != 0)
        return -1                       // Error of some kind
    endif
    FStatus refNum                      // Sets V_logEOF
    Variable numBytesInFile = V_logEOF
    String text = PadString("", numBytesInFile, 0x20)
    FBinRead refNum, text               // Read entire file into variable.
    Close refNum
 
    // Fix the text
    text = FixText(text)                    // Remove bad line breaks
 
    // Write the fixed text to a temporary file
    String tempFileName = fileName + ".noindex" // Use of .noindex prevents Spotlight from indexing the file. Otherwise we get an error when we try to delete the file because Spotlight has it open.
    Open refNum as tempFileName
    FBinWrite refNum, text
    Close refNum
    
    String columnInfoStr = ""
    columnInfoStr += "N=DatumZeitA,F=8;"        // <date><space><time>
    columnInfoStr += "N=TmpAk,F=0;"         // <number>
    columnInfoStr += "N=WGms,F=0;"          // <number>
    columnInfoStr += "N=WR,F=0;"                // <number>
    columnInfoStr += "N=DatumZeitB,F=8;"        // <date><space><time>
    
    // Load the temporary file
    // The /L flag causes LoadWave to load the data starting from line 1 (zero-based), skipping the name line which is problematic.
    // The /B flag specifies the column names and formats.
    // The /R flag specifies the date format as dd.mm.yy.
    LoadWave/J/D/P=$pathName/E=1/L={0,1,0,0,0}/B=columnInfoStr/K=0/R={English,2,2,2,2,"DayOfMonth.Month.Year",40} tempFileName
    if (V_flag == 0)
        Printf "An error occurred while loading data from \"%s\"\r", S_fileName
    else
        Printf "Loaded data from \"%s\"\r", S_fileName
    endif
    
    // Make table bigger
    MoveWindow 10, 50, 800, 400
    
    // Set date format to show fractional seconds
    ModifyTable showFracSeconds[1]=1,digits[1]=2, width[1]=175
    ModifyTable showFracSeconds[5]=1,digits[5]=2, width[5]=175
 
    // Delete the temporary file    
    DeleteFile /P=$pathName tempFileName
 
    return 0
End

Log in or register to post comments

July 10, 2017 at 10:02 am - Permalink

zek

Thanks for your help!
The procedure works, :-).
Not very fast, probably of the big data file.
I attached one of my data file.

Best regards
Kerstin

Attachments WindTestFile.zip (7.46 MB)

Log in or register to post comments

July 14, 2017 at 02:32 am - Permalink

_sk

zek wrote:
Dear all,
I try to read in a file like this

Datum Zeit Tmp_ak WGms WR DatumZeit
17.09.2016 09.50.00.02 15.8 1.4 248 17.09.2016 09.50.00.02
17.09.2016 09.50.00.12 15.7 1.3 253.6 17.09.2016 09.50.00.12
...

Why do you have duplicate columns in the data file? (Datum, Zeit) in the beginning, (Datum, Zeit) at the end? Each line begins and ends with this tuple. For the file that you posted if you remove the last two columns, the reduction is substantial: 70173015 bytes > 42524260 bytes (~67MB > ~41MB).

Under a linux or cygwin environment you'd do something like:
cat wind_test_file.txt | awk '{print $1,$2,$3,$4,$5}' > wind_test_file_out

Another recommendation is to change the delimiter to a comma. Then, use the Data > Load Waves > Load delimited text... menu, to load the data keeping the time as a text. Parse it after loading.

Under linux or cygwin in vim you'd issue the following to substitute space for comma:
:1,$s/ /,/g

Another recommendation is to combine the date and time into one variable ala julian date, unix time, etc. This will provide further savings on the file size, but will increase processing upon file loading.

edit: As a matter of fact, you don't need the date. It's a waste of bytes. Tell the person generating these files to name the filenames with the corresponding date, or put it in the beginning of each file as a header or something, and dump data only for this particular day in this particular file. I would also use fractional time. Even though a delta t would be enough (which btw is not precisely 100ms, just a tiny bit less, ~99.99(93)ms)

best,
_sk

Log in or register to post comments

July 14, 2017 at 06:24 am - Permalink