ExecuteScriptText and German umlauts
joerg.kunze
Generally ExecuteScriptText cmd
works nicely on Win10. But as soon as the string cmd
containts special characters, e.g. the German umlauts "ä", "ö" or "ü", the command is no longer executed without further notice. Try
ExecuteScriptText "C:\\Test\\Überprüfung.bat"
where the batch file Überprüfung.bat contains e.g. a line like
echo "Hello World" >test.txt
I guess, this problem could be related to UTF coding. -?
Best regards,
Joerg.
I tested Igor 8.02 on Windows 10 and I don't see a problem. I changed your test .bat file to the following:
When I execute it using the ExecuteScriptText command you provided, test2.txt is created and contains the correct contents.
December 2, 2018 at 08:19 am - Permalink
I may be leading you on a wild goose chase, but...
This may be an issue of precomposed versus decomposed characters.
If your file name uses precomposed then your command must use precomposed. In precomposed, Ü is U+00DC ("Latin capital letter U with diaeresis").
If your file name uses decomposed then your command must use decomposed. In decomposed, Ü is U+0055 ("Latin capital letter U") followed by U+0308 ("Combining diaerisis").
Windows allows you to have two files with ostensibly the same name but which use different representations of accented characters.
Here are Igor commands that call ExecuteScriptText on two distinct files:
ExecuteScriptText "C:\\Test\\\u0055\u0308berpr\u0075\u0308fung.bat" // Decomposed (uses combining characters)
If precomposed, strlen("Überprüfung.bat") prints 17.
If decomposed, strlen("Überprüfung.bat") prints 19.
If you copy the file name to the clipboard from the Windows desktop and then execute the strlen command on it, that will tell you if the filename is precomposed or decomposed.
If I use the Windows 10 "on screen keyboard" and set my input method to DEU (German), and press Shift and click Ü, I get a precomposed Ü. There may be other ways of entering Ü that produced decomposed text.
December 2, 2018 at 10:42 am - Permalink
In reply to I may be leading you on a… by hrodstein
Hi Howard,
thank you for this very insightful answer. I use Win 10 DEU keyboard. strlen tells me, the results are decomposed. So I played with decomposed letters. And my questions get rather more than less.
If I type
I get U ̈ instead of Ü. And an ExecuteTextScript command does not work either this way. So I want to know, how my "Ü" is constructed. I decompose the "Ü" with two char2num commands and I yield -61 and -100, which is 0xFFC3FF9C. Honestly I do not understand, how these codes are related to the decomposed 0x00550308.
I recognized, that the problem does only exist with ExecuteScriptText while CopyFile works always fine. In order to get ahead with work, I programmed a workaraound. I now copy all files with CopyFile to an umlaut-free path, give them umlaut-free names, let a latex batch do its work on them and copy the result back using the CopyFile command. ExecuteScriptText is working again after rebooting the computer. So no more pressure on this issue. But I am still curious to understand it, because it will for sure come back to me.
December 2, 2018 at 11:12 pm - Permalink
In reply to Hi Howard, thank you for… by joerg.kunze
That's due to the Lucida Console font (or whatever font you're using). If you change the history window's font to something different, such as Courier New, you'll see the text printed correctly. Or, if you copy the string into Word and set the font to Lucida Console, you'll see it displayed incorrectly there also.
December 3, 2018 at 08:24 am - Permalink
Igor stores text as UTF-8, a byte-oriented Unicode text encoding format that uses between one and four bytes per character.
0055 and 0308 are code values expressed in hexadecimal as UTF-16, a Unicode text encoding format that encodes most characters using a single 16-bit word but which encodes some characters as a sequence of two 16-bit words.
Here is a demo that shows how to view the UTF-16 codes corresponding to an Igor UTF-8 string:
String utf8Str
String wName // Name for output wave
// Convert UTF-8 to UTF-16LE
Variable utf8Code = 1
Variable utf16Code = 101 // UTF-16, little-endian
String utf16LEStr = ConvertTextEncoding(utf8Str, utf8Code, utf16Code, 1, 0)
// Make an unsigned byte wave
Variable numCodeValues = strlen(utf16LEStr) / 2 // Two bytes per UTF-16 code value
Make/O/N=(numCodeValues*2)/B/U $wName
WAVE w = $wName
w = char2num(utf16LEStr[p]) & 0xFF
// Convert to an unsigned 16-bit wave
Redimension/E=1/N=(numCodeValues)/W/U w
End
Function DemoUTF16()
String precomposedStr = "\u00DCberpr\u00FCfung.bat"
MakeUTF16LEWave(precomposedStr, "PrecomposedUTF16LE")
WAVE PrecomposedUTF16LE
String decomposedStr = "\u0055\u0308berpr\u0075\u0308fung.bat"
MakeUTF16LEWave(decomposedStr, "DecomposedUTF16LE")
WAVE DecomposedUTF16LE
DoWindow/F DemoUTF16LETable
if (V_Flag == 0)
Edit/W=(190,45,582,442)/N=DemoUTF16LETable PrecomposedUTF16LE, DecomposedUTF16LE
ModifyTable format(PrecomposedUTF16LE)=10,format(DecomposedUTF16LE)=10
endif
End
December 3, 2018 at 09:12 am - Permalink
Like aclight, I was also unable to reproduce the original problem.
Please zip and attach a "Überprüfung.bat" file that shows the problem. That should preserve the spelling of your file name and may allow me to reproduce the problem.
December 3, 2018 at 09:13 am - Permalink
Are you calling the code from a procedure file? What encoding does the procedure file has? The "i" button on the lower left tells you that.
December 4, 2018 at 06:28 am - Permalink
In reply to Are you calling the code… by thomas_braun
A procedure file's text encoding comes into play only when the file is read or written. All text is stored in memory as UTF-8. So the procedure file's text encoding should not be an issue.
If a procedure file has the wrong text encoding for the text stored in it, you will see incorrect characters when editing the procedure file in Igor. Unless you see that, the file's text encoding is not relevant.
December 4, 2018 at 09:14 am - Permalink
Thanks Howard for the insight and clarification.
December 4, 2018 at 09:41 am - Permalink
In reply to Like aclight, I was also… by hrodstein
Many thanks and sorry for the delay. Here comes the zip.
December 5, 2018 at 01:12 am - Permalink
Your file, as unzipped, is spelled using precomposed characters, like my version of your file.
I tried calling ExecuteScriptText on your file in Igor6, Igor7, and Igor8. In all versions, it runs without returning an error.
It does not produce the Test.txt output file unless I run Igor as administrator. If I run as administrator, it produces the file in all versions.
The file is produced in the "current folder" which is the folder containing the Igor64.exe file. This is a descendant of the "Program Files" folder which is protected by the operating system. That's why I need to run as administrator.
If I change the command in the file from
to
where "Folder" is a folder for which I have write access, then the Test.txt file is produced in the specified folder whether I am running as administrator or not.
Bottom line is that I don't know why I am getting a different result from you.
December 5, 2018 at 09:55 am - Permalink
Hi Howard,
thanks again. It seems to be a soft bug or a "Heisenbug", as my colleagues like to call it. They are sometimes hard to reproduce. Maybe also the path to the batch plays a role. It also contained some umlauts. And maybe also a process got stucked. Again, rebooting helped.
As I currently have a running workaround, I would like to stop discussion here and rather put our energy into other questions.
Thanks again,
Jörg (...hopefully precomposed...)
December 7, 2018 at 07:12 pm - Permalink
I missed that part in your post from December 2nd.
December 8, 2018 at 10:33 am - Permalink
Oh, sorry. It helped after changing to umlaut-free path and file names. I posted it as soon as I recognized.
December 8, 2018 at 09:46 pm - Permalink