acantor Posted September 1, 2020 Report Share Posted September 1, 2020 I have written a script to import five sentences from a text-only file. ASCII File Begin Process: "C:\Test.txt" (Tab Delimited Text (.txt)) ASCII File End Process <ASCII FILE BEGIN PROCESS Filename="C:\\Test.txt" Format="Tab" Start_Record="1" Process_All="TRUE" Records="1" Variable="%Text%" Start_Index="1" Parse_Blank_Lines="FALSE" Clear_Array="TRUE"/> <ASCII FILE END PROCESS/> I pressed the tab key once between each of the five sentences. My sample shows where each tab is located: "Hello this is 1."<TAB>"And this is 2."<TAB>"Three = 3!"<TAB>"Four four four"<TAB>"The last of five!" Then my script outputs the five results: Variable Set Integer %x% to 1 Repeat Start (Repeat 5 times) Text Type (Simulate Keystrokes): %Text[%x%]% Variable Modify Integer %x%: Increment End Repeat <VARIABLE SET INTEGER Option="\x00" Destination="%x%" Value="1"/> <REPEAT START Start="1" Step="1" Count="5" Save="FALSE"/> <TEXT TYPE Action="0" Text="%Text[%x%]%"/> <VARIABLE MODIFY INTEGER Option="\x07" Destination="%x%"/> <END REPEAT/> It works perfectly for %Text[2]% through %Text[5]%, but there is a wrinkle with %Text[1]%. Instead of outputting this: Hello this is 1. I get this: "Hello this is 1." In other words, the macro is printing the quote marks that delineate %Text[1]% in the file, prefaced with three symbols:  I created the text file in Notepad, so it shouldn't contain any weird or invisible characters. Any ideas about what is going on? My kludge for getting rid of the extra characters is to do this. But what an awful workaround! Variable Modify String: Delete part of text from %Text[1]% starting at 1 and 4 characters long Variable Modify String: Replace """ in %Text[1]% with "" Quote Link to comment Share on other sites More sharing options...
Cory Posted September 1, 2020 Report Share Posted September 1, 2020 I dont' think your file is ASCII, I think it's UTF-8. Those first characters might be the BOM (Byte Order Markers). Look at your file with a hex editor. I use UltraEdit but it's money, so try Notepad++. Don't forget Notepad was upgraded to support Unicode, and by default is saves to UTF-8 now, not ASCII. In Notepad go File > Save As > and look in the lower right and tell me which encoding you have selected. I'm guessing it's UTF-8 with BOM. 1 Quote Link to comment Share on other sites More sharing options...
Cory Posted September 1, 2020 Report Share Posted September 1, 2020 Duh. I shoudl have done this first. I created a test file in Notepad and saved as UTF-8 with BOM? See how it starts with 0xEF, 0XBB, 0XBF? Click here for an explanation of BOM. Quote Link to comment Share on other sites More sharing options...
acantor Posted September 1, 2020 Author Report Share Posted September 1, 2020 Thank you, Cory! I had no idea that Notepad now supports Unicode. Saving the file and specifying ASCII encoding fixed the problem and solved the mystery. Quote Link to comment Share on other sites More sharing options...
Cory Posted September 1, 2020 Report Share Posted September 1, 2020 You're welcome. MEP doesn't support Unicode, so years ago I had to process a UTF-16 file and what I did was to delete the first three bytes and then take every other byte. Since the text was ASCII, Unicode has the same code page essentially so the first byte of the two bytes could be ignored. I learn much about Unicode then. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.