Jump to content
Macro Express Forums

Curious behaviour: ASCII File Processing


Recommended Posts

I have written a script to import five sentences from a text-only file.

 

ASCII File Begin Process: "C:\Test.txt" (Tab Delimited Text (.txt))
ASCII File End Process

<ASCII FILE BEGIN PROCESS Filename="C:\\Test.txt" Format="Tab" Start_Record="1" Process_All="TRUE" Records="1" Variable="%Text%" Start_Index="1" Parse_Blank_Lines="FALSE" Clear_Array="TRUE"/>
<ASCII FILE END PROCESS/>

I pressed the tab key once between each of the five sentences. My sample shows where each tab is located:

 

"Hello this is 1."<TAB>"And this is 2."<TAB>"Three = 3!"<TAB>"Four four four"<TAB>"The last of five!"

 

Then my script outputs the five results:

 

Variable Set Integer %x% to 1
Repeat Start (Repeat 5 times)
  Text Type (Simulate Keystrokes): %Text[%x%]%
  Variable Modify Integer %x%: Increment
End Repeat
 
<VARIABLE SET INTEGER Option="\x00" Destination="%x%" Value="1"/>
<REPEAT START Start="1" Step="1" Count="5" Save="FALSE"/>
<TEXT TYPE Action="0" Text="%Text[%x%]%"/>
<VARIABLE MODIFY INTEGER Option="\x07" Destination="%x%"/>
<END REPEAT/>

It works perfectly for %Text[2]% through %Text[5]%, but there is a wrinkle with %Text[1]%. Instead of outputting this:

 

Hello this is 1.

 

I get this:

 

"Hello this is 1."

 

In other words, the macro is printing the quote marks that delineate %Text[1]% in the file, prefaced with three symbols:

 



 

I created the text file in Notepad, so it shouldn't contain any weird or invisible characters.

 

Any ideas about what is going on?

 

My kludge for getting rid of the extra characters is to do this. But what an awful workaround!

 

Variable Modify String: Delete part of text from %Text[1]% starting at 1 and 4 characters long
Variable Modify String: Replace """ in %Text[1]% with ""

 

Link to post
Share on other sites

I dont' think your file is ASCII, I think it's UTF-8. Those first characters might be the BOM (Byte Order Markers). Look at your file with a hex editor. I use UltraEdit but it's money, so try Notepad++.

Don't forget Notepad was upgraded to support Unicode, and by default is saves to UTF-8 now, not ASCII. In Notepad go File > Save As > and look in the lower right and tell me which encoding you have selected. I'm guessing it's UTF-8 with BOM. 

  • Like 1
Link to post
Share on other sites

 

2020-09-01_13-09-12.jpg.bf248e7a1fe39ba86f94e9c17f1d8a15.jpg

Duh. I shoudl have done this first. I created a test file in Notepad and saved as UTF-8 with BOM? See how it starts with 0xEF, 0XBB, 0XBF?

Click here for an explanation of BOM. 

Link to post
Share on other sites

You're welcome. 

MEP doesn't support Unicode, so years ago I had to process a UTF-16 file and what I did was to delete the first three bytes and then take every other byte. Since the text was ASCII, Unicode has the same code page essentially so the first byte of the two bytes could be ignored. I learn much about Unicode then.

Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
×
×
  • Create New...