Jump to content
Macro Express Forums

How to convert Unicode to ASCII within MEP in 6 lines


Recommended Posts

For some time now I've dealt with the shortcoming in MEP that it can't deal with Unicode and relies completely on ASCII. But often I would have output from PowerShell or other thing that was Unicode and had given up. Today I was dealing with log files from NTBackup and slapping my forehead again when I had a closer look and discovered a quick and dirt way to convert the Unicode to ASCII with a few lines of MEP macro code if it's essentially ASCII text!

 

Unicode was the solution to the problem of too limited of a character set for displaying many foreign languages and such. ASCII is 1 byte per letter but Unicode is 2 in UTF-16 and 4 in UTF-32. But get this... When the character is essentially an ASCII character like "A" the first byte is the usual ASCII value and the second is null (0x00 or 00000000). Now in some text files they add a byte order mark at the beginning which for UTF-16 looks like ÿþ or a couple of squares in a text box but is actually FF FE. Just be aware that in some instances it doesn’t exist and UTF-32 might be something like 00 00 FF FE.

 

Variable Modify String %File Contents%: Delete a substring starting at 1 and 2 characters long // Remove header
Variable Set Integer %Null Count% to the length of variable %File Contents% // Get the total number of bytes
Variable Modify Integer: %Null Count% = %Null Count% / 2 // Cut in half
Repeat Start (Repeat %Null Count% times) // Go thru each null
 Variable Modify String %File Contents%: Delete a substring starting at %Position% and 1 characters long // Delete Null byte
End Repeat

 

In this example I have inducted the Unicode string into a text variable, deleted the first two bytes (FF FE) as in this case it's UTF-16, counted the number of bytes and divided by two, and repeated that many times starting at position 2 and incrementing by 1 and deleted every other byte. Now I have pretty looking ASCII text! Bear in mind these values might need to be adjusted slightly depending on your exact flavor of Unicode and this will only work if the characters are essentially ASCIII characters and not the extended Unicode characters. Happy day!

Link to comment
Share on other sites

Nice work on figuring that out.

 

I just ran into this problem myself. I created a macro to read some INI files and provide a consolidated status report. (The INI files were created by my backup program, SyncBackSE, on a remote machine.) ME was unable to read those files because they were encoded as unicode>8bit.

 

The solution I came up with was to download a unicode converter utility. Since it can run in batch mode, with command line parameters, I've been able to have my macro convert them on the fly, by running the conversion app from within the macro.

 

The program I'm using converts text files from just about any encoding to any other, has a flexible array of command line parameters, versions for mac and windows (I'm using the win version), source file encoding detection, a very nice GUI, and the conversion is very quick. However, it's not free (about $49, with a free trial available before you buy). But the good news is, I was able to get it working instantly. It turns out, btw, that ME was able to read my INI files (and successfully search for text strings within them), after I converted them to 8 bit unicode.

 

Here's the link for anyone that needs it:

 

"Text Encode Converter" by GoFunNow

Link to comment
Share on other sites

That's a good find. So many of these out there but I never have time to test them all so it's nice to hear of someone who has. I used to rely on additional apps like this but I try to avoid them because many of my macro packages are distributed and that just adds another thing and often they need be installed. But if the files are big then my solution could be too slow. Thanks!

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
×
×
  • Create New...