Jump to content
Macro Express Forums

Text process - corrupted first line?


Recommended Posts

I vaguely recall asking about this obscure issue before, maybe a couple of years ago. But I can't find the thread and I don't remember if it was solved.

 

I'm using Text File Begin/End Process on this text file:

 

20130726-094620.JPG
20130726-135134.JPG
20130726-140554.JPG
20130726-140648.JPG
etc

But the first line extracted always has some spurious characters at the start, as you see:

 

20130726-094620.JPG
20130726-135134.JPG
20130726-140554.JPG
20130726-140648.JPG
etc

 

Here's what it looks like in a hex editor:

 

MEPro-TextPuzzle-1.jpg

 

Anyone have any ideas as to the likely cause please?

 

--
Terry, East Grinstead, UK

Link to comment
Share on other sites

I don't have time to verify but I think it's a BOM (Byte Order Mark). It's a convention used in UTF encoding. UTF-8 and ASCII are nearly the same but with UTF you can include a BOM at the beginning to let whatever application opening it know how the bytes are ordered. It's probably being added by whatever application is generating the file. I've had this problem before with various applications and I simply opt to have my program not include the BOM. If that's not an option just have MEP delete these BOM characters.

Link to comment
Share on other sites

Thanks Cory, I'm sure that's it. I recall my text editor, TextPad, does some quirky stuff I've never really understood. I'll investigate in the morning. Suspecting something like that I did earlier try opening the file in Notepad and re-saving it, but the macro still gave that same result.

 

Terry, East Grinstead, UK

Link to comment
Share on other sites

I use Ultra Edit and it has options to include BOMs or not.

 

At some point ISS needs to start handling Unicode properly.

 

TextPad does too. As you see, it's apparently currently set not to write them, so I'm still unsure how those characters arose.

 

BOM-TextPad-1.jpg

 

 

--

Terry, East Grinstead, UK

Link to comment
Share on other sites

From your other post it sounds like Directory Lister created the file list, not your text editor. It sounds like this is where the BOM is coming from. Your text editor is probably just saving in the same format and not showing the BOM markers that are there.

Link to comment
Share on other sites

No, it was definitely down to TextPad. Bizzare though, because I've just stepped methodically through some similar tests again ... and TextPad is now working as expected! IOW, with the BOM option checkmarked I get the odd characters, with it unmarked I don't.

It's as if the setting had somehow got itself reversed, like a compass that had been too close to a loudspeaker.

Some details FYI:

1. I saved the list from Directory Lister.

2. I examined the hex; no spurious characters, ruling out the possibility that DL was the problem.

3. I opened it in Notepad and saved it with a new name. The hex of that was fine.

4. I opened the original in TextPad. Write Unicode and UTF-8 BOM was disabled. I saved it with a new name. The hex of that was fine.

5. I opened the original in TextPad. I enabled Write Unicode and UTF-8 BOM and I saved with a new name. The hex of that showed the spurious characters.

So, until if/when it happens again (and spotting it will be the challenge), my dilemma is resolved: I'll leave it disabled.

 

--
Terry, East Grinstead, UK

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
×
×
  • Create New...