Jump to content
Macro Express Forums

Replace CR/LF skips many CR/LF


Recommended Posts

I'm not sure why but only some CR/LF's are replaced

 

// Set Environment variable "CRLF" to cr/lf characters

Variable Set %T2% to ASCII Char of 10

Variable Set %T3% to ASCII Char of 13

Variable Modify String: Append %T2% to %T3%

Variable Modify String: Save %T3% to Environment Variable

// Replace Environment variable "CRLF" in String T1 with space

Replace "%CRLF%" with " " in %T1%

 

I am trying to do this in an htm file. I'm not sure if that's the problem.

 

Please help. Thanks.

Link to post
Share on other sites

Short Explanation

Some files contain LF or CR characters alone. Also some may have LFCR instead of CRLF. Replacing CRLF will miss those characters. To determine what is going wrong, after running your macro to replace CRLFs, examine the results to see if there are any stray CRs or LFs. I predict you will find them. Some systems only use LFs, not CRLFs, to indicate the end of a line.

 

Detailed Explanation

CR is an abbreviation for Carriage Return. Think of an old typewriter or an old printer. When this character is processed the typewriter carriage or the print head moves all the way to the left but the paper is not moved up or down. With this equipment you can print one line on top of another by ending a line with only a CR.

 

LF is an abbreviation for Line Feed. When this character is processed the paper is moved up one line without moving the print head to the left or to the right. You can continue printing on a line below while keeping the same left to right orientation by using a LF character. It is perfectly valid to have CR LF LF LF to move down 3 lines (note the 3 LFs).

 

With images on a computer screen and laser printers there is not a physical 'print head' or 'carriage' but the functionality is emulated. On some computer systems it was decided that it didn't make sense to print two lines on top of one another. So when a CR (without a LF) is received it moves the 'carriage' all the way to the left AND down a line. Similarly, on some systems, a LF (without a CR) 'moves' to the left AND down a line.

 

Some software may create files that include extra CRs or LFs like this: CRLFLF, CRCRLF or CRLFCR. Software may also reverse these characters using LFCR. Sometimes this is due to sloppy programming while other times it may be intentional. When viewing these files using the program they are written for, such as a text editor, word processor or browser, you generally cannot see any difference when the extra characters are there. When I need to know what is inside the file I use a hex editor to view the content of the file. A hex editor displays the hexadecimal values for each character in the file instead of interpreting their meaning. A Google search for 'hex editor' turns up a number of good references.

 

Additional Information

Other interesting characters that may be in a file are:

HT - Horizontal Tab

BS - Backspace

 

Google 'ASCII chart' for more information about these control characters.

Link to post
Share on other sites
I downloaded a hex editor and it looks like I am getting a lot of "OD OA" combinations were the CR/LF's are.

0D is the hexadecimal value for CR. 0D hex is 13 decimal.

0A is the hexadecimal value for LF. 0A hex is 10 decimal.

 

So, you are seeing the original CRLFs in the file so you need to double check your macro.

Link to post
Share on other sites

The macro appears okay, but I am seeing something strange. When I save the CRLF stripped text to a text file, there are no more CRLFs or 0D0As. However, when I save the text to a column in a comma delimited CSV file, the column retains most (perhaps all) of the CRLF/0D0As that existed.

 

I double checked and tested to see if this was happen was true by saving the stripped text results to a text file. I opened the text file in the hex editor and there were no 0D0As anywhere. I copied this text to the clipboard and pasted it into an MS Excel cell and the CRLFs returned. I copied this back to a text file, and reopened this text file in the Hex Editor and the 0D0As were back.

 

I still wonder if HTML has something to do with this, or perhaps Excel.

Link to post
Share on other sites

Excel does add extra characters when copying a cell to the clipboard. Most likely this is what you are seeing in your test.

 

Copy a cell to the clipboard and save to a text string variable. Then use the Variable Modify String - Trim command. Save the variable to a text file - Variable Modify String - Save to Text File. Open in the Hex Editor and the 0D0As should be gone.

 

Do the same thing without the Trim and you will see the 0D0As.

Link to post
Share on other sites
Excel does add extra characters when copying a cell to the clipboard. Most likely this is what you are seeing in your test.

 

Copy a cell to the clipboard and save to a text string variable. Then use the Variable Modify String - Trim command. Save the variable to a text file - Variable Modify String - Save to Text File. Open in the Hex Editor and the 0D0As should be gone.

 

Do the same thing without the Trim and you will see the 0D0As.

 

 

Thank you for the kind recommendation. In this case, unfortunately, I am setting up a database with 30,000 - 40,000 records. Since the database is Excel, the program receiving the field we are talking about has a bunch of carriage returns. I wonder what would happen if I stripped the HTML Tags out. I'll give that a try.

Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
×
×
  • Create New...