Jump to content
Macro Express Forums

How to loop in a one line text file?


sorlov

Recommended Posts

Greetings!

 

I am new to ME and have a question. I want to go through an *.mrc file, copy from it all ISBN numbers and paste into Excel. The *.mrc file is a file with bibliographic records which consists of one extremely long line with information on hundreds of books. No line breaks.

 

I wrote a macro that opens the file in EditPlus, searches for a regular expression a66[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9], copies it and pastes into Excel. Works fine, but I don't know how to tell the program to stop. I'm not sure it would work with Text File Begin/End Process, since there are no line breaks. And I don't see what variable I could define to use in Repeat Until since it's just one long stream of data.

 

Is there a way to tell the program just to stop looping when there is no more text left? If not, what would you suggest?

 

Thanks in advance,

Stanislav

Link to comment
Share on other sites

Can you supply a sample file? Zipped, if possible. It would make it easier to determine a solution.

Please find attached. This file can be opened in any text editor, including Notepad, but I'm using EditPlus becayse I need to include the regular expression a66[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9] in search. I'm looking for electronic ISBNs, and since the file has a couple varieties of those (some are 13 and some are 10 digits long), I need to find those with "a66" in the beginning and having 10 digits after the "a". If you use another editor, the macro could be edited for use with EditPlus, right?

SPRHum1.zip

Link to comment
Share on other sites

So we are on the same page:

 

If your regular expression is correct, the file you sent has 133 matches in it.

 

Are you wanting to just get a list of the "a66########" numbers within the file?:

  • a6612345678
  • a6687654321
  • a6656473829
  • ...

or do you need other information from the file, too?

 

Also, is there always whitespace following the pattern of "a66" + 8 digits? Or could there be non-whitespace characters adjacent to them?

Link to comment
Share on other sites

Here is a macro that will parse through a string without using an external regular expression. Although, for something like this, regualr expressions are MUCH better and easier. The clipboard will contain your string of numbers at the end of the macro. You will need to adjust the location and name of the input file.

 

If I have the time, I will post a macro that generates a temporary external regular expression, uses it to get the values from a string or file, and then deletes itself when finished.

// Create CR/LF string and a string of digits
Variable Set %T10% to ASCII Char of 10
Variable Set %T13% to ASCII Char of 13
Variable Modify String: Append %T10% to %T13%
Variable Set String %T4% "0123456789"

// Read in the file to search and begin main loop
Variable Set String %T1% from File: "SPRHum1.mrc"
Repeat Until %N1% <> %N1%

 // Locate next occurrence of "a66". If not found then we are done.
 Variable Set Integer %N1% from Position of Text in Variable %T1%
 If Variable %N1% = 0
Repeat Exit
 End If

 // Delete everything in the search string prior to "a66"
 Variable Modify Integer: Dec (%N1%)
 Variable Modify String: Delete Part of %T1%

 // "a66" must be followed by 8 digits and a space so if a space is in position 12, this could be a good find.
 // If not, then delete the first character ("a") so the next search won't find it again.
 Variable Set Integer %N1% from Position of Text in Variable %T1%
 If Variable %N1% = 12
Variable Modify String: Copy Part of %T1% to %T2%
Variable Modify String: Delete Part of %T1%
Variable Modify String: Trim %T2%

// Make sure that positions 4 through 11 are all digits.
// If not then this is not a good find so don't append it to the save string
Repeat Start (Repeat 8 times)
  Variable Modify String: Copy Part of %T2% to %T3%
  If Variable %T4% does not contain variable %T3%
	Variable Set String %T2% ""
	Repeat Exit
  End If
Repeat End
If Variable %T2% > ""
  Variable Modify String: Append "%T2%%T13%" to %T5%
End If
 Else
Variable Modify String: Delete Part of %T1%
 End If
Repeat End

// Copy the save string to the clipboard.
Variable Modify String: Save %T5% to Clipboard
Delay 250 Milliseconds
Macro Return



<REM2:Create CR/LF string and a string of digits><ASCIIC:10:1:10><ASCIIC:13:1:13><TMVAR2:08:13:10:000:000:><TVAR2:04:01:0123456789><REM2:><REM2:Read in the file to search and begin main loop><TVAR2:01:04:C:\Temp\SPRHum1.mrc><REP3:08:000002:000002:0001:1:01:N1><REM2:><REM2:Locate next occurrence of "a66". If not found then we are done.><IVAR2:01:13:1:a66><IFVAR2:2:01:1:0><EXITREP><ENDIF><REM2:><REM2:Delete everything in the search string prior to "a66"><NMVAR:09:01:0:0000001:0:0000000><TMVAR2:11:01:00:001:N01:><REM2:><REM2:"a66" must be followed by 8 digits and a space so if a space is in position 12, this could be a good find.><REM2:If not, then delete the first character ("a") so the next search won't find it again.><IVAR2:01:13:1: ><IFVAR2:2:01:1:12><TMVAR2:10:02:01:001:012:><TMVAR2:11:01:00:001:012:><TMVAR2:01:02:00:000:000:><REM2:><REM2:Make sure that positions 4 through 11 are all digits.><REM2:If not then this is not a good find so don't append it to the save string><REP3:01:000004:000001:00008:1:01:><TMVAR2:10:03:02:N01:001:><IFVAR2:4:04:8:T3T><TVAR2:02:01:><EXITREP><ENDIF><ENDREP><IFVAR2:1:02:4:T><TMVAR2:07:05:00:000:000:%T2%%T13%><ENDIF><ELSE><TMVAR2:11:01:00:001:001:><ENDIF><ENDREP><REM2:><REM2:Copy the save string to the clipboard.><TMVAR2:16:05:00:000:000:><MSD:250><MRETURN>

Link to comment
Share on other sites

OMG!!! I edited the file location, clicked on "Test Run Macro" and, at first, thought it's still waiting, while it has already finished. Your macro was so fast! And so efficient. As a librarian, I have to work with text a lot, so handling it is a must. This macro does it very well.

 

If you have time for a macro using external reg expressions, it would be great, but even w/o it, you saved me lots of time and headache :)

 

Thanks a lot!

Link to comment
Share on other sites

I did forget to mention that it will do its job in under 1 second on a file that small (473 k). We got lucky in that the data you wanted to extract was easy to find and parse. Anything more complicated would be better served using regular expressions.

 

Glad it worked for you!

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
×
×
  • Create New...