Jump to content
Macro Express Forums

search for email address in text file


Recommended Posts

Hello all,

 

I am busy with questions today.

 

I am parsing a text file and want to grab any text that is an email address.

 

In general, it would go something like this:

 

If variable contains @ and .com or .net or....(other extensions) do something....

 

My question is, is there a better way to make sure it's an email address and nothing else?

 

Pat

Link to comment
Share on other sites

I struggled with this once and the only test I could really find was that some were using scripts and initiating an email to see if whatever server they were using would refuse it as an invalid format. But I thought that inelegant so I created my own script to do it. It performs several tests. I think I had it parse on the @ and the last period. Then I maintained a list of TLD (Top Level Domains like .com) and made sure the TLD was in that list. Then I made sure that the length was OK on the domain name I think then checked every character in it and the alias and made sure all the characters were valid. Might have done some other tests too. Not sure. You can simply check for valid characters by converting the ASCII Decimal value and make sure it's valid. 48-57, 65-90, 97-122 and whatever the values for "! # $ % & ' * + - / = ? ^ _ ` { | } ~.". You can read about all the email syntax rules here but they're fairly simple. I see a couple I didn't test for like two periods in a row. Anyway I have it all as one subroutine that runs pretty quickly so where I use it I have one command and a yes or no answer. Might have even returned the rule it failed. I can't remember now without digging it out and looking at it.

Link to comment
Share on other sites

Hello all,

 

I am busy with questions today.

 

I am parsing a text file and want to grab any text that is an email address.

 

In general, it would go something like this:

 

If variable contains @ and .com or .net or....(other extensions) do something....

 

My question is, is there a better way to make sure it's an email address and nothing else?

 

Pat

 

Guess it's you're lucky day Pat!

 

The attached macro processes a text file and displays a TBD everytime an email address is found.

 

Tests included:

- local part length 1≤64

- domain part length 1≤253

- local part to only contain chars allowed in it

- domain part to only contain chars allowed in it

- domain part to contain at lest one "."

- first or last char of local part can't be "."

- local part not to contain ".."

- email address length ≤254

 

There are more conditions than that, that decide about an email address being correct or not, if you feel like adding more validation test go ahead. Refer to the website Cory provided. I think it's pretty clear in the script where certain validation test should be inserted. If not just ask.

 

I think I had it parse on the @ and the last period.

What about ".co.uk", "com.pl" and others? Did you create a rule for every country?

 

I didn't put any validations test on those. It's important so it shouldn't be neglected. I'm leaving this part to you Pat. Time for you to contribute to the script ;)

 

Test file attached is the file I was testing the macro with.

EMAIL EXTRACTOR.mex

test.txt

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
×
×
  • Create New...