Jump to content
Macro Express Forums

Text analyzing


Tha Vince

Recommended Posts

Hi all,

 

I am looking for a macro (for macro express) that can open PDF files and search them for a fixt set of words.

 

When a word is found the file name should be copied in an Excel file together with the words that have been found.

 

Afther that the PDF file should be closed and the second file should be opened and analyzed as well.

 

I need to do this for several hundreds of files, so any help is more than welkom.

 

Thank you in advance!

 

Vince

Link to comment
Share on other sites

Are you asking for someone to do it for you or are you asking for advice? It's not very likely that someone has the exact macro.

 

Here's a rough outline of what is required:

 

Put all pdf files in one folder

Macro:

Prompt for user to enter search text string (text variable)

Open Excel spreadsheet (or create new)

Repeat with Folder (per above)

Open pdf file

Fill in Search box (type Ctrl+F then search string, then Enter)

Determine if string is not found (not found dialog appears and has to be cleared)

If found (no dialog appears) close pdf and bring Excel to the top

Use Excel's Go To command to select cells for pdf filename/search string and enter data

Increment cell position for next set of data (eg start of next row)

Repeat End

Save Excel spreadsheet

 

The only iffy part is the not found dialog which has the same partial title as the pdf (Adobe Reader). That would be best done using Controls which is more advanced. (Edit) It can also be done by checking the exact window title "Adobe Reader" has appeared.

 

It was not clear whether you have one search string for all the pdfs or whether it will be a different one per pdf. If it is the same search string you don't need to continually add to the spreadsheet with each filename. Once should suffice (say at the start of the file list).

 

You can have other files in the same folder as the pdfs but you will have to add some logic in the Repeat Folder loop to only process pdfs. If the pdfs are not in one folder it will be more complicated.

 

Processing many pdfs will take time, the bulk of that will be opening the pdfs.

Link to comment
Share on other sites

Hi John,

 

Thank you for your reply. I am not looking for someone to do this for me (although that would be very convenient), but I am looking for someone who can help me with (a part of) the code for macro express.

 

The process that you described is more or less correct. However, I want to automate as much as possible.

 

I will probably be able to create the simple parts of the code (ie opening and closing of files, etc.). The part of the code where there is a choice (if statement) is much more of a chalenge to me.

 

Any help with the code is still more than welcome!

 

Regs,

 

Vince

 

P.S. I am now also looking at converting PDF into xls, which should make it somewhat easier.

Link to comment
Share on other sites

The outline I gave was for some minor manual input (selecting the folder with the pdf). If you already know the folder you can skip that step and put the foldername directly in the Repeat with Folder command. I mostly help outlines and code snippets. If you don't do most of it yourself you don't learn for next time.

 

If IFs are your problem here's a more detailed outline of that part. To speed things up it would be best to monitor the progress bar bottom right. It does not seem to be a window so you can only monitor the colour. The bar is shaded blue (on my pc) and background black. The close icon is part-red. The black is a better target but text behind it may also be black. Should that be the case for a particular pdf it will simply waste some time. You will have to be quick getting the position coordinates. A large pdf would help. What is in the macro below suits my 1024x768 monitor setup. I always run my apps maximized when using coordinates for consistency.

 

It may be possible to monitor the progress bar with Controls but it is so fleeting I could not spend the time investigating.

 

Load a pdf, enter some search text then run this. It presses Enter and displays text boxes with action to be taken depending whether a string is found. Try with garbage strings and strings you know it will find. In your macro you would remove the text boxes and put in the necessary logic.

 

Text Box Display: Before Starting
Activate Window: "pdf"
Text Type: <ENTER>
Repeat Start (Repeat 40 times)
 Delay 0.5 Seconds
 Get Pixel: Screen Coords: 900,710 into %N10%
 If Variable %N10% <> 0
Repeat Exit
 End If
Repeat End
Delay 0.5 Seconds
If Window Title "Adobe Reader" is on top
 Text Type: <ENTER>
 Text Box Display: String not found, close pdf, open next one
Else
 Text Box Display: String found, close pdf, enter Excel data, open next pdf
End If

 

<TBOX4:T:1:CenterCenter000278000200:000:Before StartingOpen PDF, put in search string, close this box><ACTIVATE2:pdf><TEXTTYPE:<ENTER>><REP3:01:000001:000001:00040:0:01:><DELAY:0.5><GETPX:10:S:000900:000710><IFVAR2:2:10:2:0><EXITREP><ENDIF><ENDREP><DELAY:0.5><IFOTH:03:1:Adobe Reader><TEXTTYPE:<ENTER>><TBOX4:T:1:CenterCenter000278000200:000:String not found, close pdf, open next oneString not found, close pdf, open next one><ELSE><TBOX4:T:1:CenterCenter000278000200:000:String found, close pdf, enter Excel data, open next pdfString found, close pdf, enter Excel data, open next pdf><ENDIF>

 

(Edit) I forgot the IF logic if you are processing a folder with non-pdfs. When the Repeat with Folder returns the full filename, say into T1, if T1 does not contain pdf then skip the file processing.

Link to comment
Share on other sites

I tried running the code on another PC and found that the Adobe Reader operation was different. Ctrl+F brought up a Find dialog and there was no progress bar as such. During the search a red symbol appeared on the dialog. That would be the only means of telling a search was still in progress. The macro needs to be tailored to the version of Reader being used. The other PC was also much faster (max of 5 secs for a document). Checking progress speeds things up, otherwise you have to allow time for the longest search to take place (plus a bit) and use that for every document. Whether to add the complication of checking is personal choice based on PC and time available to complete the end task.

Link to comment
Share on other sites

  • 1 month later...

I had a quick question:

 

This part of your code? What does it really do? are you getting pixel count to adjust to screen for the macro?

 

Repeat Start (Repeat 40 times)

Delay 0.5 Seconds

Get Pixel: Screen Coords: 900,710 into %N10%

If Variable %N10% <> 0

Repeat Exit

End If

Repeat End

Link to comment
Share on other sites

  • 2 weeks later...

hi,

 

ive just got a U600 and put my sim in. when i go to create a text it automatically puts a contact name in. it is not someone near top so seems totally random. does anyone know how i can get it to be blank si u can choose who to text.

 

thanks

 

martin

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
×
×
  • Create New...