MakaPakaTobyHannah Posted November 1, 2011 Report Share Posted November 1, 2011 I need to devise a process with which to modify an EPS file. The EPS file is plain text, of course, and can be viewed in any text editor. The program I use unfortunately, depending on the page size chosen, inserts an illegal "statusdict" command into the EPS file. This causes problems when trying to distill the file to a PDF. Removing the entire "statusdict" line, which could be something like: "statusdict /setpage known {statusdict begin 792 1224 1 setpage end} if" always solves the problem. I can easily do this in Notepad, for instance, by locating the instruction, and then deleting the entire line. My strategy has been to assign the contents of the file to a variable, and then either splitting the string several times, and rejoining the split segments, or by using "Text File begin Process" to read through each line, and appending each line, except for the line containing "statusdict", to a receiver variable; once that is done, I would modify the receiver variable: Modify String: save to text file. This does work. The problem is, such an EPS file can be quite long. The one I tested with is 31,191 lines long. As a result, the process is extremely extremely slow: easily in excess of one minute. I cannot think of any other procedure by which to obtain the desired result; yet at this speed, I'd be faster opening the file in Notepad and manually editing it. Is there a better approach? Thanks in advance! Quote Link to comment Share on other sites More sharing options...
Samrae Posted November 1, 2011 Report Share Posted November 1, 2011 As a result, the process is extremely extremely slow: easily in excess of one minute. Are you using the latest version? The change log for v 4.2.1.1 says "17. Optimized the 'Split String' command. Quote Link to comment Share on other sites More sharing options...
paul Posted November 1, 2011 Report Share Posted November 1, 2011 Are you using the latest version? The change log for v 4.2.1.1 says "17. Optimized the 'Split String' command. Actually, the current version is v4.3.0.1! Quote Link to comment Share on other sites More sharing options...
MakaPakaTobyHannah Posted November 1, 2011 Author Report Share Posted November 1, 2011 Are you using the latest version? The change log for v 4.2.1.1 says "17. Optimized the 'Split String' command. Yes, I am running 4.3.0.1 Quote Link to comment Share on other sites More sharing options...
Samrae Posted November 1, 2011 Report Share Posted November 1, 2011 Actually, the current version is v4.3.0.1! I didn't say the current version is v 4.2.1.1 only that the Split String command was optimized in that version. Quote Link to comment Share on other sites More sharing options...
Cory Posted November 9, 2011 Report Share Posted November 9, 2011 This takes me back. In my former life I was a CADD draftsman/designer and had created a home brew documentation system with this ‘new’ program called Acrobat. AutoCAD couldn’t create PDFs and the PDF print driver had not been invented yet so one had to use the distiller. And since AutoCAD could export EPS this worked well. Unfortunately given the 3D nature of the models arc rotation was often negative which was legal in EPS but Distiller choked on it. So I wrote a program, much like you have to identify and edit the G-strokes to positive rotation. Worked cool. Anyway I have a bit of experience in what you are dealing with. I know exactly your problem with MEP here. Great functions but for large bits of data it’s simply too slow. One suggestion I have is not to append the variable while plowing thru with a text file process but instead write each line out one at a time. It sounds counterintuitive but in some cases it’s faster. I believe one of MEP’s major performance problems comes from resizing variables. And given the way Windows treats active files and caches disk writes it can often be faster. The second option I would suggest is one I’m having to turn to more and more for exactly the reasons your experiencing and that’s to use outside programming resources. Programmers deal with this sort of problem all the time so a long time ago the created and continue to refine a weapon known as RegEX (Regular Expressions). It’s hugely powerful and difficult to understand at first but in this simple example there’s a method to replace or remove. Based on a pattern. In this case imagine the instructions being “Find ‘statusdict<any number of characters><End of Line>’ and replace with nothing. No matter how large the file the results will be practically instantaneous. And you could put this in a VBScript you could run from MEP if this is all part of a larger thing or you could use just a VBScript instead. It is a little more advanced but if you’re hacking EPS files you might find it easy and there are tons of really great simple examples online. And if you need some help just contact me directly. Quote Link to comment Share on other sites More sharing options...
MakaPakaTobyHannah Posted November 15, 2011 Author Report Share Posted November 15, 2011 Thank you Cory, as always... for your very helpful response. It's good to know I'm not the only who has experienced this kind of issue. I will explore your suggested solutions. I've only just begun using external scripts as part of ME routines; for instance, input windows with both radio buttons and check boxes - and maybe roll-down menus etc. are in-your-dreams-only features of ME, at least for now - but can be accomplished by inserting external scripts. Cheers to everyone who chimed in. The second option I would suggest is one Im having to turn to more and more for exactly the reasons your experiencing and thats to use outside programming resources. Programmers deal with this sort of problem all the time so a long time ago the created and continue to refine a weapon known as RegEX (Regular Expressions). Its hugely powerful and difficult to understand at first but in this simple example theres a method to replace or remove. Based on a pattern. In this case imagine the instructions being Find statusdict<any number of characters><End of Line> and replace with nothing. No matter how large the file the results will be practically instantaneous. And you could put this in a VBScript you could run from MEP if this is all part of a larger thing or you could use just a VBScript instead. It is a little more advanced but if youre hacking EPS files you might find it easy and there are tons of really great simple examples online. And if you need some help just contact me directly. And to conclude, based on Cory's suggestion, this macro now works in a fraction of a second, when it could have easily taken up to one minute: Variable Set String %origEPSFile% to "*.EPS" // this will display only EPS file in the next line Variable Set String %origEPSFile%: Prompt for a filename // I select the EPS file to be processed here External Script: AutoIT //The AutoIT script looks like this: // //#include <file.au3> //Dim $aRecords //_FileReadToArray("%origEPSFile%",$aRecords) //For $x = 1 to $aRecords[0] // if stringinstr($aRecords[$x], "statusdict") then _FileWriteToLine("%origEPSFile%", $x, "", 1) //Next // //End of AutoIT script Text Box Display: "Statusdict" command has been removed. Good thing, that! Quote Link to comment Share on other sites More sharing options...
lemming Posted November 24, 2011 Report Share Posted November 24, 2011 You don't even need Macex for this. You could use the well known grep program from Unix. For Windows, you can get grep via GNU utilities for Win32 which can be found at http://unxutils.sourceforge.net/ You only need the file egrep.exe which is in \usr\local\wbin For your given example: statusdict /setpage known {statusdict begin 792 1224 1 setpage end} if This one-liner will DISPLAY all lines which match the pattern in the file dirty.eps egrep "\{statusdict begin [0-9]{3} [0-9]{4} [0-9] setpage end\} if" dirty.eps This one-liner will REMOVE all lines which match the pattern in the file dirty.eps and write the results to clean.eps egrep -v "\{statusdict begin [0-9]{3} [0-9]{4} [0-9] setpage end\} if" dirty.eps >clean.eps -v mean invert-match, or select non-matching lines [0-9]{3} means "match any three-digit number" [0-9]{4} means "match any four-digit number" [0-9] means "match any one-digit number" The curly brackets have a special meaning in regex, so you need to "escape" them you're looking for the literal characters. Hence, the use of the backslash char in the search pattern, i.e. \{ \} I regularly have to grapple with extremely huge text files (over 200 million lines, file size 2GB+ each). I processed one such file using a much more complex pattern, and egrep took only about 10 seconds. It may not seem fast, but it is still faster and easier than autoit/autohotkey. Anyway, processing such large files would not be feasible with Macro Express. For your file, which is "only" about 30,000 lines and has relatively simple patterns, I would estimate egrep will take less than a second to perform the task you want. You can also call egrep from Macex. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.