Jump to content
Macro Express Forums

Some speed tests with arrays


Recommended Posts

In the ME3 days I used to search for things in big accumulator variables and when I switched to MEP I started using the arrays instead because I found that resizing string variables caused some huge slowdowns. The other day I was revisiting a macro I wrote way back when I changing it to use arrays instead because it was really slow. While trying to ID the problem spots I ran a couple of simple speed tests and I thought some of you might find the results interesting.

 

The macro runs thru approx. 25k folders and validates them all with several tests to ensure they're in compliance with the naming conventions as well as look for potential problems like invalid SSNs. Here were my simplified tests and the results.

  1. Simply run thru all the folders using "Repeat with Folder" on a local drive with no other commands. Of course this changes the one variable each time. 1.0S total, 42µs EA.
  2. Run thru all the folders using "Repeat with Folder" on a local drive and load an array. 1.0S total, 41µs EA.
  3. Run through array of 25K and perform an "If Equals" test. 0.9S total, 34µs EA.
  4. Run through array of 25K and perform an "If contains" test with a single character. 0.9S total, 35µs EA.
  5. Run through 25k folders using "Repeat with Folder" and accumulate into a single variable. 20.7S, 830µs each.
  6. Repeat 25k times doing an "If Contains" on an accumulator variable of 25k folder names. 14.4S, 577µs.

Essentially my old way used a combination of 5 and 6. I knew it was faster to use arrays but I was surprised to see how much. I was also surprised ot see that an "If Contain" is about the same speed as an If Equals.

Link to comment
Share on other sites

I also assumed "If Contains" would be slower. Long variables (like all text on a web page) may have significantly different times. I'm certain we often try to simplify macros unnecessarily - we think we are saving effort but the processor does a host of operations in milliseconds.

 

If the "Repeat with Folder" did not register each filename into the variable (ie count only), would that make a big difference?

Link to comment
Share on other sites

It can't be done. Repeat with Folder requires a variable be used for the result.

 

It is interesting how some thing make a difference and others do not. And it helps us write better macros. Personally I like to keep them as simple and intuitive because more and more I find myself coming back to old macros where I was being clever and it's like unraveling the Gordian knot. So more and more I make it simple first and then if time is an issue identify what part is costing the time and only apply advanced cleverness there.

Link to comment
Share on other sites

My question was a hypothetical. I thought that you may have a eyeball estimate from running your tests. In the thread about finding changed file in lists, I found that adding "counter plus if statement to exit the repeat" slowed down the list processing by a factor of 4 which was surprising. The slow down ratio may be dependent on PC setup. Often we want to count number of filenames or lines in file and in ME that also means transferring the info to a variable. The CRLF method for text files may also have significant dependency on PC setup.

Link to comment
Share on other sites

I don't know about that.

..............................................

Adding a counter and an If statement slowed it down by a factor of 4? If I were you I would take a closer look at that. In my experimentation adding simple variable manipulations and conditional statements were lightning fast. EG I have routines that will do things like testing a SSN. Each one of these tests do several opps. EG get the length and if not 111 create an error message and move ahead, check positions 1,2,3,5,6,8,9,10,&11 that they are numbers, check positions 4 and 7 that they are hyphens, Check high group numbers, check for valid area numbers, blah, blah, blah... And this is just one of about a dozen test. I put debug messages in between each one and calculated the time and simply ignored them because they all happen almost instantaneously.

 

Now one of those routines had accumulating variable and that was an exception. At first it was fast then it bogged down so it took some more detecting to find. You might have something like this going on. But it's real simple to figure out. I have a simple dyno macro that prompts the user for the number of repeats, catches the stat and stop times and calculates the total time as well as the 'each' time. Then you can toss bits of macro in between and run.

 

And if you find something hinky looking post it and I'll try it on my machine. Maybe you got something going on with your machine that's messing you up like a process that decides it needs 100% CPU in the middle of your macro run. But if we can test it on multiple machines we can weed out those possibilities.

Link to comment
Share on other sites

You see, you had eyeballed it! I wondered if there was anything else going on but the tests I tried were within hours of each other with similar activities on the PC. All activities involving long file searches are slow. I wondered if enabling Indexing Service on the drive would change things but the fact is that searches with Directory Opus are hugely faster. I use Opus to do the "If File Contains.." search and it passes the results to ME which searches for and breaks out the information in the required format.

Link to comment
Share on other sites

I wasn't eyeballing it. I inserted debug messages in between each section and included the current timestamp from a time variable and logged. I dumped the log file into a spreadsheet and calculated the time. So within the repeat I had a dozen or so markers. I did the math and found that the complex tests were in the milliseconds and ignored them. But I suspected the growing accumulator var as a suspect and that would not appear in a few hundred iterations so I did that separately and inferred that it increases with size.

 

If you're switching back and forth to another app that will make life slow. There is a cause and it's not simply that MEP is a dog. You haven't been explicit in what it is you are doing that seems to take so long but I can tell you that I can propagate a 25K element array with 25k long folder names and look for a string match in every single one of those value in 2 seconds. How long does that take you on your machine?

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
×
×
  • Create New...