Jump to content
Macro Express Forums

Recommended Posts

New features of any application can be good, bad , or ugly.  So being inquisitive, I was wondering what kind of a feature would it be for MXPro6 to interact with websites in the same manner and power as it does with local applications?  Personally I think it would be great.  What is your opinion?

 

 

Share this post


Link to post
Share on other sites

Nice to see you again, Joe. 

I agree, however I think it would be difficult. For one, malware likes to access content in a form also so there are many safeguards. The reason I learned to program again in .NET was because MEP didn't work well in a web browser control. Since I gave up I have written hundreds of scrapers and applications which automate interactions with web pages. If ISS were serisou about this, I think the key would be to create an application that has a web browser control. A Macro Express Browser. That way MEP could have better access and control from the inside of the process. I've made a few of these and they can often work wonderfully. Having said that, there are already extensions and applications that do this on the market. 

Share this post


Link to post
Share on other sites

Hi Joe, pleased to hear from you after such a long time.

 

I’m no programmer, but yes I would indeed like to see MXPro offer improved power and flexibility for web stuff, if that’s what you’re asking. A start would be to get a reliable command that would detect when a site is in a ready state to accept further commands. For any browser.

 

Still occasionally dipping into your excellent book - but how about an update for MXPro under Win 10? 🙂
 

Terry, UK

 

Share this post


Link to post
Share on other sites

Cory -

 

Thanks!

 

Yes, the underlying requirements for web page interaction is different than it is for local apps, so maybe a new app from Insight outside of MXPro6 would be good for this.  Indeed, they do have other apps for other things.  That being said, it would still be nice to expand the External Script command somewhat to handle possibilities.

 

You live in the .NET world which is a much broader and flexible than my web-based application world.  And like you, I have seen many apps whose claim to fame is web scraping, which is a term that I abhor.  But it is what it is, and it seems to be more acceptable than before.  Personally, the term automation is better.  Then again, web scraping is automation <sigh>.

 

Do you think the market is too flooded with applications designed for web automation?

Share this post


Link to post
Share on other sites

Hi Terry!

 

Yes, I have been away for a long time and I thank you for dipping into the book.  As a matter of fact, I needed to do the same thing several times recently!

 

So, we know that MXPro6 has the "Wait for the web page" feature in the Website command (and other commands?).  But you think it needs to be used for browsers other than Internet Explorer, yes?  If so, then I agree.  I don't think Microsoft even supports IE anymore, but I could be wrong.

 

Do you think that web page automation would be a good feature for MXPro6 or, as Cory mentioned, a new Insight application?

Share this post


Link to post
Share on other sites
6 hours ago, joe said:

So, we know that MXPro6 has the "Wait for the web page" feature in the Website command (and other commands?).  But you think it needs to be used for browsers other than Internet Explorer, yes?

If so, then I agree.  I don't think Microsoft even supports IE anymore, but I could be wrong.


Yes, I’ve always had to write my own ‘wait until ready’ code for Firefox and later Waterfox. That failed occasionally and I no longer bother, usually inserting long delays before I attempt manipulation. A command that will work for those and Chrome, Opera, Edge (Microsoft’s replacement for IE) would be handy. 
 

Quote

Do you think that web page automation would be a good feature for MXPro6 or, as Cory mentioned, a new Insight application?


I would prefer to see extra web-work functionality within MXPro. But I’m not into web stuff much these days so probably not a typical prospect.

Share this post


Link to post
Share on other sites
On 5/20/2020 at 7:53 AM, joe said:

Do you think the market is too flooded with applications designed for web automation?

I don't know. I looked into iMacros for Chrome for a spell, and I saw a huge learning curve that ultimately was still going to be a temperamental and limited 'black box'. Since I was learning .NET I figured my time was better spent learning to do it there and have ultimate control.

I have often thought what it would take to make a program that automates web page interaction for people who don't want to learn to program and I can't imagine how difficult that would be. It's technical and really if you want to do effective automation, scraping, or whatever, you should learn to program or hire someone who can. 

Having said that I have wanted to make an extension for MEP that would do some of the simple things. After years of writing such programs I could cover 75% of the needs with a simple console program that would be easy to interact with in MEP. I also want to write one for basic WinForms and RegEx. I've posted here to see if there was any interest and I heard crickets. So I have given up on the idea. 

Share this post


Link to post
Share on other sites

Understood Cory.

 

I looked at iMacros a while back and saw what you saw.  You probably looked deeper than I, but we have the same opinion.

 

BTW, when you say "console program", are you referring to something like cmd.exe running behind the scenes for MXPro interactivity?

 

Interaction with web-based apps is centered around events.  We know that an event is anything a user does with their mouse, keyboard, touchscreen, and voice.  But events are also driven by whatever the app needs to do for itself.  Most events are asynchronous anymore which are less difficult to handle than they used to be.  That being said, event handling is what takes the most design time.  It always took 80% of the time in every web-based app I created.

 

A lot of web-based apps are designed to be nearly 100% dynamic. Elements are usually created or made accessible only at the time some event happens i.e. a user clicks on a button.  So the document elements and their attributes can always change in an instant.

 

Still, a user can only do one thing at a time i.e. click here, click there, hover over this, hit a key, and so forth.

Share this post


Link to post
Share on other sites

Console app = cmd.exe. Something that can run invisibly and can be configured to return (echo) results fo it's easy for MEP to continue with the resultant data. 

 

I rarely render a web page in a WebBrowser control and allow scripts to run. No events. That's all to complicated, troublesome, and slow. The majority of the time create my own HTTP request/response sessions with the server. If I want to get data from a web page I use a HTTPWebRequest, usually as a GET but sometimes POST method, and extract the information using RegEx. And often these days there's a base page that loads with some scripts. Then depending on user input, the script will make requests from a Db server. Often you can see this with auto-complete in a search dialog for instance. They don't download all the page, scripts, graphics, and such, they just get that little bit of data to fill out the form fields, data grid, or whatever. These are usually pretty JSON data packages. JSON is kind of what XML wanted to be, but different. Anyway there's a beautiful JSON deserializer in .NET which you point to your own custom class and BAM! You have a beautiful propagated data package object. No need for RegEx. This is also extremely fast. 

If I want to automate a user action I snoop the requests/responses in the browser debugger (F12 > Network in most browsers) and I just create what was sent. Let's say there's a form you need to fill out and submit. You could download the whole dang page, graphics, scripts CSS, and all that overhead. Then render it in a DOM and try to figure out how to fill in text, click a radio button or select from a drop-down. But that's a pain. What I do is manually do that once. Then when I hit the submit button, I look at the request sent. Usually it's a POST and the data package is a small string of data. Say "LastName=Jackson&FirstName=Cory&GenderMale=True". Usually URL encoded. Often JSON also. I skip all the rest of that jazz and use a Db table to drive the process to enter ten thousand names. No loading of pages, clicking things. Just send the end result directly. And something that size I can usually do about 10 a second depending on server location and performance. And I can even send them asynchronously and go faster. 

Ah. Here. I did a search on this web page for "Test". This is the POST data. "q=test&type=all&search_and_or=or&search_in=all". A better example. 

Many times I need to visit a base page first to pick up a cookie or a session token, especially for ASP sites. Sometimes I need to find and extra a token from each result and add it to the headers. But it's pretty easy most times.

And..... Often when gathering data I find that I can do more then one can with the normal request. A good example is a web page that returns a search result for a maximum of 100 results at a time. That's being imposed in the scripts on the page, so I just change that value to a million in my POST data and get all of them and avoid downloading thousands of pages. And search criteria validation is usually in the form. Have you ever seen a page that requires you to type at least 4 characters in yoru search? Well often you can do as few as you want in the POST data and the server will give it to you. Even a blank string or a percent sign. A percent sometimes needs to be URL encoded but the cool thing there is if it's SQL on the their end that's a wildcard for all. Many tricks like this I have learned. Last collection of website I worked on I could directly pump all of the records in theri Db in one request and receive it in a single JSON package.

Anyway... These techniques took a while to learn but it's much simpler. My point is that I don't mess with GUI form interactions anymore. It's much quicker and simpler to mimic their HTTP requests. If you ever want something like this, let me know. I'd be happy to do a demo for you. 

BTW I thank ISS/MEP for getting me into this. I don't use it for web automation anymore, but it got me started and learning. I wrote some huge web automation scripts in MEP for banks and hospitals so it was OK at the time for me. 

Share this post


Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...

×
×
  • Create New...