Jump to content
Macro Express Forums

Leaderboard

Popular Content

Showing content with the highest reputation on 05/17/2021 in all areas

  1. I use the .NET WebBrowser control to make my own automated browsers sometimes. This way it's all a model including the rendered DOM which I can interact with programmatically. I figured if I was going to learn a language, which most web automation tools like iMacros need, I woudl just learn how to do it in .NET and make a proper program. Most of the ones I do I use the HTTPWebRequest/Response objects as rendering the document takes time and the text is modified before rendering to 'fix' common problems. So I just grab the raw HTML and get what I need from it using RegEx. I can even travers though a series of pages that way picking up tokens and cookies or whatever needed to interact with the statefull... state. It's much faster than navigating and rendering each page. Also I often find tokens can be reused or steps skipped speeding the process. Also I often find that the navigation isn't necessary and that at the end there's a simple GET with URI parameters or a HTTP POST with parameters I can tweak. Say the web form search offers 10, 50, or 100 results per page. Well that's a parameter in the post data and I change it to 10,000. I can often get the entire results set in a single request. Seconds to get a boatload of data. I often avoid reCAPTCHA's too this way as they often only protect the form pages. In many cases the last step is a query in JSON from a script on the web page to a AWS server or somehting. The script takes that then propagates a table on the web page for instance. Well I just copy that request, tweak the parameters, and get huge JSON DataSets I can port directly into my MS SQL database using the JSON Deserializer. No HTML, no DOM. Just the data. And for pushing data though forms it's also similar. I analyze and make a stripped down process that just uses POST requests to push a bunch of data to their server. No filling out textboxes or clicking buttons. Just send the data like the Web Browser does.
    1 point
×
×
  • Create New...