Jump to content
Macro Express Forums
Cory

Cool file downloader

Recommended Posts

As you may know I don’t like using IE or any web browser for automated web tasks. Instead I would use some downloader program to get the job done. But that wasn’t very elegant as it required me to distribute another app and make firewall adjustments to all machines running it. So I went hunting again and found this very handy VBScript. You give it a URL and a local path and file name and it downloads it for you! It’s fast and if you choose to capture the console output the macro will not advance until it’s done. I’ve been using it to download torrents on RSS feeds and it has been working very reliably. But the best part is that the script is saved in the file. In this example I use the variables %URL to Download% and %Local File Name%, you can see them in the script.

' Set your settings
strFileURL = "%URL to Download%"
strHDLocation = "%Local File Name%"

' Fetch the file
Set objXMLHTTP = CreateObject("MSXML2.XMLHTTP")

objXMLHTTP.open "GET", strFileURL, false
objXMLHTTP.send()

If objXMLHTTP.Status = 200 Then
  Set objADOStream = CreateObject("ADODB.Stream")
  objADOStream.Open
  objADOStream.Type = 1 'adTypeBinary

  objADOStream.Write objXMLHTTP.ResponseBody
  objADOStream.Position = 0	'Set the stream position to the start

  Set objFSO = Createobject("Scripting.FileSystemObject")
	If objFSO.Fileexists(strHDLocation) Then objFSO.DeleteFile strHDLocation
  Set objFSO = Nothing

  objADOStream.SaveToFile strHDLocation
  objADOStream.Close
  Set objADOStream = Nothing
End if

Set objXMLHTTP = Nothing

As a practical example I‘ll describe the macro I wrote this weekend. I am a Vikings fan but do not get DirecTV but want to watch all my games. There are private torrent sites where people post torrents of games but they are a little strange. You see most torrent sites have RSS (XML structured web pages) feeds which list all the latest posts. Most torrent sites will have the torrent URL in the link section but these guys link to another page that you have to go to where you can download the torrent. Personally I think they are just trying to discourage any automation but what this means is we can’t use the traditional RSS downloaders for our torrent clients. The other problem here is that because it’s a private tracker there are a limited number of lechers (downloaders) and since we are scrutinized on our ratio if you are tail end Charlie all the time your ratio will suck and you risk getting the boot.

 

So I essentially created a macro to be an RSS downloader. What I do is have this script download the RSS feed to a file. I then parse that XML into a series of arrays with descriptions, links, sizes and so on. I even parse out the size from the description so I don’t accidently step on an MPEG2 version. I then check every description looking for Vikings games. It does this every 5 minutes. If I do find a game it follows the link to the next page where it downloads that file. In that file it finds the unique string of text which is the beginning to the torrent URL. I then cut out the URL from the rest of the HTML and use this VBScript again to download the torrent file to the drop folder (a folder monitored by the torrent software where files are automatically inducted). It then records the title of the torrent to be saved in a file that will prevent it from being downloaded again in the future. But the important thing is that this is all being done in macro memory and does not ever involve a web browser. Not only that but it’s faster because I’m not downloading all the graphics, advertising and especially not running any scripts in the web page like a browser will. No timing issues, not waiting for a web page to load... It all just works flawlessly and invisibly every time.

 

Anyway given the problems users have with web page timing I thought I would share this as a solution. Manipulating the variables is a little more advanced but it’s easy to learn.

Share this post


Link to post
Share on other sites

Thanks.

.........................................................

However others have done similar things using VBS to extend MEPs capabilities. Joe is the best for this. It's just so liberating not to be slave to the temperamental web browsers.

Share this post


Link to post
Share on other sites

' Set your settings
strFileURL = "%URL to Download%"
strHDLocation = "%Local File Name%"

' Fetch the file
Set objXMLHTTP = CreateObject("MSXML2.XMLHTTP")

objXMLHTTP.open "GET", strFileURL, false
objXMLHTTP.send()

If objXMLHTTP.Status = 200 Then
  Set objADOStream = CreateObject("ADODB.Stream")
  objADOStream.Open
  objADOStream.Type = 1 'adTypeBinary

  objADOStream.Write objXMLHTTP.ResponseBody
  objADOStream.Position = 0	'Set the stream position to the start

  Set objFSO = Createobject("Scripting.FileSystemObject")
	If objFSO.Fileexists(strHDLocation) Then objFSO.DeleteFile strHDLocation
  Set objFSO = Nothing

  objADOStream.SaveToFile strHDLocation
  objADOStream.Close
  Set objADOStream = Nothing
End if

Set objXMLHTTP = Nothing

 

Cory, I am aware that it is possible to use external scripts, but how do you actually use this from within a macro?

 

I know you put the script part in the 'External Script' command and assign a variable, say %T1%, but which command(s) do you use to execute the code? Help is not too helpful with this.

 

Regards,

 

Alan Monaghan

Kirkcaldy, Scotland

Share this post


Link to post
Share on other sites

External Script is the command.

..........................

If you set the variables in lines 4 and 5 to a valid URL and a valid file name it will work. Beware that the full path in line 4 needs to exists as it will not create a non-existent path. Also I choose the option to capture the console output. I don’t need the console output usually but it makes the macro wait for it to be completed. But of course failures and such can be handled here depending on what a script returns.

 

What MEP does is take whatever text you have within and write it to a file in the temp folder making the appropriate variable substitutions and executes.

Share this post


Link to post
Share on other sites

Cory,

I know this is an old thread, but I can use the above External Script to grab a hotel website code, but this doesn't seem to include the data elements as displayed when using Chrome.

I know I can fudge it and do the following:-

1) Go to Elements tab

2) Select Top most element

3) Edit as HTML

4) Select All

5) Copy etc.

but this kind of defeats the purpose of using VBscript.

Do you know what lines in the above code that I need to change so that I can download the actual 'Elements' script rather than just the surrounding HTML code?

I've looked at the code above and can't see how to do this.

Regards,

Alan

Share this post


Link to post
Share on other sites

Click F12 in your browser. I bet there's more going on than than a single web download. Most of these web pages are active. As you do things, they send more requests and collect more data from the server. Sometimes even just at startup. I do a lot of this data collection for hire and I follow the trail of requests and emulate it. My programs use raw HTTP requests to get responses. No browser is involved. Sometimes it can get sticky but I can make it work most times. In come cases it's easier. One site I scrape load a large amount of data in JSON. When the user clicks, filters, and does things, the script on the page simply changes the display of that data and doesn't make any more requests. This worked wonderful for me as I was able to import that directly into a data table and do what I needed to in a single request. 

It really depends on the site. Send me a PM with a link and whatever it is your're trying to do and i can take a peek at it for you. 

Share this post


Link to post
Share on other sites

This is simple. After the first GET there are 124 additional requests. About a third the way though it does a GET for https://www.premierinn.com/whitbread-services-unsecured/hotels/search/54.97814/-1.61622/40. Go ahead and click this link. This returns simple JSON data. I converted it to CSV and attached to this message. It would be a pain to parse in MEP but there are online converters. In my work I do all this in VB.NET where there are ready made JSON readers. I just load everything into a data table. 

convertcsv (1).csv

Share this post


Link to post
Share on other sites

Cory,

I've searched high and low through the URL I posted and can't find any reference to the link you posted. Where did you find that from?

How did you know to use the bold part, as I can't find any reference to it in the rest of the code?  https://www.premierinn.com/whitbread-services-unsecured/hotels/search/54.97814/-1.61622/40.

Incidentally, great work so far, I still need date/price info, but don't know where (other than trawling through all the code) where or what link to use.

I found several JSON->CSV/XLSX converters online for later as I tried loading as JSON data file in Excel 2016 but not yet experienced enough in handling the Data module (formally Power Pivot).

Alan

 

Share this post


Link to post
Share on other sites

Open your browser's developer tools. Most are F12. Open the URL you provided and then look at the requests. I ignored all the web parts, graphics, scripts, etc until I saw an XMLHttpRequest (XHR). See here. Since it's a GET and not a post, all of the parameters are included in the URI. Expand it to look at the HTTP Request and you can see everything about it. It will even let you preview the JSON

Share this post


Link to post
Share on other sites

I didn't give you the correct one. Further down is one wiht the rates. See here.  The request was a GET again and is 

https://www.premierinn.com/whitbread-services-unsecured/booking/availabilities/2018-03-08/2018-03-14/NEWTHY,NEWPTI,NEWNEW,NEWMTI,NEWTEA,NEWWAG,GATDER,NEWPLI,NEWSTO,NEWAVI,NEWMOO,NORROY,WASCOL,SOUTAY,SUNWES,WHIBAY,SUNPTI,SUNCIT,DURDUN,DURBAR,DURBRE,ASHWDG,BISTIN,NEWGRE,HAROLD,STOBAR,DARTOW,DARBRE,MIDTOW,STOPRE?get=%7B%22method%22:%22GET%22,%22cache%22:false%7D

You can work out the codes in there and rearrange however you like. Don't worry about the percent signs. Those are URI encoded. E.G. %22 is a hashtag. That too can be en/decoded in .NET. You should be able to modify that anyway you want to get exactly the hotels you want. 

Share this post


Link to post
Share on other sites

That's excellent Cory. I can work with this now.

I looked at that GET too, but didn't expand it enough to see the pricing information.

A lot of work still for me to do, but I think I can manage it from here. I've even got some sort of working data in Excel now.

Definitely bedtime now. 

Thanks again,

Alan

Share this post


Link to post
Share on other sites

You might be able to make it a web data query in Excel also. 

Share this post


Link to post
Share on other sites

That's what I'm working towards, possibly doing it all from within Excel. I'm going to use MEP to tie all bits together in the meantime until I suss more out.

You've given me a really good start though, so thanks again Cory.

Having to leave it until the weekend as I am currently job hunting and have an appointment tomorrow to prepare for. (Not an interview yet).

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×