Web scrapping of masked URL using VBA
Web scrapping of masked URL using VBA
I want to scrape some stock data from a website https://dps.psx.com.pk/ using VBA in Excel, but the problem is the URL of this website does not change.
When I click on market summary as highlighted in the below image
that will return the whole market summary, I just need to scrape data in Excel using VBA as highlighted in the following image:
This is basic web scraping. You need to inspect the source code elements and like @antfuentes87 says , follow the requests. It sounds more like you need to purchase a third party tool to help you. It's also called "scrape" and "scraping"
– dbmitch
Jul 1 at 16:44
1 Answer
1
All the necessary info to scrape that data you may find in captured by Fiddler request which is logged after you made an input of the quote symbol in a browser manual:
You need to reproduce that request via VBA and parse HTML response. There is the example showing how that might be done:
Option Explicit
Sub Test()
' Add references
' Microsoft HTML Object Library
' Microsoft XML, v6.0
Dim sResp As String
Dim rOutputCell As Range
Dim oElememnt
Dim cElements As IHTMLElementCollection
Dim oTableRow
Dim oTableCell
' Retrieve HTML from website
With New MSXML2.XMLHTTP60
' Send request
.Open "POST", "https://dps.psx.com.pk/webpages/SL_main_page.php", True
.SetRequestHeader "Content-Type", "application/x-www-form-urlencoded"
.Send "symbolCode=EFOODS"
Do Until .ReadyState = 4: DoEvents: Loop
sResp = .ResponseText
End With
' Parse response and output
With New HTMLDocument
' Load response HTML into DOM
.body.innerHTML = sResp
' Clear first worksheet for output
ThisWorkbook.Sheets(1).Cells.Delete
' Parse SL_cmpInfo table and output
Set rOutputCell = ThisWorkbook.Sheets(1).Cells(1, 1)
Set oElememnt = .getElementsByClassName("SL_cmpText")(0)
rOutputCell.Value = oElememnt.innerText
' Parse SL_mktStats1 tables and output
Set rOutputCell = Cells(3, 1)
Set cElements = .getElementsByClassName("SL_mktStats1")
For Each oElememnt In Array(cElements(1), cElements(2), cElements(3))
For Each oTableRow In oElememnt.getElementsByTagName("tr")
For Each oTableCell In oTableRow.getElementsByTagName("td")
rOutputCell.Value = oTableCell.innerText
Set rOutputCell = rOutputCell.Offset(0, 1)
Next
Set rOutputCell = rOutputCell.Offset(1, 0).EntireRow.Cells(1, 1)
Next
Next
' Parse SL_announce table and output
Set rOutputCell = rOutputCell.Offset(1, 0)
Set oElememnt = .getElementsByClassName("SL_announce")(0)
For Each oTableRow In oElememnt.getElementsByTagName("tr")
For Each oTableCell In oTableRow.getElementsByTagName("td")
rOutputCell.Value = oTableCell.innerText
Set rOutputCell = rOutputCell.Offset(0, 1)
Next
Set rOutputCell = rOutputCell.Offset(1, 0).EntireRow.Cells(1, 1)
Next
End With
MsgBox "Completed"
End Sub
Don't forget to add the necessary references:
The output for me is as follows:
As required:
How do you know if you need to include cookie information please?
– QHarr
Jul 2 at 6:00
@QHarr MSXML2.XMLHTTP supports cookies processing, and automatically adds the necessary cookies to requests. You can find that from fiddler capture.
– omegastripes
Jul 2 at 6:38
Many thanks for getting back to me. Can't upvote again, but if I could I would.
– QHarr
Jul 2 at 6:38
@QHarr Find more tips here WinINet vs. WinHTTP (Windows).
– omegastripes
Jul 2 at 6:40
@omegastripes Thank you so much . It works perfectly for me. Can you tell me what is the best way to learn VBA ?
– Faizan Nauman
Jul 2 at 9:49
By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.
Did you look at the requests being made when you search for "EFOODS"?
– antfuentes87
Jul 1 at 16:30