Need help with retrieving html data with VBA #145070
Replies: 1 comment 1 reply
-
Updated VBA Code to Retrieve All EventsSub RetrieveAllEvents()
Dim ie As Object
Dim strJson As String
Dim totalEvents As Integer
Dim lastHeight As Long
Dim currentHeight As Long
Dim scrollAttempts As Integer
Dim maxScrollAttempts As Integer
' Create a new Internet Explorer application
Set ie = CreateObject("InternetExplorer.Application")
ie.Visible = True
' Navigate to the webpage
ie.Navigate "https://www.meetup.com/cflfreethought/events/"
' Wait for the page to load completely
Do While ie.Busy Or ie.ReadyState <> 4
DoEvents
Loop
' Initialize variables
totalEvents = 0
lastHeight = 0
scrollAttempts = 0
maxScrollAttempts = 10 ' Set a limit to avoid infinite loops
' Scroll until no new content is loaded
Do While scrollAttempts < maxScrollAttempts
' Scroll down the page
ie.Document.parentWindow.scrollBy 0, 1000
' Wait for new content to load
Application.Wait Now + TimeValue("00:00:02") ' Adjust wait time if necessary
' Get the current height of the document
currentHeight = ie.Document.body.scrollHeight
' Check if the height has changed
If currentHeight = lastHeight Then
' No new content loaded, exit the loop
Exit Do
End If
' Update lastHeight for the next iteration
lastHeight = currentHeight
' Increment the scroll attempt counter
scrollAttempts = scrollAttempts + 1
Loop
' Now retrieve the HTML content
strJson = ie.Document.body.innerText
' Output the result to the Immediate Window (Ctrl + G to view)
Debug.Print strJson
' Optionally, parse strJson to extract the event data
' Your parsing logic goes here
' Clean up
ie.Quit
Set ie = Nothing
End Sub
|
Beta Was this translation helpful? Give feedback.
-
Body
This webpage lists events for an organization:
https://www.meetup.com/cflfreethought/events/
Currently, there are 33 events, but only 10 are displayed initially when the page is loaded.
The remaining 23 events are retrieved when downward scrolling occurs. This may be called dynamic loading of the data or webpage.
Data is needed for all 33 events.
This VB code returns the data for the first 10 events which can be parsed out using strJson.
Set ie = CreateObject("InternetExplorer.Application")
ie.Navigate html_webpage
strJson = ie.Document.body.innertext
The difficulty is trying to obtain data for events 11 to 33.
Some things that have been tried:
Me!EdgeBrowser1.ExecuteJavascript "window.scrollBy(0, 5000);"
Me!EdgeBrowser1.ExecuteJavascript "window.scrollTo(0, document.body.scrollHeight);"
Neither of these have worked. An html expert is needed to advise how to retrieve the missing data. Thanks!
Guidelines
Beta Was this translation helpful? Give feedback.
All reactions