Accessing last table at top level in HTML DOM

  • Thread starter Thread starter John Williams
  • Start date Start date
J

John Williams

I have a HTML page consisting of several tables at the top level of the DOM,
interpersed with other HTML tags. For example:

HTML
BODY
-- SCRIPT
-- SCRIPT
-- #comment
-- SCRIPT
-- A
-- TABLE
-- TABLE
-- TABLE
-- NOSCRIPT
-- #comment
-- BR
-- TABLE
-- #comment
-- #comment

What's the best way of accessing the last table at the top level (i.e. the
last table in the immediate children of HTMLdocument.body)? I'm currently
using the technique shown by the following code snippet, but is there an
easier way?

(ByVal mDoc As mshtml.HTMLDocument)

Dim mElems As mshtml.IHTMLElementCollection
Dim mElem As mshtml.IHTMLElement
Dim mTable As mshtml.IHTMLTable
Dim i As Integer

mElems = mDoc.body.children
For i = mElems.length - 1 To 0 Step -1
mElem = mElems.item(i)
If TypeOf mElem Is mshtml.HTMLTableClass Then
mTable = mElem 'the required table - break out here
End If
Next

Any help or info is much appreciated.
 
Hello John,
I have a HTML page consisting of several tables at the top level of
the DOM, interpersed with other HTML tags. For example:

HTML
BODY
What's the best way of accessing the last table at the top level (i.e.
the last table in the immediate children of HTMLdocument.body)? I'm
currently using the technique shown by the following code snippet,
but is there an easier way?

(ByVal mDoc As mshtml.HTMLDocument)

Dim mElems As mshtml.IHTMLElementCollection
Dim mElem As mshtml.IHTMLElement
Dim mTable As mshtml.IHTMLTable
Dim i As Integer
mElems = mDoc.body.children
For i = mElems.length - 1 To 0 Step -1
mElem = mElems.item(i)
If TypeOf mElem Is mshtml.HTMLTableClass Then
mTable = mElem 'the required table - break out here
End If
Next
Any help or info is much appreciated.

As far as i'm aware, you're doing alright. You could filter down the list of items you have to go through by requesting all the tables, but this does bring in all the nested tables as well - instead of checking if it's a table, you check the parent to see if it matches the body. Depending on the situation (ie - not many tables in the document), this might end up needing to iterate less often, but there's really no garantees.

Dim mBody as mshtml.IHTMLElement = mDoc.body
mElems = mDoc.getElementsByTagName("TABLE")
for i = mElems.length - 1 to 0 Step -1
mElem = mElems.item(i)
if mElem.parentElement.sourceIndex = mBody.sourceIndex then
mTable = mElem
endif
next
 
Geoff Appleby said:
Hello John,


As far as i'm aware, you're doing alright. You could filter down the list
of items you have to go through by requesting all the tables, but this
does bring in all the nested tables as well - instead of checking if it's
a table, you check the parent to see if it matches the body. Depending on
the situation (ie - not many tables in the document), this might end up
needing to iterate less often, but there's really no garantees.

Dim mBody as mshtml.IHTMLElement = mDoc.body
mElems = mDoc.getElementsByTagName("TABLE")
for i = mElems.length - 1 to 0 Step -1
mElem = mElems.item(i)
if mElem.parentElement.sourceIndex = mBody.sourceIndex then
mTable = mElem
endif
next

Thanks Geoff. Yes there are a number of nested tables, I see what you mean.
I've never used the sourceIndex before to match up the element with its
parent, so thanks for showing that too.
 
Hello John,
Thanks Geoff. Yes there are a number of nested tables, I see what you
mean. I've never used the sourceIndex before to match up the element
with its parent, so thanks for showing that too.

No worries.
Something to be aware of: the sourceindex is an unreliable dynamic value.
By that, what i mean is, the sourceindex is garanteed to be unique, but you can't rely on it being the exact same value whenever the page is reloaded. I any dynamic changes are made to the page (adding or removing elements, for example) then the sourceindexes will go out of whack.
End result: use the sourceindex, but only while the dom has not been modified. Hope that makes sense :)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Back
Top