Parse HTML <DIV tags into treeview control

leedo2k

Active member
Joined
Nov 9, 2006
Messages
28
Programming Experience
Beginner
Hi,

I have been trying to do this for a couple of days and failed:confused::confused:.
I have a treeview written in java script.
The nodes are separated by the <DIV> tag.
Each <DIV> block is preceded with the "id" value for the parent tree node.
This is based on the PHP Layers Menu 3.0.2 dynamic PHP menu.

What I ultimately want to do is to scrape this and import it in a windows forms treeview control.
My first challenge is to parse out the tree nodes in such a way that I can keep track of each node's parent.
What is a suggested method to achieve this?
The value I want t extract is the name of the menu which precedes the </a> tags.

Here is the source code


PHP:
<div class="treemenudiv" id="jt1"><img border="0" align="top" alt=" " class="imgs" src="pic/tree_space.png"><a onmousedown="toggletreemenu1( '1' );"><img border="0" align="top" alt="+" class="imgs" src="pic/tree_collapse_corner.png" id="jt1node"></a><img border="0" align="top" alt="O" class="imgs" src="pic/tree_folder_open.png" id="jt1folder"><a class="phplmnormal" id="jt1item" dao_id="4" onclick="javascript: { selectTreeNode( this ); }">Campus</a></div> 
<div id="jt1son" style="display: block;" class="treemenudiv"> 
<div class="treemenudiv" id="jt2"><img border="0" align="top" alt=" " class="imgs" src="pic/tree_space.png"><img border="0" align="top" alt=" " class="imgs" src="pic/tree_space.png"><a onmousedown="toggletreemenu1( '2' );"><img border="0" align="top" alt=" " class="imgs" src="pic/tree_collapse.png" id="jt2node"></a><img border="0" align="top" alt="O" class="imgs" src="pic/tree_folder_open.png" id="jt2folder"><a class="phplmnormal" id="jt2item" dao_id="31" onclick="javascript: { selectTreeNode( this ); }">Office</a> 
</div> 
<div id="jt2son" style="display: block;" class="treemenudiv"> 
<div class="treemenudiv" id="jt3"><img border="0" align="top" alt=" " class="imgs" src="pic/tree_space.png"><img border="0" align="top" alt=" " class="imgs" src="pic/tree_space.png"><img border="0" align="top" alt="|" class="imgs" src="pic/tree_vertline.png"><img border="0" align="top" alt="T" class="imgs" src="pic/tree_split.png"><img border="0" align="top" alt="^" class="imgs" src="pic/tree_leaf.png"><a class="phplmnormal" id="jt3item" dao_id="34" onclick="javascript: { selectTreeNode( this ); }">Off1</a> 
</div> 
<div class="treemenudiv" id="jt4"><img border="0" align="top" alt=" " class="imgs" src="pic/tree_space.png"><img border="0" align="top" alt=" " class="imgs" src="pic/tree_space.png"><img border="0" align="top" alt="|" class="imgs" src="pic/tree_vertline.png"><img border="0" align="top" alt="T" class="imgs" src="pic/tree_split.png"><img border="0" align="top" alt="^" class="imgs" src="pic/tree_leaf.png"><a class="phplmnormal" id="jt4item" dao_id="35" onclick="javascript: { selectTreeNode( this ); }">Off2</a> 
</div> 
<div class="treemenudiv" id="jt5"><img border="0" align="top" alt=" " class="imgs" src="pic/tree_space.png"><img border="0" align="top" alt=" " class="imgs" src="pic/tree_space.png"><img border="0" align="top" alt="|" class="imgs" src="pic/tree_vertline.png"><img border="0" align="top" alt="T" class="imgs" src="pic/tree_split.png"><img border="0" align="top" alt="^" class="imgs" src="pic/tree_leaf.png"><a class="phplmnormal" id="jt5item" dao_id="36" onclick="javascript: { selectTreeNode( this ); }">lab</a> 
</div> 
<div class="treemenudiv" id="jt6"><img border="0" align="top" alt=" " class="imgs" src="pic/tree_space.png"><img border="0" align="top" alt=" " class="imgs" src="pic/tree_space.png"><img border="0" align="top" alt="|" class="imgs" src="pic/tree_vertline.png"><img border="0" align="top" alt="L" class="imgs" src="pic/tree_corner.png"><img border="0" align="top" alt="^" class="imgs" src="pic/tree_leaf.png"><a class="phplmnormal" id="jt6item" dao_id="37" onclick="javascript: { selectTreeNode( this ); }">reception</a> 
</div> 
</div> 
<div class="treemenudiv" id="jt7"><img border="0" align="top" alt=" " class="imgs" src="pic/tree_space.png"><img border="0" align="top" alt=" " class="imgs" src="pic/tree_space.png"><a onmousedown="toggletreemenu1( '7' );"><img border="0" align="top" alt=" " class="imgs" src="pic/tree_collapse.png" id="jt7node"></a><img border="0" align="top" alt="O" class="imgs" src="pic/tree_folder_open.png" id="jt7folder"><a class="phplmnormal" id="jt7item" dao_id="32" onclick="javascript: { selectTreeNode( this ); }">WA</a> 
</div> 
<div id="jt7son" style="display: block;" class="treemenudiv">            
<div class="treemenudiv" id="jt8"><img border="0" align="top" alt=" " class="imgs" src="pic/tree_space.png"><img border="0" align="top" alt=" " class="imgs" src="pic/tree_space.png"><img border="0" align="top" alt="|" class="imgs" src="pic/tree_vertline.png"><img border="0" align="top" alt="L" class="imgs" src="pic/tree_corner.png"><img border="0" align="top" alt="^" class="imgs" src="pic/tree_leaf.png"><a class="phplmnormal" id="jt8item" dao_id="38" onclick="javascript: { selectTreeNode( this ); }">Belveu</a> 
</div> 
</div> 
<div class="treemenudiv" id="jt9"><img border="0" align="top" alt=" " class="imgs" src="pic/tree_space.png"><img border="0" align="top" alt=" " class="imgs" src="pic/tree_space.png"><a onmousedown="toggletreemenu1( '9' );"><img border="0" align="top" alt=" " class="imgs" src="pic/tree_collapse.png" id="jt9node"></a><img border="0" align="top" alt="O" class="imgs" src="pic/tree_folder_open.png" id="jt9folder"><a class="phplmnormal" id="jt9item" dao_id="33" onclick="javascript: { selectTreeNode( this ); }">SA</a> 
</div> 
<div id="jt9son" style="display: block;" class="treemenudiv"> 
<div class="treemenudiv" id="jt10"><img border="0" align="top" alt=" " class="imgs" src="pic/tree_space.png"><img border="0" align="top" alt=" " class="imgs" src="pic/tree_space.png"><img border="0" align="top" alt="|" class="imgs" src="pic/tree_vertline.png"><img border="0" align="top" alt="L" class="imgs" src="pic/tree_corner.png"><img border="0" align="top" alt="^" class="imgs" src="pic/tree_leaf.png"><a class="phplmnormal" id="jt10item" dao_id="39" onclick="javascript: { selectTreeNode( this ); }">Control Room</a> 
</div> 
</div> 
<div class="treemenudiv" id="jt11"><img border="0" align="top" alt=" " class="imgs" src="pic/tree_space.png"><img border="0" align="top" alt=" " class="imgs" src="pic/tree_space.png"><img border="0" align="top" alt="T" class="imgs" src="pic/tree_split.png"><img border="0" align="top" alt="^" class="imgs" src="pic/tree_leaf.png"><a class="phplmnormal" id="jt11item" dao_id="41" onclick="javascript: { selectTreeNode( this ); }">Single</a> 
</div> 
<div class="treemenudiv" id="jt12"><img border="0" align="top" alt=" " class="imgs" src="pic/tree_space.png"><img border="0" align="top" alt=" " class="imgs" src="pic/tree_space.png"><a onmousedown="toggletreemenu1( '12' );"><img border="0" align="top" alt="+" class="imgs" src="pic/tree_collapse_corner.png" id="jt12node"></a><img border="0" align="top" alt="O" class="imgs" src="pic/tree_folder_open.png" id="jt12folder"><a class="phplmnormal" id="jt12item" dao_id="42" onclick="javascript: { selectTreeNode( this ); }">Double</a> 
</div> 
<div id="jt12son" style="display: block;" class="treemenudiv"> 
<div class="treemenudiv" id="jt13"><img border="0" align="top" alt=" " class="imgs" src="pic/tree_space.png"><img border="0" align="top" alt=" " class="imgs" src="pic/tree_space.png"><img border="0" align="top" alt=" " class="imgs" src="pic/tree_space.png"><img border="0" align="top" alt="L" class="imgs" src="pic/tree_corner.png"><img border="0" align="top" alt="^" class="imgs" src="pic/tree_leaf.png"><a class="phplmnormal" id="jt13item" dao_id="43" onclick="javascript: { selectTreeNode( this ); }">Foo</a> 
</div> 
</div> 
</div>
 
I think this should be about it, tree.htm is the file I put the html code you posted, GetNodes is recursive:
VB.NET:
Private Sub LoadTree()
    Dim web As New WebBrowser
    web.Navigate("about:blank")
    web.Document.Write(My.Computer.FileSystem.ReadAllText("tree.htm"))
    GetNodes(web.Document, web.Document.Body, TreeView1.Nodes)
End Sub

Private Sub GetNodes(ByVal doc As HtmlDocument, ByVal root As HtmlElement, ByVal nodes As TreeNodeCollection)
    Dim node As TreeNode = Nothing
    For Each child As HtmlElement In root.Children
        If child.TagName = "DIV" Then
            Dim id As String = child.GetAttribute("id")
            If id.Contains("son") Then 'div contains child items
                GetNodes(doc, child, node.Nodes)
            Else 'div is a menu item
                node = nodes.Add(doc.GetElementById(id & "item").InnerText)
            End If
        End If
    Next
End Sub
 
This is really one brilliant idea:). I have been looking for RegExp and string splitting and IndexOf nightmare. but this really does it. Especially the recursive part. I do appreciate your input.

One more help here though if its not too much. I wish to persist the treeview structure in database. So there needs to be two methods here. One to store the structure to database and another to retrieve it from database and draws the treeview.

I already have a table with the following fields in an attached SQL 2005 database file:

GroupID (Autonumber PK)
GroupName (varchar(20))


The challenge here as I see it is figuring out how to maintain a consistent parent-child relation in all levels regardless of depth. What do you suggest?
 
I've moved the thread into Data Access forum (the first part is too), let's see if someone catch on the database stuff of this.
 
Thanks John. I already started working out a solutions.

I appreciate if someone in this forum can help further in the database issue.
 

Latest posts

Back
Top