Finding element represented by treeview node?

Joined
Jun 23, 2006
Messages
14
Programming Experience
1-3
Hi,
I am making a HTML Editor, and in it I have what I call an "Elements Treeview", which is basically just a TreeView view of your doocument, e.g. there will be a node for the body tag, and then child nodes for all tags inside that, then child nodes for all tags inside those, and so on. Now here comes the complicated bit. I want that when a user double clicks on a node, it will find the tag that node represents in the HTML source(which is displayed in a RichtextBox). Can anyone suggest how to do this?
 
Hi,
I am making a HTML Editor, and in it I have what I call an "Elements Treeview", which is basically just a TreeView view of your doocument, e.g. there will be a node for the body tag, and then child nodes for all tags inside that, then child nodes for all tags inside those, and so on. Now here comes the complicated bit. I want that when a user double clicks on a node, it will find the tag that node represents in the HTML source(which is displayed in a RichtextBox). Can anyone suggest how to do this?

How did you parse the document?
 
Basically, I load the HTML, which the user types into a RichTextBox, into WebBrowser1, and then use this code to show all the elements of WebBrowser1 on a TreeView:
VB.NET:
    Private Sub WebBrowser1_DocumentCompleted(ByVal sender As Object, _
ByVal e As System.Windows.Forms.WebBrowserDocumentCompletedEventArgs) Handles WebBrowser1.DocumentCompleted
        Dim objDoc As Object = Nothing
        objDoc = WebBrowser1.Document.DomDocument
        If objDoc.hasChildNodes Then
            EnumerateHTMLChildren(objDoc)
        End If
        TreeView1.CollapseAll()
        objDoc = Nothing
    End Sub
    Private Sub EnumerateHTMLChildren(ByRef objParent As Object, Optional ByRef objNode As TreeNode = Nothing)
        Dim m_intChildCount As Integer
        Dim objChild As Object = Nothing
        Dim objNewNode As TreeNode = Nothing
        For Each objChild In objParent.ChildNodes
            m_intChildCount += 1
            If Not objNode Is Nothing Then
                objNewNode = objNode.Nodes.Add("TAG" & m_intChildCount.ToString, objChild.nodeName)
            Else
                objNewNode = TreeView1.Nodes.Add("TAG" & m_intChildCount.ToString, objChild.nodeName)
            End If
            Application.DoEvents()
            If objChild.hasChildNodes Then
                EnumerateHTMLChildren(objChild, objNewNode)
            End If
        Next
        objChild = Nothing
        objNewNode = Nothing
    End Sub
What I want now is that when a user double-clicks on any node on the TreeView, my program will find and select the tag in the RichTextBox that that node represents(note that identifying the element by it's name or ID will not be reliable enough, some webpages contain elements with no ID).
 
quick and nasty hack I can think of is to store reference to the html node in the .Tag of the treenode. When the user doubleclicks, get the tag, turn it into an html dom node, get its outer html and search for that in the richtext box text.. (with string's index of)

THing i wanna know is, when do you update your treeview?
 
Oh, sorry about forgetting to mention that.
I was starting to try, but one thing I don't understand is:
store reference to the html node in the .Tag of the treenode
I don't understand what you mean by that. What do you mean by a "referece to the html node"?.
Also(and I don't mean to insult your idea(it's smarter than my no idea!), but I have often seen pages that would have the following text:
By using our service you will recieve these benefits:
<b>FREE</b> ....... for 1 year
<b>FREE</b> ....... for ever.
The problem being that a lot of pages will, in some way or another, have more than 1 of the same tag with the same outerHTML. Again, it's a great idea, and I'm sure that it'll be perfect once that problem is solved.:D
 
Things used in GUIs have, for a long time, had a .Tag property of type object - i.e. it can store anything you want.

If you have a TreeView holding 1000 nodes, then each of those node's Tag property can hold something. Note this is nothing to do with html tags


Try it:

TreeNode tn = myTreeView.AddNode("node key", "node text")
tn.Tag = New MyObject


in the clicked event:

Dim x as MyObject = DirectCast(e.Node.Tag, MyObject)


-

Youre right about the same outer html problem; there isnt a nice solution because youre using components that werent meant to be used for what youre using them for.
If you build the tree dynamically each time the user shows it then I'd probably be more tempted to parse the document myself, recording the character positions of each tag. Keeping this list up to date is a nightmare though, so you might not want to make it real-time

You could, however, build your own document object model. depends how far you want to go..
 
OK. I have set the .Tag of every node to the HTML Element, so in the above code I use "objNewNode.Tag = objChild". Now is there any way to juuse the webbrowser control to locate that element in the source code(i.e. where the tag begins?) Then it would be a simple matter of setting the RichTextBox's selectionstart to that number.
 
I don't think there is when you are using the webbrowser to create the tree. If you review the different elements innerhtml you will see that this is different from the source code, it's because this is the parsed DOM representation of that nodes child nodes. When the source text is parsed it is also usually modified to suit the object model, complete document and complete tags etc. You could as suggested parse the content and present your own node tree. Getting all the html tags is very easy, but there is some work to analyze to make one dimension into two. Here's the start, only getting all tags (input is the string of all html):
VB.NET:
Dim exp As String = "</?.+?/?>"
Dim tags As MatchCollection = Regex.Matches(input, exp, RegexOptions.Singleline)
For Each m As Match In tags
    MsgBox(m.Value)
Next
If you manage to do it remember to keep the index (m.Index).
 
I have created a sample project that parses markup code into some kind of DOM tree and display this in TreeView control. When you click treenode the (start)tag is selected in richtextbox.
 

Attachments

  • vbnet20-hps.zip
    14.7 KB · Views: 18
Now is there any way to juuse the webbrowser control to locate that element in the source code(i.e. where the tag begins?) Then it would be a simple matter of setting the RichTextBox's selectionstart to that number.

No, because that's not the way the web browser works. Like JohnH's code (probably), it reads in the HTML and turns it into an internally represented tree of nodes (like your treeview)
At this point it loses all concept of "character position" within the HTML stream because there is no HTML stream any more

Build your own parser, its quite simple using regular expressions! :D
 
Back
Top