pull the html code of a website

cjohnson

Well-known member
Joined
Sep 21, 2006
Messages
63
Location
WV
Programming Experience
1-3
I need to write a VB.net program that can pull the html code of a website, make a change (i.e. username/password) and resubmit and get the site returned. I can get the html, but can anyone tell me how to resubmit the edited code and get the resulting page?

Thanks,
Chris
 
Display the webpage in WebBrowser control, access and modify it through the Document property, the page will change on the fly.
 
Display the webpage in WebBrowser control, access and modify it through the Document property, the page will change on the fly.


Thank you very much for the quick reply. I am trying this now. I have the webpage in the WebBrowser, but when I try to do anything with the Document property, I get an error - No instance the object. Can you tell me how to edit the html here?

Thanks again,
Chris
 
VB.NET:
Private Sub Button1_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button1.Click
    For Each anchor As HtmlElement In WebBrowser1.Document.GetElementsByTagName("a")
        anchor.SetAttribute("href", "http://www.vbdotnetforums.com")
        anchor.InnerHtml = "www.vbdotnetforums.com"
    Next
End Sub
 
Thanks again. You have been a huge help. I have one other question. This is what I am trying to do. I have the following code in the webpage:

VB.NET:
	<script type=text/javascript>
	function viewerIsLoggedIn() { return False; }
</script>

I would like to change the False to true. Here is the code I have:

VB.NET:
        If (WebBrowser.Document IsNot Nothing) Then
            For Each Anchor As HtmlElement In WebBrowser.Document.GetElementsByTagName("script")
            Next
        End If

Am I on the right track. Once I access the script anchors, I don't know what properties to view/edit to get the function.

Thanks very much,
Chris
 
There isn't any properties of the script content, it's just plain text. I would search the InnerText for keyword "function viewerIsLoggedIn" to make sure it was the correct script tag, then continue string search from that index to find and replace first "return False;" string with "return True;".

Do beware if this is not your own page, doing something like this with a public service would probably be classified as illegal "break and enter" by same rules as in physical world.
 
Thanks very much. The Innertext was blank, but the InnerHtml has what I was looking for. The Here is what I have:

VB.NET:
        If (WebBrowser.Document IsNot Nothing) Then
            For Each Anchor As HtmlElement In WebBrowser.Document.GetElementsByTagName("script")
                If Anchor.InnerHtml IsNot Nothing Then
                    If Anchor.InnerHtml.Contains("viewerIsLoggedIn") Then
                        MsgBox(Anchor.InnerHtml)
                        Dim Str As String = Anchor.InnerHtml.Replace("False", "True")
                        Anchor.InnerHtml = Str
                    End If
                End If
            Next
        End If

But I am getting the error "Property is not supported on this type of HtmlElement." Any ideas?

Thanks again!
 
I get the same error with such case, the DOM appears more limited for modification than I first thought, at least I find no way around it right now with the available methods of the element tree.

What you can do if there are no dependencies for the page is to grab the DocumentText, change and put it back, this will load the modified page as a local document.
 
Looked into this again because of another thread and figured it out, you have to set the "text" attribute on the script htmlelement to change the script source. I think the reason is Webbrowser uses the IHtmlScriptElement interface and not HtmlScriptElement class.
VB.NET:
For Each script As HtmlElement In Me.WebBrowser1.Document.GetElementsByTagName("script")
    If script.InnerHtml.Contains("viewerIsLoggedIn") Then
        Dim repl As String = script.InnerHtml.Replace("False", "True")
        script.SetAttribute("text", repl)
    End If
Next
 
Back
Top