The problem of HttpWebRequest in .Net 2.0

kpao

Active member
Joined
Apr 3, 2006
Messages
29
Programming Experience
Beginner
I'm developing the website that can capture the html source from other website. The following is the fragment code of my web:
VB.NET:
    Protected Sub btnUpdate_Click(ByVal sender As Object, ByVal e As System.EventArgs) Handles btnUpdate.Click
        Dim myReq As HttpWebRequest = CType(WebRequest.Create("http://www.xml.org/xml/news/archives/archive.02272006.shtml"), HttpWebRequest)
        Dim reader As StreamReader = New StreamReader(myReq.GetResponse.GetResponseStream())
        Dim output As String = Nothing
        While reader.ReadLine() <> Nothing
            output += reader.ReadLine()
        End While
        txtOutput.Text = output


    End Sub

The above code run in Asp.net 2.0 The above code can capture the html source. However, it can not capture the html <body>from this website.

And I copy the above code to ASP.Net 1.1, It can capture the whole html source.

Why the HttpWebRequest cannot retrieve the whole html source?

("http://www.xml.org/xml/news/archives/archive.02272006.shtml")
 
There are two critical problems with your code:
  • First, each call to streamreader.ReadLine does read the line and you only catch every other line read.
  • Secondly, you use "<>" value operator which causes Nothing to translate to String.Empty ("") so if your code encounter an empty line the loop stops too. If you had used the "IsNot" reference operator you could have distinguished between actual Nothing (end of stream) and empty lines. There is also a EndOfStream property.
Using "+=" operator for string concatenation could be a pitfall, it's highly unlikely that it will happen here, but it's just bad programming practice to use for strings (if both operands where numbers they would add up, not combine as strings). The correct operator to use for strings is "&=". StringBuilder is also better for building strings, especially many lines like here.

You also just add string lines without adding linebreaks, since you output to a UI element (a textbox) it would be plausible to assume the user would want to see the source code properly formatted. So you could have added vbNewLine string constant in between. The code sample below use AppendLine method which adds the linebreaks.

Here is a working code for the loop:
VB.NET:
Dim sb As New System.Text.StringBuilder
While Not reader.EndOfStream
    sb.AppendLine(reader.ReadLine)
End While
txtOutput.Text = sb.ToString
stringbuilder.AppendLine method is new in .Net 2, if you have to go .Net 1 use:
sb.Append(reader.ReadLine & vbNewline)
 
Back
Top