Downloading a web page and saving it as .txt

grags

Active member
Joined
Mar 26, 2005
Messages
26
Programming Experience
Beginner
Hi, title says it all really. Can someone please tell me how I connect to a webpage and download the HTML and save it as a .txt.

I need to download information from a table. But I wouldn't know where to begin doing it properly:confused:. So i'm going to try and filter through the html with a huge string manipulation program and get the values this way.

If i've not made myself clear please please ask me, and I will try to be more clear. Im using VB.Net 2003


Note to moderators: If i've posted this in the wrong forum please accept my apologies, and could it be moved to the correct forum. Thankyou :)
 
this should get you started:
VB.NET:
Dim web As New Net.WebClient
Dim bytes() As Byte = web.DownloadData("http://kkserv")
Dim text As String = System.Text.Encoding.Default.GetString(bytes)
System.IO have various classes you can use to save to file.

To get info from html elements it might be easier to use the ActiveX WebBrowser control to load the page, then use COM library MSHTML to work the document object model. You will find very much information about this if you search the internet because it has been used at least since VB6.
 
Thankyou.

i'm kinda new to this .net stuff. I'm used to doing everything myself when I used to use old Basic on the Commodore 64 and QBasic which I used for a number of years, Mainly just for fun. I have written a few short programs in VB.Net but I find I'm doing things the long way round when there's a simple command that does it for me. :eek:

Im not quite sure what you mean about this 'ActiveX WebBrowser control'?

Also I greatly appreciate the help :)
 
Ok this is my code so far...


gfPath is a Global Function that returns the application path.
cvsEliteMembers(1000) is a Class String Array.

VB.NET:
        Dim web As New Net.WebClient
        Dim lviT, lviJ, lviD, lviY, lviX As Integer
        Dim strLevel, strName As String
        Dim lvbOK As Boolean

        'Download level 1 at this pont for names only.
        strLevel = System.Text.Encoding.Default.GetString(web.DownloadData("http://www.the-elite.net/GE/stage1.htm"))

        FileOpen(1, gfPath() + "temp.txt", OpenMode.Output)
        'Get Names
        For lviT = 1 To Len(strLevel)
            If Len(strLevel) < 20 Then Exit For
            If Mid(strLevel, lviT, 16) = "<td><font color=" Then
                strLevel = Mid(strLevel, lviT + 16, Len(strLevel))
                For lviJ = 1 To Len(strLevel)
                    If Mid(strLevel, lviJ, 1) = ">" Then
                        strLevel = Mid(strLevel, lviJ + 1, Len(strLevel))
                        For lviD = 1 To Len(strLevel)
                            If Mid(strLevel, lviD, 1) = "<" Then
                                strName = Mid(strLevel, 1, lviD - 1)
                                lviT = 1
                                lvbOK = True
                                For lviY = 1 To 1000
                                    If cvsEliteMembers(lviY) = strName Then
                                        lvbOK = False
                                        Exit For
                                    End If
                                Next
                                If lvbOK Then
                                    lviX += 1
                                    cvsEliteMembers(lviX) = strName
                                    Write(1, strName)
                                End If
                                Exit For
                            End If
                        Next
                        Exit For
                    End If
                Next
            End If
        Next
        FileClose()

The program searches through the HTML for "<td><font color=" (I know shortly after is my first name.)
It then searches again for ">" (I know now that the next Character in the string is the begining of my first name.)
It then searches for "<" (Which I now have my first name = strName)
it's forever trimming the string so when it loops back to lviT it starts again.

As you can see it's very complex and probably the wrong way around what i'm trying to do... If you help me to an easier way to get the Names from http://www.the-elite.net/GE/stage1.htm. I would be in your debt for the rest of my living days :D

EDIT: You Might be thinking "If the code works, why do you still need our help???" If you look back to my previous posts, you'll see I have allready asked this question before. I actually wrote the program and it worked a treat. but after a couple of weeks the site changed the HTML slightly. This made my program useless :(
 
Last edited:
Back
Top