parsing html table?

dallas23

New member
Joined
Jun 22, 2013
Messages
2
Programming Experience
Beginner
Does anyone know how do extract the information and place in datagrid view in vb 2008-2010?

Dim regRegExStr As String = "<td>.*?</td>"

Dim Expressions As New List(Of String)


Expressions.Add(regRegExStr)


For Each s As String In Expressions


Dim reg As New Regex(s)
Dim m As Match = reg.Match(WebBrowser1.DocumentText)


If m.Value.Trim <> "" Then
Me.ListBox1.Items.Add( _
Regex.Replace(m.Value, "<(.|\s|\r\n)+?>", String.Empty))
End If


Next



End Sub



HTML:
<table id="history_page_table_browse_call" class="table table-bordered"><thead><tr><th>Data</th><th>Chiamante</th><th>Chiamante nascosto</th><th>E-Mail</th><th>SMS</th><th></th></tr></thead><tbody><tr><td>2013-06-21 19:08:59 </td><td>39070720811</td><td>39070720811</td></table>
 
Hi,

You can use the GetElementsByTagName method of a HTML Document to get all the Tags named "tr". This will create a collection of Elements of Type HtmlElement. You can then loop through this collection of Elements and again use the GetElementsByTagName to get a collection of Tags named "td". Then, using a final loop to iterate this collection of TD Elements, the InnerText Property of the resulting Element will then return your Data string.

By doing things this way you effectively parse the HTML table one Record at a time and extract the Data Fields one Field at a time. You can then do anything you want with the extracted data fields.

Have a look at this for more information:-

HtmlDocument.GetElementsByTagName Method (System.Windows.Forms)

Hope that helps.

Cheers,

Ian
 
ok I tried it from an example

Dim Data(2) As String Dim Index As Integer = 0
Dim Output As Boolean = False
textbox1.text = ""
For Each ELement As System.Windows.Forms.HtmlElement In WebBrowser1.Document.All
If UCase(ELement.TagName.ToString).Contains("TD") And Output = True Then
Data(Index) = ELement.InnerText
Index = 1
ElseIf UCase(ELement.TagName.ToString).Contains("TR") And Output = True Then
TextBox1.Text = TextBox1.Text & Data(0) & "," &
Data(1) & "," &
Data(2) & vbNewLine
Index = 0
End If


Output = True


Next

this is the table
Cattura.PNG

The Output is:
,,,,
2013-06-21 19:08:59 ,Rimuovi,
2013-06-20 02:24:12 ,Rimuovi,
2013-06-14 21:59:30 ,Rimuovi,
2013-06-14 20:01:26 ,Rimuovi,
2013-06-14 11:26:28 ,Rimuovi,
2013-06-12 19:51:14 ,Rimuovi,
2013-06-12 18:57:22 ,Rimuovi,
2013-06-11 09:43:20 ,Rimuovi,
2013-06-03 11:21:28 ,Rimuovi,

Please help me..
 

Attachments

  • Cattura.PNG
    Cattura.PNG
    20.8 KB · Views: 79
Hi,

I am not really sure what else to say. I have told you how to do this and your graphic illustration confirms that the suggestion will work well if you follow the advice given. The only thing to add is that you only want the first 3 elements in the collection of "td" tags in the second loop.

Hope that helps.

Cheers,

Ian
 
Back
Top