How to extract text from string

amitguitarplayer

Active member
Joined
Jul 5, 2008
Messages
27
Programming Experience
5-10
Hi Guys,

I have some text below that is stored in a String str (see attached htmlsource.txt) how can i extract the text that is highlighted from the string (Latitude and Longitude) i have tried most of the string functions, but nothing seems to help !!

Pleaes Help (the text in the string is below)

HTML:
Latitude: 40.570742<br />Longitude: -74.337352<br />
 

Attachments

  • htmlsource.txt
    6 KB · Views: 23
Last edited by a moderator:
Not sure if this is specifically what you wanted, but why not use Substring combined with IndexOf() should allow you to grab the entire string from latitude to longitude.

As well, you might tried looking into HTMLDocument Class and the COM version of that. I've used the HTML DOM (as explained at w3cSchools.com) to parse many a number of html documents. from that html you list, it appears the latitude and Longitude are contained in the first <Table>, which is only one row and two columns, should be pretty quick to dissect with HTMlDocument.

(Granted for the HTMLDocument method to work you'll need a webbrowser component because for some IDIOTIC reason, VB.Net won't let you inherit HtmlDocument, nor will it allow you to create one manually. Sometimes all you need is an HTML parser, and leave it to vb to prevent that from being easy.)

However, if that is out of the picture, there are quite a few string extensions and functions that would allow for easy interpretation of that string. and given you have enough unique signifiers in the string you should be able to find what you are looking for:

VB.NET:
public function GetLatLong(byval St As String, byref Lat as Double, byref Lng as Double) as Boolean
   if Not String.IsNullorEmpty(st) then
     dim lat_start as integer = st.IndexOf("latitude:", StringComparison.OrdinalIgnoreCase)
     dim lng_start as integer = st.IndexOf("longitude:", StringComparison.OrdinalIgnoreCase)
     if lat_start > -1 andalso lng_start > -1 then
       dim lat_end as integer = st.indexOf("<br />", lat_start, StringComparison.OrdinalIgnoreCase)
       dim lng_end as integer = st.IndexOf("<br />, lng_start, StringComparison.OrdinalIgnoreCase)
       if lat_end > -1 and lng_end >-1 then
          dim sLat as string = st.substring(lat_start + 9, lat_end - (lat_start+9)).Trim()
          dim sLng as string = st.substring(lng_start + 10, lng_start - (lng_start + 10)).Trim()
          return Double.TryParse(sLat, Lat) andalso Double.TryParse(sLng, Lng)
       end if
     end if
   end if
   return false
end function
 
While harder to understand than the string parsing solution Jaden provided, Regular Expressions are great for this sort of thing.

This will achieve basically the same thing, pulling out the Long/Lat, but uses .NET's regex engine which should be more efficient. (Especially if your doing this across a lot of these strings, which I presume you are.)

VB.NET:
Dim myRegex As New Regex("(Latitude: |Longitude: )(.*?)<")

Dim myMatches As MatchCollection = myRegex.Matches(str)
  For Each myMatch As Match In myMatches
    If myMatch.ToString.StartsWith("Latitude: ") Then
      Dim latitude = myMatch.ToString.TrimEnd(CChar("<")).Substring(myMatch.ToString.IndexOf(" "))
    ElseIf myMatch.ToString.StartsWith("Longitude: ") Then
      Dim longitude = myMatch.ToString.TrimEnd(CChar("<")).Substring(myMatch.ToString.IndexOf(" "))
    Else
      'Bad Match
    End If
  Next
 
With the Regex example Raven65 posted you can also get the values from the Groups without need for string manipulation:
VB.NET:
For Each m As Match In myMatches
    Select Case m.Groups(1).Value
        Case "Latitude: "
            latitude = m.Groups(2).Value
        Case "Longitude: "
            longitude = m.Groups(2).Value
    End Select
Next
 
Back
Top