Question Get all RegEx matches

sollniss

Member
Joined
Aug 5, 2008
Messages
12
Programming Experience
3-5
I have got a sourcecode of a website. Now I want to get all links from it using RexEx a pattern like this.

\b([\d\w\.\/\+\-\?\:]*)((ht|f)tp(s|)\:\/\/|[\d\d\d|\d\d]\.[\d\d\d|\d\d]\.|www\.|\.tv|\.ac|\.com|\.edu|\.gov|\.int|\.mil|\.net|\.org|\.biz|\.info|\.name|\.pro|\.museum|\.co)([\d\w\.\/\%\+\-\=\&\?\:\\\"\'\,\|\~\;]*)\b

The found links should be returned in a List(of String).
 
Loop the Regex.Matches adding each Match.Value to the List.
VB.NET:
For Each m As Match In Regex.Matches(source, pattern)
    links.Add(m.Value) '"links" is the List
Next
 
VB.NET:
    Private Function GetLinks(ByVal strSource As String) As List(Of String)
        Dim exp As String = "^(http|https|ftp)\://(((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9][0-9]|[0-9])\.){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9][0-9]|[0-9])|([a-zA-Z0-9_\-\.])+\.(de|ru|fr|ch|com|net|org|edu|int|mil|gov|arpa|biz|aero|name|coop|info|pro|museum|uk|me))((:[a-zA-Z0-9]*)?/?([a-zA-Z0-9\-\._\?\,\'/\\\+&%\$#\=~])*)$"
        Dim strLinks As New List(Of String)

        For Each m As Match In Regex.Matches(strSource, exp)
            If strAllLinks.Contains(m.Value) = False And _
                   m.Value.Contains(m.Value) = False Then
                strLinks.Add(m.Value)
            End If
        Next

        Return strLinks
    End Function

It's not working. :/
 
m.Value.Contains(m.Value) = False ?? Well, if it's "not working" then it must be "not working" I guess. Joke aside, that logic is wrong.
 
Back
Top