digitaldrew
Well-known member
- Joined
- Nov 10, 2012
- Messages
- 167
- Programming Experience
- Beginner
I have a page I'm trying to go through and get a bunch of URL's from. There are about 50 URLs that match my regex code and my regex seams to work properly, but for some reason does not get every URL inside the content. Out of the entire page it only pulls 2 (the first one from each section)...
A basic idea of the Content
My Code
and
Any help would be appreciated!
A basic idea of the Content
VB.NET:
<div class="fl" id="vListTop">
<div class="video"><a class="hRotator" href="urlhere"><div class="thumb_container" data-previewvideo="thumb"><img class="thumb" alt="here" src="source"><img class="hSprite" id="8094883" src="spacerGIF" sprite="JPGurl"></div><b>12:40</b><u>title</u></a><div class="hRate"><div class="fr">93%</div><div class="views-value">187,642</div></div></div><div class="video"><a class="hRotator" href="urlhere"><div class="thumb_container" data-previewvideo="thumb"><img class="thumb" alt="here" src="source"><img class="hSprite" id="8025312" src="spacerGIF" sprite="JPGurl"></div><b>13:17</b><u>title</u><div class="hSpriteHD"></div></a><div class="hRate"><div class="fr">98%</div><div class="views-value">69,747</div></div></div><div class="video"><a class="hRotator" href="urlhere"><div class="thumb_container" data-previewvideo="thumb"><img class="thumb" alt="here" src="source"><img class="hSprite" id="8139449" src="spacerGIF" sprite="JPGurl"></div><b>37:45</b><u>title</u><div class="hSpriteHD"></div></a><div class="hRate"><div class="fr">97%</div><div class="views-value">26,502</div></div></div><div class="clear"></div><div class="video"><a class="hRotator" href="urlhere"><div class="thumb_container" data-previewvideo="thumb"><img class="thumb" alt="here" src="source"><img class="hSprite" id="8096739" src="spacerGIF" sprite="JPGurl"></div><b>12:35</b><u>title</u><div class="hSpriteHD"></div></a><div class="hRate"><div class="fr">92%</div><div class="views-value">197,571</div></div></div><div class="clear"></div></div>
<div style="float: right;">
<div class="avdo " id="usefulInfoBlock">
<div class="frame"><div style="margin: 0px auto; width: 300px; height: 250px; overflow: hidden;"><iframe width="300" height="250" src="advertisement" frameborder="0" scrolling="no" sandbox="allow-same-origin allow-popups allow-forms allow-scripts"></iframe></div></div>
</div></div>
<div class="clear"></div>
<div class="video new-date"><div class="vDate">August 24, 2017</div><a class="hRotator" href="urlhere"><div class="thumb_container" data-previewvideo="thumb"><img class="thumb" alt="here" src="source"><img class="hSprite" id="8162046" src="spacerGIF" sprite="JPGurl"></div><b>01:20</b><u>title</u><div class="hSpriteHD"></div></a><div class="hRate"><div class="fr">100%</div><div class="views-value">1,236</div></div></div><div class="video"><a class="hRotator" href="urlhere"><div class="thumb_container" data-previewvideo="thumb"><img class="thumb" alt="here" src="source"><img class="hSprite" id="8152826" src="spacerGIF" sprite="JPGurl"></div><b>06:13</b><u>title</u><div class="hSpriteHD"></div></a><div class="hRate"><div class="fr">83%</div><div class="views-value">2,168</div></div></div><div class="video"><a class="hRotator" href="urlhere"><div class="thumb_container" data-previewvideo="thumb"><img class="thumb" alt="here" src="source"><img class="hSprite" id="8162395" src="spacerGIF" sprite="JPGurl"></div><b>08:21</b><u>title</u><div class="hSpriteHD"></div></a><div class="hRate"><div class="fr">100%</div><div class="views-value">4,368</div></div></div><div class="clear"></div><div class="pager"><table class="ac"><tbody><tr><td><div><a class="first" href="homepage" overicon="iconPagerPrevHover"><div class="icon iconPagerPrev"></div></div></td></tr></tbody></table></div></div>
</div>
</td></tr></tbody></table>
My Code
VB.NET:
Dim strReg As String
strReg = "<a\s+class\s*=\s*""hRotator""\s+href\s*=\s*""?([^"" >]+)""?>(.+)</a>"
Dim reg As New Regex(strReg, RegexOptions.IgnoreCase)
Dim m As Match = reg.Match(htmlContent)
While m.Success
MsgBox(m.Groups(1).Value)
m = m.NextMatch()
End While
and
VB.NET:
Dim input As String = htmlContent
For Each m As Match In Regex.Matches(input, "<a\s+class\s*=\s*""hRotator""\s+href\s*=\s*""?([^"" >]+)""?>(.+)</a>", RegexOptions.IgnoreCase And RegexOptions.IgnorePatternWhitespace And RegexOptions.Singleline)
MsgBox(m.Groups(1).Value)
Next
Any help would be appreciated!