Question Extracting Integers from an unpredictable string

Mrcoder

New member
Joined
Sep 13, 2013
Messages
2
Programming Experience
Beginner
Sorry new here,

i want to extract integers from a string which I can't predict, I don't know the best way about going about it for instance I know you can split but how do I know where to split if its a random string
Following the number will always be a "ab" and before the numbers will be a close tag as it is hyper text string

example random string: "furhhrudjdirhdudhrudehdugysetsgduhicfiffhnthuddhrb the buck ehrhebheb rhedhd ignored he urufnrufendheh rubrhhdndhfh >4.4 ab dedbehdjsnsbwhsh >10.1 ab didn't >1.2 ab"

in this example I'd want to extract {4.4, 10.1, 1.2}
(the numbers will never be more than 50)

What I think the best method to do is to somehow use positions and the ASCII of ab and the ASCII of > to isolate the numbers in between however there is multiple numbers so these would have to be moved to a list.

sorry if I haven't been clear enough! But hope you get the drift
 

IanRyder

Well-known member
Joined
Sep 9, 2012
Messages
1,130
Location
Healing, NE Lincs, UK
Programming Experience
10+
Hi,

I would suggest that RegEx is probably the way to go here. Here is a quick example which would match the numbers within the random string that you posted:-

Private Sub Button1_Click(sender As System.Object, e As System.EventArgs) Handles Button1.Click
  Dim myExpression As New Regex("(?<=>)(\d+|\.){1,}")
  Dim strUnpredictableString As String = "furhhrudjdirhdudhrudehdugysetsgduhicfiffhnthuddhr b the buck ehrhebheb rhedhd ignored he urufnrufendheh rubrhhdndhfh >4.4 ab dedbehdjsnsbwhsh >10.1 ab didn't >1.2 ab"
 
  'Here you can iterate the Matches collection to identify each match:-
  For Each currentMatch As Match In myExpression.Matches(strUnpredictableString)
    MsgBox(currentMatch.Value.ToString)
  Next
 
  'Or you can add all the Match Values to an arbitrary List of String with:-
  Dim myMatchList As List(Of String) = myExpression.Matches(strUnpredictableString).Cast(Of Match).Select(Function(x) x.Value.ToString).ToList
  For Each strValue As String In myMatchList
    MsgBox(strValue)
  Next
End Sub


There may by a more efficient RegEx expression that can be used but the one I have shown seems to do the job just fine.

Hope that helps.

Cheers,

Ian

BTW, Welcome to the Forum.
 

Mrcoder

New member
Joined
Sep 13, 2013
Messages
2
Programming Experience
Beginner
Thanks it has helped, the pattern you used I can't find anywhere online what it means,could you be kind enough to explain what each bit does from New Regex("") as I know \d+ means something ect.

and searching online says I can put boundaries to where it searches in between a string, so say I know the integers I need are between two words which will always be in the string e.g. Hello and bye how do you out that into an New Regex searches for integers only in between hello and bye then puts them into an array as really all I need is the 1st and 2nd integer between hello and bye.

thanks for help!
 

IanRyder

Well-known member
Joined
Sep 9, 2012
Messages
1,130
Location
Healing, NE Lincs, UK
Programming Experience
10+
Hi,

To explain the syntax I used for the RegEx expression Lets split this (?<=>)(\d+|\.){1,} into:-

1) (?<=>)
2) (\d+|\.)
3) {1,}

So:-

1) This expression is called a Positive Lookbehind in the form of (?<=YourString). This means to search a string and find, but do not select, YourString.

2) This expression is saying select either as many consecutive numbers as there are "\d+" or a period character "\.". The OR statement is defined with the Pipe character.

3) This expression is "to be Greedy". This means match the preceding expression, at least once, and as many times thereafter until no number or period characters are encountered.

Have a look here for a good tutorial on the Basic and Advanced RegEx Expressions.

Regular Expressions Reference - Basic Syntax
Regular Expression Reference - Advanced Syntax

Now that you have all that, have a go yourself with trying to come up with an expression which now accommodates your new question. To help with that, here is a good site which allows you to add a String that you want to interrogate and then test your RegEx expression, as you build it, to see whether it words or not:-

Rubular: a Ruby regular expression editor and tester

Hope that helps.

Cheers,

Ian
 

ident

Member
Joined
Apr 30, 2012
Messages
12
Location
Cambridge
Programming Experience
3-5
Just a small note Ian no need to cast Value a string. It's already a string.

OP if the values are set in the string then being greater then 50 makes no difference. They wont exists. We can be as lazy as "\d+\.?\d+"

Match a single digit 0 to 9 as many times as possible, giving back as needed (greedy)
Match the character . between zero and one times as many times (greedy)
Match a single digit 0 to 9 between one and unlimited times, as many times as possible, giving back as needed (greedy)

PHP:
Imports System.Text.RegularExpressions

Public Class Form1
    Private Sub Button1_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button1.Click
        Dim rx As New Regex("\d+\.?\d+")
        Dim value As String =  "furhhrudjdirhdudhrudehdugysetsgduhicfiffhnthuddhr b the buck ehrhebheb  rhedhd ignored he urufnrufendheh rubrhhdndhfh >44.4 ab  dedbehdjsnsbwhsh >10.1 ab didn't >1.2 ab 55"

        Array.ForEach(rx.Matches(value).Cast(Of Match).ToArray, AddressOf Debug.WriteLine)
    End Sub
End Class

a little less greedy but not complete

PHP:
Dim rx As New Regex("[1-9]\d*(\.\d{1,2})?")
 
Last edited:
Top Bottom