How I parsed a really long email to collect ~20 values

jwcoleman87

Well-known member
Joined
Oct 4, 2014
Messages
124
Programming Experience
Beginner
VB.NET:
    Public Function GetDispatchInformation(data As String) As Dispatch
        Dim NewDispatch As New Dispatch
        Dim sb As New StringBuilder
        Dim wordDelimiters() As Char = New Char() {vbCrLf}
        Dim wordDelimiters1() As Char = New Char() {":"c}
        Dim counter As Integer = 1
        For Each word As String In data.Split(wordDelimiters, StringSplitOptions.None)
            If word.Length < 3 Then
                'nothing
            Else
                If word.Contains("You have been assigned dispatch number") Then
                    'parse dispatch
                ElseIf word.Contains("Service Administrator:") Then
                    NewDispatch.ServiceAdministrator = word.Split(wordDelimiters1, StringSplitOptions.None)(1).Trim.Trim(vbCrLf)
                ElseIf word.Contains("Purchased From:") Then
                    NewDispatch.Client = word.Split(wordDelimiters1, StringSplitOptions.None)(1).Trim.Trim(vbCrLf)
                ElseIf word.Contains("Model:") Then
                    NewDispatch.Model = word.Split(wordDelimiters1, StringSplitOptions.None)(1).Trim.Trim(vbCrLf)
                ElseIf word.Contains("Name:") Then
                    NewDispatch.Name = word.Split(wordDelimiters1, StringSplitOptions.None)(1).Trim.Trim(vbCrLf)
                ElseIf word.Contains("Address") Then
                    'parse address
                ElseIf word.Contains("Home Phone:") Then
                    NewDispatch.ConsumerHomePhone = word.Split(wordDelimiters1, StringSplitOptions.None)(1).Trim.Trim(vbCrLf)
                ElseIf word.Contains("Consumer Email:") Then
                    NewDispatch.ConsumerEmail = word.Split(wordDelimiters1, StringSplitOptions.None)(1).Trim.Trim(vbCrLf)
                ElseIf word.Contains("Problem Description:") Then
                    NewDispatch.ProblemDescription = word.Split(wordDelimiters1, StringSplitOptions.None)(1).Trim.Trim(vbCrLf)
                ElseIf word.Contains("Payment Type:") Then
                    NewDispatch.PaymentType = word.Split(wordDelimiters1, StringSplitOptions.None)(1).Trim.Trim(vbCrLf)
                ElseIf word.Contains("CRM Number:") Then
                    NewDispatch.CRMNumber = word.Split(wordDelimiters1, StringSplitOptions.None)(1).Trim.Trim(vbCrLf)
                ElseIf word.Contains("Authorization Number:") Then
                    NewDispatch.AuthorizationNumber = word.Split(wordDelimiters1, StringSplitOptions.None)(1).Trim.Trim(vbCrLf)
                ElseIf word.Contains("Authorization Amount:") Then
                    NewDispatch.AuthorizationAmount = word.Split(wordDelimiters1, StringSplitOptions.None)(1).Trim.Trim(vbCrLf)
                ElseIf word.Contains("Date of Purchase:") Then
                    NewDispatch.DateOfPurchase = DateTime.Parse(word.Split(wordDelimiters1, StringSplitOptions.None)(1).Trim.Trim(vbCrLf))
                ElseIf word.Contains("Brand:") Then
                    NewDispatch.Brand = word.Split(wordDelimiters1, StringSplitOptions.None)(1).Trim.Trim(vbCrLf)
                ElseIf word.Contains("Service Site:") Then
                    NewDispatch.ServiceSite = word.Split(wordDelimiters1, StringSplitOptions.None)(1).Trim.Trim(vbCrLf)
                ElseIf word.Contains("Mailing Label Method:") Then
                    NewDispatch.MailingLabelMethod = word.Split(wordDelimiters1, StringSplitOptions.None)(1).Trim.Trim(vbCrLf)
                ElseIf word.Contains("Entitlement1:") Then
                    NewDispatch.Entitlement1 = word.Split(wordDelimiters1, StringSplitOptions.None)(1).Trim.Trim(vbCrLf)
                ElseIf word.Contains("Entitlement2:") Then
                    NewDispatch.Entitlement2 = word.Split(wordDelimiters1, StringSplitOptions.None)(1).Trim.Trim(vbCrLf)
                ElseIf word.Contains("Entitlement3:") Then
                    NewDispatch.Entitlement3 = word.Split(wordDelimiters1, StringSplitOptions.None)(1).Trim.Trim(vbCrLf)
                ElseIf word.Contains("Entitlement4:") Then
                    NewDispatch.Entitlement4 = word.Split(wordDelimiters1, StringSplitOptions.None)(1).Trim.Trim(vbCrLf)
                ElseIf word.Contains("Entitlement5:") Then
                    NewDispatch.Entitlement5 = word.Split(wordDelimiters1, StringSplitOptions.None)(1).Trim.Trim(vbCrLf)
                ElseIf word.Contains("Entitlement6:") Then
                    NewDispatch.Entitlement6 = word.Split(wordDelimiters1, StringSplitOptions.None)(1).Trim.Trim(vbCrLf)
                ElseIf word.Contains("Entitlement7:") Then
                    NewDispatch.Entitlement7 = word.Split(wordDelimiters1, StringSplitOptions.None)(1).Trim.Trim(vbCrLf)
                End If
                counter = counter + 1
            End If


        Next
        Return NewDispatch
    End Function

The emails look like this:
VB.NET:
You have been assigned dispatch number # which has been automatically accepted.
To review and view entitlements this dispatch, log on at #, open the Dispatch Inbox from
the Quick Links shown on the Main Menu, select 'Accepted' from the dropdown
and click on the Search button.  This will bring up the list of accepted Dispatches.
 
Service Administrator: NEW WarLoc
Purchased From: WAL-MART.COM
 
Model: E E1-571-6607
Serial Number: NXM09AA03032918BBD3400
 
 
Name: #
Address: #
Home Phone: #
Consumer Email: #
Problem Description: SCREEN IS CRACKED
Special Instructions: Cracked Screen
Payment Type: Service Contract
CRM Number: #
Authorization Number: #
Authorization Amount: 504.00
Date of Purchase: 08-NOV-2013
Brand: ACER001
Service Site: Depot
Mailing Label Method: MailContract Covers:
Entitlement1: EXCLUSION - FAILURES NOT COVERED UNDER PLAN - COMMERCIAL USE OF PRODUCT  PRODUCTS USED IN A BUSINESS ENVIRONMENT OR PRODUCTS SPECIFICALLY DESIGNED FOR NON-RESIDENTIAL USE.
Entitlement2: COVERED FAILURE - DAMAGE CAUSED BY POWER SURGE COVERED FROM DATE OF PURCHASE  PRIMARY COVERAGE BEGINS DATE OF PURCHASE
Entitlement3: COVERED FAILURE - ACCIDENTAL DAMAGE FROM SPILLS AND DROPS IS COVERED FROM DOP  LAPTOPS ONLY COVER ACCIDENTAL DAMAGE, SUCH AS SPILLS, DROPS FROM DOP
Entitlement4: DOP ENDORSEMENT  ADH-LAPTOPS
Entitlement5: PARTS UNDER  NEW
Entitlement6: LABOR UNDER  NEW
Entitlement7: EXCEPTIONAL PARTS UNDER  NA
Entitlement8: EXCEPTIONAL LABOR UNDER  NA
Out for Repair Number: NULL
 
Follow this link to view the dispatch: #

Is there a better way to do this?

Open for discussion/comments/advice.

Thanks,

Jonathan Coleman
 
Hi,

Here are some comments for you:-

1) Turn Option Strict On and always keep it on. This will help you to avoid and correct type conversion errors as you code. This may seem trivial right now but as you code more you will find that this will help you to avoid unexpected errors in the future.
2) When Parsing large data strings Regex can often be quite useful but there is nothing wrong with the technique you are using here.
3) Get rid of the StringBuilder variable since you never use it.
4) Get rid of the Counter variable. Other than incrementing that variable you do not actually use it. At the same time a more concise way to increment a variable like that is to say “counter+=1”.
5) I would personally say that your variable names need some refinement. Variables names should describe, as best as possible, the contents of that variable and your use of “For Each word As String” is misleading since you are not splitting the data into Words you are splitting the data into Lines of Text which contain multiple words.
6) I then think that you have misunderstood what that Length Property is telling you since you then use “If word.Length < 3 Then” which sort of tells me that you only want to do something if there are 3 Words or More in the string. This is incorrect since the Length Property tells you the numbers of Characters in the string.
7) Continuing on with Point 5, and if that If Statement is still needed, then that Statement could be started better by saying “If Word.Length >=3 then”. You would then not need to add that Nothing Comment which is unnecessary.
8) Finally, get rid of the Last Trim Statement “Trim(vbCRLF)” in every routine. This is not needed since you have forgotten that you have already eradicated these characters by the use of the Split Method to split the data by each line using the CR/LF character.

Hope that helps.

Cheers,

Ian
 
Some code is repeated, which means you can refactor it. For example you can take the code that parses the value and put it in either a lambda or a regular function:
Dim getValue = Function(word) word.Split(wordDelimiters1, StringSplitOptions.None)(1).Trim()

Then you can replace all these repeating codes with for example:
NewDispatch.ServiceAdministrator = getValue(word)

I agree with IanRyder especially about the "word" variable name, I would probably name that variable "line" since that is what that part of string actually represents.

Function GetDispatchInformation is something I would make Shared and put in Dispatch class, as a factory method. The only purpose of this method is to produce a Dispatch instance by parsing a string. As such I would also name it Parse or Create or the like.
 
Hi Ian and John, thank you for your suggestions. I will be making some changes today. I will post some updates when it is all cleaned up.

Thanks,

Jonathan Coleman
 
Firstly, option strict on has created 115 errors in my program. This may take me a little time to resolve.

For science!
 
Hi,

Sorry about that but the sooner you deal with those 115 Option Strict issues then the sooner you get rid of 115 potential errors just waiting to happen and bite you in the bum.

Good Luck and don’t forget to always keep Option Strict On from now on.

Cheers,

Ian
 
This is how I would do it through reflection, code is a bit cleaner and more scalable:

VB.NET:
Public Class Dispatch

    Public Property ServiceAdministrator As String
    Public Property PurchasedFrom As String
    Public Property Model As String
    Public Property Name As String
    Public Property Address As String
    Public Property SerialNumber As String
    Public Property HomePhone As String
    Public Property ConsumerEmail As String
    Public Property ProblemDescription As String
    Public Property SpecialInstructions As String
    Public Property PaymentType As String
    Public Property CRMNumber As String
    Public Property AuthorizationNumber As String
    Public Property AuthorizationAmount As String
    Public Property DateofPurchase As Date
    Public Property Brand As String
    Public Property ServiceSite As String
    Public Property MailingLabelMethod As String
    Public Property Entitlement1 As String
    Public Property Entitlement2 As String
    Public Property Entitlement3 As String
    Public Property Entitlement4 As String
    Public Property Entitlement5 As String
    Public Property Entitlement6 As String
    Public Property Entitlement7 As String
    Public Property Entitlement8 As String
    Public Property OutforRepairNumber As String


    Public Shared Function FromText(ByVal text As String) As Dispatch
        Dim newDispatch As New Dispatch
        Dim thisType = newDispatch.GetType()

        Dim lines = text.Split(vbCrLf)

        For Each line In lines
            Dim colonPos = line.IndexOf(":")
            If colonPos > 1 Then
                Dim propName = line.Split(":").First.Replace(" ", "").Trim
                Dim prop = thisType.GetProperty(propName)
                If prop IsNot Nothing Then
                    If prop.PropertyType = GetType(Date) Then
                        prop.SetValue(newDispatch,
                                      DateTime.ParseExact(line.Substring(colonPos + 1).Trim,
                                                          "dd-MMM-yyyy",
                                                          Globalization.CultureInfo.InvariantCulture),
                                      Nothing)
                    Else
                        prop.SetValue(newDispatch,
                                      line.Substring(colonPos + 1).Trim,
                                      Nothing)
                    End If
                End If
            End If
        Next

        Return newDispatch
    End Function

End Class


And you can call it like this:

VB.NET:
        Dim email = File.ReadAllText("c:\sourceemail.txt")

        Dim d = Dispatch.FromText(email)

And the result:
55d8a1b661f17.png


Note that I had to rename some of your properties for this to work, and as is the custom "deserializer" only supports strings and dates. The advantage this has is if you add a field to the email all you need to do is add a property to the Dispatch class, no need to mess around the deserializer code.
 
Last edited:
Herman wow, just wow!

This is definitely a cleaner and more scalar solution!

Question, what if my dispatch object was a Linq object, could reflection still be used to create this object? See this:

TTDispatch Table.png

Once instantiated, all of the properties I need are in this table, and have already been objectified by LINQ. Question is, how do I add a method like the one you described using reflection to instantiate this database object?

I can do something like this:

dim d = TTDispatch.FromText(Email.Body)

Can I add methods to linq tables like this?

Am I on the right track:

VB.NET:
Partial Class TTDispatch


    Public Shared Function FromText(ByVal text As String) As Dispatch
        Dim newDispatch As New TTDispatch
        Dim thisType = newDispatch.GetType()


        Dim lines = text.Split(vbCrLf)


        For Each line In lines
            Dim colonPos = line.IndexOf(":")
            If colonPos > 1 Then
                Dim propName = line.Split(":").First.Replace(" ", "").Trim
                Dim prop = thisType.GetProperty(propName)
                If prop IsNot Nothing Then
                    If prop.PropertyType = GetType(Date) Then
                        prop.SetValue(newDispatch,
                                      DateTime.ParseExact(line.Substring(colonPos + 1).Trim,
                                                          "dd-MMM-yyyy",
                                                          Globalization.CultureInfo.InvariantCulture),
                                      Nothing)
                    Else
                        prop.SetValue(newDispatch,
                                      line.Substring(colonPos + 1).Trim,
                                      Nothing)
                    End If
                End If
            End If
        Next


        Return newDispatch
    End Function
End Class
 
You can add an extension method instead, in a separate module. Create a module called TTDispatchExtensions, and put this in:

Imports System.Runtime.CompilerServices

Public Module TTDispatchExtensions

    <Extension()> _
    Public Function FromText(byVal extInstance As TTDispatch, ByVal text As String) As TTDispatch
        Dim thisType = extInstance.GetType()
        Dim lines = text.Split(vbCrLf)
 
        For Each line In lines
            Dim colonPos = line.IndexOf(":")
            If colonPos > 1 Then
                Dim propName = line.Split(":").First.Replace(" ", "").Trim
                Dim prop = thisType.GetProperty(propName)
                If prop IsNot Nothing Then
                    If prop.PropertyType = GetType(Date) Then
                        prop.SetValue(extInstance,
                                      DateTime.ParseExact(line.Substring(colonPos + 1).Trim,
                                                          "dd-MMM-yyyy",
                                                          Globalization.CultureInfo.InvariantCulture),
                                      Nothing)
                    Else
                        prop.SetValue(extInstance,
                                      line.Substring(colonPos + 1).Trim,
                                      Nothing)
                    End If
                End If
            End If
        Next
 
        Return extInstance
    End Function

End Module


Compile this in your app and your TTDispatch class will now have a .FromText method.
 
It works!!!

So I created the extension module as you described:

VB.NET:
Imports System.Runtime.CompilerServices


Public Module TTDispatchExtensions


    <Extension()> _
    Public Function FromText(ByVal extInstance As TTDispatch, ByVal text As String) As TTDispatch
        Dim thisType = extInstance.GetType()
        Dim lines = text.Split(vbCrLf)


        For Each line In lines
            Dim colonPos = line.IndexOf(":")
            If colonPos > 1 Then
                Dim propName = line.Split(":").First.Replace(" ", "").Trim
                Dim prop = thisType.GetProperty(propName)
                If prop IsNot Nothing Then
                    If prop.PropertyType = GetType(Date) Then
                        prop.SetValue(extInstance,
                                      DateTime.ParseExact(line.Substring(colonPos + 1).Trim,
                                                          "dd-MMM-yyyy",
                                                          Globalization.CultureInfo.InvariantCulture),
                                      Nothing)
                    Else
                        prop.SetValue(extInstance,
                                      line.Substring(colonPos + 1).Trim,
                                      Nothing)
                    End If
                End If
            End If
        Next


        Return extInstance
    End Function


End Module

Then I implemented it like this:

VB.NET:
    'Function integrates dispatch information with production entities
    Protected Function IntegrateDispatchNotification(Mail As MailItem, Type As String) As Boolean
        If Type Is "Dispatch Received" Then
            Dim NewDispatch As New Teletrack.TTDispatch
            Dim d = Teletrack.TTDispatchExtensions.FromText(NewDispatch, Mail.Body)


            If EscalationControl.InsertNewDispatch(d) Then
                Return True
            Else
                Return False
            End If
        End If
        Return False
    End Function

And the results!

It works!.png

Thank you very much for your excellent responses!!!

This is much better, now I can have one object which defines a dispatch and have much less code to deal with if changes need to be made.

Awesome!!
 
We have now recorded over 80 entries into the database since the time I posted "It Works" above. Thanks very much. I will keep you all posted on any awesomeness which will end up coming out of this.
 
Today when I came in there were over 300 dispatches recorded in the database. This seems to be an extremely effective solution. In case anyone was wondering this is in the add-in dev forum because I use linq to DASL to read the outlook data file and other outlook handlers to capture emails to get this information into a database.
 
I know this is kinda old but I just noticed something:

    'Function integrates dispatch information with production entities
    Protected Function IntegrateDispatchNotification(Mail As MailItem, Type As String) As Boolean
        If Type Is "Dispatch Received" Then
            Dim NewDispatch As New Teletrack.TTDispatch
            Dim d = Teletrack.TTDispatchExtensions.FromText(NewDispatch, Mail.Body)


            If EscalationControl.InsertNewDispatch(d) Then
                Return True
            Else
                Return False
            End If
        End If
        Return False
    End Function


With an extension method you don't actually have to pass the first parameter. You don't call the actual module either, the whole point is that the module adds a new method to the object. Also kind of a goof on me, I didn't declare the extension method as Shared above, so if you do instead you can call it like this, like a proper factory method:

Dim d = Teletrack.TTDispatch.FromText(Mail.Body)
 
Back
Top