Reading a text file, and writing back to a text file

rituhooda

Member
Joined
Mar 10, 2010
Messages
13
Programming Experience
Beginner
TEstFileFormat.pngOriginal File


This is the code I started with
Using sr As New StreamReader("C:\Imports\Output\Output.txt"),
sw As New StreamWriter("C:\Imports\Output\FormatOutput.txt")
Dim account As String = ""
Dim FundID as String = ""
Dim Transection as String = ""
Dim RecordDate as Date
Dim ReinvestDate as Date
Dim TestID as String
For Each line As String In sr.ReadToEnd.Split(Environment.NewLine)
If line.Trim.StartsWith("Given") Then
sw.WriteLine(String.Format("{0}|{1}|{2}", elements(0), elements(1), elements(2) ))

End If
Next
End Using

My Questions are
In my first set of record the record date and reinvest date are in next line, and not with record. How to get all that inforamtion together.
I am trying to get all that as one record as the output file.
Any help is appreciated to get started
 
Last edited:
Hi,

It's difficult to give you a straight answer here since your code does not really make any sense based on the example output file that you have supplied with the dates on a new line.

Have you omitted sections of the code here since you make reference to objects which are not defined. i.e. elements(0) etc?

It seems like your obvious issue is the line of code:-

sw.WriteLine(String.Format("{0}|{1}|{2}", elements(0), elements(1), elements(2) ))

This is due to the fact that you are only writing the same three string elements for each line of the input file to the output file but again the code does not make sense since how are you then even getting any dates in your output file??

If you can post a sample of the input file along with the full code that is being used we may be able to help you better.

Cheers,

Ian
 
Attached input and output file

Thanks for the reply Ian...I am attaching seperate files as input and output. I actually am trying to figure out what is the best way to handle this file.Just format it and rewrite it. I tried to start this code.I am not sure how it will take the dates in the newline.
Imports System.IO
Using sr As New StreamReader("\\navigator\Imports\Output\Output.txt"),
sw As New StreamWriter("\\navigator\Imports\Output\FormatOutput.txt")
For Each line As String In sr.ReadToEnd.Split(Environment.NewLine)

If line.Trim.StartsWith("Given") Then
Dim elements() As String = line.Split({" "}, StringSplitOptions.RemoveEmptyEntries)
sw.WriteLine(String.Format("{0}|{1}|{2}|{3}|{4}", elements(0), elements(1), elements(2),elements(3),elements(4) ))

Next
End Using

Thanks Again...
 

Attachments

  • InputFile.txt
    809 bytes · Views: 30
  • OutPutFile.txt
    636 bytes · Views: 22
Hi Rituhooda,

You owe me a nice cold beer one day.

Once you had uploaded the input file to compare with the output file I realised what you were trying to achieve. That being a file conversion utility from one file type to another.

The input file is not really a friendly file to work with since it has got multiple header records with multiple detail records that you needed to merge. In addition to this the reason that your dates were on a separate line is because they were on a separate line in the input file - but not all the time.

It would have taken an age to try and teach you how to write custom file conversion routines so I have done this one for you so you can see how I have done it and then use these techniques in the future.

The most important thing that you have to remember is that if the Input file ever deviates from the structure that it is in now then this conversion routine will fail unless the code is maintained to accommodate any changes to the file.

You can change the format of the output as you need to.

Copy and paste all the code below to a new form and then change the input and output files to your own and then run.

Post back if you have any questions on the code.

Good Luck and cheers,

Ian

VB.NET:
Imports System.IO
 
Public Class Form1
  Private FormattedAccountDetails As New List(Of AccountRecord)
  Dim ConversionFailure As New Boolean
 
  Private Sub Button1_Click(sender As System.Object, e As System.EventArgs) Handles Button1.Click
    Dim NewAccountRecognised As Boolean
    Dim NewDetailRecordsStarted As New Boolean
    Dim MyHeaderRecord As New AccountRecord
 
    'add your own files back to these read and write streams
    Dim srReader As New StreamReader("d:\temp\InputFile.txt")
 
    For Each strBuffer As String In srReader.ReadToEnd.Split(Environment.NewLine)
      Dim strBufferElements() As String = strBuffer.Trim.Split({"  "}, StringSplitOptions.RemoveEmptyEntries)
      If strBufferElements.Count > 0 Then
        If strBufferElements(0).Trim.StartsWith("Given") Then
          'New account is recognised so the next read of the file is the actual details
          NewAccountRecognised = True
        ElseIf NewAccountRecognised Then
          'This is the new account details section
          MyHeaderRecord = CreateHeaderAccount(strBufferElements)
          NewAccountRecognised = False
          NewDetailRecordsStarted = False
        ElseIf strBufferElements.Count = 2 Then
          'this has to be the header dates on a new line
          MyHeaderRecord = AddDatesToHeaderAccount(MyHeaderRecord, strBufferElements)
        ElseIf strBufferElements(0).Trim.StartsWith("Test") Then
          'this is the start of the detail records for each given header account record
          NewDetailRecordsStarted = True
        ElseIf strBufferElements.Count = 1 Then
          'ignore this element since the is the start of the file
        ElseIf NewDetailRecordsStarted Then
          'this is a detail record for each given header account record
          Dim DetailRec As New AccountRecord
          DetailRec = MyHeaderRecord.Clone
          DetailRec = AddAccountDetails(DetailRec, strBufferElements)
          FormattedAccountDetails.Add(DetailRec)
        Else
          MsgBox("Error In File Conversion - Check Input File Structure!")
          ConversionFailure = True
          Exit For
        End If
      End If
    Next
 
    If Not ConversionFailure Then
      Dim swWriter As New StreamWriter("d:\temp\OutPutFile.txt", False)
      Dim HR As AccountRecord = CreateHeaderRecordForFile()
 
      swWriter.WriteLine(String.Format("{0}|{1}|{2}|{3}|{4}|{5}|{6}|{7}|{8}", HR.AccountNo, HR.FundID, HR.TransationType, HR.RecordDate, HR.ReinvestDate, HR.TestID, HR.Balance, HR.BalanceRatio, HR.Units))
      For Each AR As AccountRecord In FormattedAccountDetails
        'to show an example of the output add a textbox with multiline true and uncomment the code below
        'TextBox1.Text += String.Format("{0}|{1}|{2}|{3}|{4}|{5}|{6}|{7}|{8}", AR.AccountNo, AR.FundID, AR.TransationType, AR.RecordDate, AR.ReinvestDate, AR.TestID, AR.Balance, AR.BalanceRatio, AR.Units) & vbCrLf
        swWriter.WriteLine(String.Format("{0}|{1}|{2}|{3}|{4}|{5}|{6}|{7}|{8}", AR.AccountNo, AR.FundID, AR.TransationType, AR.RecordDate, AR.ReinvestDate, AR.TestID, AR.Balance, AR.BalanceRatio, AR.Units))
      Next
      swWriter.Close()
      MsgBox("File Conversion Completed Successfully!")
    End If
  End Sub
 
  Private Function CreateHeaderAccount(ByVal strAccountDetails() As String) As AccountRecord
    Dim NewAccountRecord As New AccountRecord
    Select Case strAccountDetails.Count
      Case 3
        With NewAccountRecord
          .AccountNo = strAccountDetails(0).Trim
          .FundID = strAccountDetails(1).Trim
          .TransationType = strAccountDetails(2).Trim
        End With
      Case 5
        With NewAccountRecord
          .AccountNo = strAccountDetails(0).Trim
          .FundID = strAccountDetails(1).Trim
          .TransationType = strAccountDetails(2).Trim
          .RecordDate = strAccountDetails(3).Trim
          .ReinvestDate = strAccountDetails(4).Trim
        End With
      Case Else
        MsgBox("Error In File Conversion - Check Input File Structure!")
        ConversionFailure = True
    End Select
    Return NewAccountRecord
  End Function
 
  Private Function AddDatesToHeaderAccount(ByVal AccountHeader As AccountRecord, ByVal strAccountDetails() As String) As AccountRecord
    Select Case strAccountDetails.Count
      Case 2
        With AccountHeader
          .RecordDate = strAccountDetails(0).Trim
          .ReinvestDate = strAccountDetails(1).Trim
        End With
      Case Else
        MsgBox("Error In File Conversion - Check Input File Structure!")
        ConversionFailure = True
    End Select
    Return AccountHeader
  End Function
 
  Private Function AddAccountDetails(ByVal MyDetailRecord As AccountRecord, strAccountDetails() As String) As AccountRecord
    Select Case strAccountDetails.Count
      Case 4
        With MyDetailRecord
          .TestID = strAccountDetails(0).Trim
          .Balance = strAccountDetails(1).Trim
          .BalanceRatio = strAccountDetails(2).Trim
          .Units = strAccountDetails(3).Trim
        End With
      Case Else
        MsgBox("Error In File Conversion - Check Input File Structure!")
        ConversionFailure = True
    End Select
    Return MyDetailRecord
  End Function
 
  Private Function CreateHeaderRecordForFile() As AccountRecord
    Dim HF As New AccountRecord
 
    With HF
      .AccountNo = "Given Account #"
      .FundID = "Fund ID"
      .TransationType = "Transaction"
      .RecordDate = "Record Date"
      .ReinvestDate = "Reinvest Date"
      .TestID = "TestID"
      .Balance = "Balance"
      .BalanceRatio = "Balance Ratio"
      .Units = "Units"
    End With
    Return HF
  End Function
 
  Private Class AccountRecord
    Public Property AccountNo As String
    Public Property FundID As String
    Public Property TransationType As String
    Public Property RecordDate As String
    Public Property ReinvestDate As String
    Public Property TestID As String
    Public Property Balance As String
    Public Property BalanceRatio As String
    Public Property Units As String
 
    Public Function Clone() As AccountRecord
      Return DirectCast(Me.MemberwiseClone(), AccountRecord)
    End Function
  End Class
End Class
 
Why is the header "Given_Account" (one space) one place and "Given__Account" (two spaces) another place? Seems weird a computer system would generate different texts for same header.
 
John.Thanks for pointing that out...It's not the computer, its humann error. There is lot more involved in the original file. I was able to handle other things in SSIS, but the dates in the new line piece was a puzzle for me. I haven't tried it yet. Might be able to try over the weekend. Thanks. :)
 
Hi JohnH, Rituhooda,

I must admit I had already got to that conclusion yesterday but it was good to get confirmation anyway.

Good luck over the weekend Rituhooda and let us know how it goes.

Cheers,

Ian
 
Ignoring that human error I would also utilize that each field is separated by two or more spaces, but use Regex.Split to get the fields. Regex can handle string patterns with variable length spaces, so no need to String.Split+Trim. I would also output to a TAB delimited file, which is easy to open in Excel or other spreadsheet application. This is my suggestion based on the corrected InputFile.txt:
        Dim input As New IO.StreamReader("InputFile.txt")
        Dim output As New IO.StreamWriter("OutputFile.txt")
        Dim headers = {"Given Account #", "Fund ID", "Transaction", "Record Date", "Reinvest Date", "TestId", "Balance", "Balance Ratio", "Units"}
        output.WriteLine(String.Join(vbTab, headers))
        Dim account() As String = Nothing 'keep account info here for multiple testid lines
        Dim readAccount As Boolean = False
        Do
            Dim line = input.ReadLine
            If line.Length = 0 OrElse line.StartsWith("Column") Then
                Continue Do
            ElseIf line.StartsWith("Given Account") Then
                readAccount = True
            ElseIf line.StartsWith("TestId") Then
                readAccount = False
            ElseIf readAccount Then
                If input.Peek = 32 Then 'space=record over two lines
                    line &= input.ReadLine
                End If
                account = Regex.Split(line, " {2,}")
            Else
                Dim testidFields = Regex.Split(line, " {2,}")
                Dim combined As New List(Of String)
                combined.AddRange(account)
                combined.AddRange(testidFields)
                combined.RemoveAt(combined.Count - 1) 'remove last empty field                
                output.WriteLine(String.Join(vbTab, combined.ToArray))
            End If
        Loop Until input.EndOfStream
        output.Close()
        input.Close()
 
Hi JohnH,

I am still a bit wet behind the ears when it comes to RegEx so I was interested as to how you implemented your solution using RegEx.

Can you please help me understand better your use of :-
Regex.Split(line, " {2,}", RegexOptions.Multiline)

I get that " {2,}" means search for a preceding expression, in this case space, at least twice but I do not understand what the comma does when the, but not more than, parameter is omitted? Also what does the Multiline option add to the mix?

Cheers,

Ian
 
The " {2,}" pattern means a space repeated two or more times (greedy). You can read about the syntax for example here: Regular Expressions Reference - Basic Syntax
RegexOptions.Multiline wasn't needed here, so I removed it from the posted code.
 
Back
Top