Resolved Merge two big files

Luiscarneirorm

New member
Joined
Mar 7, 2022
Messages
2
Programming Experience
5-10
Hi.
I need to join two text files together, but I don't just want to add one to the other, but rather add the lines of the first file until I find a word, then the same for the second, then back to the first and continue the cycle until I run out of both files .

I have the following code, which works (but takes a long time) with files of around 50k lines, but the files I need to merge are around 2kk lines.


VB.NET:
Private Sub Juntar_Click(sender As Object, e As EventArgs) Handles Juntar.Click
        Gravar.Enabled = False
        System.IO.File.Delete("c:\temp\tempfile.txt")

        Do Until (Prog_1_Button.Enabled = True And Prog_2_Button.Enabled = True)
            While Not (Prog_1_Button.Enabled)
                lines = System.IO.File.ReadAllLines(file1).ToList
                arrayLines = lines.ToArray
                Dim i As Integer = lines.IndexOf(Array.Find(arrayLines, Function(x) (x.Contains("teste"))))
                saida = lines.GetRange(0, i + 1)
                lines.RemoveRange(0, i + 1)

                System.IO.File.WriteAllLines(file1, lines)
                If i >= 0 Then
                    Prog_Bar.Value = Prog_Bar.Value + i
                    Exit While
                Else
                    saida = lines
                    Prog_1_Button.Enabled = True
                End If
            End While
            System.IO.File.AppendAllLines("c:\temp\tempfile.txt", saida)
            saida.Clear()
            While Not (Prog_2_Button.Enabled)
                lines = System.IO.File.ReadAllLines(file2).ToList
                arrayLines = lines.ToArray
                Dim i As Integer = lines.IndexOf(Array.Find(arrayLines, Function(x) (x.Contains("teste"))))
                saida = lines.GetRange(0, i + 1)
                lines.RemoveRange(0, i + 1)

                System.IO.File.WriteAllLines(file2, lines)
                If i >= 0 Then
                    Prog_Bar.Value = Prog_Bar.Value + i
                    Exit While
                Else
                    saida = lines
                    Prog_2_Button.Enabled = True
                End If
            End While

            System.IO.File.AppendAllLines("c:\temp\tempfile.txt", saida)
            saida.Clear()
        Loop
        Gravar.Enabled = True

    End Sub
 

jmcilhinney

VB.NET Forum Moderator
Staff member
Joined
Aug 17, 2004
Messages
14,744
Location
Sydney, Australia
Programming Experience
10+
I haven't read your code in detail but, at a glance, it looks very inefficient. For one thing, you should not be using the state of UI controls to determine the state of reading a file. Just read each file line by line and write out the data line by line.
VB.NET:
Sub Main()
    Using sourceReader1 As New StreamReader("source file 1 path here"),
          sourceReader2 As New StreamReader("source file 2 path here"),
          destinationWriter As New StreamWriter("destination file path here")
        Do Until sourceReader1.EndOfStream AndAlso sourceReader2.EndOfStream
            If Not sourceReader1.EndOfStream Then
                TransferData(sourceReader1, destinationWriter)
            End If

            If Not sourceReader2.EndOfStream Then
                TransferData(sourceReader2, destinationWriter)
            End If
        Loop
    End Using
End Sub

Private Sub TransferData(sourceReader As StreamReader, destinationWriter As StreamWriter)
    Const TARGET_LINE As String = "target line here"

    Dim line As String

    Do
        line = sourceReader.ReadLine()

        destinationWriter.WriteLine(line)
    Loop Until line = TARGET_LINE OrElse sourceReader.EndOfStream
End Sub
It's still going to take some time if there is a lot of data but it's more straightforward than what you're doing. You may need to adjust it a bit for your specific needs.
 

Luiscarneirorm

New member
Joined
Mar 7, 2022
Messages
2
Programming Experience
5-10
Thanks for the answer.
Possibly I didn't explain my question correctly, but they already helped me in another furum. I leave here the solution that may be useful to someone else.

Solution:
Imports System
Imports System.IO

Module Program
    Sub Main(args As String())

        Dim lines1() As String = File.ReadAllLines("d:\\temp\\input1.txt")
        Dim lines2() As String = File.ReadAllLines("d:\\temp\\input2.txt")
        Dim newfile As System.IO.StreamWriter
        newfile = New StreamWriter("d:\\temp\\output.txt", False)

        Dim i2 As Integer = 0
        For i1 = 0 To lines1.Count - 1
            newfile.WriteLine(lines1(i1))
            If (lines1(i1).Contains("teste")) Then
                For j = i2 To lines2.Count - 1
                    newfile.WriteLine(lines2(j))
                    i2 = j + 1
                    If (lines2(j).Contains("teste")) Then
                        Exit For
                    End If
                Next
            End If
        Next
        For j = i2 To lines2.Count - 1
            newfile.WriteLine(lines2(j))
        Next
        newfile.Close()
    End Sub
End Module
 
Top Bottom