Question Memory Issue

ihunter

New member
Joined
Jun 8, 2011
Messages
2
Programming Experience
1-3
I'm having System.OutOfMemoryException from my application. I've tried various ways to free up memory but still fail to do so. Unfortunately there is so much code to be posted so I will ask if some of the concept here is causing the problem.

The app will read 4 csv file for about 16 million line of data altogether and store them into array of object. First the main object is created with multiple array object define in it. Then all the line read from the csv file will be stored in the 4 array of objectType. Later it is used to calculate catalyst score and the score are stored into array within the main object. The object and process are shown below.

VB.NET:
Public Class Object1_Class

    Public arrayObjectType1A() As ObjectBlock
    Public arrayObjectType1B() As ObjectBlock
    Public arrayObjectType2A() As ObjectBlock
    Public arrayObjectType2B() As ObjectBlock
    Public arrayCatalyst1() As Double
    Public arrayCatalyst2() As Double

    Public Sub New()

    End Sub
End Class

VB.NET:
Public Class ObjectBlock

    Public dtDate As Date
    Public dbStart, dbMax, dbMinimum, dbEnd, dbAccumulation As Double

    Public Sub New(ByVal dtDate As Date, ByVal dbStart As Double, ByVal dbEnd As Double, ByVal dbMinimum As Double, ByVal dbMax As Double, ByVal dbAccumulation As Double)

        Me.dtDate = dtDate
        Me.dbStart = dbMax
        Me.dbEnd = dbEnd
        Me.dbMinimum = dbMinimum
        Me.dbMax = dbMax
        Me.dbAccumulation = dbAccumulation

    End Sub

End Class

VB.NET:
    Private Sub btnBeta1_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles btnBeta.Click
        DataProcessing()
    End Sub

    Public Sub DataProcessing()
        Dim MainObject As Object1_Class = New Object1_Class()
        Dim timeStart As Date = Now()
        Dim neutralValue As Integer = 10000
        Dim span As TimeSpan

        ReadData("Type1", "DataFile1-A", MainObject.arrayObjectType1A, dtpFrom.Value, dtpTo.Value)
        ReadData("Type1", "DataFile1-B", MainObject.arrayObjectType1B, dtpFrom.Value, dtpTo.Value)
        ReadData("Type1", "DataFile2-A", MainObject.arrayObjectType2A, dtpFrom.Value, dtpTo.Value)
        ReadData("Type1", "DataFile2-B", MainObject.arrayObjectType2B, dtpFrom.Value, dtpTo.Value)

        For i = 1 To 20
            CatalystProcessing(MainObject.arrayObjectType1B, i)
            Dim fileName As String = fileLocation & "\Temp\Catalyst.csv"
            Dim maxArrayLength As Long = MainObject.arrayObjectType1B.Length - 1

            Try
                Dim fileLineArray() As String = System.IO.File.ReadAllLines(fileName)
                ReDim MainObject.arrayCatalyst1(fileLineArray.Length - 1)

                For j = 0 To fileLineArray.Length - 1
                    MainObject.arrayCatalyst1(j) = CDbl(fileLineArray(j))
                Next
            Catch ex As Exception
                Console.WriteLine("I Loop")
                MsgBox(ex.Message.ToString())
            End Try

            For h = 1 To 20

                If i < h And h - i >= 5 Then
                    CatalystProcessing(MainObject.arrayObjectType1B, h)

                    Try
                        Dim fileLineArray() As String = System.IO.File.ReadAllLines(fileName)
                        ReDim MainObject.arrayCatalyst2(fileLineArray.Length - 1)

                        For j = 0 To fileLineArray.Length - 1
                            MainObject.arrayCatalyst2(j) = CDbl(fileLineArray(j))
                        Next
                    Catch ex As Exception
                        Console.WriteLine("H Loop")
                        MsgBox(ex.Message.ToString())
                    End Try

                    span = Now.Subtract(timeStart)
                    Console.WriteLine("Finish at" & span.Hours & " Hour(s) " & span.Minutes & " Minute(s) " & span.Seconds & " Second(s) " & span.Milliseconds & " Millisecond(s) used")

                    Optimizer(maxArrayLength, neutralValue, i, h, timeStart, MainObject)
                    span = Now.Subtract(timeStart)
                    Console.WriteLine("Optimization Complete: " & span.Hours & " Hour(s) " & span.Minutes & " Minute(s) " & span.Seconds & " Second(s) " & span.Milliseconds & " Millisecond(s) used")

                    Erase MainObject.arrayCatalyst2
                End If

            Next

            Erase MainObject.arrayCatalyst2
        Next

    End Sub

    Public Sub ReadData(ByVal strType As String, ByVal strSelectedFile As String, ByRef objectBlockArray() As ObjectBlock, ByVal dtFrom As Date, ByVal dtTo As Date)
        Dim fileName As String = fileLocation & strType & "\Compilation\" & strSelectedFile & ".csv"
        Dim timeStart As Date = Now()

        Try
            Dim fileLineArray() As String = System.IO.File.ReadAllLines(fileName)
            Dim strArray() As String
            ReDim Preserve objectBlockArray(fileLineArray.Length - 1)
            Dim span As TimeSpan = Now.Subtract(timeStart)
            'Console.WriteLine("Read Only " & span.Hours & " Hour(s) " & span.Minutes & " Minute(s) " & span.Seconds & " Second(s) " & span.Milliseconds & " Millisecond(s) used")

            For i = 0 To fileLineArray.Length - 1
                Dim strline As String = fileLineArray(i)
                strArray = strline.Split(",")
                objectBlockArray(i) = New ObjectBlock(strArray(0), strArray(1), strArray(2), strArray(3), strArray(4), strArray(5))
            Next

            'span = Now.Subtract(timeStart)
            'Console.WriteLine("Completion " & span.Hours & " Hour(s) " & span.Minutes & " Minute(s) " & span.Seconds & " Second(s) " & span.Milliseconds & " Millisecond(s) used")

        Catch ex As Exception
            MsgBox(ex.Message.ToString())
        End Try

    End Sub

    Public Sub CatalystProcessing(ByRef arrayObjectBlock() As ObjectBlock, ByVal catalystPeriod As Integer)

        Dim Catalyst_Array(arrayObjectBlock.Length - 1) As Double
        Dim Multiplier As Double = 2 / (catalystPeriod + 1)

        For h = 0 To arrayObjectBlock.Length - 1

            If h < (catalystPeriod - 1) Then
                Catalyst_Array(h) = 0
            ElseIf h = catalystPeriod - 1 Then
                Dim totalEnd As Double = 0

                For i = h - (catalystPeriod - 1) To h
                    totalEnd += arrayObjectBlock(i).dbEnd
                Next

                Catalyst_Array(h) = totalEnd / catalystPeriod
            Else
                Dim currentClose As Double = arrayObjectBlock(h).dbEnd
                Catalyst_Array(h) = (currentClose - Catalyst_Array(h - 1)) * Multiplier + Catalyst_Array(h - 1)
            End If
        Next

        Catalyst_Array(catalystPeriod - 1) = 0

        'temporary store them into file to release resource for other usage
        Try
            Dim textFile As String = fileLocation & "\Temp\Catalyst.csv"
            Dim outFile As IO.StreamWriter = My.Computer.FileSystem.OpenTextFileWriter(textFile, False)

            For h = 0 To Catalyst_Array.Length - 1
                outFile.WriteLine(Catalyst_Array(h))
            Next

            outFile.Close()
        Catch ex As Exception
            MsgBox(ex.Message.ToString())
        End Try

    End Sub

These are the steps i've taken to reduce memory consumption.

1. Use pass by reference for big chunk of data and pass by value is used only for single data.
2. Seperate process into different sub so after when a sub is exited the memory would be free for other processes.
3. Always set arrays = nothing at the end of usage.
4. I tried to free thread's memory consumption but not sure if it is doing right. Code shown right below

VB.NET:
        If ThreadUtilization >= 1 Then
            theProcess1 = New DataProcess(MainObject, maxArrayLength, neutralValue, i, h, timeStart, 0)
            t1 = New Thread(AddressOf theProcess1.Test)
        End If

VB.NET:
While Not ThreadFound

                            If Not ThreadFound And ThreadUtilization >= 1 Then
                                If Not t1.IsAlive Then
                                    theProcess1 = Nothing
                                    t1 = Nothing
                                    System.GC.Collect()
                                    theProcess1 = New EMACrosses(MainObject, maxArrayLength, neutralValue, i, h, timeStart, EvenUp * 5)
                                    t1 = New Thread(AddressOf theProcess1.Test)
                                    t1.Start()
                                    ThreadFound = True
                                End If
                            End If
                            System.Threading.Thread.Sleep(50)
                        End While

Is there any practices that could allow the application to run using minimal memory?
 
Last edited:
You'll get much less memory usage by changing ReadAllLines to using StreamReader and ReadLine method. This is relevant for reading the DataFiles. For this also change ObjectBlock arrays to List(Of ObjectBlock), you can set an estimate initial capacity to lessen the burden of the internal array resizing.

You'll also get much better performance and less memory usage from changing the part where you write array to temp file and read into new array from temp file, to copying one array to another directly. All these file operations you have involve a whole lot of string operations that are costly and unnecessary. If it were necessary to write/read file for each iteration you should use BinaryWriter/Reader and WriteDouble/ReadDouble to avoid String conversions. Though I think all the file and string operations are not needed and once you rid those you get a much larger working space. Also, is it really necessary to create the temp Catalyst_Array, process it, then copy it to arrayCatalyst2, can't you use arrayCatalyst2 directly?
 
Perhaps you could give us a sample of the data and describe the operations youre performing on it (e.g. are you calculating the average X of some columns and then writing a new csv with the average appeanded).. I'm fairly sure your program could be made a lot more minimal..
 
You'll get much less memory usage by changing ReadAllLines to using StreamReader and ReadLine method. This is relevant for reading the DataFiles. For this also change ObjectBlock arrays to List(Of ObjectBlock), you can set an estimate initial capacity to lessen the burden of the internal array resizing.

You'll also get much better performance and less memory usage from changing the part where you write array to temp file and read into new array from temp file, to copying one array to another directly. All these file operations you have involve a whole lot of string operations that are costly and unnecessary. If it were necessary to write/read file for each iteration you should use BinaryWriter/Reader and WriteDouble/ReadDouble to avoid String conversions. Though I think all the file and string operations are not needed and once you rid those you get a much larger working space. Also, is it really necessary to create the temp Catalyst_Array, process it, then copy it to arrayCatalyst2, can't you use arrayCatalyst2 directly?

Yes, thanks for reminding of streamreader as I've almost forget that I can get the total of the line counted using .length rather than using readallines to get the total lines the array will need. As for changing array into list, what would be the purpose of this? I've try to reduce the overhead of array by specifying the number of required size. While list is dynamic but if I'm specifying the size then it would be the same isn't it? Otherwise the list doubles itself and it will consume even much more memory. I've also read a debate of list vs array and someone tested that retrieving value from array is much faster than list which is critical for my apps as it does a lot of calculations using these stored objects.

For temp Catalyst_Array you mention, do you mean the temp file? I've now change the sub to a function making it return the catalyst array directly to arrayCatalyst1 and arrayCatalyst2 instead of retrieving it again from the file.

Again thanks for the reply.
 
Back
Top