Access Faster in Binary reading Single Data Types

mad_schatz

Member
Joined
Jan 1, 2009
Messages
15
Programming Experience
1-3
Hi Gents,

I'm new in VB.Net and I'm trying to learn it.
I've a binary file which has 300.000+ single floating point numbers which I'm reading with the following code :

VB.NET:
        Dim ReadStream As FileStream
        Dim llFileLen As Long
        Dim liArrayLen As Integer
        Dim lcFileName As String

        liArrayLen = 0
        lcFileName = Application.StartupPath & "\World_new.dat"
        Me.TextBox1.Text = Now

        ReadStream = New FileStream(lcFileName, FileMode.Open)
        Dim readBinary As New BinaryReader(ReadStream)
        llFileLen = ReadStream.Length


        Do While ReadStream.Position < llFileLen
            ReDim Preserve laCoords(liArrayLen)
            laCoords(liArrayLen) = readBinary.ReadSingle()
            liArrayLen = liArrayLen + 1
        Loop
        Me.TextBox2.Text = Now

        readBinary = Nothing
        ReadStream = Nothing

Unfortunately this reading process takes long time. About 4 mins.

My question is, Is there any other way to read single data types from a binary file more faster ?

Thanks for all
 
I wrote 300,000 single values to a binary file for testing. Reading the contents into a List(Of Single) took about 5 seconds on my system (low end Core 2 Duo).

VB.NET:
		Dim singleCollection As New List(Of Single)

		Using reader As New BinaryReader(File.OpenRead("C:\Temp\BinaryFile.dat"))
			Do Until reader.PeekChar() = -1
				singleCollection.Add(reader.ReadSingle())
			Loop
		End Using
 
Unfortunately this reading process takes long time. About 4 mins.

Eek!

ReDim Preserve laCoords(liArrayLen) would be your problem



Every time you read one number, you re-dimension the array to make it one larger. Problem is, to re-dim an array VB must create a new array and copy all the elements to the new one. Given that you re-dim it 300,000 plus times, and the average array size is 150,000 elements, that means 45 billion copy operations of 4 bytes of data (which may be stored in an 8 byte memory location) and youre talkling about shifting around somewhere between 180 and 360 gigabytes of information, just to read a 150kb file!

Stop using array, and stop using Redim Preserve. Use a data storage container that supports your method of reading, and also what you plan to do with the data afterwards. At the very least, work out from the file size, how many singles it is likely to contain and pre-dim an array of the appropriate size
 
Hi All,

Thank you for your replies Mattp and cjard.

Cjard, you're right about array. As soon as I stop re-dimming it, problem solved. I did never thought that it could use that much information.

the numbers are floating numbers which helps me to plot a simple world map on a drawing surface.

It is not something serious. I'm just playing around with vb.net in order to learn.

Thank you very much for your helps gents.
 
it sthe same way for strings.. If youre doing a LOT of string concatenation you can end up moving gigabytes around. Use a Stringbuilder to avoid similar performance hits
 
As cjard says, use a data structure that's designed for what you're doing. In your case that would be a List(Of Single). A List is like an array in many ways but it will allow you to add and remove items dynamically, growing and shrinking as required.

Internally, the List uses an array and it will work similarly to what you're already doing in that, as the array gets filled, the system will create a new, larger array and copy the existing data to it. The difference is that the List will not increase the size of the array by just 1 each time. Whenever it needs to grow it will double the size of the array.

By default, initially there is no array. When you add the first item an array with a Length of 4 is created. Each time you try to add an item beyond the Length of the array the size is doubled, so it goes 0, 4, 8, 16, 32, 64, 128, 256, etc. You should be able to see that that will lead to far fewer reallocations than if you grow an array by one element each time. This provides a good trade-off between keeping the array size as small as possible and minimising the amount of reallocation.

Furthermore, if you know that your List is going to nd up being large then you can specify an initial size for the internal array. If you know for a fact that your file will contain at least 300,000 entries then you should specify 300,000 as the initial capacity:
VB.NET:
Dim numbers As New List(Of Single)(300000)
That way your List will never have to reallocate until the 300,001st item is added. The capacity will then grow to 600,000 so another reallocation will not be needed for quite some time.

If you don't know with any sort of accuracy how many items there will be then just make a guesstimate that balances the desire to increase the capacity as few times as possible with the desire not to make the internal array much bigger than is necessary. Once you've done adding items you can force one final reallocation to shrink the internal array to the exact size needed by calling the List's TrimExcess method.
 
I really appreciate your explanations. Thank you once more.
As I'm very new in vb.net, I don't know how to do things in better way.
I'll try to use LIST learn its' power.

Thanks
 
There are other collections, Stack, Queue, Dictionary ... Read up on them all and see the differences. Each situation calls for a different container
 
Back
Top