10000 Data Items

papa_k

Member
Joined
Jun 1, 2009
Messages
22
Programming Experience
1-3
Hi all,

I am not new to programming but it is more of a hobby than anything for me. At the moment i am trying to work out he best way to get my applications (writtng in vb.net 2008) to load and display about 10'000 rows of data.

Now, at the moment the volume of data is approximately 5'000 however i am sure it will increase.
The data is currently stored in a text file.
This data is interigated and then loaded into the application; the problem is that a loop takes far too long. I am loading into a dataset, and this can be assigned to a data grid in the application, however, when i do that and load the data set the application takes about 4 mins and up to 1.5gb of memory.

when its finished it then clears down the 1.5gb but still runs at about 120mb, which to me seems stupid.

what is the most efficient way to get that large amount of data into the application and scrollable / viewable by the user?

i am happy to receive any suggestions on how best to store the data (if the text file isnt the best way, which I believe it isnt) and how to retrieve from the file (the quickest way) and display the data on the screen.
 
Hello.

For such big data would be a database nice (MySQL, Firebird) since it's made for storing a large amount of data.
Though, I believe that you won't be able to store it simpler as in the text file. The question would be how you read this file, and how many columns your Datagridview has. It also depends on the machine, my PC (Intel Dual Core, @ 2GHz) can show 3,000 rows with 5 columns from a database within some seconds.

VB.NET:
        ' a temporary variable which will hold one line
        ' such variables should always be declared outside of the scope
        Dim line As String = String.Empty

        ' open the streamreader for the file
        Using rdr As New StreamReader("C:\yourPath.txt", Encoding.Default)
            While Not rdr.EndOfStream
                line = rdr.ReadLine()

                ' this is an example of how to fill the DataGridView
                ' Add(Object-Array()), each array entry stands for the column at this position
                Me.DataGridView.Rows.Add(New String() {line.Substring(0, 10), line.Substring(10, 10), line.Substring(20, 10)})
            End While
        End Using

I can't think of a way which could possibly be faster then this, at least if you use the text file.

Bobby
 
If you're loading it into a dataset then why not have the data be stored in a database or xml file instead of plain text? You'll be able to retrieve the data without a loop.
 
If you're loading it into a dataset then why not have the data be stored in a database or xml file instead of plain text? You'll be able to retrieve the data without a loop.

Using XML interests me - not used it before though, it is easy to pick up?
 
Have a look at the DataSet.ReadXml(String) and DataSet.WriteXml(String) methods. I think you'll be pleasantly surprised.
 
Have a look at the DataSet.ReadXml(String) and DataSet.WriteXml(String) methods. I think you'll be pleasantly surprised.

Sounds good.

I think that the posting of the dataset should be quite easy using the built in functionality in .Net / VB.

I might still have the problem of initially harvesting the data.

Basically I have 5000 or so songs on my machine. My code goes and gets the destination address of the songs. It then puts them into a dataset. When its loading the dataset it takes a while (4 mins or so) which isnt too bad. However, the issue is when i render the dataset being loaded at the same time. This is when i get the massive 1.5GB of memory use.
Completely not acceptable.

I will have a look this evening, but i have a concern that getting my dataset of 5000 entities presented to the screen could countinue to cause an issue.

Have people previously presented this sort of amount of data using VB.Net on before? If so, what techniques did you use?

My machine is a HP machine - AMD Turion X2 Ultra Dual-Core Mobile Processor ZM-82 2.2 Ghz / 3GB / 250GB / Vista Home Premium (copied off the web)
 
Sounds good.

I think that the posting of the dataset should be quite easy using the built in functionality in .Net / VB.

I might still have the problem of initially harvesting the data.

Basically I have 5000 or so songs on my machine. My code goes and gets the destination address of the songs. It then puts them into a dataset. When its loading the dataset it takes a while (4 mins or so) which isnt too bad. However, the issue is when i render the dataset being loaded at the same time. This is when i get the massive 1.5GB of memory use.
Completely not acceptable.

I will have a look this evening, but i have a concern that getting my dataset of 5000 entities presented to the screen could countinue to cause an issue.

Have people previously presented this sort of amount of data using VB.Net on before? If so, what techniques did you use?

My machine is a HP machine - AMD Turion X2 Ultra Dual-Core Mobile Processor ZM-82 2.2 Ghz / 3GB / 250GB / Vista Home Premium (copied off the web)
It'll take a while to initially load the data yes, but once that's done simply continue to use xml at that point.

To speed the initial loading of the data up a little use a BackgroundWorker for the actual loading (that loop that takes forever to complete) because that'll run in a secondary background thread which'll run on whichever is the least busy core of your system at the time of execution, basically it'll mean it'll run as fast as possible because it's running separate of your UI thread
 
I've done this sort of thing a few times.

I would-

1) set up a database using the faciliy in vb2008.
2) Write code to import data from text file into the database (it may be slow but it only has to be done once.
3) Use the database from then on.

If this seems appealing I can give more detail.

Steve
 
hey guys, im really concerened about speed issues on data access on vb.net. vs2008 runs a little slow at debugging mode and at start up on my computer so i dont know if its the dataset/dataadapter thats slowing the project im making. is there a faster way of retrieving, adding, editing and deleting of data rather than a dataset? an oledbreader maybe but im not familiar with this. i also fill a listview to show the items from the database.

thanks.
 
and display the data on the screen.

Tell me, truly.. What do you really think a user is going to want to do with 10,000 items on his screen? If you have 10,000 mp3 in your winamp, do you honestly just scroll up and down them and look at them?
 
hey guys, im really concerened about speed issues on data access on vb.net. vs2008 runs a little slow at debugging mode and at start up on my computer
Didn't notice the hard disk thrashing away compiling your app, loading debug symbols and setting up your environment for development I take it?

Visual Studio 200x are orders of magnitude slower at starting an Edit-And-Continue debug session that VB6 is/was.. Something you'll have to get used to i'm afraid

is there a faster way of retrieving, adding, editing and deleting of data rather than a dataset?
No.

i also fill a listview to show the items from the database.
Ensure youre not downloading too much data then copying it all into a listview. If youre a legacy VB6 programmer you'll have picked up a few bad habits, of which this is one. Read up on MVC and how data is stored in a Model (dataset) and then a View/Controller (listview) accesses the model. We don't store data in listviews themselves any more because they are VCs not Ms
 
got this sorted in the end.

my code goes through gets the MP3 files from the folders structures and pushes them into a stack, this stack dumps them out to a list which i then trawl through and get the MP3 details from and stores it into a data set.

at the end of the processing the data is then dumped into a table and then the table associated with the datagrid on screen.

actually does it all quite quickly, apart from the MP3 analysis, but that i can always try and get it to do in a background task.

thanks for your assistance all, i will be back i am sure . . .
 
got this sorted in the end.

my code goes through gets the MP3 files from the folders structures and pushes them into a stack, this stack dumps them out to a list which i then trawl through and get the MP3 details from and stores it into a data set.
So you put data into a stack, which is a data container
Then take it out and put it into a list, which is a data container
Then take it out and put it into a data table, which is a data container

I know memory is cheap these days, but doesnt it strike you that there is an optimistation to be made here?
 
Couldnt agree more

I couldnt agree more.

Currently the push of data into the stack is very quick. It can manipulate data into the stack extremely quickly, but, i need to interigate that data and pull data attributes off of it - song title, artist, etc.

I pop each file location off, read the id3 data and then place it into a data grid.
once i have gone through the stack i then write the datagrid to an xml file for storage of the data.

Do you have suggestions on how to get around this? Open to suggestions - the process is working fine at the moment though. The bit that takes the time is the interigation of the data (pulling the artist etc). I need to review to see if i can push that to a back ground task if a quicker way cant be found - that is a "nice to have" and not essential at the moment though.

Papa
 
Back
Top