I agree with you that a DataTable is not the right place to load a million rows - I was using it to show that the performance of ADO.NET has greatly improved when in comes to large DataSets. If 1,000,000 rows can be loaded into a DataTable in an acceptable amount of time, then surely 10,000 isn't too much for it to handle.
I'm not sure that the load time is a great indicator of performance. Effectively all youre testing is the speed of the network connection; rows are loaded into the table merely by being copied into the client memory. I;d imagine that loading a hash table with a million entries would take a similarly minimal amount of time. Of course, loading and manipulating are different things..
I'm being selective in loading only 10k rows, yes. I could have designed the tool to load 10-100 rows with every click of the mouse, or just the 10k I need in one shot. I leave the other 20 million rows in the database, where they rightly belong.
What I cant understand is, typically, we use datasets to download some data.. maybe up to 1000 rows if a customer really has made a lot of orders this month but usually closer to, er.. 2 or 3 rows. We edit them, and we send them back.
What can your program possibly do that means that
a) the user needs to see 10,000 rows of info
b) the user can effectively manipulate 10,000 rows of info (that's what update statements are for)
c) the user needs to do something significant with 10,000 rows of info
?
That's what I'm having difficulty understanding. I can think of NO situation where 10,000 lines of info is an acceptably small amount of information that a user can effectively look at, navigate, use, add up, consider, and not get lost in.
It actually handles it very well, with the singular exception of the Expression column. All of my data is loaded in < 5 seconds, and reaction time is instantaneous. With the expressions, my load time jumps to 2 minutes.
Expressions probably arent compiled into .net code.. they will likely be interpreted every time, parsed from string into number, operators parsed, IF "+" then a + b Else If "-" then a - b ...
Youre not looking at something quick here, youre looking at a simple, scripting level intelligence add-on to a datatable that is intended to maybe add up 10 items, or give Cost * Qty = Price. DataSets are not relational databases. Expressions are no substitute for analytical queries (rank() over(partition by group_number order by rec_date desc) ) and other advanced sql
I do understand how quickly data can be spit out of a database - I also understand connection overhead,
.NET pools connections, there is virtually no overhead
Opening a cursor to read 10,000 rows ties the database up for longer than reading 10 rows. If 100 people all load 10,000 rows, dont you tink the database will be more busy than if they asked for 10 rows?
If the server is limited to X connections, then keeping those connections open for longer (10,000 rows) lengthens your reaction time as queries queue up waiting for the release of resources
Do you know of any source that describes the size constraints of DataSets?
The memory available in the client machine. This doesnt mean it is sensible to load 2 gig of data into a DataSet.