DataSet?

Stonkie

Well-known member
Joined
Sep 12, 2007
Messages
279
Programming Experience
1-3
I have to build a pretty small Windows application with a list of items in a DataGridView (hidden "Id" column and displayed "Name" column). I can easily select those two columns and fill a DataTable with it to display on the DataGridView, but the actual table contain 20 000 records (in my stress tests, nominal number of records around 5 000 - 6 000) with 20 - 30 columns. It is impossible to select all the data from all the columns in memory as this takes too much memory for the current system constraints.

Now I also have a "Details" Form which must show informations about all those 20 - 30 columns along with some many to many relationships to other tables which must show a list of related entries (the Color table contains the colors with Id and Name and a secondary table contains the MainItemId and ColorId foreign keys). This detailled form only shows a single record at a time, but can be opened multiple times with different records and both allow the user to modify the information (if both have the same file open, the concurrency problems can be detected on the server and the user will just have to know someone has already modified the record, that someone being himself).

In short, this is a complete database with lots of tables, relationships and records. The database is on Sql Server 2005 and the system must support optimistic concurrency. I use VS2005 and .NET 2.0 and the application must be production quality (sending patches is by CD, expensive and people just never install them anyway). And as usual, it was late before it began...

--

My question is how should I work through this?

Should I use a simple query and DataTable to show in the DataGridView, then use a strongly typed dataset to contain the item information and its associated data from other tables. This approach forces me to use counters of which records are in use so I can remove them from the DataSet when they are not used which in turn may lead to memory leaks, especially if I use data binding on DataViews of the DataTables (because delegates can keep a complete object alive).

Or should I use a Strongly Typed DataSet just for the easy access to the data and pass a custom data holder for my data so I may keep the DataSet effectively empty (using only GetData instead of Fills to the DataSet), but I am forced to code an "IsModified" property and some sort of optimistic concurrency check (possibly modification counters in the database). I would then have to manually call a few update/insert/delete methods from my TableAdapters, but I think I can go through this without using any SQL (except from within the DataSet Designer obviously).

A quick and dirty solution would be to keep only modification counters on the main table and delete and reinsert any information from the secondary tables (for the current main record, not all the table...) to avoid keeping counters on those tables too...

At the moment, I think the second option is more viable, but maybe someone has a better idea or some argument I am unaware of... I am quite knowledgeable, but I need something to feed the though process...
 
If you enable Use Optimistic COncurrency when you make the table adapter, then the UPDATE statement suddnely becomes a massive bunch of unreadable crap that basically includes every value to be updated and a check whether the database contains that value or not (or if that value is now null

I present an example:

UPDATE table SET col = @new_col WHERE pk = @pk AND (col = @old_col OR col is null)

The bold bit is the optimistic bit, that no data will have changed: @new_col will be populated with the DataRow New value. @old_col will be pop with DataRow's Old value (datarow stores both the original value d/l and the new value the user typed)
With a relevant select query.. two users d/l a record
Both users edit the record
One user hits save
1 row updated
Then the other user hits save
The other user's query will affect 0 rows
That user's app will experience a ConcurrencyException (or some kind of exception)
Upon hitting the exception, you the developer must decide what to do. You can d/l the new record too and say to the user "hey, someone else changed this data before you did.. do you want to keep their edits shown here, or overwrite them, or if you want to merge them, edit them below?"

Comprends?
 
Je comprends :)

I've built an app on that principle with an additionnal SQL Query that did not include the conditions to force update after telling the user... Concurrency is not my problem here. I guess I went so far in explaining my situation that the point got lost...

What actually happens is that my app takes too much memory with the DataSet filled so I need to either fill it with only the few records I need and keep counters for each record's use (to remove them from the DataSet/memory when not used). Maybe using a new DataSet instance for each record would work (they'd be garbage collected when not used anymore, but that feels wrong and I'll have to manage the secondary tables in multiple copies!)...

My current best solution is to use the TableAdapters to manage the data and use a custom object to hold each record's information. I'd then uses a partial class to add methods in my TableAdapters to input/output the object to/from the database.

The big data holder would have the name, description, etc. informations from the main table, and also the colors, shapes, categories, etc. informations from the secondary tables. Sometimes you would like to be able to draw stuff on message boards! :)

A concrete example is I got my main table, the colors table and an intermediate "secondary" table between the two for the many to many relationship. I would include both the record from the main table and a list of color ids stored in the secondary table in the holder object so I can clear it all from memory when it is not used anymore. The colors would be in a DataTable so I may create DataViews on it for my GUI as I do not have to update them very often.

I just wondered if there was some widely used way to virtualize the DataTables like you can do for the DataGridView...

Thanks for the reply!
 
Last edited:
I think youre thinking about this way too much. The idea of having a dataset is NOT so that you can downlaod your whole database into it. You download a bit, edit it, and upload it again.

The data stays in the db, you take it out and are as quick as possile about editing and returning it. DataTables have DataRows that maintain original and new values, and have a state, updated, added, deleted -> the state is used to know what DML sql to call when syncing the DB< which the TableAdatper does vis the Update() call

follow this tutorial and you'll see what i'm talking about:
http://msdn2.microsoft.com/en-us/library/ms171884(VS.80).aspx


It's all a lot easier than youre thinking at the moment..
 
The problem with this occurs in the myForm.load event (I think that's where the myTableAdapter.Fill(myDataset.myTable) call is inserted by the designer). This basically throws an OutOfMemoryException (it works on my dev machine but it takes like 200 - 300 mb of RAM)... Just for that single table and I got others to bring into memory!

I need this to fill the table with only the few records I need (and their related records from the secondary tables)...

VB.NET:
myTableAdapter.ClearBeforeFill = false; // or something like that
myTableAdapter.FillById(myDataset.myTable, currentId)

mySecondaryTableAdapter.ClearBeforeFill = false; // or something like that
mySecondaryTableAdapter.FillByMainRecordId(myDataSet.mySecondaryTable, currentId);

This code would work for inserting the data as I need it, but since many details forms can depend on each record, I need to make sure it is not already in the DataTable and I must keep counters to remove them from the DataTable when no more details form use them or some other way to tell which are not in use anymore (I cannot have multiple copies of a same DataRow in a DataTable like one per details form either)... Otherwise this will just delay the time it takes to load too many records and crash.

I would like to find some way for the DataTable (or a DataView?) to dynamically fetch the data from the database as I transparently access the cache that is the dataset so the cache doesn't actually contain anything but my modified data and the records currently bound in the application.

I'm pretty sure that is not possible (too expensive is just the same as impossible), but there must be some best practice way to display database data on a form when the data cannot be held totally in memory.
 
read the dw2 link in my sig, first the section on Creating A Simple Data App which teaches all the basic bits such as d/l data from a db, edit it, show it in related fashion and save it

Then read Creating a Form to Search Data
which shows you parameterised queries so you arent downloading the whole database into your client app ;)

Then read Displaying/Saving Related Data
etc
 
Ok, this is pretty straigh forward, so I made the parametrized query that fills the DataTable with a single item and its related data (actually I made a partial class of my TableAdapter and created a method that fills the datatable with the record and uses the other TableAdapters to fill all the related data as well in a single method call... I don't know if having the TableAdapters depend on each other is a breach of concepts :confused:, but this is just to make sure it is actually possible and I don't reach a blockade later in the actual project.).

By the way, it required a bit of a hack to force multiple TableAdapters use the same transaction as I don't want any partial updates...

Now, from my main form, the user can choose an item from the list and show a form that displays its details or another that displays its prices. Both forms load the data into the dataset using the AddItem(int id) method I created in the Load event and use another method called RemoveItem(int id) which removes the item and its related data from all the right DataTables in the FormClosed event.

My problem was that the user has to be able to open a details form on an item and open the prices form for that same item, both of those using the same informations from the dataset at once.

I chose to add a counter for each item so that I count the number of calls to AddItem(int id) to determine how many times an item is being used and decrement that value with RemoveItem(int id) so that AddItem adds the item only if the count is 0 and RemoveItem removes it only if the count is 1. I'll try to code this counter system and see if things work as excepted.

I was initially thinking this would be much more complex than it seems to turn out. Thanks! :D
 
By the way, it required a bit of a hack to force multiple TableAdapters use the same transaction as I don't want any partial updates...
I wrote a class to do that; you should have asked ;)


I chose to add a counter for each item so that I count the number of calls to AddItem(int id) to determine how many times an item is being used and decrement that value with RemoveItem(int id) so that AddItem adds the item only if the count is 0 and RemoveItem removes it only if the count is 1. I'll try to code this counter system and see if things work as excepted.

You've lost me, sorry! So long as you know where youre headed! ;)
 
I went in your posting history and reading 4347 posts :eek: seems a bit too time consuming to get that transaction enabled TableAdapter class... Would you mind to send it to me so I can improve/replace what I did when the time comes for the production application?

I simply added this method to the partial class of each automatically generated TableAdapter :

VB.NET:
Public Sub SetTransaction(ByVal transaction As SqlTransaction) 
    For Each command As SqlCommand In Me.CommandCollection 
        command.Transaction = transaction 
    Next 
    
    If Me.Adapter.DeleteCommand IsNot Nothing Then 
        Me.Adapter.DeleteCommand.Transaction = transaction 
    End If 
    
    If Me.Adapter.InsertCommand IsNot Nothing Then 
        Me.Adapter.InsertCommand.Transaction = transaction 
    End If 
    
    If Me.Adapter.SelectCommand IsNot Nothing Then 
        Me.Adapter.SelectCommand.Transaction = transaction 
    End If 
    
    If Me.Adapter.UpdateCommand IsNot Nothing Then 
        Me.Adapter.UpdateCommand.Transaction = transaction 
    End If 
End Sub
 
that's pretty much what it does.. it jsut uses reflection to begin and apply a transaction to each one of the member commands (of which there are more than just the basic 4, if you added more) plus some methods to wrap it all up. So far it's worked well, but I have some things I need to do to further aimprove things such a row-level updates. currently if a transaction fails, the db is not updated but every row in the datatable that has been saved has its rowstate set to unchanged at the row level.. its an oversight on microsoft's part i think, but you end up with a table that has unchanged rows that arent in the database.. just trying to work round that one, probably by setting the option to continue-on-error in the table adapter, and then wuerying haschanges myself when the update() returns. I dont have it here but when I get to work i'll make a post for the benefit of all here.. I'll also do more of a proper investigation into the behaviour (I started reading the update code using reflector, but man is it ever convoluted..)
 
You're right, I didn't think about the RowStates... It probably loses the true original values when you choose the "Refresh table data" option from the DataSet Designer to retrieve the Id keys of new rows so the new implementation has to bypass the "AcceptChanges" calls from the Update method and wait until the end of the transaction to call it on all necessary rows/tables (just guessing here).

Rewriting the whole Update method : I'll leave that work up to you :p !

Thanks, you really know your stuff! :D
 
Here's the transaction manager I spoke of. When I use it, i just use one of the constructors, sticking all my tableadapters in an Object() array. It would ahve been so much easier if MS had made TableAdapters descend from a common ancestor
 

Attachments

  • TableAdapterTransactionFactory.zip
    1.4 KB · Views: 25
Great, thanks!

I took a quick look and I wondered if you shouldn't also set the SelectCommand's transaction to get the Id value of newly created rows and stuff within the transaction scope.

Also, similarly to the RowStates, what would happen to the DataTable's actual data if the update method reloads auto-generated fields to the DataTable before a RollBack?

--
If I write myConnection.BeginTransaction(), is there any link between the connection and the created transaction? Would it be possible to extend the DbConnection class to achieve the result without any reflection? It could even be using another connection internally... (altough none of this is helping with the RowStates problem).

The SqlConnection and OleDbConnection classes are sealed though and I think the TableAdapters you create using the Designer use the specific connections depending on the data source since they don't implement an existing interface (I don't have any database system to try this at home yet)...
 
Great, thanks!

I took a quick look and I wondered if you shouldn't also set the SelectCommand's transaction to get the Id value of newly created rows and stuff within the transaction scope.
The SelectCommand is simply part of the collection of all commands that are not IUD, reflectively assigned a transaction in the loop you see in the code. If you look at the internal design of a TA having 10 different queries (different fillbyXXX for example) you will see that .CommandCollection is actually 11 elements long, and usually 0 is the original SelectCommand the other 10 being the other FillBys. Dont worry.. I got it covered :D


Also, similarly to the RowStates, what would happen to the DataTable's actual data if the update method reloads auto-generated fields to the DataTable before a RollBack?
That's the big problem with the MS impl. TableAdapter charges through the entire table, seemingly updates everything in the database, accepts changes on a per-row basis, hits a bump, throws an exception, we roll the transaction back, and are left with an un-updated database, and a datatable where half the rows have no changes even though the db doesnt contain equivalent data.
I cant recall if I ever noticed using TransactionScope would rectify this, but TransactionScope screwed my apps because the oracle MTS dll has a bug that makes it impossible to save save in a TransactionScope. I'm still working on a solution, which will i think in essence be a replacement of microsoft's row scanning and saving code.. There are a lot of things to consider though.. if I'm mid transaction and a database error occurs, did the database implicitly roll it back? i'd need to test it. Right now, lame as it may sound, I ask my users to edit and save only one row at a time until i can get time to return to the issue.

If I write myConnection.BeginTransaction(), is there any link between the connection and the created transaction?
Yes, if the db engine cannot support distributed transactions then transactions happen on a per-connection basis

Would it be possible to extend the DbConnection class to achieve the result without any reflection?
I dont think inheritance hierarchies work like that. You cannot modify OracleConnection to descend from your class instead of DbConenction. You cannot pass your class to an OracleCommand that wants an OracleCOnnection. If you can subclass OracleConnection you can, but then again, why bother when you can make the conenction yourself, give it to the tableadapter and it will use it.. (In the designer there is the option to make the connection modifer something else, e.g. public)The big nuisance is that tableadapters dont derive from anything. If they derived from TableAdapterBase that had all the main features like the .XXXCommand, .Connection etc then it would be a lot easier to work with without reflection. Reflection is still hundreds of times faster than MTS promotion though, which is what you get with TransactionScope.. Using TransactionScope used to visibly lock up my app. This doesnt.


It could even be using another connection internally... (altough none of this is helping with the RowStates problem).
No indeed.. that's either a bungling in think-through at MS' end, or a bungling in impl at our end. MS example code quite often calls GetChanges, and updates the duplicated rows that are returned. This is all very well and good but if your DB save routine updates the primary key on some kind of autonumber, you save a row of PK -1 but this becomes AC12345 on the return, I dont know how or if LoadDataRow can Upsert it to the original row. It would be nice if there was some internal ID of a row that didnt change and could be used to link a row in your table with a row in the GetChanges array (becaue they dont point to the same row, they are clones) so you could roll back everything (discard the getchanges array), have the user fix the row in error and attempt another save

The SqlConnection and OleDbConnection classes are sealed though and I think the TableAdapters you create using the Designer use the specific connections depending on the data source since they don't implement an existing interface (I don't have any database system to try this at home yet)...

You can read the TA code, i'ts not complex. Look in YOURDATASETNAME.Designer.vb

To try something at home, use an access DB.. the ODBC control panel can create one for you and linking it is very easy and fast
 

Latest posts

Back
Top