concurrency questions

Okame

Member
Joined
Oct 7, 2010
Messages
6
Programming Experience
5-10
We have a small team of developers who currently program in a system similar to Access. We are looking to move over to something a bit more of an industry standard and are looking at developing in Visual Basic.NET and Microsoft SQL Server. I have some questions I hope someone can answer for me please. I apologise if they are slightly out of the scope of this forum.

1) Our current system is a heirarchical database that allows multiple developers to log in to the same application at the same time. It takes care of locking so two developers can't alter the same form (for example) at the same time. Is there a system similar to this that you can use with VB.NET? I have read about the Team System and Team Foundation Server but I'm still not clear if this will do what I want.

2) Again on our current system, the database server automatically updates records on other screens when a change is made by a user. I appreciate that SQL Server is not really designed to work this way and was thinking of creating my own 'server' to notify clients when a record has been changed. Does anyone have any experience of this? Is it feasible or am I looking at a lot of work when there is an acceptable, though not exact, alternative?

3) More specifically to do with VB.NET/ADO.NET: I have a reasonable amount of experience programming in VB6 and at that time I never used data-bound controls because in VB6 it just didn't work very well. Also, not necessarily related to bound controls, I don't like the idea of having huge datasets in memory all the time. I used to read the data from the data source, populate a listview for example and then close the dataset/connection. I have heard that datasets and bound controls are a lot better in .NET, would I be wrong to still use the old method or are there any valid reasons for still doing so (bearing in mind question 2, above)?

Any assistance would be gratefully received.
 
Errr...... I originally created this thread in the VB.NET general forum with a subject of 'Advice on Moving to VB.NET'. Not sure how it ended up in the SQL forum with a subject of 'concurrency questions', especially as the post is nothing to do with concurrency or SQL Server (at least, not in the implied sense)...

Could a moderator please move this back in to a more relevant forum?
 
It's a little unclear as to what part of your question relates to developers working on the app and what part relates to user working with the app. You seem to have plonked both in together. If so, they are different issues and belong in different threads.
 
Perhaps I should rephrase the questions:

1) Is it possible for multiple developers to work on the same VB.NET project at the same time? If so, how? Does it involve Team Foundation Server?

2) I was thinking of writing my own 'server' in VB.NET so that, for example, a record updated on one user's screen could be updated on all other users' screens who are viewing that record (without them having to click anything). Is there a better way of doing this?

3) I have heard that datasets and bound controls are a lot better in VB.NET than VB6, would it be a bad idea to still fill controls manually (taking in to account question 2, above)?


Maybe I should've made three different posts but even if I did, I'd struggle to think of a better forum to post them in than 'VB.Net General'. Perhaps question 2 would've been better in 'Database General' but I wanted people reading this to have question 2 in mind before answering question 3.
 
These are three essentially unrelated topics so, for future reference, please do keep each thread to a single topic and each topic to a single thread.

1. What you're looking for is called "source code control" or just "source control" and there are many options. Team Foundation Server is a good, very functional product but it is also very expensive. You would want to going to make use of the functionality it provides to justify the cost. There are also various free options available, including Subversion, which would be better if cost is an issue and/or you just want simple source control.

2. This is not really something that databases generally do. They are there to have data pulled from them, rather than their pushing data out. The .NET Framework does include the SqlDependency class, which can be used with SQL Server 2005 or later to do what you want, but it is only intended to be used with a small number of clients. You could perhaps create your own server and route all data changes through it, so it could notify all clients. You might also be able to use triggers in your database to invoke some external process using xp_cmdshell.

3. Data-binding in VB.NET is very different to data-binding in VB6, mainly because of the difference in architecture between ADO.NET and ADO. ADO worked on a connected model, so bound controls had a live connection to the database. ADO.NET works on a disconnected model, so you open a connection, retrieve the data, close the connection, bind the data, edit the data, open the connection, save the changes and then close the connection again. If you're working with one record at a time then it probably doesn't really matter whether you bind or not. If you're working with multiple records then populating a DataTable and binding it should be your first choice. You should only venture beyond data-binding when you need to do something that it can't. There are also many situations where you might combine bound and unbound controls, where you do as much as you can with data-binding and then write custom code for the bits it can't handle. Data-binding is for more than just data from databases too. In WinForms, you can bind control properties to application settings, saving the changes automatically to the config file. In WPF, data-binding goes much further than it does in WinForms and you pretty much have to use it.
 
Thanks for your response. Very helpful answers.

The disconnected model worries me, perhaps because I don't understand how to use datasets optimally.

Imagine you have a table with 1000 records. You need to give the user the ability to browse through those 1000 records. If you create a dataset with all 1000 records and bind the control(s) to it, how long do you allow the user to browse that dataset before you start to worry that the data has changed on the server?

I thought about reloading the data in to the dataset every time the user changes to another record but this could generate an awful lot of traffic. My other thought was writing the server I mentioned. I guess the SqlDependency class could do this as well but I wonder if it would support 50-100 simultaneous users.

Am I missing something about all this? After all my searches on the internet, I've come to the conclusion that no one seems to be that worried about the fact the user may always be looking at out-of-date data. I don't understand this because I cannot stand the thought of my users looking at a record which is not showing the latest data.
 
The fact that you cannot stand your users looking at out of date data isn't really relevant. What matters is whether absolutely up to date data is critical to the application. In most applications, each individual record will not change very often. In many applications, most of the data won't change very often. The amount of effort involved in keeping the data the user sees up to date is far in excess of the gain achieved by occasionally updating a record being viewed. The important thing is that the current state of the database is taken into account when the user saves changes. That's where optimistic concurrency comes in.

The idea of optimistic concurrency is that you assume that, more often than not, when a user retrieves data, edits it and then saves it, no other user will have changed that data in the mean time. As such, when you retrieve data, you leave it unlocked and accessible to all. This contrasts with pessimistic concurrency, which locks all data retrieved until it is explicitly released. The advantage of pessimistic concurrency is that it guarantees that any data you retrieve can be edited and saved without the possibility of someone else changing it in the mean time. The problem with that is that the more data you retrieve and lock, the more likely you are to inconvenience other users. For instance, if you have 1000 records in your database and 1000 users, using optimistic concurrency there is no problem with all 1000 users retrieving all 1000 records, editing 1 record and then saving it all at the same time, as long as each one edits a different record. In the case of pessimistic concurrency, only one user at a time could retrieve the data, so I wouldn't want to 1000th in that queue. This example is a little contrived but hopefully it illustrates the point: in most apps for most of the time, more than one person won't be editing the same data at the same time.

Now do you see why someone renamed your thread? This particular issue is 100% concurrency. Of course, it's not as simple as I've described thus far. You also have to take into account how long a user will view the data. In the case of that 1000 records I mentioned, if each user is going to be looking at that same data all day then you would obviously want to refresh it at some point. Otherwise they will all be looking at out of date data before long. How you address that depends on the specific circumstances of the application. For instance, if everyone will most likely be looking at the same data for long stretches then you might want to add an auto-refresh feature. Perhaps every five or ten minutes you could refresh the data from the database. If users are unlikely to be viewing the same data for a long time then you might provide a manual refresh feature, just for those who do need it.

There's also the issue of how you refresh the data. Do you just discard everything and retrieve everything again? That's inefficient for large amounts of data. Another option is to add a LastModifiedTime column to the table(s) and then only retrieve those records that have been modified since the last time you retrieved data. This takes a little more effort but leads to far less data travelling between server and client. It also means that you cannot ever delete anything, so if you normally would delete then you'd need to add another column to contain a flag that indicates a record is no longer active instead.

There's also the issue of what you do when a user tries to save data that has changed since they retrieved it. Any time multiple users can edit the same data, this is a consideration. Many people adopt a "last in wins" approach, which means that, any time you save data, your data overwrites what's in the database. For some people this is the best option, while others just use it because they haven't thought about the consequences or the alternatives. In this case, if two users edit the same record but change different fields, the second one to save will overwrite the changes made by the first and those changes will be lost.

In the case of optimistic concurrency, when you save data, the data in the database is compared to the original state of the data you're saving. This support is built into ADO.NET by virtue of the fact that a DataRow stores two versions of the data: original and current. The original data is compared to the database and, if they are the same, the current data is saved. If they are not the same then a concurrency exception is thrown and the save fails. It's up to you to handle that exception in code, determine which row(s) cause the issue and then decide what to do. There are various options and you might ask the user to decide which they want to pursue. You might discard the users changes and refresh the data. You might get the current version of the records that are in error and then ask the user to merge their changes with those in the database. You might take a different approach again.

There is also the option of not showing the user chunks of data at a time. You might just retrieve one record at a time and display it in TextBoxes, etc. In that case there may be a slight delay navigating from one record the next, but your guaranteed of always displaying the most up to date data, at least when the user navigates to a new record.

Hopefully this spiel demonstrates that people have thought about this problem before. Many people before me have given it a lot of thought. There are various options available to you and the best one for you depends on the specifics of your app.
 
Well, as the thread name was changed I thought I would hijack the thread and bring it on-topic. Now there's something you don't see every day :p

Seriously though, I couldn't ask for a more complete reply than that and you have my sincere thanks for taking the time and effort to help me when I have no right to expect it.

I agree with you that most of the time the data will not change and I'm worrying needlessly. However, we do have one client using our current system whose main requirement is to be able to view a list of records which is updated as soon as users in another part of the system alter one of those records. It isn't a huge list of records but all the records will be changed regularly during the day by other users. When a record is changed, only that record is updated on the screen, without reloading the whole 'dataset', and the list remains in exactly the same position, though possibly moved up or down a line if a record was added or deleted.

The problem with me is the system I use at the moment is a combined database server and notification server. The ability to have a record update in front of your eyes automatically when another user updates that record is completely inherent and requires no extra coding. Don't get me wrong, it's far from perfect which is why we're trying to move away from it. When you're used to something like this though, it's hard to grasp how a completely disconnected model is anything but trouble.

Regarding the disconnected model using a VB.NET front end and a SQL server back end. You talked about opening the database connection, doing the work and then closing it again. Does it make sense to do this on a client that is going to be querying the database all the time, possibly even asynchronously in another thread as well (or is this impossible/inadvisable)? Would it not be preferable to open the database connection when the application starts and leave it that way until the user closes the application?
 
Last edited:
Regarding the disconnected model using a VB.NET front end and a SQL server back end. You talked about opening the database connection, doing the work and then closing it again. Does it make sense to do this on a client that is going to be querying the database all the time, possibly even asynchronously in another thread as well (or is this impossible/inadvisable)? Would it not be preferable to open the database connection when the application starts and leave it that way until the user closes the application?
It really depends on what you mean by "all the time". If you really do mean all the time then you certainly would only open a connection once. As I said, you open the connection and then close it again when you've done your work. If your work continues until the the app closes then that's how long you keep the connection open. I'm not saying that you can only execute one query before closing the connection. You can certainly execute multiple queries if you have multiple queries to execute.

If you actually mean a lot of the time then you would still open and close connections as needed. ADO.NET has been specifically designed to work this way, which is a difficult concept to grasp for many people. An ADO.NET connection object is actually very lightweight. The actual database connection is at a lower level and is managed by the system.
 
Once again, thanks for your help.

One final question if you or anyone can answer. Can you recommend a good book or website that demonstrates best practices for things like object/variable naming conventions?
 
Back
Top