Question Looking for advice organizing new database

keb1965

Well-known member
Joined
Feb 9, 2009
Messages
103
Programming Experience
10+
I have been put in charge of redeveloping a prety large database application. The existing application has several hundred stored procs and at least that many tables. The problem I see with it is that the data appears to be duplicated in many places. I'd like to simplify the database and put in place relationships that make sense, not only now, but to anyone looking at the data in the future.

For now, I would like to just work on cleaning up the most egregious problems and get to the others at some other point.

So should I break out duplicated data and place it into related tables with a 1 to ~ relationship? As far as that is concerned, what is the general rule when dealing with records where there may be duplicated information. Does it make sense to create another table when there aren't alot of fields that are duplicated?

An example might be a table where there are records with several duplicated fields and two distinct fields. Creating a table for the distinct fields and creating a ~ to 1 relationship with the originating table would allow the removal of all records where the several fields were duplicated.

For example, instead of having an employee table with DepartmentName and DepartmentManager (a nightmare to administer by the way) you might have an employee table with DepartmentID and let the department table keep all data related to the department, because you will likely have multiple employees for each department, although this wouldn't allow a user to be part of multiple departments unless either the record was duplicated with a different departmentID or another table is used to map EmployeeID and DepartmentID to the respective records in the other tables.

Any advice from the gurus would be greatly appreciated.
 
You've pretty much nailed it. There should be a Department table with a DepartmentID column and then you either add a DepartmentID column to the Employee table where employees are only members of one department or else you have an EmployeeDepartment table with an EmployeeID and a DepartmentID where employees can belong to multiple departments. You would also create the appropriate relations and foreign key constraints. This allows you to treat a department as a separate entity and add, edit and delete it as such, as well as ensuring that you don't end up with multiple spellings for what should be the same department. The Department would then probably have a ManagerID that was a foreign key from the Employee table.
 
Great! I thougth I was on the right track.

One more related question.

When setting up the relationships, what is the significance of the UPDATE/DELETE rules? I've tried to find something comprehensive but it still doesn't quite make sense. For example, you can set it to "No Action" "Set to NULL", "Set to Default" or "Cascade"

For example, tblEmployees has FK departmentID ~ to 1 with tblDepartments.ID. If tblDepartments.ID is changed and the UPDATE action is set as "Cascade" does that mean the records in tblEmployees are automatically updated where empl.departmentID = dept.ID? What if the action is "Set to Default"?

I am looking to make this as foolproof as possible .. because it would appear the biggest fool of them all is the one doing the redevelopment ;-)

but, I've heard it said more than once, the only stupid questions are the ones not asked.
 
If you have a parent record and a child record with a relation between them, you need to decide what happens when the PK of the parent record changes or the parent record is deleted. Generally speaking, you shouldn't be changing the PK of a record anyway so, if that's the case, you can leave the rule as No Action for UPDATE. If you might be editing PKs then the most likely option would be Cascade, which means that the new value would be copied into the child record too.

When deleting a parent record, you first have to decide whether you're always going to want to delete related child records too. If not then you'll want to choose Set to NULL or, if that column has a default value set, possibly Set to Default. If you do always want to delete child records too then you need to decide whether deleting a parent record should automatically delete the child records or deleting a parent with children throws an error, thus meaning that the children must be deleted explicitly first. If you want children deleted automatically then select Cascade, otherwise select No Action.

Note also that some options will throw an error if you have a foreign key constraint on the relation, which is normally the case more often than not, but not otherwise. A FK means that, if there's a value in the child then that value must appear in a parent too. That means that, if there is a FK, deleting or updating a parent and not deleting or updating the corresponding children will throw an error while it won't if there's no FK.
 
I sincerely doubt we would ever change the ID of any record. If we do anything, it would be to update all the fields, just not the ID, that being said, there exists a real possibility that a record would/could be deleted. In those cases we would only want to allow that to happen when there are no child records, thus no updating would ever be needed.
 
Back
Top