How helpful can DataRelations actually be?

cjard

Well-known member
Joined
Apr 25, 2006
Messages
7,081
Programming Experience
10+
Using the well hammered example of Person:Address, I will describe my database structure:

2 DataTables, one for address, one for person details
ADDRESS_REF is the PK of the Address table (in combination with a work/home/etc specifier) and a same named column exists as the FK in Person

One person can have many addresses (home, work, etc)

The datarelation that exists is set up such that Person is the Parent table, and Address is the child table. The form is set up correctly such that the binding nav for addresses is bound to the relation, allowing browsing of multiple addresses for multiple people


When we insert these into a database, the addresses is inserted first, with a -1 as the address_ref. Seeing this, the stored procedure that performs the update calculates a proper ID for the address, inserts the record and transmits the proper ref back to the client via the same parameter.

At this point I manually inspect to see if it was -1 before and not -1 now.. I take the new value and iterate the customers table, changing all -1 for the new value. If i had to insert 2 people, one of them would be -1 and one would be -2 thus ensuring they dont overlap and keeping the client side datarelation happy.

Now, what I ask to know is if the relation can keep this in sync for me. Judging by the notion that relations care cascade only, i think this will be parent, down to child and hence the data is updated in the wrong order for this to occur.

What can I do to use an automatic keep-in-sync offered by the relation?

Additionally, if I perform the entire operation inside a Transactions.TransactionScope, and inserting the first lot of data succeeds but the second lot fails, will the transaction scope be able to correctly roll back both my client side dataset data, and the database data?
 
Holy carp! .... how can that possibly work? The Person record should be your parent.... the Address is the child with an FKey back to the person? With the way you have it, how can a person have more than one address?

-tg
 
That is a tough one.
First I'd look into different possibilities for the id field.
Sounds like you are using autoincrement to provide a unique identifier.
It may be more beneficial to generate your own unique key from the data input.
Think about concatination of several fields or partial fields (first 3 digits of the address, last 3 letters of the street, and last 3 digits of the zip - or something like that) .
That key can be generated at design time during data entry, applied to both datasets, verified and posted with the ability for potential rollback.

You are going to have to manipulate the dataset structure to accomplish what I think you are going for...


Good luck.
 
That is a tough one.
First I'd look into different possibilities for the id field.
Sounds like you are using autoincrement to provide a unique identifier.
It may be more beneficial to generate your own unique key from the data input.
Think about concatination of several fields or partial fields (first 3 digits of the address, last 3 letters of the street, and last 3 digits of the zip - or something like that) .
That key can be generated at design time during data entry, applied to both datasets, verified and posted with the ability for potential rollback.

You are going to have to manipulate the dataset structure to accomplish what I think you are going for...


Good luck.


mmhh... i need the autonumber, because it actually goes on to be a bigger part of the machine, and things like credit card numbers will be calculated from it.

Additionally, the entries in the Addresses table, for example, will be incremental. One person will always have just one record, one address ref, but if they move houyse, a new addres is created to provide audit trail. THis is why the addresses table has a compound key where one field is guaranteed unique. I cant take any other address data because in this system it is conceivable that an address will exist twice if husband and wife sign up as different people, at different times, or joe moves into the house that jane is vacating and they are both my customers.

So, i'm curious to know if I must continue to struggle with this notion:

Getchanges to the address datatable
note the id
update the database
if the id changed, manually update the person table to maintain the relationship
get the changes to the person table
update the database
merge the changes to my getchanges() datatables, into the original datatables.


all in, this isnt quite what I had in mind for the new data access components. I'm seeking a way that I can maintain a client side set of related data, and just save the whole darn thing to a database, in a transaction, using stored procedures to do the writing (required) easily.. save person first, and have address ID update automatically reflected.
 
You never moved house? You work from home?

You took the question out of context.... or rather I may have asked it out of context....

It sounds like the AddressID is in the Person table.... so if a person has multiple addresses.... how does that work?

To me it would make sense to save the person first and get the ID. Then insert the Address(es) with the PErsonID as the FKey back to the person table.

OR have I got it all wrong?

-tg
 
mmmh, no, you have it right..

The relationship within the Person table of PersonID:AddressID is 1:1 and I know this is in some ways a normalisation error - any column that is related 1:1 with another is technically redundant but there is a data complexity here

The system contains banks, people, businesses etc and each of these entities has an address that is always of the same format. Each of these entities has a unique ID for its own particular reasona nd it doesnt follow a spec that is universal. Because we are reusing the addresses table for many entities' addresses the address table has its own simple system of PK and each table that wishes to attach to it uses a field called AddressID that is probably 1:1 related to that table's PK - introduction of a normalisation error for the sake of readable clarity

The types of addresses in the addresses table are distincted by a column indicating the type, (a trigram related to the entity the address refers to) but this is largely another useless syntactical convenience. the simple PK system for Addresses is an incrememnting number that is inserted into whichever table. It ensures that AddressID 1 will never appear in both person and Bank, so you can do either of these for person addresses only:

SELECT * FROM addresses WHERE type = 'PE'
SELECT * FROM addresses JOIN person USING(address_ref)


because the join has a primary key on the slave table, and is doing a full scan of the driving table, it should be faster to do the join than scan the entire addresses table and pick out the 'PE' records, and saves the creation of an extra index on the Type

Now, I have considered your last suggestion, that I insert person first, calculate their address ID when I insert them and pass that back, thereby alllowing the relation to autoupdate the addreses it knows of. I forget why it wasnt implemented this way, but there was a reason. I shall look into it and report back to you, but it will be in approx 1 week, because I'm on holiday. Thanks for the feedback and second opinion; if I can find a way to make it possible the way you suggest, it'll be a) a good solution, b) a solution recommended by another developer (which, given that i've been way out non a limb with this particualr project because i'm the only .net2 developer my company has got - its a good thing to know not all your ideas are crazy off the wall things :) )
The good thing is I wrote the stored procedures in such a way that I should be able to use them in this order without a problem, and they use merge statements, which if you never encountered them are "insert if its not there, update if it is" - no PK violations possible. kinda. :) so i'm hoping i can implement as you suggest.
 
Now, I have considered your last suggestion, that I insert person first, calculate their address ID when I insert them and pass that back, thereby alllowing the relation to autoupdate the addreses it knows of. I forget why it wasnt implemented this way, but there was a reason.

It was possibly because, originally, a relation existed that constrained the data entry in such a way that the insertion of an address had to happen first - it isnt possible to insert a Person record with an address id of 1234, if that address doesnt exist already.. hence the address must be inserted first and then a related record can be made in person. typically in Relational syatems i've worked with, it's always been this way.. if you conceive related tables as a star hub type diagram, you must begin by inserting at the leaf nodes and come up the linked branches to the centre of the hub. I may have removed relational constraints from the database itself as part of the design maintenance (though there is strong agument for them remaining they were causing too many problems with user submitted bulk data.

If you know of another way round such a problem - i.e. to have cascading relations in place on an RDBMS, and being able to insert related records in any order, then I'd be interested to hear it! :)
 
*WHACK!* *WHACK!* *WHACK!*

THat's the sound of me beating my head into the desk.... I'm not convinced you understood what I was saying.

The parent object shouldn't care about the child. The child should care about the parent.

What I was suggesting is that the Person table does NOT have an AddressID in it.... But the Address table should have the PersonID in it. Like this:

VB.NET:
PersonID int NOT NULL autoincrement,
.
.
.



AddressID int NOT NULL autoincrement,
PersonID int NOT NULL,
.
.
.

THen you can have one person with many addresses.

Now, if you want to use the same Address table for multiple parent types.... change the PersonID field to something generic, then add one more field that identified the ID type....

VB.NET:
AddressID int NOT NULL autoincrement,
ParentKeyID int NOT NULL,
ParentTypeID int NOT NULL,

Then the PArentKEyID can be a personID, bank ID, etc. You can then tell what the parent is by using the ParentTypEID (you can link this to a lookup table if necessary.)

We use a similar set up in our system, where for the ParentTypeID, 1 is the account, 2 is the billing address (where invoices are sent) and 3 is the Customer (there could be one customer with 1+ accounts (possibly for different locations)).

So if we need to get all Account Addresses, SELECT * FROM tblAddress WHERE PArentTypeID = 1.

Now, this does mean there is NO FKey between the parent/child tables since the ParentKeyID could go to any one of several tables. It means that we have to take care in our delete stored procs to remove the necessery child records.

Inserting - we insert the parent record first (Say Account) and get the ID (AccountID)... we can then insert into the Address table the AccountID, with a ParentType of 3.


Does that help explain?

-tg
 
What I was suggesting is that the Person table does NOT have an AddressID in it.... But the Address table should have the PersonID in it.

I thought I'd mentioned earlier that this structuring is not possible because addresses are typed, and do not always belong to a person. I cannot put PersonID into the address table, because a Bank, Warehouse, Merchant, Client, Customer etc also have addresses, and they all use the same addresses table. There are about 7 or so entities in this system, and each of them can have multiple addresses. There isnt one specific naming convention I can give to the PK of the addresses table that will allow it to apply to all the entities that might possess an address. You could say its possible to create one addresses table, and 7 views, called BankAddresses, PersonAddresses, CustomerAddresses, WareHouseAddresses etc, and hence name the PK column differently for each one.. But it's just a name.

In one way I totally agree with what youre saying, and technically, since a PersonID and a BankID, WarehouseID and all the other addressable entities IDs are all 8 characters 00000000 ... 99999999 then it is possible to put the values into the address PK..

i.e. i really could put the PersonID PK of 00000001 into addresses and that would then make that address belong to that person. When I want to store the address of BankID 00000001 then that causes a problem. The problem is already addressed as a second column in addresses forms a compound primary key. AddressType is PE for a personID and BA for a bankID, so the two 00000001 values for the relevant PK of person and bank can coexist in the addresses table. This does, however, break the naming convention in place.
(Further to this, a date column as a third element of the addresses compound PK allows a person to change address on a daily basis but that's extra to this discussion right now)


Now, i realise this breaks a normalisation feature: namely, within a particular table, be it Bank or Person or Warehouse etc there is always a 1:1 relationship between the bankID and addressID column. Its a sacrifice the designer chose to make to strive for maintining the simplified naming convention.. if a column called XXX_REF is encountered in a table and it is not the PK, then it is an FK of another table called XXXES that has in identically named PK column or column that is part of a compound PK.

It's basically to prevent a dataset that is a meaningless block of numbers in the Addresses table PK columns..




Now, if you want to use the same Address table for multiple parent types.... change the PersonID field to something generic, then add one more field that identified the ID type....
One of the problems was in finding something so suitably generic that it still meant something. The TypeID notion was implemented too, largely in secret by me, because I thought more along the lines of your thought here, and I was preparing for future update should a radical overhaul of the database be required.

Additionally for reasons you name below:

So if we need to get all Account Addresses, SELECT * FROM tblAddress WHERE PArentTypeID = 1.




Does that help explain?

it does indeed, and it would appear that that is how the designer set out to have this system. It no longer works like that, and I guess you could say that something as simple as a naming convention has forced this particular hand.. I do feel that the system is easily navigable and a nice place to be from the point of the developer.. it's just rather horrible from the relational management aspect you so rightly point out

Also, your comments have spurred me to investigate client side relation maintenance, and i've found some things that a) i wished I'd known 3 months ago and b) reduce my workload a lot
 
AAAGH, fargin bastages....

If I had inherited such a schema, I would have scoped out the tallest part of the building by now (as it turnes out, our structure at 45feet tall isn't tall enough to obtain a terminal imact at ground level :) ).

Ah, I see the delema.... yeah, I can see how that all sucks. No chance of getting it right either huh? (mgmt: if it ain't broke, don't fix it, eh?)

-tg
 
mmmh.. well in a way, it's not that *broken* - i can easily alter the relational cascades (i dont think any exist, actually) on the oracle server because as noted, in any particular entity table, there is a 1:1 relationship between the PersonID and the AddressID. As such you could consider them equal candidates for the primary key of Person.
As a result, it is conceptually possible to say that the current:
Addresses.AddressID <--> Persons.AddressID is logically equivalent to the usual notion you have pointed out:
Addresses.AddressID <--> Persons.PersonID

Thanks for being the ideas board though! I've gained a lot of thopught-provocation simply by talking to you about it :)
 
Back
Top