Whats the most efficient way of using my tables etc?

Cheetah

Well-known member
Joined
Oct 12, 2006
Messages
232
Programming Experience
Beginner
Hi there,

I have a database driven program and was wondering what the most effient way of using my tables etc...

Basically, i have one table which has users in it which needs to be available over a whole network of computers and then four other tables which don't.

The only table where i will be adding information to is the users table. The other tables i only need to read off.

There is quite alot of data in one table which fills up the bulk of the information and i am not quite sure how i should access it to make the program efficient.

Should i have the users in a seperate database and have that on the network and then the other bits in a different database which is local?

Also, should i be using datasets, readers?....etc?

I would apprechiate any input.

Thanks.

EDIT:

To give you an idea on size, the main table will be about 100MB in size. (Expecting about 100,000 records each around 1000 bytes each)
 
Last edited:
How much of this data will be needed by the client at any one time?

For maintenance purposes I would centralise it all, but if each client is going to be regualrly accessing 70% of 100mb, in droves, then putting it locally would remove the network congestion problem
 
use datasets when data will be downlaoded to the client, edited and sent back to the db. use readers when data will be forward-only read-only accessed and not stored in the client pp e.g.
downloaded from the client and sent somewhere else, like formatted into a csv file or a log report or something..

#there are FAQs that answer this question in a little more depth
 
How much of this data will be needed by the client at any one time?

For maintenance purposes I would centralise it all, but if each client is going to be regualrly accessing 70% of 100mb, in droves, then putting it locally would remove the network congestion problem


Well it will be accced by roughtly 35 clients at a time at around the same time and they will be querying about 80% of the 100mb data at a time



use datasets when data will be downlaoded to the client, edited and sent back to the db. use readers when data will be forward-only read-only accessed and not stored in the client pp e.g.
downloaded from the client and sent somewhere else, like formatted into a csv file or a log report or something..

#there are FAQs that answer this question in a little more depth

Well the clients will not be changing the data at all, all its being got for is to render a treeview diagram. So I would use a data adapter then.


====


I am trying to slit up my database a bit so it doesnt have to query as much data.

Also i heard that when querying a primary key number (id of something which is a foreign key cause its linked) its quicker than querying other things.

Is that best to do?

Thanks.
 
Well it will be accced by roughtly 35 clients at a time at around the same time and they will be querying about 80% of the 100mb data at a time

Do you mean that the clients will be issuing a command such as:

SELECT * FROM table WHERE ranking < 80000

i.e. they will regularly be downloading 80 thousand rows.


Or do you mean that they will only be downloading 100 out of maybe 80 000 rows? - i.e. the 80% is all likely targets, but the selection list is much smaller.



Well the clients will not be changing the data at all, all its being got for is to render a treeview diagram. So I would use a data adapter then.

It doesnt matter which you use, but I mentioned:

use readers when data will be ... not stored in the client

If youre downloading data from a db into a client [application] and showing it to the user on screen, then you are storing the data in the client [app]
Sorry if I didnt make that point clear. Use readers only if your app is acting as a conduit between the database and some destination. If the final destination of the data is the client app itself for purposes of showing on screen etc then you use a dataset

However, you mention treeview which may add a small complication. I wasnt aware your data was hierarchical, and you dont say anything about its organisation or why you will be using a treeview. I imagine that you will be expanding a node, and filling nodes underneath with database data. In this regard I would actually use a reader because as far as I am aware, a treeview cannot have a data binding for its nodes. It would be possible to write your own node collection but youre likely just going to want to build it up as you go


I am trying to slit up my database a bit so it doesnt have to query as much data.

Dont split the tables - size of tables is of minor concern to a database that is properly indexed. Form your queries more carefully, and remember that no user appreciates having a control with thousands of items in

Also i heard that when querying a primary key number (id of something which is a foreign key cause its linked) its quicker than querying other things.

A primary key is a (combination of) column(s) that can be used to uniquely identify a row. They are always indexed for fast enforcement of a constraint, and it is typically one of the fastest retrieval operations a database can do, to retrieve rows based on a where clause that targets the primary key.
That doesnt mean it can be used in all situations however. I cant really explain much more without knowing more about the context

Is that best to do?

Thanks.[/quote]
 
Do you mean that the clients will be issuing a command such as:

SELECT * FROM table WHERE ranking < 80000

i.e. they will regularly be downloading 80 thousand rows.


Or do you mean that they will only be downloading 100 out of maybe 80 000 rows? - i.e. the 80% is all likely targets, but the selection list is much smaller.

They may only be downloading half the data, they could be downloading the whole of the data, it depends, they could only be downloading a little data.


However, you mention treeview which may add a small complication. I wasnt aware your data was hierarchical, and you dont say anything about its organisation or why you will be using a treeview. I imagine that you will be expanding a node, and filling nodes underneath with database data. In this regard I would actually use a reader because as far as I am aware, a treeview cannot have a data binding for its nodes. It would be possible to write your own node collection but youre likely just going to want to build it up as you go

Well basically, i query one table, write a node, the based on that query, query another table to get the child nodes, and i am currently doing this via a datareader (see below). So I am rendering the whole treeview at once.

VB.NET:
            While dbrsub.Read()

                Dim sub As String = dbrsub(0).ToString()

                tv.Nodes.Add(sub, sub)

                While dbrdsub.Read()

                    tv.Nodes.Item(sub).Nodes.Add(dbrdsub(0).ToString, dbrdsub(1).ToString)

                End While

                dbrdsub.Close()

            End While

            dbrsub.Close()


Dont split the tables - size of tables is of minor concern to a database that is properly indexed. Form your queries more carefully, and remember that no user appreciates having a control with thousands of items in

Thats why i am putting the data into sub catergories (tree nodes) so they dont see all teh data at once. I was thinking that because there is alot of data in a record which isnt relavent to the search, that i have an index number which links to another table with the index number, so once i have searched and found the row i want, i simply get that row from the other table which has the bulk of the information in.

Would this be a bad idea?

A primary key is a (combination of) column(s) that can be used to uniquely identify a row. They are always indexed for fast enforcement of a constraint, and it is typically one of the fastest retrieval operations a database can do, to retrieve rows based on a where clause that targets the primary key.
That doesnt mean it can be used in all situations however. I cant really explain much more without knowing more about the context

See the example i am doing it in above.


==========


Thank you very much for your response, you have been very helpful.

EDIT:

Also, why not just index the whole database or should you only index the fields that you search by.
 
Last edited:
They may only be downloading half the data, they could be downloading the whole of the data, it depends, they could only be downloading a little data.

So, the 80% is a sway towards normal use being a lot of data downloaded? I would definitely have it local, unless the data is to be updated by each user - you may then feel the management headache of each client merging and distributing updates among 34 other clients to be more of a headache than having it central.



Well basically, i query one table, write a node, the based on that query, query another table to get the child nodes, and i am currently doing this via a datareader (see below). So I am rendering the whole treeview at once.

I'd recommend that you only render sections of treeview as they are expanded, and keep the node grouping relatively small - Windows explorer doesnt read the folder structure of your entire hard disk as soon as you open My Computer, it reads them as the user selects.

Form your queries so that the minimum of data is downloaded to populate the tree (node text and node id should be all that is needed until a leaf is reached)

Thats why i am putting the data into sub catergories (tree nodes) so they dont see all teh data at once. I was thinking that because there is alot of data in a record which isnt relavent to the search, that i have an index number which links to another table with the index number, so once i have searched and found the row i want, i simply get that row from the other table which has the bulk of the information in.

It sounds like youre on the right track, and i dont know how many levels there are in your hierarchy but typically you would have a table with 3 fields:

node_id, parent_node_id, data_record_id

node_id for new nodes is a sequential counter. Any node that has a parent, has that nodes parent ID recorded in the parent ID, and the data record ID is normally null (unless parent nodes have data) until you reach a leaf node

upon expansion of a node, you know the node ID that is being expanded so you:

select * from node_hierarchy where parent_node_id = <the node id>


it's possible to do away with the data_record_id and key the data records straight to the node - it jsut depends if more than one node will have the same data record attached (i.e. two routes to the same place)


Would this be a bad idea?

Not at all.. The only thing is that downloading should be like eating - a little and often, rather than everything at the start. It stops your program getting fat in memory, with useless bits that will not be used


Also, why not just index the whole database or should you only index the fields that you search by.

There comes a point where indexing is either pointless or impossible, indexes must be maintained and they add a level of indirection to accessing data. Sometimes the costs outweigh the gain and the index is useless. There are pages of articles written on database design and indexing and I cant replicate it here. Suffice to say, its a large and complex topic that is best asked in context :) To answer your question, ther are times when its a waste to even index a field you search by, but generally if youre repeatedly looking for a small number of results out of a huge number of tables, then those columns youre searching on become candidates for indexing. Be aware that indexes dont help on all kinds of searches or you might have to pull some tricks in the index.
For example if youre searching for strings that always end with certain characters, then a classic index based on the start of the string wont help. Instead you might choose to create an index on the result of reversing the string, and then use a reversal in your search. Not all DBMS will support this feature, of course. If it does not then you might consider storing the data in reverse, for fast access and turning it round when displaying
 
Back
Top