List Items and Frequency of Association

epiphyte · Aug 11, 2009

Hi Everybody,

I'm working on an application to compare the associations of items in several lists. Probably the most well known example is how on facebook they offer a friend suggestion based on quantity of friends in common.

A while back I developed a similar application to display keywords that were most frequently associated with the selected keyword. However, looking over the many to many table that I'd created it sort of rubs me wrong that the same association has two records...so the table has twice as many records as is necessary. Meaning, there is a record that associates Item A with Item B and a record that associates Item B with Item A. However, the alternative requires making more requests on the server...because when you're adding a new association you have to check for both possibilities...if your first query doesn't yield a result.

Generally, I prefer to sacrifice space rather than speed. For example, in the many to many table I also have a column that contains a count of how many times that association has been made...instead of trying to calculate that on the fly.

Anybody ever work on anything similar and come up with a more efficient solution?

Thanks in advance!

rcombs4 · Aug 12, 2009

To me, it sounds like you are working on a feature like in text messaging. You type in a word and hit space, the phone then adds some other word that you frequently use after the first word. For this type of scenario you definitely want the DB to have two records of the 'same association'. This is because the phrase 'What Time' is different from 'Time What'. But maybe in your case the order of the two keywords doesn't matter.

But I would feel it would be much easier to calculate the number of times the association has been made. The calculation is really easy and not very demanding.

I'm not sure if this is the most efficient query but it should do the trick

VB.NET:

SELECT I.ItemB, (SELECT Count(*) FROM Items WHERE ItemB = I.ItemB)
FROM Items I
WHERE ItemA = @ItemA

Also the advantage of calculating this on the fly is that when you need to add an association you simply just insert a new record in the DB. The other way you would need to check to see if the association exist currently then either update or insert.

cjard · Aug 13, 2009

Why can this not just be solved with a single table:

keyword1, keyword2, count

SELECT count FROM table WHERE
keyword1 = 'dog' and keyword2 = 'rabbit' OR
keyword1 = 'rabbit' AND keyword2 = 'dog'

There is only one matching row, because you don't store both pairs

If you want all the keywords associated with dog:

SELECT CASE WHEN keyword1 = 'dog' THEN keyword2 ELSE keyword1 END as keyword
FROM table
WHERE keyword1 = 'dog' OR keyword2 = 'dog'

or

use a union all

cjard · Aug 13, 2009

note rcombs is right in that calculating things can be faster than looking them up. if you had a table of only keyword1 and keyword2 that was bi-indexed

index 1 -> (keyword1, keyword2)
index2 -> (keyword2, keyword1)

then the db could answer a query like:

SELECT count(*) FROM table WHERE
keyword1 = 'dog' and keyword2 = 'rabbit' OR
keyword1 = 'rabbit' AND keyword2 = 'dog'

or

SELECT count FROM table WHERE
keyword1 = 'dog' OR keyword2 = 'dog'

Straight out of the index.. It could actually answer all association lists out of the index too; it wouldnt be hitting the table at all. (weird uh? the table needs to be there to provide the data for the index, but the index would answer the queries not the table)

When you want to make associations, just write them into the table, the more the merrier

List Items and Frequency of Association

epiphyte

New member

rcombs4

Well-known member

cjard

Well-known member

cjard

Well-known member

Similar threads

Share this page

Latest posts