"how to model a database with many m:n relations on a table" Code Answer

2

your design violates fourth normal form. you're trying to store multiple "facts" in one table, and it leads to anomalies.

the person_attributes table should look something like this: personid jobid houseid restaurantid

so if i associate with one job, one house, but two restaurants, do i store the following?

personid jobid houseid restaurantid
    1234    42      87         5678
    1234    42      87         9876

and if i add a third restaurant, i copy the other columns?

personid jobid houseid restaurantid
    1234   123      87         5678
    1234   123      87         9876
    1234    42      87        13579 

done! oh, wait, what happened there? i changed jobs at the same time as adding the new restaurant. now i'm incorrectly associated with two jobs, but there's no way to distinguish between that and correctly being associated with two jobs.

also, even if it is correct to be associated with two jobs, shouldn't the data look like this?

personid jobid houseid restaurantid
    1234   123      87         5678
    1234   123      87         9876
    1234   123      87        13579 
    1234    42      87         5678
    1234    42      87         9876
    1234    42      87        13579 

it starts looking like a cartesian product of all distinct values of jobid, houseid, and restaurantid. in fact, it is -- because this table is trying to store multiple independent facts.

correct relational design requires a separate intersection table for each many-to-many relationship. sorry, you have not found a shortcut.

(many articles about normalization say the higher normal forms past 3nf are esoteric, and one never has to worry about 4nf or 5nf. let this example disprove that claim.)


re your comment about using null: then you have a problem enforcing uniqueness, because a primary key constraint requires that all columns be not null.

personid jobid houseid restaurantid
    1234   123      87         5678
    1234  null    null         9876
    1234  null    null        13579 

also, if i add a second house or a second jobid to the above table, which row do i put it in? you could end up with this:

personid jobid houseid restaurantid
    1234   123      87         5678
    1234  null    null         9876
    1234    42    null        13579 

now if i disassociate restaurantid 9876, i could update it to null. but that leaves a row of all nulls, which i really should just delete.

personid jobid houseid restaurantid
    1234   123      87         5678
    1234  null    null         null
    1234    42    null        13579 

whereas if i had disassociated restaurant 13579, i could update it to null and leave the row in place.

personid jobid houseid restaurantid
    1234   123      87         5678
    1234  null    null         9876
    1234    42    null         null 

but shouldn't i consolidate rows, moving the jobid to another row, provided there's a vacancy in that column?

personid jobid houseid restaurantid
    1234   123      87         5678
    1234    42    null         9876

the trouble is, now it's getting more and more complex to add or remove associations, requiring multiple sql statements for changes. you're going to have to write a lot of tedious application code to handle this complexity.

however, all the various changes are easy if you define one table per many-to-many relationship. you do need the complexity of having that many more tables, but by doing that you will simplify your application code.

adding an association to a restaurant is simply an insert to the person_restaurant table. removing that association is simply a delete. it doesn't matter how many associations there are to jobs or houses. and you can define a primary key constraint in each of these intersection tables to enforce uniqueness.

By B L on July 27 2022

Answers related to “how to model a database with many m:n relations on a table”

Only authorized users can answer the Search term. Please sign in first, or register a free account.