Imagine you have a table POSTS with 50 million posts. This table has:
ID, POST, CATEGORY_ID
1, "Hello world", 2
And then you have a CATEGORIES table with:
On the website, you list these posts in ORDER DESC, showing the post and the category name using a join.
I'm thinking on adding (when someone create a new post) the CATEGORY_NAME column to the POSTS table in order to avoid having to make that JOIN everytime I get a visit.
In order of performance, is a good idea? Imagine you have 10.000.000 visitors daily (hopefully :))
On the other hand, I would like to know what DB engine you consider better for it, mysql? mariadb? mongodb?
Imagine a more real scenario: I need 3 joins to show POST + CATEGORY + USER_NAME.
POSTS: 50 million rows
CATEGORIES: 100 rows
USERS: 10 million rows
Option A) Join between the 3 tables to show:
- By John
Option B) Adding 2 columns (CAT_NAME, USER_NAME) varchar(25) to POSTS table in order to avoid the join.
Keep things normalized.
The cost of the
JOIN is less than the cost caused by having a plumper table. Currently (small amount of data) you cannot see much performance difference. When the tables are too big to cache in RAM, the performance benefit of normalization will shine.