Advertisement
Guest User

Untitled

a guest
Jun 21st, 2015
277
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 2.52 KB | None | 0 0
  1. # Normalization in data modeling
  2.  
  3. Say you have a service where users can add products they like to a wish list, representing this with JSON, you might expect something like this:
  4.  
  5. ```json
  6. [
  7. {
  8. "user_id": 3,
  9. "email": "foo@example.com",
  10. "wishlist": [
  11. {
  12. "product_id": 323,
  13. "product_price": 3.29
  14. },
  15. {
  16. "product_id": 4942,
  17. "product_price": 15.99
  18. }
  19. ]
  20. },
  21. {
  22. "user_id": 7,
  23. "email": "bar@example.com",
  24. "wishlist": [
  25. {
  26. "product_id": 87,
  27. "product_price": 8.98
  28. },
  29. {
  30. "product_id": 4942,
  31. "product_price": 15.99
  32. }
  33. ]
  34. }
  35. ]
  36. ```
  37.  
  38. However, notice that there are two copies of the product with id 4942, this is common with the EAV way of modeling the data, it is also referred to as "denormalized" data. To get a better idea of what that means, let's normalize the above example using relations:
  39.  
  40. ### the users relation
  41. | user_id | email |
  42. |---------|-------|
  43. | 3 | foo@example.com |
  44. | 7 | bar@example.com |
  45.  
  46. ## the products relation
  47.  
  48. | product_id | product_price |
  49. |------------|---------------|
  50. | 323 | 3.29 |
  51. | 4942 | 15.99 |
  52. | 87 | 8.98 |
  53.  
  54. And to tie the two together, the
  55.  
  56. ## the wishlist relation
  57. | user_id | product_id |
  58. |---------|------------|
  59. | 3 | 323 |
  60. | 3 | 4942 |
  61. | 7 | 87 |
  62. | 7 | 4942 |
  63.  
  64.  
  65. So, this way you avoid unncessary duplication, you store the users in their own table, the products in their own table, and then use a third table to store the links between them. This will allow for much more efficient storage of data, and also help with consistency. For example, if product 4942 goes on sale, the denormalized form only has to update the price for the single record in the products table. By contrast, in the denormalized (JSON object above) form, you'd have to iterate over all the objects, dig into each of the wishlists and update the price in all of them.
  66.  
  67. ## Querying
  68. To query out the full records (JOINed together), you could run a query like:
  69.  
  70. ```SQL
  71. SELECT u.user_id, u.email, p.product_id, p.product_price
  72. FROM users u
  73. JOIN wishlist w
  74. JOIN products p
  75. ON u.user_id = w.user_id AND
  76. w.product_id = p.product_id
  77. ```
  78.  
  79.  
  80. | user_id | email | product_id | product_price |
  81. |---------|-------|------------|---------------|
  82. | 3 |foo@example.com | 323 | 3.29 |
  83. | 3 |foo@example.com | 4942 | 15.99|
  84. | 7 |bar@example.com | 87 | 8.98 |
  85. | 7 |bar@example.com | 4942 | 15.99|
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement