background:
I want to add tagging capabilities for our products. A product can have multiple tags, and each tag can be used by multiple products, which is the so-called many-to-many relationship.
specific:
We have 10 million products Product, administrator defined 20 tags Tag, we want to label these 10 million products (one or more), for this we created a Product table and a Tag table.
question:
how to express this many-to-many relationship? Optimization plan? Performance?
our first idea:
Intermediate table: ProductTag connects Product and Tag, to form a many-to-many relationship. But each time you need to traverse the ProductTag table and get the tag description information from the Tag table, isn"t it a waste of efficiency because checking the tag is much more than setting the tag?
so the second idea:
cache field, define an array type field in Product to store the id, of the Tag object in the tags, field, so that when you can view it, you can directly obtain the array of Tag objects according to the tags. Then get the description information of each tag. But the problem is that one day the administrator deletes a tag object from the Tag table, so we have to traverse all the Product, to delete the id. of the tag from the tags. Oh my God!
so the third idea:
We still use the second idea, but when the administrator deletes a tag object, we do not do anything, but when viewing the label data of the product, check whether the tag in tags is valid (query the Tag table). So here comes the problem again: isn"t it a waste of efficiency to check products more times than to delete tags?
is there a better way? Or optimize the solution?
look for an answer! Thank you ~