Save the information of the article in Mongo. I want to repeat the article by the title of the article to judge as follows
,
:,
,
,!!
,!!
,!!
articles considered to be duplicates can be found by creating a text
index on the title of the article as follows
db.post.createIndex({title: "text"})
> db.post.find({$text: {$search: ":,"}}, {score: {$meta: "textScore"}}).sort({score: {$meta: "textScore"}})
{ "_id" : ObjectId("5b2f809152993004aaabdacc"), "title" : ":,", "score" : 2 }
{ "_id" : ObjectId("5b2f809252993004aaabdacd"), "title" : ", ", "score" : 1.3333333333333333 }
{ "_id" : ObjectId("5b2f809152993004aaabdacb"), "title" : ", ", "score" : 1.25 }
> db.post.find({$text: {$search: "!!"}}, {score: {$meta: "textScore"}}).sort({score: {$meta: "textScore"}})
{ "_id" : ObjectId("5b2f836652993004aaabdad6"), "title" : ",!!", "score" : 1.3333333333333333 }
{ "_id" : ObjectId("5b2f836652993004aaabdad7"), "title" : ",!!", "score" : 1.3333333333333333 }
{ "_id" : ObjectId("5b2f836652993004aaabdad8"), "title" : ",!!", "score" : 1.3333333333333333 }
but in some cases, articles with similar titles cannot be found as follows
,
-sharp
> db.post.find({$text: {$search: ""}}, {score: {$meta: "textScore"}}).sort({score: {$meta: "textScore"}})
{ "_id" : ObjectId("5b2f80cd52993004aaabdad3"), "title" : "", "score" : 1.1 }
{ "_id" : ObjectId("5b2f80ce52993004aaabdad5"), "title" : ",", "score" : 0.75 }
65
67
68
-sharp 1
> db.post.find({$text: {$search: "65"}}, {score: {$meta: "textScore"}}).sort({score: {$meta: "textScore"}})
{ "_id" : ObjectId("5b2f80ad52993004aaabdace"), "title" : "65", "score" : 1.1 }
Why can"t you find out some titles that are obviously very similar?