def process_item (self, item, spider):
print("")
print(item["file_url"], item["name"])
key_word = {"file_url": "asdasdasd", "name": "asdasdadas"}
res = self.db.find(key_word)
print("")
print(res)
if res:
print("")
raise DropItem("Duplicate item found: %s" % item)
else:
print("*******************************************************************************")
self.db.insert({"file_url": item["file_url"], "name": item["name"]})
return item
db.XiaoMiQuan.find ()
{"_ id": ObjectId ("5bbf14dbc96b5b3f5627d11d"), "file_url": "https://baogaocos.seedsufe.com/2018/07/19/doc_1532004923556.pdf"," name ":" AMCHAM- China"s "Belt and Road Initiative": impact on American Enterprises (English)-2018.6-8 pages. Pdf "}
this is in the collection of my database. But I looked up a missing data in python and returned something
f&e=1874736000&token=kIxbL07-8jAj8w1n4s9zvv64FuZZNEATmlUfuZZNEATm1BWe3EZYatp1qQis = Akiba Uncle Akiba-how do you manage a community? page v2-42. Pdf
<pymongo.cursor.Cursor object at 0x7fbb73f301d0>
2018-10-12 15:53:40 [scrapy.core.scraper] WARNING: Dropped: Duplicate item found: {"file_url": "https://files.zsxq.com/lnQuwPAAWDexZKnV1XbBjDRDNA71?attname=%E7%A7%8B%E5%8F%B6%E5%A4%A7%E5%8F%94-%E7%BB%99%E4%BD%A0%E4%B8%80%E4%B8%AA%E7%A4%BE%E7%BE%A4%E4%BD%A0%E6%80%8E%E4%B9%88%E7%AE%A1v2-42%E9%A1%B5.pdf&e=1874736000&token=kIxbL07-8jAj8w1n4s9zv64FuZZNEATmlU_Vm6zD:Aa_-t7C8cCDjBWe3EZYatp1qQis=",
"name": "-v2-42.pdf"}
{"file_url": "https://files.zsxq.com/lnQuwPAAWDexZKnV1XbBjDRDNA71?attname=%E7%A7%8
the pymongod.cursor here is the returned data