when scrapy crawls a picture of a web page, the class that inherits ImagesPipelines is customized in the pipelines file.
but the custom pipelines cannot be executed after running the program. Item cannot pass
the following is a custom pipelines
class Douyu3Pipeline(ImagesPipeline):
-sharp def process_item(self, item, spider):
-sharp return item
IMAGES_STORE = get_project_settings().get("IMAGE_STORE")
def get_media_requests(self, item, info):
print("1---------------------------1")
image_url = item["imagelink"]
yield scrapy.Request(image_url)
def item_completed(self, results, item, info):
image_path = [x["path"] for ok, x in results if ok]
os.rename(self.IMAGES_STORE + "/" + image_path[0],
self.IMAGES_STORE + "/" + item["nickname"] + ".jpg")
item["imagePath"] = self.IMAGES_STORE + "/" + item["nickname"]
return item
settings file settings are as follows:
BOT_NAME = "douyu3"
SPIDER_MODULES = ["douyu3.spiders"]
NEWSPIDER_MODULE = "douyu3.spiders"
ROBOTSTXT_OBEY = True
-sharp Override the default request headers:
DEFAULT_REQUEST_HEADERS = {
"User-Agent": "DYZB/1 CFNetwork/808.2.16 Darwin/16.3.0",
}
ITEM_PIPELINES = {
"douyu3.pipelines.Douyu3Pipeline": 300,
}
IMAGE_STORE = "/Users/enritami/desktop/SearchEngineer/douyu3/Images"
print exception as follows:
exception is: Enable item pipelines: [] custom pipelines cannot be added to the pipe list