thanks for your time.
I wrote a crawler. When I executed a certain step, the set, of getting an address was passed into this file and started downloading pictures
.from urllib import request
import ssl
ssl._create_default_https_context = ssl._create_unverified_context;
class Downloader(object):
def download(self, urlSet):
print("----- start download -----")
if(isinstance(urlSet, set)):
count = 0
for imgUrl in urlSet:
headers = {"User-Agent": "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:23.0) Gecko/20100101 Firefox/23.0"};
coat = request.Request(url=imgUrl, headers=headers);
res = request.urlopen(coat);
content = res.read();
f = open("%d.jpg" % count, "wb")
f.write(content)
f.close()
count += 1;
else:
headers = {"User-Agent": "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:23.0) Gecko/20100101 Firefox/23.0"};
coat = request.Request(url=urlSet, headers=headers);
res = request.urlopen(coat);
content = res.read();
f = open("001.jpg", "wb")
f.write(content)
f.close()
print("----- download over -----")
after downloading about thousands of pictures, the program no longer writes pictures, continues to crawl the page, and does not throw an exception
The
point pause has no effect.
I think there may be something wrong with writing the file.
or please recommend me a reliable way to write documents or a tripartite package, thank you.