request의 반환값에 따라 작동방식을 다르게 한 코드
code:
import urllib.request
from urllib.error import URLError, HTTPError, ContentTooShortError
def download(url, num_retries=2):
print('Downloading:', url)
try:
html = urllib.request.urlopen(url).read()
except (URLError, HTTPError, ContentTooShortError) as e:
print('Download error:', e.reason)
html = None
if num_retries > 0:
if hasattr(e,'code') and 500 <= e.code < 600:
return download(url, num_retries-1)
return html
downloaded_html = download('http://httpstat.us/500')
'웹(web) > 크롤링(web scraping)' 카테고리의 다른 글
regex training site (0) | 2019.07.07 |
---|---|
crawl_site() with itertools (0) | 2019.07.06 |
crawl_sitemap() with re.findall() (0) | 2019.07.06 |
naver html 페이지 다운로드(download() simple ver) (0) | 2019.07.06 |
참고 pdf (0) | 2019.07.06 |