When we do data capture , Often encountered due to network problems caused by the program save , Previously, only the error was recorded , And the error content is post processed .
Original process ：
def crawl_page(url): pass def log_error(url): pass url = "" try:
crawl_page(url) except: log_error(url)
Improved process ：
attempts = 0 success = False while attempts < 3 and not success: try:
crawl_page(url) success = True except: attempts += 1 if attempts == 3: break
Recently discovered new solutions ：retrying
retrying It's a Python Retrying package for , It can be used to automatically retry some program segments that may fail to run .retrying Provides a decorator function retry
, The decorated function will be executed again if it fails to run , By default, as long as the error has been reported, it will continue to try again .
import random from retrying import retry @retry def do_something_unreliable():
if random.randint(0, 10) > 1: raise IOError("Broken sauce, everything is
hosed!!!111one") else: return "Awesome sauce!" print do_something_unreliable()
If we run have_a_try function , So until random.randint return 5, It's the end of execution , Otherwise, it will be repeated all the time .
retry You can also accept some parameters , This is from the source code Retrying Class's initialization function can see the optional parameters ：
* stop_max_attempt_number： Used to set the maximum number of attempts , Stop retrying after this number
： For example, set it to 10000, Then start at the point in time when the decorated function starts to execute , The time point to the end of a successful function or the termination of a failure , As long as this period exceeds 10 second , The function is no longer executed
* wait_fixed： Set to twice retrying Stay time between
* wait_random_min and wait_random_max： Produce twice in a random way retrying Stay time between
* wait_exponential_multiplier and wait_exponential_max： Twice in the form of an index retrying
Stay time between , The resulting value is 2^previous_attempt_number * wait_exponential_multiplier,
previous_attempt_number It's already there retry Times of , If the generated value exceeds wait_exponential_max
Size of , So the next two retrying The retention values between the wait_exponential_max. This design caters to exponential backoff
algorithm , It can reduce congestion .
* We can specify which exceptions we want to go back to retry, This one needs to be used retry_on_exception Pass in a function object ：def
retry_if_io_error(exception): return isinstance(exception, IOError)
@retry(retry_on_exception=retry_if_io_error) def read_a_file(): with
open("file", "r") as f: return f.read() In execution read_a_file Function , If an exception is reported , Then this exception will take the form parameter
exception afferent retry_if_io_error Function , If exception yes IOError Then go ahead retry, If not, stop running and throw an exception .
We can also specify when we want to get the results retry, This one needs to be used retry_on_result Pass in a function object ：
def retry_if_result_none(result): return result is None
@retry(retry_on_result=retry_if_result_none) def get_result(): return None
In execution get_result After success , The return value of the function is passed through a formal parameter result In the form of retry_if_result_none Function , If the return value is None Then go ahead
retry, Otherwise, it ends and returns the function value .