When we do data capture , Often encountered due to network problems caused by the program save , Previously, only the error was recorded , And the error content is post processed .

Original process :
def crawl_page(url): pass def log_error(url): pass url = "" try:
crawl_page(url) except: log_error(url)
Improved process :
attempts = 0 success = False while attempts < 3 and not success: try:
crawl_page(url) success = True except: attempts += 1 if attempts == 3: break
Recently discovered new solutions :retrying

retrying It's a Python Retrying package for , It can be used to automatically retry some program segments that may fail to run .retrying Provides a decorator function retry
, The decorated function will be executed again if it fails to run , By default, as long as the error has been reported, it will continue to try again .
import random from retrying import retry @retry def do_something_unreliable():
if random.randint(0, 10) > 1: raise IOError("Broken sauce, everything is
hosed!!!111one") else: return "Awesome sauce!" print do_something_unreliable()
If we run have_a_try function , So until random.randint return 5, It's the end of execution , Otherwise, it will be repeated all the time .

retry You can also accept some parameters , This is from the source code Retrying Class's initialization function can see the optional parameters :

* stop_max_attempt_number: Used to set the maximum number of attempts , Stop retrying after this number
*  stop_max_delay
: For example, set it to 10000, Then start at the point in time when the decorated function starts to execute , The time point to the end of a successful function or the termination of a failure , As long as this period exceeds 10 second , The function is no longer executed
* wait_fixed: Set to twice retrying Stay time between
* wait_random_min and wait_random_max: Produce twice in a random way retrying Stay time between
* wait_exponential_multiplier and wait_exponential_max: Twice in the form of an index retrying
Stay time between , The resulting value is 2^previous_attempt_number * wait_exponential_multiplier,
previous_attempt_number It's already there retry Times of , If the generated value exceeds wait_exponential_max
Size of , So the next two retrying The retention values between the wait_exponential_max. This design caters to exponential backoff
algorithm , It can reduce congestion .
* We can specify which exceptions we want to go back to retry, This one needs to be used retry_on_exception Pass in a function object :def
retry_if_io_error(exception): return isinstance(exception, IOError)
@retry(retry_on_exception=retry_if_io_error) def read_a_file(): with
open("file", "r") as f: return f.read() In execution read_a_file Function , If an exception is reported , Then this exception will take the form parameter
exception afferent retry_if_io_error Function , If exception yes IOError Then go ahead retry, If not, stop running and throw an exception .

We can also specify when we want to get the results retry, This one needs to be used retry_on_result Pass in a function object :
def retry_if_result_none(result): return result is None
@retry(retry_on_result=retry_if_result_none) def get_result(): return None
In execution get_result After success , The return value of the function is passed through a formal parameter result In the form of retry_if_result_none Function , If the return value is None Then go ahead
retry, Otherwise, it ends and returns the function value .

Technology