Cache penetration

  Problem description

        key The corresponding data does not exist in the data source , Every time for this key
The request for cannot be obtained from the cache , Requests will be pressed to the data source ( database ), Which may overwhelm the data source . For example, use a non-existent user id
Get user information , Neither cache nor database , If hackers exploit this vulnerability to attack, they may collapse the database .

Conditions under which cache penetration occurs :

* Application server pressure increases
* redis Hit rate reduced
* Always query the database , Make the database too stressed and collapsed
actually redis It has been running smoothly in this process , It was our database that crashed ( as MySQL).

Reasons for cache penetration : Hackers or other abnormal users frequently make many abnormal url visit , bring redis Unable to query the database .

Solution

* Cache for null values : If the data returned by a query is empty ( Whether the data does not exist or not ), We still put this empty result (null) Cache , Setting the expiration time of empty results will be very short , Up to five minutes .
* Set accessible list ( White list ): use bitmaps Type defines an accessible list , name list id As bitmaps Offset of , Each visit and bitmap
Inside id Compare , If access id be not in bitmaps inside , Intercept , Access not allowed .
* Adopt bloom filter : Bloom filter (Bloom Filter) yes 1970 Proposed by bron in . It's actually a very long binary vector ( bitmap )
And a series of random mapping functions ( hash function ). Bloom filters can be used to retrieve whether an element is in a collection . Its advantage is that its space efficiency and query time are far more than general algorithms , The disadvantage is that it has a certain error recognition rate and difficult to delete .
* Conduct real-time monitoring : When found Redis The hit rate of began to decrease rapidly , It is necessary to check the access objects and data , Cooperate with operation and maintenance personnel , Blacklist restriction service can be set .
Buffer breakdown

    Problem description

        key Corresponding data exists , But in redis
Medium expired , At this time, if a large number of concurrent requests come , These requests usually load data from the back-end database and reset it to the cache when the cache expires , At this time, large concurrent requests may instantly overwhelm the back-end database .

Phenomenon of cache breakdown :

* Database access pressure increases instantaneously , Database crash
* redis There is not a large number in it key be overdue
* redis normal operation
Causes of cache breakdown :redis Some key Out of date , Massive access to use this key( Hot key).

Solution

        key It may be accessed very concurrently at some point in time , Is a very “ hotspot ” Data of .

* Preset popular data : stay redis Before peak visit , Store some popular data in advance redis inside , Increase these hot data key Duration of .
* Real time adjustment : What data is popular for on-site monitoring , Real time adjustment key Expiration time of .
* Use lock :
* When the cache fails ( Judge that the value is empty ), Not immediately load db.
* First use some operations of the cache tool with the return value of successful operations ( such as Redis of SETNX) go set One mutex key.
* When the operation returns success , Proceed again load db Operation of , And reset the cache , Last delete mutex key;
* When the operation returns failure , Prove that there are threads in load db, The current thread sleeps for a period of time and then retries the whole get Cache method .
Cache avalanche

     Problem description

        key Corresponding data exists , But in redis
Medium expired , At this time, if a large number of concurrent requests come , These requests usually load data from the back-end database and reset it to the cache when the cache expires , At this time, large concurrent requests may instantly overwhelm the back-end database .

         The difference between cache avalanche and cache breakdown is that there are many key cache , The former is a certain key Normal access .

Solution

* Build multi-level cache architecture :nginx cache + redis cache + Other caches (ehcache etc. ).
* Use locks or queues : Lock or queue to ensure that there will not be a large number of threads reading and writing to the database at one time , So as to avoid a large number of concurrent requests falling on the underlying storage system in case of failure , This method is not suitable for high concurrency .
* Set expiration flag to update cache : Whether the record cache data is expired ( Set advance ), If it expires, it will trigger another thread to update the actual in the background key Cache of .
* Spread cache expiration time : For example, a random value can be added to the original failure time , such as 1-5
Minutes random , In this way, the repetition rate of the expiration time of each cache will be reduced , It is difficult to cause collective failure .

Technology