What is operation and maintenance for ? It is estimated that even the operation and maintenance engineers themselves are not clear , Search on Baidu also basically can't get the answer , Looking for a lot of old staff of operation and maintenance , Finally summed up the operation and maintenance engineer's work content :


generally speaking , Operation and maintenance engineers are the operation and maintenance engineers of Internet enterprises , It usually belongs to the technical department , Is to support Internet product technology and research and development , Four main departments of test and system management . There will be differences between domestic and foreign companies, as well as between large and small companies , The main work contents are as follows :

1, Ensure the long-term and stable operation of business system

After all, if there is a mistake in the business system , Users will complain , So the core work of the operation and maintenance engineer is to ensure the stable operation of the business system .

First of all, we need to know what the business is running on , Generally speaking, web servers are nginx,apache etc. , rely on mysql Database for data storage , rely on PHP Analysis , So the operation and maintenance engineer must master LNMP,LAMP Knowledge of environment deployment .


2, Ensure data security and reliability

Data security is the most important part of company leaders , Operation and maintenance engineers should also ensure the security and reliability of data , If something goes wrong , Leaders are going to have tea with O & M .

Sometimes you need to change the contents of the database manually , We must learn to master mysql Knowledge of adding, deleting, checking and modifying database ;

Sometimes the server hardware that needs to deal with the database breaks down , You need it Mysql Master slave copy in case of emergency ;

Sometimes you need to restore the database , You need to learn mysql Incremental backup and recovery , To restore to the specified point in time ;

Sometimes scheduled backup is not enough , You need to use it rsync+inotify To backup in real time ;

Sometimes in order to increase the security of the server , It's about to pass iptables To control the company IP Or a springboard machine IP Access rights ;


3, Construction of monitoring and alarm system

Operation and maintenance engineers often use zabbix,nagios For alarm monitoring , If there's no monitoring, you're blind , So we should first build the alarm monitoring system , After that, we have to solve the system failure .


generally speaking , Common failures are application failures , Database failure , Network cable fault and so on , There are some software failures , Sometimes it's hardware failure , An experienced operation and maintenance engineer can locate the cause of the fault in the first time .

4, Technical and business problem handling

There are two core issues , They are technical problems and business problems , Technical problems mainly need network packet capture analysis ,tcpdump Packet capture analysis and agent mechanism and so on ;


Business problems are more complex than technology , For example, business level data analysis , It is not only necessary to make statistics on various indicators of business , We also need to analyze and dissect the data , Find out where the business problem is .

5, Version testing and launch

This is also the common work of operation and maintenance engineers , Responsible for version testing and launch , Before developer releases , Operation and maintenance engineers need to conduct performance and functional tests ; In addition, when the version goes online , It's better to go online when the business volume is small in the evening , It can avoid too much online pressure .



Operation and development are two completely different directions . If we do operation and maintenance , With the foundation of development, it's not impossible to change jobs .

Operation and maintenance is responsible for the specific product line operation and maintenance work , At the same time, we also need to master the ability of development , In depth business , Best understand the pain points and problems of the business , Simultaneous research and development / Optimize the platform for product business needs , Tools and means , Able to access to all kinds of excellent system architecture and have the ability to make comparison , At the same time, the control of the business determines the role of the corresponding operation and maintenance engineers in the business development .