- Ensure the stability of the company's exchange business, respond quickly to incidents with the R&D team, and establish mechanisms to improve handling efficiency.
- Participate in the construction of operation and maintenance tools and platforms and system risk identification (including DB/middleware), and promote operation and maintenance automation.
- Promote system optimization through continuous all-round data operation (including historical incidents, online issues, resource utilization, etc.).
- Handle alerts so that they are properly disposed of.
- Formulate various operation and maintenance standards to promote the improvement of the overall operation and maintenance level.
- Build budget management, cost measurement, cost monitoring, and cost optimization systems, provide solutions for cost governance, and promote their implementation.
Requirements
- Bachelor's degree or above in computer science or related field, with 3+ years of experience in SRE / operation and maintenance / cloud native related work.
- Solid basic knowledge of computer software, proficiency in daily operation and maintenance and troubleshooting of Linux operating systems.
- Proficient in the principles and operation and maintenance of core components of distributed systems, such as MySQL (master-slave replication, read-write separation), Redis (clustering, persistence), Kafka (reliability of message delivery).
- Familiar with one or more scripting languages, such as Python/Shell/GO.
- Possess systematic problem-solving skills, good communication skills, and a sense of ownership.
- Experience with related computing/distributed/big data systems is preferred (Nginx/Kubernetes/Docker, etc.).
 
Nice-To-Have:
- Experience in blockchain node operation and maintenance and optimization.
- Experience in exchange or DeFi project operation and maintenance.
*Only shortlisted candidates will be contacted.