Pharos Network is looking for a Site Reliability Engineer (SRE) to ensure the stability, reliability, and performance of our blockchain network. The SRE will play a critical role in system availability, monitoring, and optimizing infrastructure processes for a high-performance blockchain environment.
Build and maintain robust monitoring and alerting systems (Prometheus, Grafana), ensuring high system availability and performance.
Develop and implement automation for infrastructure management, CI/CD pipelines, and incident response.
Monitor, troubleshoot, and maintain production environments to ensure maximum uptime.
Collaborate with development teams to improve system design, infrastructure, and capacity management.
Conduct regular stress tests and performance optimizations to ensure the security and scalability of the network.
5+ years of experience in Site Reliability Engineering, focusing on highly available, scalable systems.
Proficiency in Linux-based systems, networking fundamentals, and distributed systems.
Programming experience with Python, Go, or shell scripting.
Experience with CI/CD tools (Jenkins, GitLab), version control (Git), and monitoring frameworks (Prometheus, Grafana).
Excellent troubleshooting and problem-solving skills, with experience in preventing and resolving infrastructure bottlenecks.
Experience operating blockchain nodes and services is a plus.
Collaborate on cutting-edge blockchain technologies, solving real-world scalability and reliability challenges.
Fully remote, flexible working environment, with competitive compensation and benefits.
Opportunities for career development in an innovative, fast-growing blockchain ecosystem.