Manager of Site Reliability for Enterprise Solutions at Arcadia Code

Arcadia Code is seeking a dynamic and experienced Site Reliability Manager to lead our initiatives in ensuring the reliability and performance of our enterprise-scale solutions. This role is pivotal to our mission of delivering seamless, high-quality software solutions and fostering a culture of collaboration across various teams.

About Us

Located in the heart of Phoenix, Arcadia Code is a leading provider of innovative technology solutions. Our mission is to empower businesses by leveraging cutting-edge technology to enhance efficiency and productivity. We pride ourselves on our inclusive culture, where collaboration and innovation thrive.

Role Overview

As the Site Reliability Manager, you will be responsible for:

  • Leading the Site Reliability Engineering (SRE) team to maintain high service uptime and performance.
  • Facilitating cross-team collaboration to ensure seamless integration of operations, development, and testing.
  • Implementing best practices for incident management, problem resolution, and system monitoring.
  • Driving continuous improvement initiatives focusing on automation and scalability.
  • Acting as a liaison between technical and non-technical stakeholders to ensure alignment of objectives.

Key Responsibilities

1. Operational Excellence

Develop and oversee operational metrics to ensure high service availability. You will be expected to:

  • Monitor service health and performance through advanced monitoring tools.
  • Define service level indicators (SLIs), service level objectives (SLOs), and service level agreements (SLAs).
  • Implement effective incident response strategies to minimize downtime.

2. Team Leadership

Lead, mentor, and develop a high-performing SRE team. Your responsibilities will include:

  • Recruiting top talent to build a diverse team.
  • Empowering team members through coaching and professional development opportunities.
  • Fostering an environment of continuous learning and innovation.

3. Cross-team Collaboration

Enhance cooperation between teams by:

  • Facilitating regular meetings and workshops to cultivate open communication.
  • Promoting shared ownership of system reliability across development and operations.
  • Encouraging feedback loops to improve processes and workflows.

Qualifications and Skills

The ideal candidate will possess the following qualifications:

  • Bachelor’s degree in Computer Science, Engineering, or a related field.
  • 5+ years of experience in a Site Reliability Engineering or similar role within an enterprise environment.
  • Proven expertise in cloud technologies (AWS, Azure, Google Cloud).
  • Strong understanding of containerization and orchestration tools (Docker, Kubernetes).
  • Excellent problem-solving skills and ability to work under pressure.

Why Arcadia Code?

At Arcadia Code, we believe in rewarding hard work and innovation. We offer:

  • Competitive salary and performance bonuses.
  • Comprehensive health benefits and wellness programs.
  • Flexible work arrangements, including remote work options.
  • Opportunities for career advancement within a rapidly growing organization.

Join Us!

If you’re a passionate leader who thrives on collaboration and innovation in the IT industry, we encourage you to apply for the Site Reliability Manager position at Arcadia Code. Help us drive the future of enterprise solutions while shaping a culture of excellence and reliability.

Apply today and become a crucial part of our journey!

Additional Links