Site Reliability Engineer, eduCLIMBER - remote, USA

Illuminate Education - Remote - Full time

Position Overview

As a Site Reliability Engineer working as part of an engineering team, you will be responsible for designing and developing automation including the provisioning, configuration, operation, monitoring and maintenance of systems within the Amazon Web Services (AWS) cloud infrastructure. The toolset you and your team will be using includes CI, CD, automated pipelines, security tools, network topology, policy management, monitoring and telemetry, container orchestration/scheduling, and machine image creation and configuration.

In this role we are currently looking for engineers with operations experience including network provisioning, systems administration, machine/container configuration and security, and identity management. You will be working on a team with other engineers building new applications and modernizing legacy systems. The opportunity will include being a part of building a generative culture of continuous learning, continuous delivery, and continuous improvement.

Key Responsibilities

  • Train, advise, and assist fellow team members on DevOps best practices.
  • Manage, diagnose, and resolve system incidents and internal technical escalations within the cloud infrastructures.
  • Identify and implement appropriate use of AWS operational best practices.
  • Engage with Software Engineers, Product Managers, and Quality Assurance Engineers to diagnose problems and help architect long-term solutions.
  • Collaborate with Product Managers and Engineers to identify, prioritize, and develop enhancements to the platform.
  • Perform daily system monitoring to verify the integrity and availability of the services we provide.
  • Perform ongoing performance tuning, upgrades, and resource optimization.
  • Assist team members in developing, testing and documenting disaster recovery plans.
  • Build out production-like dev, staging, and test environments.
  • Audit and update dependencies, help modernize development environments.

Required Experience & Qualifications

  • BS degree in an IT discipline.
  • Proficiency with AWS technologies including VPC/Networking, EC2, RDS, S3, CloudFormation, and CloudFront.
  • Systems administration experience including securing, troubleshooting, administering, and monitoring Linux systems.
  • Experience developing, implementing, and continually improving system and network monitoring and alerting capabilities and procedures.
  • Experience with CI/CD tools and automation.
  • Solid knowledge with networking, storage, load balancing, and virtualization.

Desired Experience & Qualifications

  • Experience with Terraform, Ansible, Packer and RunDeck.
  • Experience with container orchestration with Kubernetes, Nomad, Fargate or similar.
  • Understanding of microservice architecture and communication patterns.
  • Proficiency with Nginx and Apache servers or similar.
  • AWS or Microsoft Cloud Certification.
  • Previous experience as a NOC Engineer monitoring and troubleshooting servers.
  • Familiarity with GCP technologies a plus.

Apply for this job

Apply for this job



Experience Level

Mid Level

Illuminate Education

Illuminate Education partners with K-12 educators to equip them with data to serve the whole child and reach new levels of student performance.
Share this job
Get our email newsletterSign me up
Keep up to date with our email newsletterSign me up