Senior Site Reliability Engineer
Vacancy expired!
Site Reliability Engineer (Remote) Needed - Leading Education SaaS Platform!This Jobot Job is hosted by: Stanton SikorskiAre you a fit? Easy Apply now by clicking the "Apply Now" button and sending us your resume.Salary: $120,000 - $190,000 per yearA bit about us:Founded over twenty years ago, we specialize in building a SaaS education platform and are a leading curriculum solutions provider for K-12 students. Our comprehensive, dynamic, and progressive learning technology helps students develop as learners and thinkers. Our platform delivers research-proven, high-quality core and supplemental solutions in math, world languages, ELA and literacy, computer science and biotech, as well as best-in-class K-12 professional learning services. Here, we strive to create an environment where people want to work - one where the larger team comes first, where trying new things (and sometimes failing) is encouraged, and where we pursue our mission relentlessly. We are a major disruptive force in the digital curriculum market by combining world-class research, differentiated technology, best in class content together with a world-class mission-oriented team. Are you passionate about shaping the future of learning?Why join us?
- Competitive base salary and overall compensation package
- 401 K with generous company match
- Full benefits: Medical, Dental, Vision, Life, Disability
- Generous PTO, vacation, sick, and holiday schedule
- Enhancement and improvements of our SaaS solutions within AWS
- Define, operate, and refine processes for continuous integration and deployment of application software
- Development of CI/CD pipelines, IaaS for cloud native applications
- Configuring services
- Manage and interpret application data and logs to assist customer support teams with escalations to development.
- Design and implement mechanisms for proactive monitoring, alerting, trend-analysis and self-healing.
- Identify opportunities to improve DevOps processes and collaborate with the team for solutions.
- Help define, measure and report on SLIs and SLOs, drive organization to meet SLOs, and support the ability of the company to provide its customers with SLAs.
- Participate in post-incident reviews to better expose system or process gaps.
- Document procedures and site infrastructure.
- Site Reliability, SRE
- AWS, SaaS
- IaaC, CloudFormation (currently used), Chef/OpsWorks, Elastic Beanstalk
- CI/CD, Pipelines, Jenkins, Bamboo, Git, Jira
- Container Orchestration, Fargate, ECS, Kubernetes
- Python, Bash
- System logs/metrics, Splunk
- Application Performance Management Tools, New Relic, Datadog