Site Reliability Engineer
Vacancy expired!
Site Reliability Engineer- Growing Company!This Jobot Job is hosted by: Mary LeeAre you a fit? Easy Apply now by clicking the "Apply Now" button and sending us your resume.Salary: $100,000 - $170,000 per yearA bit about us:We are the leader in people first search advertising that is looking for a SRE to implement observability for legacy infrastructure within Kubernetes (EKS Fargate) in AWS. You will have the ability to create our infrastructure workflow/coding standards as well as our observability standards across all of our products.Why join us?
- Huge Room for Growth
- Great Pay and Benefits
- Work/life Balance
- 100% Remote
- Support on-prem infrastructure
- Work closely with software engineers to implement observability within on-prem and cloud-based environments
- Scale our infrastructure using an infrastructure-as-code mindset
- Join the on-call rotation to support infrastructure
- Solve complex infrastructure challenges related to low-latency large-scale distributed systems
- Create and maintain documentation for runbooks, implementations and infrastructure
- Create automation tools for the infrastructure team as well as software engineers
- Work with the infrastructure team to migrate on-prem infrastructure to a cloud solution
- BS in Engineering/Computer Science or relevant work experience in the field
- 4+ years of experience as a Systems or DevOps Engineer
- 2+ years of experience with container orchestration
- Strong knowledge of Unix-based systems
- Strong knowledge of Terraform
- Scripting experience with BASH/Python or the like
- Experience with modern observability tools and the implementation thereof
- Excellent documentation, communication and troubleshooting skills
- A zeal for coding excellence
- Ability to initiate and complete projects with minimal guidance
- Ability to collaborate with others who may not share your technical opinions
- Willingness to learn and support old architectures
- Experience with Kubernetes
- Experience with EKS Fargate
- Experience with Datadog
- Experience migrating on-prem infrastructure to the cloud
- Experience with Puppet/Ansible
- Experience with routers/switches
- Experience with monitoring/Log collection tools such as Nagios, Prometheus, Grafana, Graylog, Logstash, Kibana and Filebeat