Senior Site Reliability Engineer
Vacancy expired!
Immediate need for a talented
Senior Site Reliability Engineer. This is a 12+Months Contract opportunity with long-term potential and is located in Atlanta, GA(Hybrid). Please review the job description below and contact me ASAP if you are interested. Job ID: 23-18782Pay Range: $75 - $85/hour. Employee benefits include, but are not limited to, health insurance (medical, dental, vision), 401(k) plan, and paid sick leave (depending on work location).Key Responsibilities:- Execute on the Incident, Change Management, Problem Management processes.
- Building and supporting a reliable application suite for the environment in order to meet the development and maintenance requirements of systems/platforms.
- Provide consultation and direct technical support in life cycle planning, problem management, integration, and systems programming.
- Ensure platform performance and availability meet enterprise objectives through monitoring, timely service restoration, and tuning.
- Constantly working to improve and implement automation of applications tasks.
- Providing technical support for systems/platforms according to application SLA's.
- Responsible for designing and developing resiliency in the application code, troubleshooting incidents, engaging with squads to address failure patterns, and participating in incident management.
- Strong Troubleshooting ability required.
- Leads calls or contributes in a logical fashion.
- Focus on resolving issues before they become incidents.
- Identify and articulate severity of impacts using provided monitoring tools and escalate as needed.
- Able to understand architecture and design of applications and identify or narrow focus for an incident based on symptoms.
- Perform root cause analysis to quickly recover from service interruptions, and to prevent recurring problems.
- Monitor, manage, and tune platforms to ensure expected availability and performance levels are achieved.
- dentify gaps in monitoring or documentation and reaches out to appropriate teams to fill those gaps.
- Implement changes to platforms with minimal impact to the business by following enterprise standards and procedures.
- Design and document enterprise standards and procedures.
- Bachelors degree in Computer Science, Information Technology or related field is preferred.
- Experience and exposure to VMWare VDI implementations a huge plus.
- Experience with Dynatrace APM and synthetic monitoring.
- Experience with airline applications and infrastructure technology is a plus.
- Associate's degree or industry certification in an applicable IT field, in addition to four years applicable experience in the design/administration/support of one or more platforms; or Bachelor's degree in an IT field, in addition to two years applicable experience in the design/administration/support of one or more platforms; or five years equivalent in depth experience in the above related areas.
- 5 or more years of experience as a Systems Engineer or Site Reliability Engineer.
- 2 or more years of experience with ops automation using a scripting language such as Python or Ansible.
- Site Reliability Engineering: Knowledge of the theories and methodologies of reliability engineering; ability to design, develop and support various tools, services and applications to maintain a reliable site environment.
- Performance Measurement and Tuning: Knowledge of system performance, testing and programming; ability to monitor, measure, and optimize system performance and network communication.
- CI/CD Pipeline: Knowledge of concepts, values and tools applied in building Continuous Integration(CI), Continuous Delivery and Continuous Deployment(CD) pipeline; ability to design, build, implement and maintain CI/CD pipelines to achieve the automation of software delivery process.
- Software Release Management: Knowledge of strategies, practices and tools for managing versions and distribution of software products and enhancements; ability to evaluate and improve release management practices and tools.
- Application Maintenance: Knowledge of production applications; ability to monitor application functions and resolve issues to maintain optimal conditions for system applications.
- Software Engineering: Knowledge of software engineering; ability to deliver new or enhanced software products.
- Agile Development: Knowledge of agile methodologies and the agile development lifecycle; ability to utilize formal agile methodologies, disciplines, practices and techniques for the delivery of new and enhanced applications.
- Embraces diverse people, thinking and styles.