Job Title : Site Reliability Engineer
Location: Austin, TX / Atlanta GA / Dallas TX
Duration: 12+ Months
Our client is seeking a Site Reliability Engineer to assist in improving the reliability and automation of its hybrid cloud infrastructure. A strong sense of ownership, self-drive, creativity, innovation, and technical skills and especially experience in cloud, automation, monitoring and configuration management technologies will ensure success in this role.
The SRE will configure, tune, and troubleshoot multi-tiered systems to achieve optimal application performance, stability, and availability. The SRE will work closely with the systems engineers, network engineers, database administrators, monitoring administrators, and information security teams. For this position, strict application security and high availability requirements must be balanced to achieve optimal solutions.
- Create functional capabilities for monitoring, alerting and logging at full stack level, including business process, infrastructure, applications and interfaces. Generate system metrics and reports for purposes of detecting performance bottleneck and capacity planning.
- Assist in the development and management of the Infrastructure as Code (IaC) processes. Define and implement procedures to track configuration changes.
- Aid in the creation of automated troubleshooting capabilities across multiple cloud providers.
- Maintain cross-platform tooling across multiple cloud providers.
- Administer system services in data centers and cloud. Validate business requirements and determine architecture design based on security policies.
- Perform business analysis of infrastructure services including but not limited to: operational performance, operation problems, design and prototypes of current and emerging technologies.
- Production Support Background
- Practical knowledge of scripting language
- Strong monitoring background with log analytics, Synthetic Transaction Monitoring, and Application Performance Monitoring
- Good understanding of DevOps tools with CI/CD
- Own root cause analysis for complex infrastructure and application issues. Identify and drive strategies to prevent a recurrence.
- Bachelors / 4 year degree.
- AWS or Azure Certifications (MCSA: Cloud Platform MCSE: Cloud Platform and Infrastructure/AWS Certified Solutions Architect/AWS Certified Developer).
|Job Category||Information Technology|