SRE 
Visa: USC and GC only
Location: -
They must be hybrid (4 days on-site). Can sit out of one of the five locations below.
Wilmington, DE
Manassa, VA
Dallas, TX
Scottsdale, AZ
Urbandale, IA
Duration: 6 Months CTHInterview: 2 Steps
Need Updated LinkedIn with pic. Two Managerial refernece with email , number and LinkedIn and they should not be older than 2021.
Description
This customer is currently in their journey of establishing a SRE practice/platform. The goal is to build a solid observability of the platform and evaluate the tool stack they will look to implement. They have a code team in place now and are looking to augment that team with a staff engineer. What they need is someone with industry knowledge in SRE - Working with vendors, hands on technical experience, technical guidance to more junior members of team and help Lead migrations from one tool to another. As of now they do not need a SE or SA - been working on strategy regarding this. Its dashboarding and setting up the tools that they need help with.
Job Title
Staff Engineer, SRE
Top Skills Details
1. 7+ years of experience within SRE. The hiring manager is more focused on SRE Practice (being able to bring the knowledge of production to me SLO\'s), less focused on the DevOps piece listed in the job description (more of a nice to have)
2. Solid scripting knowledge and experience within any of the following - Python, Go, Bash, Javascript, and Shell (does not need to be proficient in all of these, just very knowledgeable in at least one of them)
3. Main Tech Stack: Dynatrace, Datadog, ELK. Ansible background is a nice to have as they are beginning to utilize that
4. SRE certifications are a requirement as well as a bachelor\'s degree (see education requirement in job description)
Key Functions/Duties of Position:
Define, and track reliability and observability OKRs. This includes defining and tracking Service Level Objectives (SLOs) and Service Level Indicators (SLIs).
Implement robust monitoring and alerting systems to proactively monitor health, identify potential issues, analyze system performance, and facilitate quick response to incidents.
Implement AIOps functionality to enable auto-response, self-healing, and anomaly trend analysis.
Drive the development and implementation of automation solutions to remove "toil", streamline processes, reduce manual interventions, and enhance the overall efficiency of the product engineering and SRE teams.
Identifying and addressing performance bottlenecks in applications and infrastructure to improve efficiency and user experience.
Work closely with incident management to quickly address and resolve system outages or performance issues to minimize downtime and impact on users.
Collaborate actively with development and operations teams to implement observability and resiliency requirements in order to ensure smooth deployment and operation of software systems.
Lead the coordination with product, development, infrastructure, and architecture teams to conduct capacity planning, ensuring that systems can handle current and future demand; anticipate growth and scalability requirements.
Improve reliability by identifying and addressing gaps in our architecture, services, and tooling.
Modernize disaster recovery program for both on premise and Cloud-based Berkley solutions.
Provide technical leadership and mentorship to other engineers, fostering a culture of learning and continuous improvement.
Education Requirement
Bachelor\'s degree in computer science, Information Technology, or a related field (or a combination of education and equivalent experience).
Qualifications:
7+ years of IT experience working with infrastructure support and development
7+ years of experience of Site Reliability Engineering and DevOps.
Proficient in scripting languages like Python, Go, Bash, and/or Javascript, and experience with Shell Scripting.
Strong expertise of observability, monitoring, alerting, and logging tools (Dynatrace, Datadog, ELK Stack)
Practical expertise in creating and implementing logging and monitoring architectures through hands-on experience.
Expertise in designing and implementing on-premises, cloud, and hybrid resiliency solutions (HA, AA, AP), disaster recovery, and business continuity planning.
Deep understanding of cloud computing principles, including IaaS, PaaS, and SaaS models.
Experience with Kubernetes and other auto-scaling tools and technologies. Including proficiency with tools such as Helm and Prometheus for deployment and monitoring.
Proficient in leveraging GitOps with containerization technologies and CI/CD pipelines.
Develop and implement automated system reliability and performance solutions including infrastructure automation and configuration management tools (GitHub Actions, Terraform, Ansible, Chef, Puppet).
Solid understanding of security best practices in on-premises, cloud, and hybrid environments along with Network technologies.
Understanding of industry standard security frameworks and ability to interpret them for Berkley environments.
Ability to drive critical issues and system design discussions and moderate between multiple technology teams.
Demonstrated leadership experience, including mentoring junior engineers and leading technical projects.
Excellent problem-solving skills and the ability to troubleshoot complex issues in a distributed hybrid environment.
Strong communication skills to collaborate effectively with cross-functional teams and convey technical concepts to non-technical stakeholders.
Behavioral Core Competencies:
Strategic
Influential
Organizational Navigation
Balanced Approach
Commandership Skills
Composure
Recommended Jobs
Event Coordinator
Job Title: Event Coordinator Location: Mason, OH 45040 (Onsite) Duration: 3+ Months (Contract) Pay Rate: $30–$35/hour (W2) Industry: Insurance / Vision Benefits Job Overview: We ar…
Field Quality Analyst
Sealed Air designs and delivers packaging solutions that protect essential goods transported worldwide, preserve food, enable e-commerce and digital connectivity, and help create a global supply chai…
Cement Operator Assistant II (198201)
We are looking for the right people — people who want to innovate, achieve, grow and lead. We attract and retain the best talent by investing in our employees and empowering them to develop themselve…
Pharmacy Manager
Job Summary: Provides empathetic pharmacy consulting services to patients regarding the effective use of medications and drug interaction awareness. Offers preventive and clinical healthcare servi…
Elementary Classroom Teacher 2025-26
Region: 4 School District/Employer: District Category/Employer: Campus, Position: Teacher, Assignm…
Data Architect
Job Description Job Description Data Architect We are seeking a highly skilled Data Architect who will work closely with various stakeholders to understand their needs and deliver insights tha…
Financial Analysis
DESCRIPTION: Duties: Operate business cases, for new and existing technology investments including the creation of new business cases, and develop the methodology and sensitivity analysis. Challeng…
Receptionist
20 - 175811 lm A non-profit organization is seeking a Receptionist to answer phones, check in and out visitors, schedule appointments, take payments, and use Microsoft Outlook. Work hours are Monda…
Manager, International Logistics & Trade Compliance
Job Description Job Description "Invacare is a global leader in medical equipment, specializing in innovative mobility and lifestyle support products that enhance quality of life. We are currently…
early childhood educators
Job Description Job Description Hi We are looking for responsible teachers that willing to work with the kids (infant to school ages). We prefer the candidate with an associate degree, but w…