Position: Mid-Senior level

Job type: Full-time

Loading ...

Job content

Job Description:

The DataRobot AI Cloud Platform provides mission critical services for our customers. We are expanding our Service Reliability Engineering team as we build the next generation of DataRobot. We are looking for individuals who are excited to have the opportunity to work on high impact cloud platforms, thrive in troubleshooting and solving complex problems, and enjoy enhancing reliability through automation and tooling.

Service Reliability Engineers (SRE) troubleshoot, debug, evaluate and resolve alarms, perform systems management, perform software deployments and migrations, and automate routine operational tasks with the end result being a more stable, reliable, and available environment for DataRobot customers. SREs play a key role in designing and modifying DataRobot tools and practices to provide observability and seamless scaling while proactively preventing failures.

The DataRobot AI Cloud Platform provides mission critical services for our customers. We are expanding our Service Reliability Engineering team as we build the next generation of DataRobot. We are looking for individuals who are excited to have the opportunity to work on high impact cloud platforms, thrive in troubleshooting and solving complex problems, and enjoy enhancing reliability through automation and tooling.

DataRobot is looking for an experienced and energetic Lead of Site Reliability Operations to help lead our world-class SRO teams. Our SaaS platform powers the best product available in the market to enable the AI driven enterprise. We’re seeking a candidate who cares deeply about the customer experience and will take personal ownership for the quality of our SaaS experience. You’ll be equally comfortable talking with engineers, executives, and customers.

As a member of the DataRobot Service Reliability team, you will be surrounded by individuals representing some of the brightest and most innovative minds in the industry. You will be a part of an organization that prides itself on providing training, empowerment, and career progression. Bring your passion for collaboration and automating everything and help enable the next generation of AI.

Responsibilities:

  • Lead as the single point of accountability to a critical client base, internal stakeholders, and the business for our SaaS environments
  • Drive programs to improve efficiency, increase availability, and improve overall quality of service delivery
  • Consistently advocate for our customers and influence the product roadmap on their behalf
  • Exhibit extreme ownership of incident response and communication
  • Grow and manage world-class SRE teams
  • Set operational objectives around availability and reliability and ensure they are met
  • Define and manage SLAs with customers and internal stakeholders
  • Develop strategy for service continuity and disaster recovery
  • Develop and enforce SaaS platform governance framework to enforce compliance with SLAs, security, confidentiality, privacy, and regulation
  • Manage a budget for SaaS technical operations
  • Strategic, forward thinking, and client focused
  • A global enterprise view of systems, people, and processes.

Requirements:

  • 10+ years of experience leading globally distributed teams responsible for 24x7x365 mission-critical infrastructure and automation
  • Ability to communicate clearly to multiple audiences, including engineers, executive leadership, and customers.
  • Experience with regulatory compliance procedures, including but not limited to SOC, GDPR, and HIPAA. This extends to incident response with sensitive information.

All U.S. DataRobot employees must be fully vaccinated against COVID-19. If there is a medical, religious, or other legally protected reason that prevents you from receiving an available COVID-19 vaccination, and you are selected as a candidate for consideration, we have a process in place to evaluate requests for accommodation.

DataRobot is proud to be an Equal Employment Opportunity and Affirmative Action employer. We do not discriminate based upon race, religion, color, national origin, gender (including pregnancy, childbirth, or related medical conditions), sexual orientation, gender identity, gender expression, age, status as a protected veteran, status as an individual with a disability, or other applicable legally protected characteristics. DataRobot is committed to working with and providing reasonable accommodations to applicants with physical and mental disabilities. Please see the United States Department of Labor’s EEO poster and EEO poster supplement for additional information.

All applicant data submitted is handled in accordance with our Applicant Privacy Policy.

Loading ...
Loading ...

Deadline: 09-06-2024

Click to apply for free candidate

Apply

Loading ...
Loading ...

SIMILAR JOBS

Loading ...
Loading ...