Site Reliability Engineer - Apple Services Engineering

Apple • Dublin, D, IE • 1m ago

Summary

Posted: 19 Dec 2024

Role Number:200584223

People at Apple don’t just build products - they craft the kind of experiences that have revolutionized entire industries. The diverse collection of our people and their ideas inspire innovation in everything we do. Imagine what you could do here! Join Apple and help us leave the world better than we found it. The Apple Services Engineering (ASE) System Infrastructure team builds and provides systems and infrastructure that fuel Apple’s services (such as iTunes, iCloud, Siri, and Maps). We are the foundation on which Apple’s software developers build the products that our customers love. We are looking for passionate and talented Site Reliability Engineers to continue our focus in providing our customers the highest quality Apple Services experience. Our services have to scale globally, stay highly available, and "just work.” If you love designing, engineering and running systems and infrastructure that will help millions of customers, then this is the place for you!

Description

Apple Services Engineering (ASE) Infrastructure is the foundation upon which Apple services run. Working collaboratively with key stakeholders in storage, compute, traffic, and observability, the Systems Infrastructure team’s focus is system provisioning, configuration management, deployment, name resolution, packaging, and access/authorization. Operating at our scale, across multiple geographically dispersed data centers and servicing hundreds of millions of users presents unique challenges. As an SRE in ASE Infra, you'll need to solve these problems using data, teamwork, and your own expertise. You'll own the full infrastructure stack and have both broad and deep responsibility for its health. We run Linux and use a mix of open source, vendor licensed, and internally developed tools to manage the service. You'll learn these tools and have opportunities to improve them. We think critically and strive to balance the best solution with the need to resolve each unique engineering challenge. Good ideas are heard and results are rewarded.

Minimum Qualifications

Depth of understanding of the Linux Operating System, standard software, services, and networking protocols.
Strong sense of ownership and integrity demonstrated through clear communication and collaboration.
The ability to design, author, and deploy code in languages like Python and Go.
Bachelor’s degree or Masters in Computer Science, Computer Engineering, or equivalent.
5+ years of software development or production operations experience in a large-scale environment.

Preferred Qualifications

Extensive hands-on experience automating configuration of a diverse Linux fleet with services like Puppet; including management of core OS services, system partitioning, networking, package management, and system observability.
Extensive experience managing and scaling resilient distributed systems on bare metal / private clouds.
Capability and drive to troubleshoot and problem-solve complex system issues for root cause discovery.
Strong SRE mindset, experience with managed deployment methodologies, error budgets, scale testing, and disaster recovery.
RHEL9 certification a plus.
Familiarity with microservices architecture and container orchestration with Kubernetes a plus.

Apply