Senior Software Engineer - SRE
Company: General Motors
Location: Austin
Posted on: June 1, 2025
Job Description:
Key Responsibilities:
- Automation and Reliability Improvements: Develop tools and
software to automate operational processes, improve system
reliability, and reduce manual intervention.
- Observability and Monitoring: Lead, Implement and improve
monitoring and observability frameworks, enabling proactive
detection and resolution of incidents.
- Incident Response: Participate in an on-call rotation to
diagnose, troubleshoot, and mitigate production incidents, ensuring
minimal downtime and swift resolution.
- Collaboration with Development Teams: Work alongside developers
to ensure the quality, scalability, and reliability of our
services. Practice shared ownership of services in production,
fostering a "You build it, you run it" culture.
- Service Level Management: Manage Service Level Indicators
(SLIs), Service Level Objectives (SLOs), and Service Level
Agreements (SLAs) to manage reliability expectations
effectively.
- Engineering for Reliability: Strong understanding of common
application reliability patterns, with hands-on experience
implementing them.
- Failure Analysis and Post-Incident Reviews: Conduct deep-dive
analyses of incidents and collaborate on post-incident reviews to
derive learnings and prevent recurrence. Champion a culture of
continuous improvement.
- Cost Efficiency: Evaluate system performance and advocate for
optimisations that reduce infrastructure costs while maintaining
service reliability.Skills and Qualifications:
- Programming Skills: Proficiency in at least one programming
language (e.g., Python, Go, Java) and familiarity with multiple
language ecosystems.
- Systems Knowledge: Solid understanding of operating systems,
networking, distributed systems, databases, and storage
architectures.
- Strong Understanding of System Fundamentals: Deep understanding
of how code runs on underlying hardware, including operating
systems, algorithms, and data structures. Ability to optimize or
troubleshoot code by understanding its execution and the impact on
system resources.
- Incident Management: Experience handling production incidents,
including root cause analysis, mitigation, and working through
complex system failures.
- Communication and Collaboration: Strong communication skills,
with an ability to explain technical concepts to both engineering
and business stakeholders. Commitment to collaborative
problem-solving and shared ownership of services.
- Automation Focus: Proven experience in automating manual
processes, building deployment pipelines, or managing configuration
systems.
#J-18808-Ljbffr
Keywords: General Motors, Temple , Senior Software Engineer - SRE, IT / Software / Systems , Austin, Texas
Didn't find what you're looking for? Search again!
Loading more jobs...