Back to all jobs
A
Systems Development Engineer, Region Services
Melbourne, Victoria, AUS
full-timeSystems, Quality, & Security EngineeringJob Description
Applicants must be Australian citizens and hold or be eligible to obtain an Australian Government Security Clearance with the ability to successfully complete an Organisational Suitability Assessment. For more information regarding security clearances please visit (https://www.agsva.gov.au/).
The AWS Region Services team combines AWS global cloud leadership with Australian security expertise to deliver highly secure, scalable environments for sensitive workloads. We’re creating innovative ways to use cloud computing, artificial intelligence, and machine learning while maintaining the highest standards of security and operational excellence.
The Engineering organisation within Region Services is structured across core capability pillars: Compute & Machine Learning, Security Identity & Compliance, Storage & Databases, and a growing capability domain. Collectively these pillars encompass a team of varying technical skillsets, including Engineers; Technical Program Managers and Subject Matter Experts, organised into focused sub-teams.
This is an opportunity to make a lasting impact on Australia’s digital future. You’ll work with AWS services, implement innovative solutions, and help customers succeed in their most important missions. We’re committed to helping our builders grow through continuous learning, mentoring, and collaboration with industry experts. Are you ready to build the future of secure cloud computing in Australia?
Key job responsibilities
- Define and/or refine system requirements, participate in the development and delivery of operability-related features such as system health monitoring, diagnostics, repair, and other self-healing automation
- Develop or further existing application and system management tools and processes that reduce manual efforts and increase overall efficiency
- Adapt and improve operations management systems and processes to accommodate rapid and increasing growth in systems and traffic
- Participate in the design and execution of production acceptance tests and new hardware evaluations
- Monitor the health of the fleet, automating system health, maintenance tasks, and reporting systems as needed
- Participate in “on-call” rotations to resolve incidents occurring out-of-hours.
A day in the life
Your morning starts with a review of the self-healing automation you deployed this week. The metrics tell the story: fourteen automated remediations overnight, zero pages to the on-call engineer. You refine a detection threshold that's been slightly too sensitive, commit the change, and watch the next cycle run cleanly.
By mid-morning, you're in a requirements session with the team. A new service is approaching production readiness, and you're defining the operability features it needs: what health signals to emit, what diagnostic paths to expose, what self-repair mechanisms to implement. You push for completeness — because the time to think about operability is now, not after the first incident.
After lunch, you're deep in code. An existing management tool needs to handle three times its current throughput as the fleet scales into its next growth phase. You redesign the data ingestion layer, implement parallel processing, and write load tests that validate performance at projected scale. The refactored system handles the load with headroom to spare.
Later, you join a hardware evaluation session. New server variants have arrived, and you're helping design acceptance tests that will stress them under realistic production conditions. Your test harness is automated and repeatable — results flow directly into a report that the team can act on immediately.
Before wrapping up, you investigate a pattern you noticed in fleet health data. A subset of hosts is showing elevated memory pressure during a specific maintenance window. You trace the cause, design an automated fix that adjusts the maintenance scheduling, and deploy it. Tomorrow morning, the pattern will be gone — and no one will have needed to intervene.
About the team
Region Services provides the highest caliber Operational Solutions and Cleared Support for services within our Regions. We provide 'hands on keyboard' support to our service teams by deploying changes into these isolated regions, monitoring the results, and reporting any issues that are observed.
Diverse Experiences
Amazon values diverse experiences. Even if you do not meet all of the preferred qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn’t followed a traditional path, or includes alternative experiences, don’t let it stop you from applying.
Why AWS
Amazon Web Services (AWS) is the world’s most comprehensive and broadly adopted cloud platform. We pioneered cloud computing and never stopped innovating — that’s why customers from the most successful startups to Global 500 companies trust our robust suite of products and services to power their businesses.
Work/Life Balance
We value work-life harmony. Achieving success at work should never come at the expense of sacrifices at home, which is why we strive for flexibility as part of our working culture. When we feel supported in the workplace and at home, there’s nothing we can’t achieve in the cloud.
Inclusive Team Culture
AWS values curiosity and connection. Our employee-led and company-sponsored affinity groups promote inclusion and empower our people to take pride in what makes us unique. Our inclusion events foster stronger, more collaborative teams. Our continual innovation is fueled by the bold ideas, fresh perspectives, and passionate voices our teams bring to everything we do.
Mentorship and Career Growth
We’re continuously raising our performance bar as we strive to become Earth’s Best Employer. That’s why you’ll find endless knowledge-sharing, mentorship and other career-advancing resources here to help you develop into a better-rounded professional.
We’re continuously raising our performance bar as we strive to become Earth’s Best Employer. That’s why you’ll find endless knowledge-sharing, mentorship and other career-advancing resources here to help you develop into a better-rounded professional.
The AWS Region Services team combines AWS global cloud leadership with Australian security expertise to deliver highly secure, scalable environments for sensitive workloads. We’re creating innovative ways to use cloud computing, artificial intelligence, and machine learning while maintaining the highest standards of security and operational excellence.
The Engineering organisation within Region Services is structured across core capability pillars: Compute & Machine Learning, Security Identity & Compliance, Storage & Databases, and a growing capability domain. Collectively these pillars encompass a team of varying technical skillsets, including Engineers; Technical Program Managers and Subject Matter Experts, organised into focused sub-teams.
This is an opportunity to make a lasting impact on Australia’s digital future. You’ll work with AWS services, implement innovative solutions, and help customers succeed in their most important missions. We’re committed to helping our builders grow through continuous learning, mentoring, and collaboration with industry experts. Are you ready to build the future of secure cloud computing in Australia?
Key job responsibilities
- Define and/or refine system requirements, participate in the development and delivery of operability-related features such as system health monitoring, diagnostics, repair, and other self-healing automation
- Develop or further existing application and system management tools and processes that reduce manual efforts and increase overall efficiency
- Adapt and improve operations management systems and processes to accommodate rapid and increasing growth in systems and traffic
- Participate in the design and execution of production acceptance tests and new hardware evaluations
- Monitor the health of the fleet, automating system health, maintenance tasks, and reporting systems as needed
- Participate in “on-call” rotations to resolve incidents occurring out-of-hours.
A day in the life
Your morning starts with a review of the self-healing automation you deployed this week. The metrics tell the story: fourteen automated remediations overnight, zero pages to the on-call engineer. You refine a detection threshold that's been slightly too sensitive, commit the change, and watch the next cycle run cleanly.
By mid-morning, you're in a requirements session with the team. A new service is approaching production readiness, and you're defining the operability features it needs: what health signals to emit, what diagnostic paths to expose, what self-repair mechanisms to implement. You push for completeness — because the time to think about operability is now, not after the first incident.
After lunch, you're deep in code. An existing management tool needs to handle three times its current throughput as the fleet scales into its next growth phase. You redesign the data ingestion layer, implement parallel processing, and write load tests that validate performance at projected scale. The refactored system handles the load with headroom to spare.
Later, you join a hardware evaluation session. New server variants have arrived, and you're helping design acceptance tests that will stress them under realistic production conditions. Your test harness is automated and repeatable — results flow directly into a report that the team can act on immediately.
Before wrapping up, you investigate a pattern you noticed in fleet health data. A subset of hosts is showing elevated memory pressure during a specific maintenance window. You trace the cause, design an automated fix that adjusts the maintenance scheduling, and deploy it. Tomorrow morning, the pattern will be gone — and no one will have needed to intervene.
About the team
Region Services provides the highest caliber Operational Solutions and Cleared Support for services within our Regions. We provide 'hands on keyboard' support to our service teams by deploying changes into these isolated regions, monitoring the results, and reporting any issues that are observed.
Diverse Experiences
Amazon values diverse experiences. Even if you do not meet all of the preferred qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn’t followed a traditional path, or includes alternative experiences, don’t let it stop you from applying.
Why AWS
Amazon Web Services (AWS) is the world’s most comprehensive and broadly adopted cloud platform. We pioneered cloud computing and never stopped innovating — that’s why customers from the most successful startups to Global 500 companies trust our robust suite of products and services to power their businesses.
Work/Life Balance
We value work-life harmony. Achieving success at work should never come at the expense of sacrifices at home, which is why we strive for flexibility as part of our working culture. When we feel supported in the workplace and at home, there’s nothing we can’t achieve in the cloud.
Inclusive Team Culture
AWS values curiosity and connection. Our employee-led and company-sponsored affinity groups promote inclusion and empower our people to take pride in what makes us unique. Our inclusion events foster stronger, more collaborative teams. Our continual innovation is fueled by the bold ideas, fresh perspectives, and passionate voices our teams bring to everything we do.
Mentorship and Career Growth
We’re continuously raising our performance bar as we strive to become Earth’s Best Employer. That’s why you’ll find endless knowledge-sharing, mentorship and other career-advancing resources here to help you develop into a better-rounded professional.
We’re continuously raising our performance bar as we strive to become Earth’s Best Employer. That’s why you’ll find endless knowledge-sharing, mentorship and other career-advancing resources here to help you develop into a better-rounded professional.