Cloud Systems Reliability Engineer

Clearance Level
Systems Engineering
Washington, District of Columbia
Remote, Based in District of Columbia

REQ#: RQ92888

Travel Required: Less than 10%
Public Trust: BI Full 6C (T4)
Requisition Type: Regular

We are GDIT, one of the largest IT and mission services providers to the government. We offer our customers the power of choice through a vast cloud ecosystem. 

GDIT is your place. You make it your own by bringing your passion for accelerating the cloud. By owning your opportunity at GDIT, you are helping to ensure our mission is never interrupted. Our work depends on a Systems Reliability Engineer joining our team to support federal customer activities in Washington, DC.

At GDIT, people are our differentiator. As a Systems Reliability Engineer, you will be trusted to manage, support and maintain a reliable environment for the site in order to ensure the stability and security of multiple systems/platforms that are run or operated in that environment.

 In this role, you will:

  • Develop or contribute to solutions to a variety of problems of moderate scope and complexity.
  • Oversee the development of more robust systems for by building a resilient infrastructure.
  • Build in redundancy, implement monitoring tools, and automate wherever possible and reduce toil by scripting routine tasks and automating self-repair.
  • Develop and manage Service Level Objectives (SLOs) for production systems
  • Manage incident response teams (i.e., triaging, prioritizing and troubleshooting), detailing root cause analysis (RCAs), corrective actions and leading postmortems
  • Develop operational playbooks/runbooks, and disaster recovery testing
  • Review system design and architecture documentation and prepare materials addressing security controls
  • Work across product development teams to understand program goals

 What you’ll need:

  • Experience establishing or working with ‘help desk’ type ticket tracking and similar support processes
  • Extensive experience with monitoring tools (e.g., Sysdig, New Relic, Twistlock, Splunk, etc.,) and understanding and expertise to automate alerting functions
  • 3+ years of experience with Linux and Windows OS
  • 2+ years of experience with DevOps/DevSecOps through CI/CD pipeline tools such as Jenkins
  • 2+ years of experience with Jenkins and Github / Git / Gitflow
  • 1+ year of experience with Docker and Containers
  • 2+ years of experience with Agile development
  • 2+ years of experience with Infrastructure as code: Terraform, Cloud Formation, Ansible, Packer
  • 2+ year of experience with AWS services: EC2, RDS, Secrets Manager, IAM, etc.
  • 2+ years of experience with troubleshooting applications and testing automation (e.g., Selenium)
  • 1+ year of experience analyzing observability metrics, logs, traces
  • BA or BS degree
  • AWS Certified DevOps Engineer- Professional


●           401K with company match

●           Diverse, highly collaborative teams

●           Challenging work that makes a real impact on the world around you

●           Internal mobility team dedicated to helping you own your career

We are GDIT. The people supporting some of the most complex government, defense, and intelligence projects across the country. We deliver. Bringing the expertise needed to understand and advance critical missions. We transform. Shifting the ways clients invest in, integrate, and innovate technology solutions. We ensure today is safe and tomorrow is smarter. We are there. On the ground, beside our clients, in the lab, and everywhere in between. Offering the technology transformations, strategy, and mission services needed to get the job done.

GDIT is an Equal Opportunity/Affirmative Action employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, or veteran status, or any other protected class.