Important Information
- Location: Colombia
- Work Mode: Hybrid
Responsibilities and Duties
- Lead major incident coordination across cloud (Azure) and on-premise environments, ensuring rapid triage, clear escalation paths, and timely service restoration
- Own service performance and reliability metrics (MTTR, SLA compliance, incident recurrence, stability trends), driving visibility and accountability across teams
- Facilitate structured post-incident reviews, ensuring root cause analysis, remediation tracking, and continuous improvement of operational processes
- Provide clear operational reporting and governance, delivering transparent service health insights to both technical teams and business stakeholders
Qualifications and Skills
- 5+ years of experience in Service Operations, Incident Management, or IT Service Delivery roles
- Proven experience leading major incident response efforts across multiple technical teams
- Strong understanding of ITIL practices including Incident, Problem, and Change Management
- Experience working in hybrid environments combining Microsoft Azure cloud services and on-premise infrastructure
- Strong analytical skills with experience defining and tracking KPIs such as MTTR, SLA attainment, incident trends, and recurrence rates
- Experience conducting post-incident reviews and driving root cause analysis (RCA) with actionable remediation plans
- Ability to manage ticket lifecycles, prioritization frameworks, and service governance processes
- Strong stakeholder communication skills, with the ability to provide structured updates during high-pressure incidents
- Experience producing operational reports and dashboards for both technical and executive audiences
- Highly organized, proactive, and confident in taking ownership of cross-functional service outcomes in complex environments
