The Senior Data Center Operations professional is responsible for the availability, reliability, and operational excellence of mission-critical data center infrastructure. This role operates as a technical lead on site, owning complex operational activities, incident response, and advanced troubleshooting while mentoring junior technicians and supporting continuous improvement initiatives.
The role requires deep hands-on expertise, sound judgment in high-pressure situations, and the ability to operate independently within a 24/7 critical environment.
Technical Authority & Ownership
Senior-level ownership of data center operations and infrastructure stability
Authority to lead incident response and complex troubleshooting activities
Recognition as a subject-matter expert within the operations team
Direct influence on operational standards, procedures, and improvements
Professional Standing & Growth
Positioning as a senior technical reference within the data center
Opportunity to mentor technicians and shape operational best practices
Your responsibilities
Senior Operations & Infrastructure Management
Oversee installation, configuration, testing, and maintenance of critical data center hardware and systems
Ensure operational readiness and compliance with availability, security, and safety standards
Act as escalation point for complex or high-impact operational issues
Monitoring, Incident Leadership & Troubleshooting
Lead response to infrastructure incidents, outages, and performance degradation
Perform advanced troubleshooting across hardware, OS, networking, and storage layers
Coordinate with engineering, network, facilities, and vendor teams during incidents
Drive root cause analysis (RCA) and corrective actions
Preventive Maintenance & Reliability
Own and improve preventive and predictive maintenance programs
Validate maintenance procedures and execution quality
Identify risks, single points of failure, and reliability gaps
Project Execution & Change Management
Lead or support complex operational projects such as:
Data center expansions
Hardware refresh programs
Infrastructure upgrades
Execute changes in line with change management and risk controls
Documentation, Standards & Mentorship
Own and maintain senior-level operational documentation and SOPs
Contribute to audits, compliance reviews, and operational assessments
Mentor and support junior and mid-level technicians
Promote a strong culture of safety, discipline, and continuous improvement
Your key competencies
Education
Degree in Computer Science, Information Technology, Engineering, or equivalent experience
Experience
8–12+ years of experience in data center operations or mission-critical IT environments
Proven experience leading operational activities in 24/7 critical facilities
Demonstrated ownership of incident management and reliability initiatives
Technical Expertise
Deep hands-on expertise with:
Server, storage, and rack infrastructure
Networking fundamentals and connectivity troubleshooting
Linux
Strong understanding of monitoring, DCIM, ticketing, and change management tools
Leadership & Personal Attributes
Strong decision-making under pressure
High sense of ownership and accountability
Ability to mentor and guide less experienced technicians
Excellent written and verbal communication skills in English
Proactive, detail-oriented, and reliability-focused mindset