AI Infrastructure Network Manager, Data CentreHeadquartersHKCMIXApplyStaff Application
Responsibility
Network Monitoring & Maintenance: Assist in daily network inspections, performance monitoring, and health checks for AI infrastructure.
Support troubleshooting of IP networks (IGP/BGP), EVPN, and NVIDIA InfiniBand issues.
Follow escalation procedures for critical incidents and collaborate with senior engineers.
Network Operations Support: Work with network management systems (NMS) for logging, alerts, and basic configurations.
Assist in implementing network changes and optimizations under supervision.
Vendor & Documentation Coordination: Liaise with vendors (Cisco, Huawei, NVIDIA) for hardware/software support as needed.
Maintain network documentation, including topology diagrams and incident reports.
Team Collaboration: Work closely with senior network supervisors and cross-functional teams.
Provide guidance to junior engineers on standard procedures.Requirements
8-12 years in network operations, preferably in data center/cloud/AI environments.
Hands-on experience with IP networking (OSPF/BGP), EVPN, and basic InfiniBand knowledge.
Familiarity with network monitoring tools (e.g., SolarWinds, Nagios, Prometheus).
Certifications (Preferred but Not Mandatory): Cisco (CCNA/CCNP), Huawei (HCIP), or NVIDIA networking certifications.
Soft Skills: Strong analytical and problem-solving skills.
Ability to work in a team and follow operational protocols.
Language Skills:Fluent in English and Mandarin; Cantonese proficiency is a strong plus.