Assistant Network ManagerHeadquartersHKCMIXApplyStaff Application
Responsibility
Key Responsibilities: Network Operations & Maintenance: Monitor, inspect, and maintain AI/ML infrastructure networks, including IP, BGP, EVPN, and NVIDIA InfiniBand fabrics.
Perform proactive system checks, fault diagnosis, and performance optimization.
Troubleshoot complex network issues and ensure rapid resolution to minimize downtime.
Escalate critical incidents following defined protocols and coordinate with cross-functional teams.
Network Management & Automation:Administer network management systems (NMS) for monitoring, logging, and alerting.
Implement automation scripts/tools to improve network efficiency and reliability.
Vendor & Certification Alignment: Work closely with vendors (Cisco, Huawei, NVIDIA, etc.) for hardware/software support and optimizations.
Maintain compliance with best practices and industry standards.
Team Leadership & Documentation: Mentor junior engineers and supervise daily operations.
Maintain detailed documentation of network configurations, incidents, and resolutions.Requirements
Qualifications & Experience: Education:Bachelor’s degree or higher in Computer Engineering, Telecommunications, or equivalent qualications.
Experience: 15+ years’ working experiences in network operations, preferably in AI/HPC/data center environments.
Expertise in IP networking (IGP/BGP), EVPN, and NVIDIA InfiniBand (IB).
Hands-on experience with network management and monitoring tools.
Certifications (Preferred): Cisco (CCNP/CCIE), Huawei (HCIE), or NVIDIA InfiniBand certifications.
Soft Skills: Strong problem-solving, communication, and leadership abilities.
Ability to work under pressure in a fast-paced AI infrastructure environment.
Language Skills:Fluent in English and Mandarin; Cantonese proficiency is a strong plus.Preferred Skills:Familiar with AI/ML cluster networking and high-performance computing (HPC).
Knowledge of “Linux networking, SDN, or cloud networking (AWS/Azure/GCP)” is a plus.