Manage, monitor, and maintain critical infrastructure components such as servers, storage systems, and networks.
Use infrastructure monitoring tools to proactively detect issues and ensure continuous service availability.
Respond to and resolve infrastructure-related incidents in a timely manner, minimizing downtime.
Conduct root cause analysis of recurring issues and implement solutions to prevent future incidents.
Participate in on-call rotations and work flexible hours to ensure 24/7 support.
Perform capacity planning to ensure infrastructure scalability and support business growth.
Optimize system and network performance by identifying potential bottlenecks and making necessary adjustments.
Implement and manage regular backups of key systems and data.
Support disaster recovery processes and ensure infrastructure resilience to maintain business continuity.
Apply security best practices to protect infrastructure from internal and external threats.
Ensure compliance with internal policies, industry standards, and regulatory requirements.
Develop automation tools to streamline operational processes, improve efficiency, and reduce manual effort.
Continuously seek opportunities to enhance and optimize infrastructure operations through automation.
Maintain accurate documentation of infrastructure configurations, procedures, and incident resolutions.
Provide regular performance and incident reports to management and relevant stakeholders.
Collaborate with cross-functional teams on infrastructure-related projects and new initiatives.
Provide technical support to other teams and mentor junior staff where applicable.
Job Qualifitions
Bachelor’s degree in information technology, Computer Science, or related field. Fresh graduates are encouraged to apply.
Prior internship or project experience in IT infrastructure management is a plus.
GCP Professional level certification, AWS Certified Solutions Architect, or Azure Administrator certification is preferred, or proven experience with the ability to obtain certification later.
Strong learning ability and willingness to take the exams post-hiring are welcome.
Knowledge of networking, server administration, and cloud platforms (Google Cloud, AWS, Azure).
Familiarity with virtualization tools (e.g., VMware, Hyper-V) and automation tools (e.g., Ansible, Terraform).
Understanding of monitoring and logging systems (e.g., Nagios, Zabbix).
Ability to present a self-introduction in English and communicate effectively with team members and international stakeholders.
Strong problem-solving skills with the ability to work under pressure.
Team player with excellent communication skills.
Flexibility in working shifts, on-call, and remote office locations (Operation Center or Data Center sites).
Training and development opportunities will be provided, including support for certification exams.
Good command in English (Minimum 750 TOEIC score).