Responsibilities:
? Design, implement manage AliCloud CI/CD pipelines to facilitate automated build, test, deployment processes fapplications services.
? Familiar with code governance, release management, artifact management & integration toolchain in SDLC.
? Continuously improve optimize the performance scalability of the cloud-based infrastructure service.
? Troubleshoot resolve issue related to Alicloud platform, application deployments performance, participate in on-call support as needed.
? Provide analysis support to development team to improve automate the build/release/deployment processes
? Participate report on cost optimization exercises fcloud resources
? Not only providing the development capabilities but also having operation technique across infrastructure, IT service management, disastrous recovery, patching, monitoring in our self-owned systems.
? Our stakeholders are scattering in APAC region, as a DevOps Engineer, Fluent English both inal writing is preferred.
Requirements:
? Bachelor’s Degree in Computer Science, Engineering, Software Engineering.
? Strong experience with Linux-based infrastructures, Linux/Unix administration, Kubernetes.
? Strong experience with AliCloud (e.g. ACK, ARMS, KMS,Cloud-native Api Gateway.. )
? Solid underston the DevSecOps process with CI/CD pipeline management platforms such as GitHub, Jenkins, Artifactory, Snyk etc.
? Strong scripting skills such as Shell, Python, Groovy, Bash.
? Experience in Monitoring. (e.g. NewRelic, ARMS)
? More than five years of experience in a DevOps Engineer role (similar role); experience in software development infrastructure development is a plus.
? Experience in working with Azure is preferred.
? Nice to have experience working with infrastructure as code (Terraform, Ansible)
? Nice to have AliCloud certification (e.g. ACA, ACP ACE)
更新于 2025-12-21
查看更多崗位職責
Principal responsibilities:
-Design, implement, maintain highly available, scalable infrastructure solutions, leveraging automation to streamline operations.
-Monitsystem performance, proactively identify potential issues, drive incident response root cause analysis.
-Collaborate with cross-functional teams (development, product, security) to integrate reliability best practices the entire software lifecycle.
-Develop manage automation scripts, CI/CD pipelines, infrastructure-as-code (IaC) frameworks to enhance efficiency reduce manual intervention.
-Optimize cloud resources, cost management, disaster recovery strategies to ensure business continuity.
Qualifications :
-Experience: Minimum 5 years in IT operations Site Reliability Engineering, with a focus on infrastructure management system optimization.
-Technical Skills: Proficiency in operation control tools such as Ansible, Puppet, Chef, Terraform, Prometheus, Grafana, ELK Stack.
-Strong scripting skills in Python, Shell, similar languages.
Cloud Competency: Solid experience with majcloud platforms (AWS, Azure, GCP), including services like EC2, Lambda, Kubernetes, containerization.
-Problem-Solving: Proven ability to troubleshoot complex issues across distributed systems, networks, applications.
-Communication: Excellent written verbal communication skills, with the ability to collaborate effectively in a fast-paced, dynamic environment.
Preferred Qualifications:
-3+ years of dedicated experience in cloud service operations, with expertise in cloud-native architectures microservices.
-Certifications in AWS Certified Solutions Architect, Google Cloud Professional Cloud Architect, equivalent.
-Experience with service mesh technologies (e.g., Istio) observability tools (e.g., Jaeger).
-Familiarity with DevOps culture practices, including agile methodologies continuous improvement frameworks.
-Bonus: Proven experience in developing IT operation maintenance tools using Python, demonstrating the ability to automate complex workflows solve real - world problems.
更新于 2025-12-16
查看更多崗位職責