
Summary
A global engineering team builds advanced computing systems that power large-scale AI and high-performance workloads. They develop both hardware and low-level software to ensure these systems operate with maximum reliability, efficiency, and stability across diverse environments.
We are looking for a highly motivated System Validation Engineer to join the system reliability and burn-in validation team. In this role, you will design and execute system-level test suites, automate stress and reliability workflows, analyze performance data, and collaborate with hardware, firmware, and software engineering teams to resolve issues before products reach customers.
This position plays a key role in ensuring the quality of next-generation compute platforms.
Responsibilities
- Collaborate with cross-functional engineering teams to understand system features, requirements, and validation goals.
- Design, develop, and implement automated test scripts for burn-in, stress, and reliability testing using Python, Bash, or Go.
- Execute system-level tests across server platforms, monitoring health, stability, and performance metrics.
- Analyze test results, identify root causes of failures, and work with engineering teams to drive fixes.
- Conduct fault-injection, power-cycle, boot-sequence, and workload-deployment testing.
- Build and maintain scalable test environments and automation tools used across multiple configurations.
- Create clear documentation including test plans, test cases, procedures, and reports.
- Continuously improve test processes, workflows, and tooling to enhance reliability and validation coverage.
Requirements
- 5+ years of experience in system testing, software QA, or hardware-software integration.
- Hands-on scripting skills in Python, Bash, or Go for test automation.
- Solid understanding of Linux operating systems and server hardware architecture.
- Experience executing system-level tests, including stress, reliability, or performance testing.
- Ability to analyze system logs, debug hardware/software issues, and drive root-cause analysis.
- Familiarity with testing methodologies, defect tracking tools, and structured validation processes.
- Experience working with test environments such as virtualization, containers, or cloud systems.
- Good understanding of performance metrics (CPU, memory, storage, network) and benchmarking concepts.
Post Time|2025/12/01


