Summary
We are seeking motivated and detail-oriented System Validation Engineers to join our Burn-in Testing team, which is essential to delivering reliable and stable high-performance inference server platforms. In this role, you will design, implement, and execute comprehensive burn-in and stress test frameworks, analyze results to identify potential failure patterns, and work closely with internal hardware and software teams as well as external manufacturing partners. You will be directly involved in diagnosing issues, driving improvements, and ensuring that our server hardware and software meet stringent quality and reliability standards prior to customer deployment.
Responsibilities
- Collaborate with cross-functional engineering teams to understand system features, requirements, and validation goals.
- Design, develop, and implement automated test scripts for burn-in, stress, and reliability testing using Python, Bash, or Go.
- Execute system-level tests across server platforms, monitoring health, stability, and performance metrics.
- Analyze test results, identify root causes of failures, and work with engineering teams to drive fixes.
- Conduct fault-injection, power-cycle, boot-sequence, and workload-deployment testing.
- Build and maintain scalable test environments and automation tools used across multiple configurations.
- Create clear documentation including test plans, test cases, procedures, and reports.
- Continuously improve test processes, workflows, and tooling to enhance reliability and validation coverage.
Requirements
- 5+ years of experience in system testing, software QA, or hardware-software integration.
- Hands-on scripting skills in Python, Bash, or Go for test automation.
- Solid understanding of Linux operating systems and server hardware architecture.
- Experience executing system-level tests, including stress, reliability, or performance testing.
- Ability to analyze system logs, debug hardware/software issues, and drive root-cause analysis.
- Familiarity with testing methodologies, defect tracking tools, and structured validation processes.
- Experience working with test environments such as virtualization, containers, or cloud systems.
- Good understanding of performance metrics (CPU, memory, storage, network) and benchmarking concepts.
Post Time|2026/01/06



