Intel® Data Center Diagnostic Tool for Intel® Xeon® Processors
Introduction
The Intel® Data Center Diagnostic Tool is a diagnostic software tool that can be run on your data center platforms to:
- Verify the functionality of all cores within an Intel® Xeon® Processor.
- Be used as part of a regular system maintenance program.
High reliability and availability in the data center require the right tools and a commitment to maintenance. Intel believes it is an industry best practice to use maintenance tools such as these for both initial deployment and periodic testing to help ensure the best system experience.
Note | Modern computing infrastructure brings ever-increasing demand for processing power combined with business expectations for service quality and high availability (and guarantees on service-level agreements [SLAs] in general). These expectations emphasize the need for powerful software tools that can help predict, identify, and minimize unexpected system faults that might compromise service quality or uptime. Read a paper from IDC that covers the need for diagnostic tools including the Intel® Data Center Diagnostic Tool. |
System requirements
The Intel Data Center Diagnostic Tool is a Linux* application that can be installed and run on many current Linux distributions. There is no Windows* version of this tool.
For best coverage, run the application in the root system of a server. It is possible to run it inside a container or virtual machine, but be aware that some functionality may be disabled.
Supported processors:
- 3rd Generation Intel® Xeon® Scalable Processors (formerly Ice Lake and Cooper Lake)
- 2nd Generation Intel® Xeon® Scalable Processors (formerly Cascade Lake)
- 1st Generation Intel® Xeon® Scalable Processors (formerly Skylake)
- Intel® Xeon® Processor E5 v4 Family (formerly Broadwell)
- Intel® Xeon® Processor E7 v4 Family (formerly Broadwell)
Note |
|
Installation
Notes |
|
Debian*/Ubuntu* |
To install the Intel® Data Center Diagnostic Tool software packages on Debian*-based distributions, add the Intel software package repository and install the appropriate packages. Prior to copying+pasting to your console, you may want to run sudo ls and enter your password to prevent the commands from being consumed by the sudo password prompt: Set up the key to verify the package signatures curl https://repositories.intel.com/dcdt/dcdiag.pub | sudo apt-key add - Set up the repository sudo apt-add-repository 'deb [arch=amd64] https://repositories.intel.com/dcdt/debian stable main' Install the package sudo apt-get update |
Fedora*/CentOS*/RHEL* |
To install the Intel Data Center Diagnostic Tool software packages on a Fedora-based distribution, add the Intel software package repository and install the package. The first time you install, YUM or DNF will prompt you to accept the signing key. Verify that the fingerprint is as follows, and then accept it: Prior to copying+pasting to your console, you may want to run sudo ls and enter your password to prevent the commands from being consumed by the sudo password prompt: Install the repository file sudo yum install https://repositories.intel.com/dcdt/dcdiag-repo.rpm Install the package sudo yum install dcdiag |
OpenSUSE*/SUSE Linux Enterprise*: |
Install the repository file sudo zypper ar https://repositories.intel.com/dcdt/dcdiag.repo Install the package sudo zypper install dcdiag You will be warned that respond.xml is not signed. Respond yes to continue. You will be given another chance to verify the package signature. Verify that the fingerprint is as follows, and then accept it: Repository: dcdiag |
How to test the Intel® Xeon® Processor
Once installed, the Intel Data Center Diagnostic Tool is automatically enabled for background execution. You can verify that this is successful with the following command:
# systemctl status dcdiag
● dcdiag.service - Intel® Data Center Diagnostic Tool
Loaded: loaded (/usr/lib/systemd/system/dcdiag.service; enabled; vendor preset: disabled)
Active: active (running) since Fri 2021-02-19 11:24:17 MST; 4 days ago
Docs: file:///usr/share/doc/dcdiag/README.rst
Main PID: 8777 (dcdiag)
CGroup: /system.slice/dcdiag.service
└─8777 /usr/bin/dcdiag --service
Note | If you want to disable the Intel Data Center Diagnostic Tool's background execution, run: systemctl disable --now dcdiag For more information on using the systemctl(1) command, refer to its Linux* manual page. |
If any errors are detected while the Intel Data Center Diagnostic Tool executes in the background, the tool will log them to the system log. The tool can also query if any errors were detected in the background scan using the --query argument.
# dcdiag --query
Intel® Data Center Diagnostic Tool Version 506
Test completed successfully. No issues detected.
This tool can also be run manually in the foreground by executing at a Linux command prompt:
# dcdiag
The manual test runs for about 45 minutes and has high CPU utilization.
When the diagnostic completes, the system returns one of the following messages:
- Test completed successfully. No issues detected.
- Test completed successfully. One or more machine check errors occurred. Please check the system logs.
- This processor is not supported by this version of the tool.
Check the system's processor model and version. This message appears if the Intel Data Center Diagnostic Tool does not detect a production version of the supported processors. Engineering samples are not supported by this tool.
Find help in identifying the processor.
- Test completed. Results are inconclusive due to an outdated version of microcode.
The latest version of the microcode addresses known issues. Please update. Microcode updates are usually delivered by your Linux distribution vendor alongside security fixes and other firmware updates for various components. If your system does not have these updates enabled, we recommend that you enable them. The microcode is automatically loaded by the Linux kernel on every boot and can be reloaded at runtime with the following command as root:
echo 1 > /sys/devices/system/cpu/microcode
- Test completed. Results are inconclusive due to the system exceeding temperature limits
This could be due to a variety of issues with the system that is not providing enough cooling for the CPU to operate within required temperature limits. We recommend that you check your system to ensure that required cooling is operating correctly. This may include faulty fans, incorrect airflow, or some other environmental issue.
- Test completed. Results are inconclusive, one or more machine check errors occurred.
Check system logs.
- Test failed. Contact your system manufacturer or processor vendor for support.
If test results show fail, check if your server node's processors are still under warranty:
- If you have a Boxed Intel® Xeon® Processor still under 3-year warranty, contact Intel Customer Support for assistance.
- If you have a tray processor, contact your system or processor vendor or place of purchase to check if the processor is still under warranty.
Note Tray processors are sold directly to system manufacturers or Intel authorized distributors. Intel does not provide direct warranty to end users for tray processors unless they came preinstalled in Intel® Data Center Blocks (Intel® DCB) server systems. Except for Intel DCB systems, the tray processor’s warranty is from the vendor or place of purchase of the processor or the system if the processor was pre-installed. Intel recommends purchasing from Intel Authorized Distributors, Intel Approved Suppliers, and resellers of Intel® products. - Be aware that Intel does not have an out-of-warranty replacement program.
-
Test failed.
Test completed, and an error was detected on the physical processor containing /sys/devices/system/cpu/cpuXX.
Contact your system manufacturer or processor vendor for support.
-
Test failed.
Test is unable to determine which physical processor caused the failure.
Contact your system manufacturer or processor vendor for support.
Version history
Date | Version | Description |
July 7, 2021 | 540 | Initial version |
Related topics |
Intel® Xeon® Support Central Website |
Warranty Guide for Intel® Processors |