Smartmontools (S.M.A.R.T. Monitoring Tools) is a set of utility programs (smartctl and smartd) to control and monitor computer storage systems using the Self-Monitoring, Analysis and Reporting Technology (S.M.A.R.T.) system built into most modern (P)ATA, Serial ATA, SCSI/SAS and NVMe hard drives.
Smartmontools displays early warning signs of hard drive problems detected by S.M.A.R.T., often giving notice of impending failure while it is still possible to back data up.
In this post, we will show you how to check SSD and HDD health on Linux.
Install Smartctl
By default, Smartctl is included in the default repository of all major Linux distributions.
For Debian and Ubuntu distribution, install Smartctl using the following command:
#sudo apt-get install smartmontools -y
For RHEL, CentOS, and Fedora distribution, install Smartctl using the following command:
sudo dnf install smartmontools
[ads1]
After installing Smartctl, start the Smartctl service using the following command:
# sudo systemctl start smartd
You can check the status of the smartd with the following command:
systemctl status smartd
You should get the following output:
# systemctl status smartd
● smartmontools.service - Self Monitoring and Reporting Technology (SMART) Daem>
Loaded: loaded (/lib/systemd/system/smartmontools.service; enabled; vendor>
Active: active (running) since Thu 2021-08-12 08:06:42 CEST; 22s ago
Docs: man:smartd(8)
man:smartd.conf(5)
Main PID: 71321 (smartd)
Status: "Next check of 1 device will start at 08:36:42"
Tasks: 1 (limit: 9278)
Memory: 1.7M
CGroup: /system.slice/smartmontools.service
└─71321 /usr/sbin/smartd -n
Aug 12 08:06:42 Gandalf smartd[71321]: Device: /dev/sda [SAT], opened
Aug 12 08:06:42 Gandalf smartd[71321]: Device: /dev/sda [SAT], PNY CS900 120GB >
Aug 12 08:06:42 Gandalf smartd[71321]: Device: /dev/sda [SAT], not found in sma>
Aug 12 08:06:42 Gandalf smartd[71321]: Device: /dev/sda [SAT], can't monitor Cu>
Aug 12 08:06:42 Gandalf smartd[71321]: Device: /dev/sda [SAT], can't monitor Of>
Aug 12 08:06:42 Gandalf smartd[71321]: Device: /dev/sda [SAT], is SMART capable>
Aug 12 08:06:42 Gandalf smartd[71321]: Monitoring 1 ATA/SATA, 0 SCSI/SAS and 0 >
Aug 12 08:06:42 Gandalf smartd[71321]: Device: /dev/sda [SAT], previous self-te>
Aug 12 08:06:42 Gandalf smartd[71321]: Device: /dev/sda [SAT], state written to>
Aug 12 08:06:42 Gandalf systemd[1]: Started Self Monitoring and Reporting Techn>
Test Health of SSD/HDD
To test overall-health of the drive, type:
# sudo smartctl -d ata -H /dev/sda
Where,
d – Specifies the type of device.
ata – the device type is ATA, use scsi for SCSI device type.
H – Check the device to report its SMART health status.
# sudo smartctl -d ata -H /dev/sda [sudo] password for rasho: smartctl 7.1 2019-12-30 r5022 [x86_64-linux-5.4.0-80-generic] (local build) Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED
The result PASSED indicates that the disk drive is good. If the device reports failing health status, this means either that the device has already failed or could fail very soon.
If it indicates failing use -a option to get more information.
# sudo smartctl -a /dev/sda
Example output:
# sudo smartctl -a /dev/sda smartctl 7.1 2019-12-30 r5022 [x86_64-linux-5.4.0-80-generic] (local build) Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org === START OF INFORMATION SECTION === Device Model: PNY CS900 120GB SSD Serial Number: PNY07190003520101427 LU WWN Device Id: 5 f8db4c 071901427 Firmware Version: CS900612 User Capacity: 120,034,123,776 bytes [120 GB] Sector Size: 512 bytes logical/physical Rotation Rate: Solid State Device Form Factor: 2.5 inches Device is: Not in smartctl database [for details use: -P showall] ATA Version is: ACS-4 (minor revision not indicated) SATA Version is: SATA 3.2, 6.0 Gb/s (current: 6.0 Gb/s) Local Time is: Thu Aug 12 08:26:38 2021 CEST SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x00) Offline data collection activity was never started. Auto Offline Data Collection: Disabled. Self-test execution status: ( 32) The self-test routine was interrupted by the host with a hard or soft reset. Total time to complete Offline data collection: (65535) seconds. Offline data collection capabilities: (0x79) SMART execute Offline immediate. No Auto Offline data collection support. Suspend Offline collection upon new command. Offline surface scan supported. Self-test supported. Conveyance Self-test supported. Selective Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time: ( 2) minutes. Extended self-test routine recommended polling time: ( 30) minutes. Conveyance self-test routine recommended polling time: ( 6) minutes. SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x000b 100 100 050 Pre-fail Always - 0 9 Power_On_Hours 0x0012 100 100 000 Old_age Always - 5116 12 Power_Cycle_Count 0x0012 100 100 000 Old_age Always - 3573 168 Unknown_Attribute 0x0012 100 100 000 Old_age Always - 0 170 Unknown_Attribute 0x0003 093 093 000 Pre-fail Always - 65 173 Unknown_Attribute 0x0012 100 100 000 Old_age Always - 10420411 192 Power-Off_Retract_Count 0x0012 100 100 000 Old_age Always - 225 194 Temperature_Celsius 0x0023 067 067 000 Pre-fail Always - 33 (Min/Max 33/33) 218 Unknown_Attribute 0x000b 100 100 050 Pre-fail Always - 0 231 Temperature_Celsius 0x0013 100 100 000 Pre-fail Always - 94 241 Total_LBAs_Written 0x0012 100 100 000 Old_age Always - 12374 SMART Error Log Version: 1 No Errors Logged SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Extended offline Interrupted (host reset) 00% 1822 - SMART Selective self-test log data structure revision number 0 Note: revision number not 1 implies that no selective self-test has ever been run SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 0 0 Not_testing 2 0 0 Not_testing 3 0 0 Not_testing 4 0 0 Not_testing 5 0 0 Not_testing Selective self-test flags (0x0): After scanning selected spans, do NOT read-scan remainder of disk. If Selective self-test is pending on power-up, resume after 0 minute delay.
[ads1]
You can monitor the following attributes:
[ID 5] Reallocated Sectors Count – Numbers of sectors reallocated due to read errors.
[ID 187] Reported Uncorrect – Number of uncorrectable errors while accessing read/write to sector.
[ID 230] Media Wearout Indicator – Current state of drive operation based upon the Life Curve.
100 is the BEST value and 0 is the WORST.
Check SMART Attribute Details for more information.
To initiate the extended test (long) using the following command:
# sudo smartctl -t long /dev/sda
To perform a self test, run:
# sudo smartctl -t short /dev/sda
To find drive’s self test result, use the following command.
# sudo smartctl -l selftest /dev/sda
To evaluate estimate time to perform test, run the following command.
# sudo smartctl -c /dev/sda
You can print error logs of the disk by using the command:
# sudo smartctl -l error /dev/sda
To get help information, run the following command:
# sudo smartctl --help
Conclusion
[ads1]
In the above guide, you learned how to install and use the S.M.A.R.T tool to check the health of your SSH and HDD drives. I hope this will help you a lot. For more information, read the smartctl man page.
See also: GDU fast console disk usage analyzer