Tool Usage
- Use the smartctl tool to collect SMART data of drives (both normal and faulty drives) every day, label whether the drives are faulty, and sort the data into a data file to be trained in the specified format, for example, fault_train.csv. For details, refer to Data Collection and Input File Format. In the fault_train.csv file, the fault column indicates whether a drive is faulty.
- Import BoostKit_KSML functions.
from BoostKit_KSML.hdd_fault_detect import fault_train from BoostKit_KSML.hdd_fault_detect import fault_update from BoostKit_KSML.hdd_fault_detect import fault_predict
- Call the fault_train interface and enter the to-be-trained data file obtained in step 1 to obtain a binary training model model.pkl.
fault_train(r"fault_train.csv", r"model.pkl")
- (Optional) Use the smartctl tool to collect SMART data of drives (both normal and faulty drives) every day, label whether the drives are faulty, and sort the data into a data file for incremental training of the specified format, for example, fault_update.csv. For details, refer to Data Collection and Input File Format. The fault column in this file indicates whether a drive is faulty.
- (Optional) Call the fault_update interface, input the fault_update.csv file for incrementally training the existing model model.pkl, and generate an updated model new_model.pkl.
fault_update(r"fault_update.csv", r"model.pkl", r"new_model.pkl")
- Use the smartctl tool to collect SMART data of drives to be predicted and sort the data into a data file in the specified format, for example, fault_predict.csv. For details, refer to Data Collection and Input File Format. fault_predict.csv is the dataset to be predicted and is not labeled.
- Call the fault_predict interface, and input the data file to be predicted (fault_predict.csv) and the existing model (model.pkl). If you have performed steps 4 and 5, replace model.pkl with new_model.pkl.
fault_predict(r"fault_predict.csv", r"model.pkl")
Interface logs are output by sub-loggers to the file specified by the log_file parameter. For details about the parameter, see Parameter Package. If the logging module directly configures and modifies RootLogger, logs of all sub-loggers are transmitted to RootLogger and output by RootLogger.
Parent topic: HDD Fault Prediction