Training Models
The section provides details on ModelZoo model training, including obtaining models, datasets, and training steps.
Obtaining Models and Datasets
- Download the ModelZoo source code.
git clone https://gitee.com/openeuler/sra_benchmark.git -b v1.0.0
The following figure shows the directory structure.

- Download the datasets.
- Change the file name extension of each dataset to .csv.
mv test.txt test.csv mv train.txt train.csv
- Extract the datasets, and copy the Taobao and criteo-kaggle datasets to the path/to/dataset directory.
1 2
tar -zxvf taobao.tar.gz cp train.csv eval.csv taobao /path/to/dataset
Evaluating Performance in the Training Phase
- Go to the directory where ModelZoo is stored.
1cd /path/to/sra_benchmark/modelzoo
- Train and save models.
1python train_throughput_test.py --test_method single --meta_path /path/to/sra_benchmark --criteo_data_location /path/to/dataset --taobao_data_location /path/to/dataset/taobao
Table 1 describes the command parameters.
After a model, such as Wide_and_Deep, is trained, the trained model is saved into the following directory structure, where variables directory holds the data and saved_model.pb defines the structure of the trained model.

The Area Under the Curve (AUC) value is displayed on the terminal.
Table 1 Command parameters for model training Parameter
Description
--test_method
Specifies the resources used during training.
- single: A single NUMA node is used (default).
- entire: All NUMA nodes of the server are used.
--meta_path
Path to sra_benchmark.
--criteo_data_location
Path to the criteo-kaggle datasets.
--taobao_data_location
Path to the Taobao datasets.
Evaluating AUC on the Test Dataset
ModelZoo integrates multiple common search and recommendation models and allows users to assess training performance using metrics like AUC. The AUC is a core metric for evaluating the training performance. It measures the capability of a binary classification model to distinguish between positive and negative samples. The effectiveness of the training process can only be ensured when the AUC of the model on the test dataset reaches a proper threshold (see Table 2), which is a prerequisite for meaningful subsequent evaluation of model performance.
