Rate This Document
Findability
Accuracy
Completeness
Readability

Sensitive Information Scan

Command Function

Scans files for sensitive information, such as mobile numbers, public network addresses, and ID numbers. You can customize sensitive words.

Syntax

1
devkit doctor sen-scan {-i INPUT_PATH | --input INPUT_PATH} [-o OUTPUT_PATH | --output OUTPUT_PATH] [-S | --show] [-t [PATH]| --template [PATH]]  [-sn {1|2|3|a|b|c}*| --sen-num {1|2|3|a|b|c}*] [-sf PATH| --sen-file PATH]

Parameter Description

Table 1 Parameter description

Parameter

Option

Parameter description

-h/--help

-

Obtains help information.

-S/--show

-

Displays the default list of sensitive words.

  • 1: Public IP address
  • 2: Mobile number
  • 3: ID number
  • a: Hard-coded key/Password (high false positive rate)
  • b: Common password text (high false positive rate)
  • c: Privacy text (high false positive rate)

-t/--template

-

Generates a sensitive word template in the specified location. If no location is specified, the sen_word.json file is generated in the doctor directory by default. You can use -sf to specify a sensitive word template for a scan.

-sn/--sen-num

1/2/3/a/b/c

Sensitive word ID. You can run -S to obtain a sensitive word ID. Use commas (,) to separate multiple sensitive word IDs. If no sensitive word ID is specified, all sensitive words are scanned for by default.

-sf/--sen-file

-

File path of the user-defined sensitive words. This file must have the same format as the template file generated by -t.

-i/--input

-

Path to the folder or file to be scanned. Only text files can be scanned. Use spaces to separate multiple text files.

-o/--output

-

Path for storing scan reports. If this parameter is not specified, the sen_scan_{time}_[zh|en]_{num}.xlsx file is generated in the doctor/report/sen_scan directory by default.

During a scan, you can press Ctrl+C to stop the scan. After the scan is stopped, the data already detected is outputted. A single report supports a maximum of 10,000 data records.

Example

  • Viewing the sensitive word list
    1
    devkit doctor sen-scan -S
    

    Command output:

    1
    2
    3
    4
    5
    6
    7
    8
    id      note
    ————————————————————————————————————————————————————————————
    1       Public IP address
    2       Mobile number
    3       ID number
    a       Hard-coded key/Password (high false positive rate)
    b       Common password text (high false positive rate)
    c       Privacy sensitive words (high false positive rate)
    
  • Generating a sensitive word template
    1. Generate a sensitive word template.

      The following uses the /home/temp template directory as an example. Replace it with the actual one. If the template directory is not specified, the sen_word.json file is generated in the doctor directory by default.

      1
      devkit doctor sen-scan -t /home/temp
      

      The following information is displayed. If a file with the same name already exists in the path, the file name is automatically suffixed by 1, for example, sen_word_1.json.

      1
      [INFO]Generating the template file in /home/temp/sen_word.json succeeded.
      
    2. Edit the template file.
      1
      vi /home/temp/sen_word.json
      
    3. Press i to enter the insert mode and configure the sensitive word template.
       1
       2
       3
       4
       5
       6
       7
       8
       9
      10
      11
      12
      [
          {
              "word": "",
              "word_type": "regex",
              "word_note": ""
          },
          {
              "word": "",
              "word_type": "text",
              "word_note": ""
          }
      ]
      
      • word: sensitive word to be scanned, which must correspond to the sensitive word type.
      • word_type: type of the sensitive word. regex is a regular expression and text is a text style.
      • word_note: description of the sensitive word. This parameter is optional.
    4. Press Esc to exit the insert mode. Type :wq! and press Enter to save the file and exit.
  • Sensitive information scan
    The following example describes how to scan /home/software/RuoYi-master/ with the specified sensitive word code and template. Replace this directory with the actual one.
    1
    devkit doctor sen-scan -i /home/software/RuoYi-master/ -sn 1,2,3 -sf /home/temp/sen_word.json
    

    The following information is displayed and a report is generated:

    1
    2
    3
    [INFO]Start scan /home/software/RuoYi-master.
    [INFO]The scan is complete, starting to generate the report.
    Excel report is created successfully. Files are located in /usr/local/devkit/doctor/report/sen_scan/20240814101140
    

    A scan report in both Chinese and English is generated. The report contains three tab pages: Overview, Sensitive Words, and Details. The report displays the scan path, start and end times, whether to stop the scan, sensitive word statistics, and details.