Checking and Repairing Index Consistency
You can use the GSI tool to check the consistency between index data and user data. If index data is inconsistent with user data, you can use the GSI tool to repair index data.
Scenarios
You can use the GSI tool to check the consistency between index data and user data. If index data is inconsistent with user data, you can use the GSI tool to repair index data.
Function
Do not perform a Put or Delete operation when using this function. Otherwise, the result will be abnormal.
The index data repair function is designed for exceptions that occur during a Put or Delete operation. If an index table is manually damaged, the index data may fail to be repaired. In this case, you are advised to delete the index and re-create one. For example, the GSI tool considers an index column in the Verified state as normal. If the covering column of the index column is manually modified, the modification cannot be deleted by this function.
How to Use
Before using this function, modify the Hadoop configuration file mapred-site.xml and add the lib directory of HBase to the mapreduce.application.classpath attribute. (By default, the OmniHBaseGSI JAR package is stored in /usr/local/hbase/lib. If the JAR package is stored in another directory, you need to configure the GSI JAR package to classpath.) After that, restart Hadoop and HBase.
Run the following command to check the data consistency. If the data is inconsistent, the index data will be rebuilt. The consistency check result is saved to the default:INDEX_CONSISTENCY_RECORD_TABLE and default:INDEX_CONSISTENCY_RECORD_METADATA tables.
1 | hbase com.huawei.boostkit.hindex.mapreduce.consistency.GlobalHIndexConsistencyTool -dt table1 -it idx3 -src BOTH -r |
Parameter description:
- -dt,--data-table: Name of the data table to be checked.
- -it,--index-table: Name of the index to be checked.
- Optional: -src,--source: Check mode, which is BOTH by default and can be any of the following values:
- INDEX_TABLE_SOURCE: The index table is used as the source table.
- DATA_TABLE_SOURCE: The data table is used as the source table.
- BOTH: Both the index table and data table are used as the source tables.
- Optional: -r,--repair: repairs the index data. If this parameter is added, the index data is repaired after the check.
- Optional: -sc,--scan-caching: Scan caching size of a MapReduce task for consistency check or repair.
- Optional: -h, --help: Displays the available options of the GSI tool.