Creating an Index

No data is filled in an index table created in without data mode. Data is filled in an index table created in with data mode. Before creating an index, you need to create a data table, for example, by running create 'tablename','cf_0','cf_1'.

Without Data

Run the following command to create an index table. After an index table is created, it is empty and no data is filled in.

hbase com.huawei.boostkit.hindex.mapreduce.GlobalTableIndexer -Dtablename.to.index=tablename -Dindexspecs.to.add='idx_0=>cf_0:[q_0->STRING],[q_1];cf_1:[q_0->STRING],[q_1]#idx_1=>cf_0:[q_1->STRING],[q_2];cf_1:[q_1->STRING],[q_2]' -Dindexspecs.covered.to.add='idx_0=>cf_0:[q_0];cf_1:[q_0]#idx_1=>cf_0:[q_0];cf_1:[q_0]'

**Table 1** Parameters
Parameter	Description
tablename.to.index	Namespace and table name of the data table for which the index is created.
indexspecs.to.add	Mapping between the index name and the columns of the corresponding data table. No data is filled in the index table.
indexspecs.covered.to.add (optional)	Covering column.

**Table 2** Parameter value description
Value	Description
idx_0, idx_1	Index name.
cf_0, cf_1	Column family name.
q_0, q_1, q_2	Column name.
STRING	Data type. It can be STRING, INTEGER, FLOAT, LONG, DOUBLE, SHORT, BYTE, or CHAR.
'#'	Separates indexes.
';'	Separates column families.
','	Separates column qualifiers.

The column name and its data type must be included in '[]'.
Column names and their data types are separated using '->'.
If the data type of a specific column is not specified, the default data type (STRING) is used.

Suggestions on index column selection: During index-based query, the query condition must contain the first index column. Therefore, the column that is used as a query condition and has a high selection rate is preferentially selected as the first column in the index columns.
If the primary and secondary masters are switched during index creation, an error may be reported. However, the creation process may be successfully executed on the secondary master. You can query the metadata to verify the operation.
If indexes are created for columns with high randomness, the query performance is improved more significantly.

With Data

Run the following command to create an index table and fill in data. You will have a complete index table with index data.

hbase com.huawei.boostkit.hindex.mapreduce.GlobalTableIndexer -Dtablename.to.index=tablename -Dindexspecs.to.addandbuild='idx_0=>cf_0:[q_0->STRING],[q_1];cf_1:[q_0->STRING],[q_1]#idx_1=>cf_0:[q_1->STRING],[q_2];cf_1:[q_1->STRING],[q_2]' -Dindexspecs.covered.to.add='idx_0=>cf_0:[q_0];cf_1:[q_0]#idx_1=>cf_0:[q_0];cf_1:[q_0]'

**Table 3** Parameters
Parameter	Description
tablename.to.index	Name of the data table for which an index is to be created.
indexspecs.to.addandbuild	Mapping between the index name and the columns of the corresponding data table. The index table data is filled.
indexspecs.covered.to.add (optional)	Covering column.

**Table 4** Parameter value description
Parameter Value	Description
idx_0, idx_1	Index name.
cf_0, cf_1	Column family name.
q_0, q_1, q_2	Column name.
STRING	Data type. It can be STRING, INTEGER, FLOAT, LONG, DOUBLE, SHORT, BYTE, or CHAR.
'#'	Separates indexes.
';'	Separates column families.
','	Separates column qualifiers.

The column name and its data type must be included in '[]'.
Column names and their data types are separated using '->'.
If the data type of a specific column is not specified, the default data type (STRING) is used.

Suggestions on index column selection: During index-based query, the query condition must contain the first index column. Therefore, the column that is used as a query condition and has a high selection rate is preferentially selected as the first column in the index columns.
If the primary and secondary masters are switched during index creation, an error may be reported. However, the creation process may be successfully executed on the secondary master. You can query the metadata to verify the operation.
If indexes are created for columns with high randomness, the query performance is improved more significantly.

Parent topic: Using the Feature