Rate This Document
Findability
Accuracy
Completeness
Readability

OmniData Configuration File

Table 1 describes the configuration items in the /home/omm/omnidata-install/omnidata/etc/jvm.config file.

Table 1 Configuration items in jvm.config

Category

Configuration Item

Default Value

Description

Memory limit

-Xmx

1/4 of the physical memory

Set the maximum heap size of the Java VM.

Memory limit

-Xms

1/64 of the physical memory

Set the maximum heap size of the Java VM.

Table 2 describes the configuration items in the /home/omm/omnidata-install/omnidata/etc/config.properties file.

Table 2 Configuration items in config.properties

Category

Configuration Item

Default Value

Description

CPU limit

number.of.cpu.core

-

Number of CPU cores that can be used by OmniData. The startup script uses CGROUP to set OmniData process resources based on this value. Set this value as follows:

  1. Add the number.of.cpu.core=xxx configuration item to the /home/omm/omnidata-install/omnidata/etc/config.properties file to set the maximum number of cores used for process running.
  2. Restart the OmniData service.

Maximum number of tasks

max.task.queue.size

-

Number of tasks received by OmniData. The number must be in direct proportion to the number of cores. Set the multiple based on the actual CPU performance. The default value is max (Number of available processors x 4, 4).

Maximum timeout duration of a task

task.timeout.period

120000

OmniData task processing timeout duration. The default value is 120,000, in milliseconds.

Size of the expression cache

compile.expression-cache-size

8192

Number of cache expressions.

Compression

compression.enabled

false

Indicates whether data is compressed.

Storage time zone

storage.timezone

-

Default server time zone.

Plugin

external-functions-plugin.dir

/home/omm/omnidata-install/omnidata/plugin

Plugins exist as folders in the directory.

UDF plugin for Hive

function-namespace.dir

/home/omm/omnidata-install/omnidata/etc/function-namespace

Directory for storing the configuration file of the Hive UDF plugin. The directory must be under etc/function-namespace in the installation directory.

Accessing Ceph/HDFS

hdfs.config.resources

/home/omm/omnidata-install/omnidata/etc/hdfs-site.xml,

/home/omm/omnidata-install/omnidata/etc/core-site.xml

Paths of core-site.xml and hdfs-site.xml. Use a comma (,) to separate them.

In scenario 1 of Configuring OmniData, the two files are stored in the specified directory.

Indicates whether the HDFS is in security mode.

hdfs.authentication.type

NONE

HDFS authentication mode. The value can be NONE or KERBEROS.

Configuring the secure HDFS

hdfs.krb5.conf.path

-

Path of the krb5.cnf file. If a secure HDFS cluster is connected, configure krb5.cnf, keytab, and principal.

hdfs.krb5.keytab.path

-

Path of the keytab file.

hdfs.krb5.principal

-

User principal.

fs.hdfs.impl.disable.cache

false

Disables HDFS access to the cache.

Spark registration service

omnidata.zookeeper.heartbeat.enabled

true

Indicates whether OmniData registers with ZooKeeper and sends status information.

ZooKeeper configuration

zookeeper.quorum.server

Parameters input by users

IP address of the ZooKeeper server.

zookeeper.namespace

sdi

OmniData node name registered with ZooKeeper.

zookeeper.status.node

status

Directory registered by OmniData with ZooKeeper for storing pushdown information.

zookeeper.connection.timeoutMs

15000

ZooKeeper connection timeout interval, in milliseconds.

zookeeper.session.timeoutMs

60000

ZooKeeper session timeout interval, in milliseconds.

zookeeper.retry.intervalMs

1000

ZooKeeper reconnection interval upon failure, in milliseconds.

omnidata.pushdown.threshold

0.8f

Threshold of OmniData pushdown node resources.

omnidata.status.update.interval

3

Update frequency of OmniData pushdown node resources, in seconds.

Secure ZooKeeper configuration

zookeeper.krb5.enabled

false

Indicates whether the ZooKeeper krb5 security configuration is enabled.

zookeeper.java.security.auth.login.config

-

ZooKeeper secure login configuration path.

zookeeper.krb5.conf

-

Path of the krb5.conf file of ZooKeeper. When the secure ZooKeeper is connected, configure krb5.conf, keytab, and principal.

Configuring the Spark Registration Service and Secure ZooKeeper

Spark uses ZooKeeper to collect and manage OmniData node information, such as OmniData node names and their task quantities. When OmniData connects to the Spark engine, you need to configure the Spark registration service and secure ZooKeeper configuration in the preceding table.

The following steps show a typical configuration for OmniData to connect to the Spark engine.

  1. Open the config.properties configuration file.
    1
    vi /home/omm/omnidata-install/omnidata/etc/config.properties
    
  2. Set the following parameters, save the settings, and exit:
    zookeeper.quorum.server=xxx.xxx.xxx.xxx:2181
    hdfs.config.resources=/home/omm/omnidata-install/omnidata/etc/hdfs-site.xml,/home/omm/omnidata-install/omnidata/etc/core-site.xml
    hdfs.authentication.type=KERBEROS
    external-functions-plugin.dir=/home/omm/omnidata-install/omnidata/plugin
    hdfs.krb5.conf.path=/home/omm/omnidata-install/omnidata/etc/krb5.conf
    hdfs.krb5.keytab.path=/home/omm/omnidata-install/omnidata/etc/hdfs.keytab
    hdfs.krb5.principal=hdfs/server1@EXAMPLE.COM
    omnidata.zookeeper.heartbeat.enabled=true
    zookeeper.krb5.enabled=true
    zookeeper.java.security.auth.login.config=/home/omm/omnidata-install/omnidata/etc/client_jaas.conf
    zookeeper.krb5.conf=/home/omm/omnidata-install/omnidata/etc/krb5.conf