Rate This Document
Findability
Accuracy
Completeness
Readability

OmniData Configuration File

Table 1 describes the configuration items in the /home/omm/omnidata-install/omnidata/etc/jvm.config file.

Table 1 Configuration items in jvm.config

Category

Configuration Item

Default Value

Description

Memory limit

-Xmx

1/4 of the physical memory

Maximum heap size of the Java virtual machine (JVM).

Memory limit

-Xms

1/64 of the physical memory

Initial heap size of the JVM.

Table 2 describes the configuration items in the /home/omm/omnidata-install/omnidata/etc/config.properties file.

Table 2 Configuration items in config.properties

Category

Configuration Item

Default Value

Description

CPU limit

number.of.cpu.core

-

Number of CPU cores that can be used by OmniData. The startup script uses CGROUP to set OmniData process resources based on this value. Set this value as follows:

  1. Add the number.of.cpu.core=xxx configuration item to the /home/omm/omnidata-install/omnidata/etc/config.properties file to set the maximum number of cores used for process running.
  2. Restart OmniData.

Maximum number of tasks

max.task.queue.size

3000

Number of tasks received by OmniData. The number must be in direct proportion to the number of cores. Set the multiple based on the actual CPU performance. The recommended value is the number of available CPU cores multiplied by 4.

Maximum timeout duration of a task

task.timeout.period

120000

OmniData task processing timeout duration. The default value is 120,000, in milliseconds.

Cache expression

compile.expression-cache-size

8192

Size of cache expressions.

Compression

compression.enabled

false

Indicates whether data is compressed.

Storage time zone

storage.timezone

-

Default server time zone.

Plugin

external-functions-plugin.dir

/home/omm/omnidata-install/omnidata/plugin

Plugins exist as folders in the directory.

Hive UDF plugin

function-namespace.dir

/home/omm/omnidata-install/omnidata/etc/function-namespace

Directory for storing the configuration file of the Hive UDF plugin. The directory must be under etc/function-namespace in the installation directory.

Accessing Ceph/HDFS

hdfs.config.resources

/home/omm/omnidata-install/omnidata/etc/hdfs-site.xml,

/home/omm/omnidata-install/omnidata/etc/core-site.xml

Paths of core-site.xml and hdfs-site.xml. Use a comma (,) to separate them.

In scenario 1 of Configuring the Feature, the two files are stored in the specified directory.

Indicates whether HDFS is in secure mode.

hdfs.authentication.type

NONE

HDFS authentication mode. The value can be NONE or KERBEROS.

Configuring secure HDFS

hdfs.krb5.conf.path

-

Path of the krb5.cnf file. If a secure HDFS cluster is connected, configure krb5.cnf, keytab, and principal.

hdfs.krb5.keytab.path

-

Path of the keytab file.

hdfs.krb5.principal

-

User principal.

fs.hdfs.impl.disable.cache

false

Disables HDFS access to the cache.

Spark registration service

omnidata.zookeeper.heartbeat.enabled

true

Indicates whether OmniData registers with ZooKeeper and sends status information.

ZooKeeper configuration

zookeeper.quorum.server

Parameters input by users

IP address of the ZooKeeper server.

zookeeper.namespace

sdi

OmniData node name registered with ZooKeeper.

zookeeper.status.node

status

Directory registered by OmniData with ZooKeeper for storing pushdown information.

zookeeper.connection.timeoutMs

15000

Timeout duration of a ZooKeeper connection, in milliseconds.

zookeeper.session.timeoutMs

60000

Timeout duration of a ZooKeeper session, in milliseconds.

zookeeper.retry.intervalMs

1000

ZooKeeper reconnection interval upon failure, in milliseconds.

omnidata.pushdown.threshold

0.8f

Threshold of OmniData pushdown node resources.

omnidata.status.update.interval

3

Update frequency of OmniData pushdown node resources, in seconds.

Secure ZooKeeper configuration

zookeeper.krb5.enabled

false

Indicates whether the ZooKeeper krb5 security configuration is enabled.

zookeeper.java.security.auth.login.config

-

ZooKeeper secure login configuration path.

zookeeper.krb5.conf

-

Path of the krb5.conf file of ZooKeeper. When the secure ZooKeeper is connected, configure krb5.conf, keytab, and principal.

Configuring the Spark Registration Service and Secure ZooKeeper

Spark uses ZooKeeper to collect and manage information about OmniData nodes so that the engine can detect the nodes and the number of tasks on the nodes. When OmniData connects to Spark, you need to configure the Spark registration service and secure ZooKeeper configuration in the preceding table.

The following steps show a typical configuration for OmniData to connect to Spark.

  1. Open the config.properties configuration file.
    1
    vi /home/omm/omnidata-install/omnidata/etc/config.properties
    
  2. Press i to enter the insert mode and configure as follows:
     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    zookeeper.quorum.server=xxx.xxx.xxx.xxx:2181
    hdfs.config.resources=/home/omm/omnidata-install/omnidata/etc/hdfs-site.xml,/home/omm/omnidata-install/omnidata/etc/core-site.xml
    hdfs.authentication.type=KERBEROS
    external-functions-plugin.dir=/home/omm/omnidata-install/omnidata/plugin
    hdfs.krb5.conf.path=/home/omm/omnidata-install/omnidata/etc/krb5.conf
    hdfs.krb5.keytab.path=/home/omm/omnidata-install/omnidata/etc/hdfs.keytab
    hdfs.krb5.principal=hdfs/server1@EXAMPLE.COM
    omnidata.zookeeper.heartbeat.enabled=true
    zookeeper.krb5.enabled=true
    zookeeper.java.security.auth.login.config=/home/omm/omnidata-install/omnidata/etc/client_jaas.conf
    zookeeper.krb5.conf=/home/omm/omnidata-install/omnidata/etc/krb5.conf
    
  3. Press Esc, type :wq!, and press Enter to save the file and exit.