OmniData Configuration File
Table 1 describes the configuration items in the /home/omm/omnidata-install/omnidata/etc/jvm.config file.
Category |
Configuration Item |
Default Value |
Description |
|---|---|---|---|
Memory limit |
-Xmx |
1/4 of the physical memory |
Maximum heap size of the Java virtual machine (JVM). |
Memory limit |
-Xms |
1/64 of the physical memory |
Initial heap size of the JVM. |
Table 2 describes the configuration items in the /home/omm/omnidata-install/omnidata/etc/config.properties file.
Category |
Configuration Item |
Default Value |
Description |
|---|---|---|---|
CPU limit |
number.of.cpu.core |
- |
Number of CPU cores that can be used by OmniData. The startup script uses CGROUP to set OmniData process resources based on this value. Set this value as follows:
|
Maximum number of tasks |
max.task.queue.size |
3000 |
Number of tasks received by OmniData. The number must be in direct proportion to the number of cores. Set the multiple based on the actual CPU performance. The recommended value is the number of available CPU cores multiplied by 4. |
Maximum timeout duration of a task |
task.timeout.period |
120000 |
OmniData task processing timeout duration. The default value is 120,000, in milliseconds. |
Cache expression |
compile.expression-cache-size |
8192 |
Size of cache expressions. |
Compression |
compression.enabled |
false |
Indicates whether data is compressed. |
Storage time zone |
storage.timezone |
- |
Default server time zone. |
Plugin |
external-functions-plugin.dir |
/home/omm/omnidata-install/omnidata/plugin |
Plugins exist as folders in the directory. |
Hive UDF plugin |
function-namespace.dir |
/home/omm/omnidata-install/omnidata/etc/function-namespace |
Directory for storing the configuration file of the Hive UDF plugin. The directory must be under etc/function-namespace in the installation directory. |
Accessing Ceph/HDFS |
hdfs.config.resources |
/home/omm/omnidata-install/omnidata/etc/hdfs-site.xml, /home/omm/omnidata-install/omnidata/etc/core-site.xml |
Paths of core-site.xml and hdfs-site.xml. Use a comma (,) to separate them. In scenario 1 of Configuring the Feature, the two files are stored in the specified directory. |
Indicates whether HDFS is in secure mode. |
hdfs.authentication.type |
NONE |
HDFS authentication mode. The value can be NONE or KERBEROS. |
Configuring secure HDFS |
hdfs.krb5.conf.path |
- |
Path of the krb5.cnf file. If a secure HDFS cluster is connected, configure krb5.cnf, keytab, and principal. |
hdfs.krb5.keytab.path |
- |
Path of the keytab file. |
|
hdfs.krb5.principal |
- |
User principal. |
|
fs.hdfs.impl.disable.cache |
false |
Disables HDFS access to the cache. |
|
Spark registration service |
omnidata.zookeeper.heartbeat.enabled |
true |
Indicates whether OmniData registers with ZooKeeper and sends status information. |
ZooKeeper configuration |
zookeeper.quorum.server |
Parameters input by users |
IP address of the ZooKeeper server. |
zookeeper.namespace |
sdi |
OmniData node name registered with ZooKeeper. |
|
zookeeper.status.node |
status |
Directory registered by OmniData with ZooKeeper for storing pushdown information. |
|
zookeeper.connection.timeoutMs |
15000 |
Timeout duration of a ZooKeeper connection, in milliseconds. |
|
zookeeper.session.timeoutMs |
60000 |
Timeout duration of a ZooKeeper session, in milliseconds. |
|
zookeeper.retry.intervalMs |
1000 |
ZooKeeper reconnection interval upon failure, in milliseconds. |
|
omnidata.pushdown.threshold |
0.8f |
Threshold of OmniData pushdown node resources. |
|
omnidata.status.update.interval |
3 |
Update frequency of OmniData pushdown node resources, in seconds. |
|
Secure ZooKeeper configuration |
zookeeper.krb5.enabled |
false |
Indicates whether the ZooKeeper krb5 security configuration is enabled. |
zookeeper.java.security.auth.login.config |
- |
ZooKeeper secure login configuration path. |
|
zookeeper.krb5.conf |
- |
Path of the krb5.conf file of ZooKeeper. When the secure ZooKeeper is connected, configure krb5.conf, keytab, and principal. |
Configuring the Spark Registration Service and Secure ZooKeeper
Spark uses ZooKeeper to collect and manage information about OmniData nodes so that the engine can detect the nodes and the number of tasks on the nodes. When OmniData connects to Spark, you need to configure the Spark registration service and secure ZooKeeper configuration in the preceding table.
The following steps show a typical configuration for OmniData to connect to Spark.
- Open the config.properties configuration file.
1vi /home/omm/omnidata-install/omnidata/etc/config.properties - Press i to enter the insert mode and configure as follows:
1 2 3 4 5 6 7 8 9 10 11
zookeeper.quorum.server=xxx.xxx.xxx.xxx:2181 hdfs.config.resources=/home/omm/omnidata-install/omnidata/etc/hdfs-site.xml,/home/omm/omnidata-install/omnidata/etc/core-site.xml hdfs.authentication.type=KERBEROS external-functions-plugin.dir=/home/omm/omnidata-install/omnidata/plugin hdfs.krb5.conf.path=/home/omm/omnidata-install/omnidata/etc/krb5.conf hdfs.krb5.keytab.path=/home/omm/omnidata-install/omnidata/etc/hdfs.keytab hdfs.krb5.principal=hdfs/server1@EXAMPLE.COM omnidata.zookeeper.heartbeat.enabled=true zookeeper.krb5.enabled=true zookeeper.java.security.auth.login.config=/home/omm/omnidata-install/omnidata/etc/client_jaas.conf zookeeper.krb5.conf=/home/omm/omnidata-install/omnidata/etc/krb5.conf
- Press Esc, type :wq!, and press Enter to save the file and exit.