OmniData Configuration File
Table 1 describes the configuration items in the /home/omm/omnidata-install/omnidata/etc/jvm.config file.
Category |
Configuration Item |
Default Value |
Description |
|---|---|---|---|
Memory limit |
-Xmx |
1/4 of the physical memory |
Set the maximum heap size of the Java VM. |
Memory limit |
-Xms |
1/64 of the physical memory |
Set the maximum heap size of the Java VM. |
Table 2 describes the configuration items in the /home/omm/omnidata-install/omnidata/etc/config.properties file.
Category |
Configuration Item |
Default Value |
Description |
|---|---|---|---|
CPU limit |
number.of.cpu.core |
- |
Number of CPU cores that can be used by OmniData. The startup script uses CGROUP to set OmniData process resources based on this value. Set this value as follows:
|
Maximum number of tasks |
max.task.queue.size |
- |
Number of tasks received by OmniData. The number must be in direct proportion to the number of cores. Set the multiple based on the actual CPU performance. The default value is max (Number of available processors x 4, 4). |
Maximum timeout duration of a task |
task.timeout.period |
120000 |
OmniData task processing timeout duration. The default value is 120,000, in milliseconds. |
Size of the expression cache |
compile.expression-cache-size |
8192 |
Number of cache expressions. |
Compression |
compression.enabled |
false |
Indicates whether data is compressed. |
Storage time zone |
storage.timezone |
- |
Default server time zone. |
Plugin |
external-functions-plugin.dir |
/home/omm/omnidata-install/omnidata/plugin |
Plugins exist as folders in the directory. |
UDF plugin for Hive |
function-namespace.dir |
/home/omm/omnidata-install/omnidata/etc/function-namespace |
Directory for storing the configuration file of the Hive UDF plugin. The directory must be under etc/function-namespace in the installation directory. |
Accessing Ceph/HDFS |
hdfs.config.resources |
/home/omm/omnidata-install/omnidata/etc/hdfs-site.xml, /home/omm/omnidata-install/omnidata/etc/core-site.xml |
Paths of core-site.xml and hdfs-site.xml. Use a comma (,) to separate them. In scenario 1 of Configuring OmniData, the two files are stored in the specified directory. |
Indicates whether the HDFS is in security mode. |
hdfs.authentication.type |
NONE |
HDFS authentication mode. The value can be NONE or KERBEROS. |
Configuring the secure HDFS |
hdfs.krb5.conf.path |
- |
Path of the krb5.cnf file. If a secure HDFS cluster is connected, configure krb5.cnf, keytab, and principal. |
hdfs.krb5.keytab.path |
- |
Path of the keytab file. |
|
hdfs.krb5.principal |
- |
User principal. |
|
fs.hdfs.impl.disable.cache |
false |
Disables HDFS access to the cache. |
|
Spark registration service |
omnidata.zookeeper.heartbeat.enabled |
true |
Indicates whether OmniData registers with ZooKeeper and sends status information. |
ZooKeeper configuration |
zookeeper.quorum.server |
Parameters input by users |
IP address of the ZooKeeper server. |
zookeeper.namespace |
sdi |
OmniData node name registered with ZooKeeper. |
|
zookeeper.status.node |
status |
Directory registered by OmniData with ZooKeeper for storing pushdown information. |
|
zookeeper.connection.timeoutMs |
15000 |
ZooKeeper connection timeout interval, in milliseconds. |
|
zookeeper.session.timeoutMs |
60000 |
ZooKeeper session timeout interval, in milliseconds. |
|
zookeeper.retry.intervalMs |
1000 |
ZooKeeper reconnection interval upon failure, in milliseconds. |
|
omnidata.pushdown.threshold |
0.8f |
Threshold of OmniData pushdown node resources. |
|
omnidata.status.update.interval |
3 |
Update frequency of OmniData pushdown node resources, in seconds. |
|
Secure ZooKeeper configuration |
zookeeper.krb5.enabled |
false |
Indicates whether the ZooKeeper krb5 security configuration is enabled. |
zookeeper.java.security.auth.login.config |
- |
ZooKeeper secure login configuration path. |
|
zookeeper.krb5.conf |
- |
Path of the krb5.conf file of ZooKeeper. When the secure ZooKeeper is connected, configure krb5.conf, keytab, and principal. |
Configuring the Spark Registration Service and Secure ZooKeeper
Spark uses ZooKeeper to collect and manage OmniData node information, such as OmniData node names and their task quantities. When OmniData connects to the Spark engine, you need to configure the Spark registration service and secure ZooKeeper configuration in the preceding table.
The following steps show a typical configuration for OmniData to connect to the Spark engine.
- Open the config.properties configuration file.
1vi /home/omm/omnidata-install/omnidata/etc/config.properties - Set the following parameters, save the settings, and exit:
zookeeper.quorum.server=xxx.xxx.xxx.xxx:2181 hdfs.config.resources=/home/omm/omnidata-install/omnidata/etc/hdfs-site.xml,/home/omm/omnidata-install/omnidata/etc/core-site.xml hdfs.authentication.type=KERBEROS external-functions-plugin.dir=/home/omm/omnidata-install/omnidata/plugin hdfs.krb5.conf.path=/home/omm/omnidata-install/omnidata/etc/krb5.conf hdfs.krb5.keytab.path=/home/omm/omnidata-install/omnidata/etc/hdfs.keytab hdfs.krb5.principal=hdfs/server1@EXAMPLE.COM omnidata.zookeeper.heartbeat.enabled=true zookeeper.krb5.enabled=true zookeeper.java.security.auth.login.config=/home/omm/omnidata-install/omnidata/etc/client_jaas.conf zookeeper.krb5.conf=/home/omm/omnidata-install/omnidata/etc/krb5.conf