Constraints
OmniStream has restrictions on supported data types, operators, and state backends. Plan your tasks accordingly and avoid unsupported scenarios.
SQL
- OmniStream Flink Native supports the Nexmark benchmarking suite, including Nexmark data types and built-in functions.
- Supported data types: BIGINT, TIMESTAMP(3), and VARCHAR.
- Supported expressions: + - * /
- Supported built-in functions: LOWER, SPLIT_INDEX, DATE_FORMAT, MOD, and COUNT_CHAR.
- Supported GroupAggregate functions: SUM(BIGINT), COUNT(BIGINT), AVG (BIGINT), MIN(BIGINT), MIN(VARCHAR), MAX (BIGINT), and MAX(VARCHAR).
- The join key of the join operator must be of the BIGINT type, and the operation type must be Inner Join.
- The Deduplicate and Rank operators allow only Partition By BIGINT. All fields in the query table must be of the supported data type and only the ROW_NUMBER function is supported. Partition By of Rank supports only one field of the BIGINT type. Order By of TOPN supports only one BIGINT field, whose sorting rule is DESC. Order By of TOP1 supports a maximum of two fields. The type can be BIGINT TIMESTAMP(3), and the sorting rule can be DESC or ASC.
- The Group By column of the Aggregate operator allows only BIGINT.
- The aggregate function of the LocalWindowAGG and GlobalWindowAGG operators is COUNT or MAX. The aggregate function of the GroupWindowAGG operator is COUNT.
- The LocalWindowAGG and GlobalWindowAGG operators support only the TUMBLE and HOP windows.
- The GroupWindowAGG operator supports only the SESSION window.
- The external table data source of the Lookup Join operator supports only CSV files.
- The state backend supports only the memory and RocksDB.
- Flink stores states in the memory state backend, and the memory usage grows over time as the volume of data increases. In comparison, OmniStream uses the columnar vectorized architecture to optimize performance. Its state storage behaves the same as the native Flink while delivering a higher processing speed and consuming the memory space faster. Therefore, the Nexmark benchmark test cases support a maximum of 50 million data records.
DataStream
For details, see Supported DataStream Operators and UDFs.
- Source and Sink support only Kafka data sources.
- These operators are supported: Map, FlatMap, GroupReduce, Filter, Source, and Sink.
- The Filter operator must be RichFilterFunction.
- The state backend supports only memory and does not support checkpoints.
Parent topic: Feature Overview