Ingest data from NATS JetStream
You can ingest data from NATS JetStream into RisingWave by using the NATS source connector in RisingWave.
NATS is an open-source messaging system for cloud-native applications. It provides a lightweight publish-subscribe architecture for high-performance messaging.
NATS JetStream is a streaming data platform built on top of NATS. It enables real-time and historical access to streams of data via durable subscriptions and consumer groups.
PUBLIC PREVIEW
This feature is currently in public preview, meaning it is nearing the final product but may not yet be fully stable. If you encounter any issues or have feedback, please reach out to us via our Slack channel. Your input is valuable in helping us improve this feature. For more details, see our Public Preview Feature List.
Prerequisites
Before ingesting data from NATS JetStream into RisingWave, please ensure the following:
- The NATS JetStream server is running and accessible from your RisingWave cluster.
- If authentication is required for the NATS JetStream server, make sure you have the client username and password. The client user must have the
subscribe
permission for the subject. - Create the NATS subject from which you want to ingest data.
- Ensure that your RisingWave cluster is running.
Ingest data into RisingWave
When creating a source, you can choose to persist the data from the source in RisingWave by using CREATE TABLE
instead of CREATE SOURCE
and specifying the connection settings and data format.
Syntax
schema_definition:
RisingWave performs primary key constraint checks on tables with connector settings but not on regular sources. If you need the checks to be performed, please create a table with connector settings.
For a table with primary key constraints, if a new data record with an existing key comes in, the new record will overwrite the existing record.
According to the NATS documentation, stream names must adhere to subject naming rules as well as being friendly to the file system. Here are the recommended guidelines for stream names:
- Use alphanumeric values.
- Avoid spaces, tabs, periods (
.
), greater than (>
) or asterisks (*
). - Do not include path separators (forward slash or backward slash).
- Keep the name length limited to 32 characters as the JetStream storage directories include the account, stream name, and consumer name.
- Avoid using reserved file names like
NUL
orLPT1
. - Be cautious of case sensitivity in file systems. To prevent collisions, ensure that stream or account names do not clash due to case differences. For example,
Foo
andfoo
would collide on Windows or macOS systems.
Parameters
Field | Notes |
---|---|
server_url | Required. URLs of the NATS JetStream server, in the format of address:port. If multiple addresses are specified, use commas to separate them. |
subject | Required. NATS subject that you want to ingest data from. To specify more than one subjects, use a comma. |
stream | Required. NATS stream that you want to ingest data from. |
connect_mode | Required. Authentication mode for the connection. Allowed values: plain: No authentication. user_and_password: Use user name and password for authentication. For this option, username and password must be specified. credential: Use JSON Web Token (JWT) and NKeys for authentication. For this option, jwt and nkey must be specified. |
jwt and nkey | JWT and NKEY for authentication. For details, see JWT and NKeys. |
username and password | Conditional. The client user name and password. Required when connect_mode is user_and_password. |
scan.startup.mode | Optional. The offset mode that RisingWave will use to consume data. The supported modes are:
earliest will be used. |
scan.startup.timestamp.millis | Conditional. Required when scan.startup.mode is timestamp. RisingWave will start to consume data from the specified UNIX timestamp. |
data_encode | Supported encodes: JSON, PROTOBUF, BYTES. |
consumer.deliver_subject | Optional. Subject to deliver messages to. |
consumer.durable_name | Required. Durable name for the consumer. |
consumer.name | Optional. Name of the consumer. |
consumer.description | Optional. Description of the consumer. |
consumer.deliver_policy | Optional. Policy on how messages are delivered. |
consumer.ack_policy | Optional. Acknowledgment policy for message processing (e.g., None, All, Explicit). |
consumer.ack_wait.sec | Optional. Time to wait for acknowledgment before considering a message as undelivered. |
consumer.max_deliver | Optional. Maximum number of times a message will be delivered. |
consumer.filter_subject | Optional. Filter for subjects that the consumer will process. |
consumer.filter_subjects | Optional. List of subjects that the consumer will filter on. |
consumer.replay_policy | Optional. Policy for replaying messages (e.g., Instant, Original). |
consumer.rate_limit | Optional. Rate limit for message delivery in bits per second. |
consumer.sample_frequency | Optional. Frequency for sampling messages, ranging from 0 to 100. |
consumer.max_waiting | Optional. Maximum number of messages that can be waiting for acknowledgment. |
consumer.max_ack_pending | Optional. Maximum number of acknowledgments that can be pending. |
consumer.headers_only | Optional. If true, only message headers will be delivered. |
consumer.max_batch | Optional. Maximum number of messages to process in a single batch. |
consumer.max_bytes | Optional. Maximum number of bytes to receive in a single batch. |
consumer.max_expires.sec | Optional. Maximum expiration time for a message in seconds. |
consumer.inactive_threshold.sec | Optional. Time in seconds before a consumer is considered inactive. |
consumer.num.replicas | Optional. Number of replicas for the consumer. |
consumer.memory_storage | Optional. If true, messages will be stored in memory. |
consumer.backoff.sec | Optional. Backoff time in seconds for retrying message delivery. |
Examples
The following SQL query creates a table that ingests data from a NATS JetStream source.
The parameters supported by the async_nats crate are all supported in the RisingWave NATS source connector.