This guide describes how to sink data from RisingWave to Cassandra or ScyllaDB using the Cassandra sink connector in RisingWave.

PUBLIC PREVIEW

This feature is currently in public preview, meaning it is nearing the final product but may not yet be fully stable. If you encounter any issues or have feedback, please reach out to us via our Slack channel. Your input is valuable in helping us improve this feature. For more details, see our Public Preview Feature List.

Prerequisites

  • Ensure your Cassandra or ScyllaDB cluster is accessible from RisingWave.
  • If you are running RisingWave locally from binaries and intend to use the native CDC source connectors or the JDBC sink connector, make sure that you have JDK 11 or later versions is installed in your environment.

Syntax

To sink data to Cassandra or ScyllaDB, create a Cassandra sink in RisingWave using the syntax below:

CREATE SINK [ IF NOT EXISTS ] sink_name
[FROM sink_from | AS select_query]
WITH (
    connector='cassandra',
    type='<type>',
    cassandra.url = '<node1>,<node2>,<node3>',
    cassandra.keyspace = '<keyspace>',
    cassandra.table = '<cassandra_table>',
    cassandra.datacenter = '<data_center>'
);

Once the sink is created, data changes will be streamed to the specified table.

Parameters

Parameter NamesDescription
sink_nameName of the sink to be created.
sink_fromA clause that specifies the direct source from which data will be output. sink_from can be a materialized view or a table. Either this clause or select_query query must be specified.
AS select_queryA SELECT query that specifies the data to be output to the sink. Either this query or a sink_from clause must be specified. See SELECT for the syntax and examples of the SELECT command.
typeRequired. Specify if the sink should be upsert or append-only. If creating an upsert sink, you must specify a primary key.
primary_keyOptional. A string of a list of column names, separated by commas, that specifies the primary key of the Cassandra sink.
force_append_onlyIf true, forces the sink to be append-only, even if it cannot be.
cassandra.urlRequired. The URL or IP address of the Cassandra or ScyllaDB cluster or node you want to connect to.
cassandra.keyspaceRequired. The name of the keyspace within the Cassandra database or ScyllaDB where you want to store the data. A keyspace is a logical container for organizing data in Cassandra.
cassandra.tableRequired. The name of the table in the specified keyspace where you want to insert or update the data.
cassandra.datacenterRequired. The name of the datacenter within the Cassandra or ScyllaDB. You can set it in Cassandra or ScyllaDB. If not specified, the default value is datacenter1.
cassandra.max_batch_rowsOptional. The number of batch rows sent at a time. The value must be between 1 and 65535. The default value is 512.
cassandra.request_timeout_msOptional. The waiting time for each batch. The default value is 2000. It is recommended to reduce batch size first before trying to change the waiting time.
cassandra.usernameOptional. The username for Cassandra login. Ensure you have the necessary permissions.
cassandra.passwordOptional. The password for Cassandra login. Ensure that you have the required permissions.

Data type mapping - RisingWave and Cassandra

RisingWave Data TypeCassandra Data Type
booleanboolean
smallintsmallint
integerint
bigintbigint
numericdecimal
realfloat
double precisiondouble
character varying (varchar)text
byteablob
datedate
time without time zonetime
timestamp without time zoneunsupported. You need to convert timestamp to timestamptz in RisingWave before sinking.
timestamp with time zonetimestamp
intervalduration
structunsupported
arrayunsupported
JSONBunsupported