RisingWave supports a variety of data ingestion methods.
INSERT
, UPDATE
, DELETE
).
INSERT ... SELECT
statement, users can transform ad-hoc ingestion data into a streaming flow to the table, affecting the downstream streaming pipeline of the table. This can also be used for bulk imports.CREATE MATERIALIZED VIEW
or CREATE SINK
statement will generate streaming ingestion jobs.mv
.
SELECT * FROM table_on_kafka
, the query engine will directly access the data from RisingWave’s internal storage, eliminating unnecessary network overhead and avoiding read pressure on upstream systems. Additionally, users can create indexes on the table to accelerate queries.postgres_query
to directly query PostgreSQL databases. This function connects to a specified PostgreSQL instance, executes the provided SQL query, and returns the results as a table in RisingWave.
To use it, specify connection details (such as hostname, port, username, password, database name) and the desired SQL query. This makes it easier to integrate PostgreSQL data directly into RisingWave workflows without needing additional data transfer steps. For more information, see Ingest data from Postgres tables.
website_visits
and inserts 5 rows of data.
INSERT SELECT
to do bulk ingestioninsert ... select ...
can be used to implement bulk data import into the table, and to convert the data into a stream of changes that are synchronized downstream to the table.
Parquet data type | RisingWave file source data type |
---|---|
boolean | boolean |
int16 | smallint |
int32 | int |
int64 | bigint |
float | real |
double | double precision |
string | varchar |
date | date |
decimal | decimal |
int8 | smallint |
uint8 | smallint |
uint16 | int |
uint32 | bigint |
uint64 | decimal |
float16 | double precision |
timestamp(_, Some(_)) | timestamptz |
timestamp(_, None) | timestamp |