Indexes in a database are designed to increase the speed at which the database management system (DBMS) can locate and retrieve the desired data from a table or a materialized view.
customers
and orders
.
c_phone
column in the customers
table.
o_custkey
column in the orders
table.
INCLUDE
clause. This differs from the standard PostgreSQL. This is because RisingWave’s design as a cloud-native streaming database includes several key differences from PostgreSQL, including the use of an object store for more cost-effective storage, and the desire to make index creation as simple as possible for users who are not experienced with database systems.
By including all columns, RisingWave ensures that an index will cover all of the columns touched by a query and eliminates the need for a primary table lookup, which can be slower in a cloud environment due to network communication. However, RisingWave still provides the option to include only specific columns using the INCLUDE
clause for users who wish to do so.
For example:
If your queries only access certain columns, you can create an index that includes only those columns. The RisingWave optimizer will automatically select the appropriate index for your query.
distributed_column
by default if you omit the DISTRIBUTED BY
clause. RisingWave distributes the data across multiple nodes and uses the distributed_column
to determine how to distribute the data based on the index. If your queries intend to use indexes but only provide the prefix of the index_column
, it could be a problem for RisingWave to determine which node to access the index data from. To address this issue, you can specify the distributed_column
yourself, ensuring that these columns are the prefixes of the index_column
.
For example: