LucidDB Features

Here are some of the main features which set LucidDB apart:

Category Feature Benefits

Storage Column-store tables Very high data compression rates for columns with many repeated values; reduced I/O for queries which access only a subset of columns; greater cache effectiveness

Intelligent indexing Automatically adapts to either bitmap or btree representation depending on data distribution (even using both in the same index for different portions of the same table), yielding optimal data compression, reduced I/O, and fast evaluation of boolean expressions, without the need for a DBA to choose index type

Page-level multi-versioning Supports read/write concurrency with snapshot consistency, allowing readers to access a table while data is being bulk loaded or updated; versioning at page-level is much more efficient than transactional multi-versioning schemes such as row-level versioning or log-based page reconstruction

Warehouse labels Allows report execution to sync to a particular global database state such as last successful ETL so that queries never see intermediate inconsistent states; enables trickle-feed ETL, and may be used to eliminate the downtime imposition of an ETL window.

Hot+incremental backup Allows the system to be backed up consistently while queries and ETL are running, eliminating downtime; incremental and compression options minimize archival storage and bandwidth.

Optimization Star join optimization Avoids reading fact table rows which are not needed by query

Cost-based join ordering and index selection No hints required

Execution Hash join/aggregation Can scale to number-crunch even the largest datasets in limited RAM via skew-resistant disk-based partitioning

Intelligent prefetch High performance and greater cache and disk effectiveness because LucidDB can almost always predict exactly which disk blocks are needed to satisfy a query

INSERT/UPSERT as bulk load Tables can be loaded directly from external sources via SQL; no separate bulk loader utility is required (for performance, loads are never logged at the row-level, yet are fully recoverable via page-level undo); the SQL:2003 MERGE statement provides standard upsert capability

Connectivity SQL/MED architecture Allows LucidDB to connect to heterogeneous external data sources via foreign data wrappers and access their content as foreign tables

JDBC foreign data wrapper Allows foreign tables in any JDBC data source to be queried via LucidDB, with filters pushed down to the source where possible

Flat file foreign data wrapper Allows flat files (e.g. BCP or CSV format) to be queried as foreign tables via LucidDB

Pluggability Allows new foreign data wrappers (e.g. for accessing data from a web service) to be developed in Java and hot-plugged into a running LucidDB instance

Pentaho Data Integration step Allows data to be pushed into LucidDB from the Kettle ETL tool

Extensibility SQL/JRT architecture Allows new functions and transformations to be developed in Java and hot-plugged into a running LucidDB instance; LucidDB also comes with a companion library of common ETL functions (applib). Plugin jars are self-installing via deployment descriptors.

User-defined functions Allows the set of builtin functions to be extended with custom user logic

User-defined transformations Allows new table functions (such as custom logic for data mining operators or CONNECT BY queries) to be added to the system

Standards SQL:2003 Smooths migration of applications to and from other DBMS products

JDBC, HTTP Allows connectivity from popular front-ends such as the Mondrian OLAP engine

UNICODE Supports storage and access to international character data

J2EE Java architecture enables deployment of LucidDB into a J2EE application server (just like hsqldb or Derby); usage of Java as the primary extensibility mechanism makes it a snap to integrate with the many enterprise API's available

Category	Feature	Benefits
Storage	Column-store tables	Very high data compression rates for columns with many repeated values; reduced I/O for queries which access only a subset of columns; greater cache effectiveness
	Intelligent indexing	Automatically adapts to either bitmap or btree representation depending on data distribution (even using both in the same index for different portions of the same table), yielding optimal data compression, reduced I/O, and fast evaluation of boolean expressions, without the need for a DBA to choose index type
	Page-level multi-versioning	Supports read/write concurrency with snapshot consistency, allowing readers to access a table while data is being bulk loaded or updated; versioning at page-level is much more efficient than transactional multi-versioning schemes such as row-level versioning or log-based page reconstruction
	Warehouse labels	Allows report execution to sync to a particular global database state such as last successful ETL so that queries never see intermediate inconsistent states; enables trickle-feed ETL, and may be used to eliminate the downtime imposition of an ETL window.
	Hot+incremental backup	Allows the system to be backed up consistently while queries and ETL are running, eliminating downtime; incremental and compression options minimize archival storage and bandwidth.
Optimization	Star join optimization	Avoids reading fact table rows which are not needed by query
Optimization	Cost-based join ordering and index selection	No hints required
Execution	Hash join/aggregation	Can scale to number-crunch even the largest datasets in limited RAM via skew-resistant disk-based partitioning
	Intelligent prefetch	High performance and greater cache and disk effectiveness because LucidDB can almost always predict exactly which disk blocks are needed to satisfy a query
	INSERT/UPSERT as bulk load	Tables can be loaded directly from external sources via SQL; no separate bulk loader utility is required (for performance, loads are never logged at the row-level, yet are fully recoverable via page-level undo); the SQL:2003 MERGE statement provides standard upsert capability
Connectivity	SQL/MED architecture	Allows LucidDB to connect to heterogeneous external data sources via foreign data wrappers and access their content as foreign tables
	JDBC foreign data wrapper	Allows foreign tables in any JDBC data source to be queried via LucidDB, with filters pushed down to the source where possible
	Flat file foreign data wrapper	Allows flat files (e.g. BCP or CSV format) to be queried as foreign tables via LucidDB
	Pluggability	Allows new foreign data wrappers (e.g. for accessing data from a web service) to be developed in Java and hot-plugged into a running LucidDB instance
	Pentaho Data Integration step	Allows data to be pushed into LucidDB from the Kettle ETL tool
Extensibility	SQL/JRT architecture	Allows new functions and transformations to be developed in Java and hot-plugged into a running LucidDB instance; LucidDB also comes with a companion library of common ETL functions (applib). Plugin jars are self-installing via deployment descriptors.
	User-defined functions	Allows the set of builtin functions to be extended with custom user logic
	User-defined transformations	Allows new table functions (such as custom logic for data mining operators or CONNECT BY queries) to be added to the system
Standards	SQL:2003	Smooths migration of applications to and from other DBMS products
	JDBC, HTTP	Allows connectivity from popular front-ends such as the Mondrian OLAP engine
	UNICODE	Supports storage and access to international character data
	J2EE	Java architecture enables deployment of LucidDB into a J2EE application server (just like hsqldb or Derby); usage of Java as the primary extensibility mechanism makes it a snap to integrate with the many enterprise API's available