What is a Columnar Database? MariaDB ColumnStore Explained

Опубликовано: 07 Апрель 2024
на канале: Database Dive
142
2

In this MariaDB tutorial we're discussing one of the core components of MariaDB Server - MariaDB ColumnStore.

Many developers have heard of MySQL Server. According to DB-Engines, MySQL Server is the second most widely used database management system in the world as of 2024.
MySQL does have a couple of brothers them being MariaDB and Percona Server.

Any Percona tutorial will tell you that Percona runs a popular open source database conference called Percona LIVE and that's because both MariaDB and Percona Server are known for their unique offerings in the database space - compared to MySQL, MariaDB offers more data types and storage engines users can choose from, while Percona provides database consulting services and builds some of their own products such as Percona Monitoring and Management, Percona Toolkit, Percona XtraDB which is an enhanced version of the InnoDB storage engine, and Percona XtraBackup which acts as a complete and open source online Percona XtraBackup MySQL solution.

MariaDB is also known for its ColumnStore offering.
ColumnStore extends MariaDB with database columnar storage.
ColumnStore is a massively parallel processing shared-nothing architecture that was built by porting InfiniDB 4.6.7 to MariaDB and released under the GPL license.

If we take a step back and look at it, columnar storage refers to a database that stores data using a column-oriented model.
Column-based data storage makes it easier to read columns of data and that's why it's usually used for analytical applications, but at the same time, columnar databases like ColumnStore raises the cost of doing simple INSERT operations and SQL queries because they do not operate as row-oriented databases do.

With that out of the way, if we'd look at the way ColumnStore is designed, we would see that MariaDB ColumnStore is designed for intense reading operations with occasional huge data updates, so we can say that it's a good fit for Online Analytical Processing, or OLAP use cases.

This MariaDB tutorial will tell you that ColumnStore being MariaDB-based means that any SQL statement can involve both tables built on ColumnStore and on other storage engines, such as CONNECT, SPIDER, or the S3 storage engine. Indeed, the default storage engine in MySQL Server and MariaDB Server - InnoDB - can also be used.
ColumnStore is a specialized storage engine to facilitate scalable and high-performance analytics operations with parallel query processing.

Follow this MySQL tutorial and know that to create a table running the MariaDB ColumnStore engine, use the CREATE TABLE statement with the ENGINE specified as ColumnStore as shown in this example.
After you have created your table, it's time to look at the use cases.

Use ColumnStore for anything that requires fast retrieval of columns of data. That includes pretty much anything that falls in the data warehousing or big data categories. As stated before, MariaDB ColumnStore is perfect for analytical applications. And yes, COUNT(*) queries will be blazing fast too unlike in the default storage engine offered by MariaDB.

To load huge volumes of data into ColumnStore or into any other storage engine within MariaDB, you can use the LOAD DATA INFILE SQL statement. Unline INSERTs, LOAD DATA INFILE doesn't have that much overhead and is meant to deal with raw data, which means that your queries will complete much faster.
In a usual setup, loading 1 million rows into your table should take around 20 seconds, which isn't too bad. That's around 33 minutes for 1 million rows.

Also keep in mind that the ColumnStore storage engine has limitations too.
To begin with, if you use ColumnStore, it's likely that simple INSERT operations will be massively slow.
By default, ColumnStore can only store 8 bytes of data inside of a column due to its design and if the value is bigger than that, it is saved in a separate data dictionary and referenced to the corresponding column and that's the primary reason why checking how much space our data occupies on the disk usually returns next to nothing if our table is running ColumnStore.

When it comes to columns, ColumnStore only reads the columns that are necessary to complete a query. That means that to save disk I/O, you should only select the columns that are absolutely necessary for your query to complete.

Aside from that, MariaDB ColumnStore can be used just as you use everything else.

Subscribe to this SQL tutorial channel if you want to learn more about MariaDB ColumnStore and related storage engines in MariaDB Server, MySQL Server, Percona Server and other database management systems, and until next time.

Music:
Moments by GalaxyTones:   / galaxytones  
Creative Commons — Attribution 3.0 Unported — CC BY 3.0
Free Download / Stream: https://bit.ly/48KTrsc
Music promoted by Audio Library https://bit.ly/3OduxcL

#database #mariadb #mysql #software #developer