Vertica

From Glitchdata
Jump to: navigation, search

The cluster-based, column-oriented Vertica Analytics Platform is designed to manage large, fast-growing volumes of data and provide very fast query performance when used for data warehouses and other query-intensive applications. The product claims to drastically improve query performance over traditional relational database systems, provide high-availability, and petabyte scalability on commodity enterprise servers.

Its design features include:

  • Column-oriented storage organization, which increases performance of sequential record access at the expense of common transactional operations such as single record retrieval, updates, and deletes.[6]
  • Standard SQL interface with many analytics capabilities built-in, such as time series gap filling/interpolation, event-based windowing and sessionization, pattern matching, event series joins, statistical computation (e.g., regression analysis), and geospatial analysis.
  • Out-of-place updates and hybrid storage organization, which increase the performance of queries, insertions, and loads, but at the expense of updates and deletes.
  • Compression, which reduces storage costs and I/O bandwidth. High compression is possible because columns of homogeneous datatype are stored together and because updates to the main store are batched.[7]
  • Shared nothing architecture, which reduces system contention for shared resources and allows gradual degradation of performance in the face of hardware failure.
  • Easy to use and maintain through automated data replication, server recovery, query optimization, and storage optimization.
  • Support for standard programming interfaces ODBC, JDBC, ADO.NET, and OLEDB.
  • High performance and parallel data transfer to statistical tools such as Distributed R, and the ability to store machine learning models, and use them for in-database scoring.[8][9]

Vertica's specialized approach aims to significantly increase query performance in data warehouses, while reducing the total cost of ownership by reducing the hardware footprint. One example of a use case detailed in a research paper shows a performance improvement of hundreds of times with Vertica in a specific application due to the use of the vertical DBMS approach.[10]

As of late 2011, the Vertica Analytics Platform Community Edition[11] is available for free with certain limitations, such as a maximum of one terabyte of raw data, three-node (servers) cluster, and limited support.

Links