Timescale is a VC-backed company headquartered in New York, with distributed team members across the globe. Its product, TimescaleDB, is an extension to PostgreSQL, it leverages many of the features of PostgreSQL, and migration to TimescaleDB is trivial. TimescaleDB was officially launched in April 2017 followed by general availability of version 1.0 in September 2018.
The database is open source and is available via both an Apache license and commercially from Timescale, either in a Community or Enterprise Edition. A cloud-based (Google, AWS and Microsoft Azure) managed service offering is also available.
Company Info
Headquarters: 335 Madison Avenue, 5th Floor Suite E, New York, NY 10017, USA
TimescaleDB is built on top of PostgreSQL as a database intended to specifically support the requirements of time-series data. This is illustrated in Figure 1. While it can be used to collect metrics and then present these in a dashboard for monitoring environments, and it can be deployed as a storage engine working with third-party tools, the company’s main focus is on time-series analytics in industrial, financial and IoT environments. Because the product leverages not just PostgreSQL but also its extensions, it can combine time-series capabilities with geo-spatial data using the PostGIS extension.
As a PostgreSQL compatible solution, you can use any third-party tools that work with that database, with TimescaleDB also. Notable examples include Prometheus, Kafka, Tableau and Grafana.
Customer Quotes
“Initially my colleagues were sceptical when I suggested storing metrics for our 120 Petabyte data centre in a relational database, but after replacing the prior NoSQL database with TimescaleDB we couldn’t be more happy with the performance. In addition, because TimescaleDB is an extension of PostgreSQL, we’re now starting to expand the scope of the metrics storage to power executive dashboards and advanced analytical functions that our prior NoSQL solution couldn’t support.” Comcast
“At Bloomberg, we have millions of data feeds and trillions of data points dating back over 100 years. My team and I have been extremely pleased with TimescaleDB’s capability to accommodate our workload while simplifying geo-financial analytics and data visualization. If you are looking to support large scale time-series datasets, then TimescaleDB is a good fit.” Bloomberg
We do not intend to discuss the features of PostgreSQL (storage mechanisms, high availability, security, ACID compliance and so forth), since these are well-known. What Timescale has done is to implement what the company calls a “hypertable” as an abstraction layer on top of the relational underpinnings of PostgreSQL. This abstraction provides a single continuous table across all space and time intervals, which is sharded into what are known as “chunks”, with the size of each chunk either set by the database or you can manually tune this to improve performance based on specific use cases. Indexes for chunks are stored in memory and there are also various query optimisation functions built into the product. All of this is designed to minimise the time required to run queries, which are typically based on a combination of the required time intervals and the type of data within the chunk (tick data, sensor data, location data and so on). Note that both the database optimiser, and the product’s ingestion capabilities, understand chunks.
One notable limitation is that, currently, TimescaleDB is limited to running on a single node. However, the company has announced (August 2019) a distributed version of TimescaleDB, which is currently in private beta and is scheduled for availability in 2020. While it is too early to comment on this in detail, initial testing by the company suggests that a nine-node cluster will support an ingestion rate in excess of twelve million metrics per second. Another significant capability that is in development is an improved compression capability compared to the native PostgreSQL ZFS protocol. This is important because of the volume of data that often has to be ingested and stored in time-series environments.
Some notable features provided by Timescale include built-in support for various analytic functions, time buckets, gap fitting and continuous aggregates. In the case of the last of these, this has recently (version 1.4) been extended to support multiple continuous aggregates against each hypertable where previously only one was supported. Also noteworthy is support for ordered appends, which optimise a variety of queries but especially those ordered by time.
Why you should care rather depends where you are coming from. If you are an existing PostgreSQL user who wants improved ingestion and query performance for time-based analytics, then TimescaleDB represents an obvious starting point and Figure 2 (supplied by Timescale) represents a good reason why you might add TimescaleDB to your existing environment. A similar argument would apply if you are a relational database user in general and want to explore time-series with lower latency and better performance.
More generally, there is the case where you want to be able to exploit time-series data but don’t have any existing database that you want to replace or extend. In this event TimescaleDB has a number of differentiators and advantages, though which of these will be relevant will be dependent on the potential competition. First off, while Timescale focuses on analytics it is a transactional database, and many IoT use cases involve both operational and analytic capabilities. Secondly, not only does TimescaleDB use SQL, it is a relational database and it will often be the case that you want to combine relational and time-series data. Thirdly, the company can offer geo-spatial capabilities, through PostGIS, that are sadly lacking in some competitive offerings. And finally, TimescaleDB has a relatively small footprint – you can implement it on a Raspberry Pi – so it can be implemented in suitable edge devices.
All of that said, even though TimescaleDB is a relatively new offering, it builds on PostgreSQL and therefore inherits the reliability and stability from its decades of existence. The question is whether this is a preferable approach compared to a purpose-built time-series database. The imminent release of a distributed version of TimescaleDB, along with improved native compression, suggests that the company is determined to prove that this should not be an issue.
The Bottom Line
Timescale has established itself as a major player in the time-series market, despite its relative youth. This is primarily because of its relational heritage but, whatever the reason, it is certainly a major contender within this marketplace.
We use third-party cookies, including Google Analytics, to ensure that we give you the best possible experience on our website.I AcceptNo, thanksRead our Privacy Policy