Cloud Historian - Distributed system

for storing technological streaming data

Automation of technological processes and manufacturing requires processing of large amount of data about equipment status and its working parameters. Quantity of measured and calculated parameters sometimes hits millions per second.

Usually, as a datasource, is used different control and measurement equipment, as well as subsystems and controllers. Used protocols and connection standards are also different. The amount of aggregated data can reach petabytes.

Cloud Historian is intended right for solving those problems written above.

It is a distributed, failover system, allowing to build server clusters, including tens and hundreds of servers in failover and stable configurations.

CloudHistorian can be applied as OEM-solution,

integrated in other systems and complexes.

Special databases are used in order to receive, store and process such a big amount of data and variety of datatypes, oriented for working with so called “time rows” (time series data).

We took as a base software such products as Apache Tomcat, Apache Cassandra, PostgreSQL and designed an instrumental package (CloudHistorian Platform) for creating high-performance, failover, distributed and scalable systems, used for receiving, processing and storing different types of measurements from industrial facilities and systems.

CloudHistorian combines comparable with European and American similar systems performance and and reliability, but it also possesses such an ability to be cross-platform (can operate under control of operating systems Windows, UNIX/Linux), and is based on open source code and doesn’t contain any licensed third-party software.

Historically, CloudHistorian has been designed and applied for the tasks in the field of electricity and energy transmission, but there are no limitations to use this software in any other technical systems and complexes, where the acquisition, accumulation and processing of time rows is needed.

CloudHistorian can be applied as OEM-solution, integrated in other systems and complexes.

The name CloudHistorian originates from one of the basic features of the product – ability to create distributed data storages, where data is placed on several (geographically spread) system nodes (clouds). The data itself, received from measuring and other information systems, is stored in the data storage of that CloudHistorian node, where the interface of data receiving has been started.

Technical features of implementation

1.

Open Java-based

architecture

2.

Absence of proprietary products

inside the solution

3.

Wide range of operational systems support

(including Linux and other open-source OSs)

4.

Web-technologies

for visualization of data

5.

Technology of big distributed arrays of data

for storing large amount of data (Big Data)

6.

Combined real-time based data storage

with historical data storage

7.

High Performance (write frequency 50-200 times per second in every tag),

quick search on historical data

8.

Failover clustering

(from 2 to 65535 servers)

9.

Scalability

(distributing data on any number of servers in a cluster)

10.

Low requirements to computing resources – ability to produce on low-power industrial controllers

(up to 20 000 of records per second on one core of Intel Core i5 2.33 GHz)

11.

SaaS (Software as a Service)

architecture suitability

12.

Solutions for IIoT (Industrial Internet of Things)

and Fog Computing

In the system there are implemented:

Adapters to datasources

– MQTT


– Client IEC 60870-5-104


– OPC Client


– Modbus RTU Master


– Client C37.118.-2011/2008 (for data acquisition from WAMS recorders)


– Receiving in C37.111-1991 format (COMTRADE)


– Receiving in CSV format (files)


– RTdbcon

Protocols of distributed interaction between system nodes (CH-to-CH) and other automation systems

– Server C37.118-2011

– Solution based on UDP protocol (unicast, multicast) – exchange with low delays of operational data

– Web-services for transmitting archives on request

– REST API / JSON

– Thrift API

Data processing functions

– Receiving and storing data in storage
– Calculating and storing

of ROLL-UP

(data slice, integrated by special time period)
– Deleting outdated data
– Calculating of derivative parameters
– Alarm calculation (signal situations)
– Self-performance monitoring and producing of diagnostic information
– API for data access

(read and write) for third-party software

Performance (average)

Laptop specification

OS Microsoft Windows, Intel Core 5i 4 cores, 2.67 GHz,

RAM 16 GB, HDD WDC SATA 500 GB 7200 RPM 16 MB cache

Data recording

200 000 records per second (200 tags of 1000 records per second for each) or 80 000 records per second (80 000 tags – 1 record per second for each)

Read data series

250 000 – 500 000

measurements per second

Read slices

200 measurements per 20 milliseconds

(time-aligned) 

Read last values

50 000 measurements per second