Sidra Data Platform

April 29, 2021

Sidra Data Platform | V.2021 R1:Inquisitive Idared

This first Release of 2021 comes with a massive upgrade to the self-service capabilities Sidra, thanks to the new support for certain common operations in Sidra’s Web UI. This took a little bit longer to release than initially expected, but we are sure you will love all the new capabilities!

The key feature that is released as part of this version is the support of configuration of data intake connectors from the Web UI, as well as the release of the first two new connectors. This has been possible thanks to an important architectural overhaul, which manages to wrap the logic of each connector in independent modules, called plugins.

Connectors for Sidra represent a new way of configuring data sources, through an easy-to-use interface embedded in Sidra Web. It also provides a way for both Plain Concepts and its partners to decouple data source connections from the different Sidra releases, allowing for a continuous stream of new connectors to be released without requiring users to perform a Sidra re-deployment.

It is important to note that the architectural work done to enable this feature has also set the foundations of another important track of improvements in self-service configuration and operations in Sidra: creation and management of Client Applications from the Web, with this exciting new feature being planned for the next release.

On top of that, this Release comes with important features around the operability of the platform, both from the web, and from the perspective of other operational tools.

What’s new in Sidra 2021.R1

Data connectors UI support

Data connectors UI support is a major feature whose key exponent is visible in the Sidra Web UI. Sidra connectors have been conceived with the purpose of adding more self-service capabilities to Sidra. This has been, and will continue to be, one of the key objectives of Sidra evolution as a product.

Connectors greatly simplify the configuration of new data source intake in a Sidra Data Storage Unit. Before data connectors, the configuration of a new data source required several calls to the API (e.g., for configuring metadata, triggers, etc), or alternatively setting up the metadata directly on Sidra’s Core database with SQL scripts. While these methods will still be available for advanced set-up scenarios, with the new connectors the process is now greatly simplified, as the connectors’ approach just requires the user to fill in a series of fields in a step-by-step wizard (Data > Connectors).

When configuring and executing a Sidra connector from Sidra web, several underlying steps are involved. On one hand, the necessary metadata and data governance structures are created in Sidra. On the other hand, the actual data integration infrastructure (e.g., pipelines) is created, configured and deployed.

A typical connector is configured in less than five-minutes. Just after the user provides a few details and confirms, all the orchestration will happen and the data intake pipelines for the new data source will be up and running.

Starting from this release, Sidra’s roadmap will incrementally add support for different data sources through the connectors interface.

Release 2021.R1 comes with out of the box support for two connectors: Azure SQL Database and SQL Server. These are described below.

Sidra web users with Admin privileges can see the gallery with available connectors under the Data menu in Sidra Web:

Next, the user is taken through a set of steps (wizard mode), to add the needed configuration for that connector.

Before confirming the creation of the underlying infrastructure, there is the possibility to validate the connection or to export the configuration as a JSON file.

Azure SQL connector

The Azure SQL database connector for Sidra enables seamless integration with the most widely used SQL Server database as a Service on Azure; an intelligent, scalable, relational database service built in for the cloud. Azure SQL Database uses the latest SQL Server capabilities, and you can learn more about it on Microsoft Documentation.

Sidra’s connector for Azure SQL database extracts data from any table and view in the source database and loads it into the specified Data Storage Unit at regular intervals. It relies on the Sidra Metadata Model for mapping source data structures to Sidra as destination, and uses Azure Data Factory as underlying data integration mechanism within Sidra.

When configuring and executing this connector, several underlying steps are involved to achieve the following:

The necessary metadata and data governance structures are created and populated in Sidra.
The actual data integration infrastructure (ADF Pipeline) is created, configured and deployed.

After starting the connector creation process, users will receive a message that the process has started and will continue in the background. Users will be able to navigate through Sidra Web as usual while this process happens.

Once the whole deployment process is finished, users will receive a notification in Sidra Web Notifications widget. If this process went successfully, the new data structures (new Entities) will appear in the Data Catalog automatically, and the data intake process will incorporate this new data source.

The Azure SQL Database connector for Sidra supports different modes of data synchronization, which also depend on the mechanisms configured on the source system or Sidra metadata:

Full load data synchronization: Generally performed for first time loads. This is also the default mode if no Change Tracking is enabled in the source system, nor alternative incremental load mechanism is defined. By default, the first load will be a complete load.
Incremental load data synchronization: This data synchronization mechanism captures updates for any new or modified data from the source database. Only data corresponding to the updates since the last synchronization is retrieved.

For incremental load to work, there must be a defined mechanism to capture updates in the source system. For incremental load data synchronization, two possible types of mechanisms are supported:

Incremental Load with built-in SQL Server Change Tracking (CT): This is achieved by directly activating Change Tracking in the source database.
Incremental Load non-Change Tracking related (non-CT): This is achieved by specific configurations in the Sidra Metadata tables.

You can see more details about how this connector works and a setup guide in the documentation.

SQL Server connector

The SQL Server connector for Sidra enables seamless integration with Microsoft’s powerful enterprise relational database .

Sidra’s connector for SQL Server extracts data from any table and view in the source database and loads it into the specified Data Storage Unit at regular intervals. It relies on the Sidra metadata model for mapping source data structures to Sidra as destination, and uses Azure Data Factory as underlying data integration mechanism within Sidra.

Similarly to the Azure SQL connector, when configuring and executing this connector, the metadata and data integration infrastructures are created and deployed. The user will receive a notification once the whole deployment process is finished.

The following list includes all SQL Server versions supported by this connector:

SQL Server 2008 R2 (version 10.5.xx)
SQL Server 2012 (version 11.xx)
SQL Server 2014 (version 12.xx)
SQL Server 2016 (version 13.xx)
SQL Server 2017 (version 14.xx)
SQL Server 2019 (version 15.xx)

Note: All editions (Developer, Standard and Enterprise) are supported, but some features of the connector will only be available if the source SQL Server edition supports the feature, such as the Enterprise edition requirement for Change Tracking on tables.

The SQL Server connector supports the same modes of data synchronization than the Azure SQL Database.

You can see more details about how this connector works and a setup guide in the documentation.

Provider Import/Export

As a continued effort to ease the use of the installation and configuration of Sidra, the Sidra CLI tool now includes the capability to automatically export data about a Provider (including its Entities and Attributes) from one environment (ex. dev), and to import this data into another environment in the same or different installation.

The procedure just needs to be run once per destination environment to replicate the configured metadata. From the implementation perspective, the data is copied transparently to an intermediate staging database, and data migrations are applied up or down to set the same migration at the destination.

Data Catalog Attributes view

This feature completes the whole view of the Sidra metadata from Sidra Web, by including a new Attributes detail view for each Entity. After selecting the specific Entity, the user can choose to delve into the whole list of Attributes (See All Attributes option), that are defined in the metadata system for that Entity. This includes the Attributes originated from the source system, as well as the System Attributes – those technical Attributes created by Sidra for conveying information on how to process a specific Entity.

For each Attribute, the Attributes view allows to see all relevant fields organized in sub-sections: General, Metadata and Security.

Improved telemetry

This feature allows the support team at Sidra to have access to operational data from Sidra Core installations.

As of Release 2020.R3, each Sidra installation had access to a set of operational Power BI reports providing key insights on data intake figures: volume stored, pipeline executions, etc. With this release, now customers opting-in to send telemetry operational data will benefit from an enhanced monitoring and operability from the Sidra team. The insights gathered from this telemetry will allow to react early in time about possible anomalies and operational errors, as well as to better customize the product to the customer’s needs.

The telemetry job will be transparent to the customer, and gather data about key intake metrics, errors, warnings, notifications and service daily metrics.

Web UI improvements

This release continues building on the functionalities and improvements of Sidra Web, across different sections:

New Dashboard layout:Sidra’s Dashboard page has been redesigned, with improved utilization of real state for Data Intake graphs and Cluster status. An improved widget has been added for geo-visualization of the different Data Storage Units.

Release notes redesign:The Release notes page in Sidra web has been re-designed, with the objective of displaying a linear timeline of all the Releases of Sidra.

Visualización mejorada de Data Catalog para tipos de entidad:El catálogo de datos ahora incluye un soporte mejorado para mostrar el formato de una entidad. El formato de la entidad (por ejemplo, parquet, csv) ahora se muestra con un icono en la página de detalles de la entidad. La lista de entidades también incluye una columna con el formato de entidad.
Revisión de diseño y correcciones de problemas en diferentes secciones de Sidra Web:También se han realizado una gran cantidad de cambios y mejoras estéticos relativamente menores durante estos últimos meses desde la última versión de 2020.Hemos mejorado algunos problemas de diseño en el Catálogo de datos y mejorado el rendimiento de la página Registros. También hemos mejorado la visualización de etiquetas en la vista de tarjetas de Entidades y Proveedores.

Complete info

Access to the complete list of resolved issues and relevant changes in Sidra 2021. R1, here.

We would love to hear from you! You can make a product suggestion, report an issue, ask questions, find answers, and propose new features at info@sidra.dev.

Author

Plain Concepts