Better UX, New data uploader and Auto ML

Added the possibility to filter, compare and indicate missing values in different bricks

Now it is possible to work with bricks even in the case when the data has missing values - we automatically diagnose and handle the missing values issue considering the function to be performed:

Improved search through the list of bricks

Now it has become easier to find the brick because we've added keywords to the search. You can enter "cleansing" to find Missing Values Treatment brick, or "scaling" to find Normalization brick, or just enter PCA to find Dimensionality Reduction brick.

Improved notifications tab

We've moved all notifications to the separate tab in the right sidebar, now it's easier to navigate through them. All notifications are sorted by severity, errors are always on the top of the list.

Add custom code bricks to your pipeline, set up the arguments, and run pipeline as usual

Improved scene editor

Finally, you can select, copy and paste multiple bricks on the scene. You can select bricks by holding the shift button on your keyboard or just frame them with your mouse.

Data Segmentation Brick

Added functionality for the out-of-the-box data segmentation, which provides the cluster analysis results from the raw data, without the necessity to create the data preparation pipeline. Data Segmentation brick reproduces

data cleansing → feature engineering → modeling

pipeline and returns the data segmentation model and the data processing scenario, which can be implemented as a Datrics pipeline. Data processing scenario includes data cleansing, encoding, missing values treatment, and feature selection. The prepared features are used for the fitting of the K-Means clustering model with the optimal number of clusters. Brick supports simple and advanced modes. Simple mode is the completely automated mode, which includes the detection of the optimal number of clusters and feature engineering without the user's involvement, but in the advanced mode, the user can configure the model's hyperparameters manually.

Add custom code bricks to your pipeline, set up the arguments, and run pipeline as usual

SSL termination support in databases

Now you can use SSL termination to connect to your databases. Just attach certificates when you create the new data source connection.

Add custom code bricks to your pipeline, set up the arguments, and run pipeline as usual

New operations in math formula

A new operation has been added to the math formula - now we can construct the complex conditions using AND and OR and NOT logical operators, and the strings processing becomes more flexible.

Add custom code bricks to your pipeline, set up the arguments, and run pipeline as usual

Binary classification models improvements

Added thresholds to binary classification models. Now the cutoff point can be changed from the default 0.5 value. This threshold is used to determine the affiliation with a positive class based on the predicted probability. The threshold is taken into account when generating all applicable model performance metrics and visualizations.

Added Model Scores Distribution dashboard to the Model Performance tab for binary classification models.

Add custom code bricks to your pipeline, set up the arguments, and run pipeline as usual

Auto ML Bricks: Time Series

Added TimeSeries Forecasting brick which supports stratification and may be used for the time series forecasting without complex settings. Time Series Forecasting brick provides the possibility to train and apply the forecasting model based on the analysis of historical time-series data with the inline capabilities of its preprocessing. Time Series Forecasting brick performs the analysis pipeline that consists of three stages - Time Series feature extraction, which includes Time Series feature extraction, Time Series Preprocessing, and Model fitting and applying. First, we detect the time-series features like a trend, seasonality, and data logging frequency, including detecting the features that might be considered additional regressors. Next, we perform the preprocessing of time-series data - outliers and missing values treatment, denoising, and discretization. And finally, fitting the stratified forecasting model and making the forecasting. Brick has two modes of usage - simple and advanced modes. In simple mode, the data preprocessing and the model hyper-parameters settings are performed automatically based on the dependencies extracted from the time series, the user should define the target and date-time variables only. In the advanced mode, the user can configure the brick with all advantages of the simplified mode, but without its limitations - we provide a very flexible combination of the manual and automatic configurations that allows introducing the expert knowledge to the time series processing pipeline. Time Series forecasting brick is equipped with a Forecasting Dashboard, which provides a detailed description of the time-series processing stages and the forecasting results.

Add custom code bricks to your pipeline, set up the arguments, and run pipeline as usual

Updated Model Performance Dashboard for Regression and Clusterization

We have improved the models' interpretability for the binary classification via extending the Model Performance dashboard with the Model Score Distribution plot. The new plot depicts the distribution of the output scores per target clases, including probability density function, and range- and quantile-based discretization plots, which reflect the share of the class items that took the specific score range.

Pivot Spreadsheet Brick

The Datrics data processing section was extended with the Pivot Spreadsheet brick that provides a possibility to reorganize and summarize the input data using the table of grouped values that aggregates the items of an input dataset within some categorical values.

Add custom code bricks to your pipeline, set up the arguments, and run pipeline as usual

New Visual CSV Uploader and Editor

Now you can upload CSV, XLS, XLSX files with a new intuitive, visual user interface.In the new uploader interface users can:

Do you want to discover more about Datrics?