DATRICS UPDATES V2.7 | 9 MARCH 2023

Unlock the Power of Custom Features, Cloud Storage Export, JSON Transformation, and More

Datrics has a new update that lets do more cool stuff for data analysts with no or low-code! You can use non-data inputs/outputs in custom bricks, load python libraries from git, export data to AWS S3 and GCS, and transform json-like data into a dataframe in one step.

Let's dive deeper.

Low code data science platform

Datrics gives new capabilities for data engineers, analysts and scientists to expand the no-code platform with custom features. Our objective is to equip data scientists with appropriate tools that enable them to leverage their expertise and tackle intricate data analytics tasks and machine learning, without any constraints imposed by the platform. Let’s go through new features and how they may help your team.

Reusable custom bricks with custom types

We are adding the capability to use non-data (custom) inputs/outputs for custom bricks. This seemingly small feature provides the opportunity to create custom bricks for specific features and tasks and reuse those bricks in multiple steps. Let’s look at the example.

Add custom code bricks to your pipeline, set up the arguments, and run pipeline as usual

The goal is to add SVM classifier to Datrics and perform predictions using the model. With the standard approach to custom brick, data scientist would fill in all the functionality in one brick - importing the model, training and predicting. This limits the possibilities to reuse the brick in a different pipelines and different use case.

Add custom code bricks to your pipeline, set up the arguments, and run pipeline as usual

It seems more efficient to create two bricks - to train the model and the other to perform predictions. This way each brick solves the particular problem and can be used separately in a variety of pipelines. In particular example, it becomes easier to train the model on one data set and use different data set to test the quality of predictions.

Moreover, creating separate custom bricks for each task enlarges the pool of features data analysts and data scientists may use in their ETL pipelines or data experiments.

Load Python libraries from git

In one of the previous Datrics product updates, we have introduced the capability to load external python libraries from pypi. It eliminated the constraints of the no-code platform with pre-installed Python libraries, providing users with the ability to load supplementary resources.

We go one step further and add the capability to load libraries from git. It works in a similar way to loading resources from pypi.

Once installed, library may be used in the custom brick to create cool features in Datrics.

Export data to AWS S3 and Google Cloud Storage

There is new export connector AWS S3 and Google Cloud Storage coming to Datrics. The connector supports export to csv and parquet file formats. We also provide the possibility to export the dataset into multiple parquet files using the data in the partition columns as the key.

Here is how to set up the export brick:

  • Define the target file path:
    • For csv set the full path to the file.
    • For parquet there are two options - static or dynamic path.
      Static path - is the full path to the file
      Dynamic path consists of:
      prefix - the static part of the path, partition columns - key defining how to split the dataset into the files.

      As the result, folders with files will be created:
      prefix/partition=value/uuid.parquet

More about connector to AWS S3 and Google Cloud Storage in Datrics.

Transform json to dataset in one step

Parse JSON - new brick to transform json into the dataframe. There are 3 json types supported: default, split, and index.

Add custom code bricks to your pipeline, set up the arguments, and run pipeline as usual

Confidence intervals in the Prophet model brick

There is an important update to the Prophet time series forecasting model in Datrics no-code platform. We are adding the confidence intervals for the prediction to the output dataframe of the brick. The confidence interval will also be displayed in the model performance dashboard. This advancement gives more options to work with the predictions.

Add custom code bricks to your pipeline, set up the arguments, and run pipeline as usual

Check out our previous updates

Do you want to discover more about Datrics?

BACKED BY