Better design, deployment updates, and new XGBoost brick
Added possibility to update deployment without changing the link
Added possibility to update deployment without changing the API URL. User can create immutable deployment API URL. This API URL can be connected to any deployed pipeline.
Added the interpolation method for MVT
We have extended the set of methods that can be applied for the numerical data imputation by adding the possibility to make a linear interpolation to fill the missed values, which also extends our capabilities for time series processing. This approach is based on the assumption of the linear relationship between data points and uses non-missing values to compute a value for a missing data point.
Updated design of the header
The header was improved for better usability, so now you can easily switch between projects, models, and pipelines.
Added a XGBoost brick
We have added a new class of models - XGBoost, which is an ensemble learning and a gradient boosting algorithm for decision trees that uses a second-order approximation of the scoring function. XGBoost learns a model faster than many other machine learning models (especially among the other ensemble methods) and works well on categorical data and limited datasets.
Updated bricks: parse the list and parse JSON to new versions
The names of the columns with unparsed data were changed to be more intuitively understandable. If the column contains unparsed user-specified tags or the incorrect JSON / list format data, they now point directly to them. Also, these columns are created only in the presence of this data, i. e. there will be no cases when the whole column contains only NaNs. Specified changes were separated into the new brick version, while the old one was deprecated. To get the recent changes, you should replace the existing bricks.
ClickHouse is an open-source, high-performance columnar OLAP database management system for real-time analytics using SQL. ClickHouse is a super popular database for analytics and now you can use it with no-code analytics in Datrics.
In this version, we created the pipeline deployments from scratch by introducing a new fault-tolerant and reliable architecture. Now enterprise users can run pipelines in production in a separate computation cluster that gives better performance and stability.
Freezing in Parse List and Parse JSONThe functionality for working with JSONs and Lists now has the freezing option. This allows to train the pipeline to produce set number and names of features, even with the new data that's different to the previous data set. Working with deployed pipelines that consist of those bricks had become easier.