Forecast Architecture

This article describes the architecture of the forecasting platform, as shown in the diagram below.

The Umbrella Cost forecasting mechanism starts by utilizing all the data from the onboarded accounts available in the platform.

Then the platform automatically preprocesses the data by cleaning it, removing anomalies, extracting features and preparing it for the Forecast model. The platform trains a large number of machine learning models from various types and hyper-parameters, such as XGB, linear models and prophet in order to achieve the most accurate results. The platform evaluates the performance of the models and ensembles them to produce the most accurate forecasts per cost metric.

The platform generates forecasts continuously, and provides different ways to consume them. In Umbrella Cost the forecasts are retrieved every day using the Forecast API, and the accuracy of the forecasts is automatically reviewed. New models are retrained periodically to adjust them to the latest data.

The Forecast architecture can be implemented on a large scale, mainly by using an orchestration engine for scheduling and running the pipelines, and by distributing the jobs to run in parallel on an AWS batch.