Data Science

Econometric Forecasting

Using regression and classification techniques to forecast macroeconomic indicators

ForecastingMachine LearningClassificationRegressionPythonScikit-learnPandasMatplotlibEconometrics

Problem Statement

An Indian entrepreneur wanted to develop machine learning models for the forecasting of certain macroeconomic indicators. The models were required to forecast the indicators 3 months, 6 months and 12 months into the future. Further, separate models were required for positive value prediction and negative value prediction.

Challenges

  • The definitions of the indicators and the definitions of fields that were standardized indices were masked for data privacy reasons, and thus it was possible to analyze the results only from a mathematical standpoint
  • A total of 4 indicators each with separate models for positive and negative prediction further split by time horizon of prediction resulted in large number of models which were difficult to maintain

Solution

Summary

I developed supervised classification models to predict whether an indicator will be positive or negative in the future. For each scenario, I developed regression models to estimate the most likely value. Using a combination of both results, a final prediction was made for each indicator and time horizon.

Approach

  • Initial experiments resulted in high errors with multivariate linear regression models. So, a better approach was needed.
  • I designed a hybrid classification-regression approach where it is first determined whether the indicator is likely to be positive or negative and then the most likely value is estimated using regression
  • For classification, logistic regression and random forest were evaluated on ROC-AUC, Precision, Recall and F1-Score. Random Forest was selected.
  • For regression, linear regression and random forest regressor were evaluated on MAPE and MAE. Random Forest Regressor was selected.
  • Models along with scaling functions were stored as pickle files and a python tool was developed to facilitate predictions

Deliverables

  • Best models after hyper-parameter tuning
  • Jupyter notebook reports on the modeling exercise
  • Tool to facilitate predictions
Sample Classification Results

Sample Classification Results

Sample Regression Results

Sample Regression Results

Final Prediction

Final Prediction

Have a similar challenge?

Let's discuss how we can develop solutions for your specific use case.

More Case Studies