Dengue — Machine learning prediction models using time-series weather data.

INTRODUCTION

What is Dengue?

GETTING THE DATA

Data was given by the Driven Data Competition as CSV including two sources :

  • Atmospheric Administration in the U.S. Department of Commerce.

Our goal

Our goal was to create a machine learning model able to accurately predict the number of weekly cases of Dengue that will occur at two locations : San Juan in Puerto Rico, and Iquitos in Peru.

Visualise the weekly reported cases in San Juan

San Juan Dengue Fever Total cases per week
  • We can see that the time series does not appear to have a trend. There is no long-run upward or downward direction in the series.

PREPROCESSING

The Data in this project is referred to as a time series. Time-series Data requires considerations and particular pre-processing for Machine Learning. Missing Values have to be handled carefully before fitting any predictive model. Here is a view of the variables of our data set, and their missing values.

Distribution of Data before and after Standardization
Correlation between our target variable and the others
  • Average and Min temperature is also strongly correlated to the target variable.

BUILDING RANDOM FOREST MODEL

After all the data preparation work, we used Scikit-Learn libraries to construct our Random Forest Model model.

Predictions vs Actual total cases

Building RNN LSTM Model

We used the Keras and Tensorflow libraries to construct our model.

MAKING PREDICTIONS

Here’s a chart showing the number of cases, predicted vs. actual cases :

Predictions vs Actual cases with RNN model

CONCLUSION

With these graphs, we have completed an entire end-to-end machine learning prediction ! If we want to improve our model, we could try different hyperparameters (settings), test more different algorithms, or, the best approach of all, gather more data !

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Hub HETIC - Innovation pole

Hub HETIC - Innovation pole

Introduce students to innovation through impact projects allowing them to learn by doing : Data Science, Machine Learning, Deep Learning, Compute