Dengue — Machine learning prediction models using time-series weather data.


What is Dengue?


Data was given by the Driven Data Competition as CSV including two sources :

  • Atmospheric Administration in the U.S. Department of Commerce.

Our goal

Our goal was to create a machine learning model able to accurately predict the number of weekly cases of Dengue that will occur at two locations : San Juan in Puerto Rico, and Iquitos in Peru.

Visualise the weekly reported cases in San Juan

San Juan Dengue Fever Total cases per week
  • We can see that the time series does not appear to have a trend. There is no long-run upward or downward direction in the series.


The Data in this project is referred to as a time series. Time-series Data requires considerations and particular pre-processing for Machine Learning. Missing Values have to be handled carefully before fitting any predictive model. Here is a view of the variables of our data set, and their missing values.

Distribution of Data before and after Standardization
Correlation between our target variable and the others
  • Average and Min temperature is also strongly correlated to the target variable.


After all the data preparation work, we used Scikit-Learn libraries to construct our Random Forest Model model.

Predictions vs Actual total cases

Building RNN LSTM Model

We used the Keras and Tensorflow libraries to construct our model.


Here’s a chart showing the number of cases, predicted vs. actual cases :

Predictions vs Actual cases with RNN model


With these graphs, we have completed an entire end-to-end machine learning prediction ! If we want to improve our model, we could try different hyperparameters (settings), test more different algorithms, or, the best approach of all, gather more data !



Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Hub HETIC - Innovation pole

Hub HETIC - Innovation pole

Introduce students to innovation through impact projects allowing them to learn by doing : Data Science, Machine Learning, Deep Learning, Compute