Is SLEEP really for the Weak?

How WELL do you really SLEEP?

Sleep quality and quantity are the strongest lifestyle predictors of depression: People who had better sleep quality, and those who slept inside the range of eight to 12 hours per night reported fewer symptoms of depression. Sleep quality has “significantly outranked” other health behaviors linked to mental health and well-being. Thus, the goal here is to build a web application to help people in determing their quality of their sleep. The full code of is available in

To get good quality sleep, one of the top factors is to Stick to one sleep schedule: Keep the same bedtime and wake-up time, even on the weekends. This keeps your circadian rhythm consistent and makes it easier to fall asleep. In Short make the length of sleep consistent as well.

Data Collection

I collected the from kaggle. The dataset has the following attributes:

Start             887 non-null    object 
1 End 887 non-null object
2 Sleep quality 887 non-null object
3 Time in bed 887 non-null object
4 Wake up 246 non-null object
5 Sleep Notes 652 non-null object
6 Heart rate 162 non-null float64
7 Activity (steps)

Data Preparation and Exploration

Here is a snapshot of the data header.

Attributes Correlation

This heatmap shows the correlations between the dataset attributes, and how the attributes interact with each other.It is clear to see that the “Time in bed” is the most related to the “Sleep quality” except itself. The “start time” of sleep time is more related to the “end time”.

I cleaned this DataSet up and Feature Engineered some columns to make it easier to work with. This is the result

Data Modeling

Let’s create the machine learning model. We are trying to predict Sleep Quality. We will use the ‘TheTarget’ column as the class, and all the other columns as features for the model.

-Data Splitting

We will divide the data into a training set and test set. 80% of the data will be for training and 20% for testing.

– Machine Learning Model

Here, we will try the below machine learning algorithms then we will select the best one based on its classification report.

  • Linear Regression
  • Ridge
  • Random Forest Regressor
  • Gradient Boosting

This is the Baseline MAE:

The model turned out to be the Random Forest Regressor

Looking at all the features, It is clear that Quality of Sleep is highly affected by the Total amount of sleep you get. I depicted this using a permutation importance plot

This is also emphasized when we create a Pdp Plot while using “Time in Bed in Seconds” as the feature

In Conclusion

Sleep Quality is highly affected by Sleep Quantity in addition to other features and factors that we also saw according to this study. So please get some sleep, because sleep is NOT for the weak!