Skip to content

Conversation

@kevinislas2
Copy link

…t was being assigned to ok dataset but never used again)

ok wasn't being used anywhere after being filtered by year, month and day of death

…t was being assigned to ok dataset but never used again)
@kevinislas2
Copy link
Author

Hi Dr. Hadley!
As part of a Machine Learning course I was asked to reproduce the case study in your paper on Tidy data.
I was using Python's Pandas library to reproduce it and noticed that the deaths' frequency per cause of death wasn't matching with the one on your paper by a small margin, after digging up I noticed that the ok table which is assigned the values of the deaths' table after filtering for year==2008, mod!=0, dod!=0 isn't used anywhere else.
The following function uses the deaths' table meaning that it calculates for causes of death for all years, months and days of death.
After assigning that filter to the deaths table in the R code I noticed I got the same results that I was getting in Python when reproducing the case study (although I'm not sure if it is a mistake on my end)

Best regards!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant