You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
<p><strong>Bonus / Extra practice</strong>: Let’s say you change your survey so participants can rank their response 1-10 (inclusive). Create a randomly sampled vector of 30 survey responses. (hint use <code>seq()</code> and <code>sample()</code> and set the replace argument to <code>TRUE</code>). Store the output as <code>my_responses_2</code>. Examine the data by typing the name in the Console using a function.</p>
Copy file name to clipboardExpand all lines: modules/Data_Cleaning/Data_Cleaning.html
+8-8Lines changed: 8 additions & 8 deletions
Original file line number
Diff line number
Diff line change
@@ -204,7 +204,7 @@ <h1 data-config-title><!-- populated from slide_config.json --></h1>
204
204
<li>The <code>lubridate</code> package is helpful for dates and times<br/>📃<a href='https://jhudatascience.org/intro_to_r/modules/cheatsheets/Day-4.pdf' title=''>Cheatsheet</a></li>
<li><code>any</code> will be <code>TRUE</code> if ANY are true
@@ -538,7 +538,7 @@ <h1 data-config-title><!-- populated from slide_config.json --></h1>
538
538
539
539
<p><strong><code>filter()</code> removes missing values by default.</strong> Because R can’t tell for sure if an <code>NA</code> value meets the condition. To keep them need to add <code>is.na()</code> conditional.</p>
540
540
541
-
</article></slide><slide class=""><hgroup><h2>filter() and missing data</h2></hgroup><article class="codesmall" id="filter-and-missing-data-1">
541
+
</article></slide><slide class=""><hgroup><h2>filter() and missing data</h2></hgroup><article id="filter-and-missing-data-1" class="codesmall">
542
542
543
543
<pre class = 'prettyprint lang-r'>df</pre>
544
544
@@ -612,7 +612,7 @@ <h1 data-config-title><!-- populated from slide_config.json --></h1>
612
612
1 2 6
613
613
2 1 2</pre>
614
614
615
-
</article></slide><slide class=""><hgroup><h2>Drop <strong>columns</strong> with any missing values</h2></hgroup><article class="codesmall" id="drop-columns-with-any-missing-values">
615
+
</article></slide><slide class=""><hgroup><h2>Drop <strong>columns</strong> with any missing values</h2></hgroup><article id="drop-columns-with-any-missing-values" class="codesmall">
616
616
617
617
<p>Use the <code>miss_var_which()</code> function from <code>naniar</code></p>
618
618
@@ -633,7 +633,7 @@ <h1 data-config-title><!-- populated from slide_config.json --></h1>
633
633
634
634
<pre >[1] "Dog" "Cat"</pre>
635
635
636
-
</article></slide><slide class=""><hgroup><h2>Drop <strong>columns</strong> with any missing values</h2></hgroup><article class="codesmall" id="drop-columns-with-any-missing-values-1">
636
+
</article></slide><slide class=""><hgroup><h2>Drop <strong>columns</strong> with any missing values</h2></hgroup><article id="drop-columns-with-any-missing-values-1" class="codesmall">
637
637
638
638
<pre class = 'prettyprint lang-r'>df %>% select(!miss_var_which(df))</pre>
639
639
@@ -647,7 +647,7 @@ <h1 data-config-title><!-- populated from slide_config.json --></h1>
647
647
5 5
648
648
6 6</pre>
649
649
650
-
</article></slide><slide class=""><hgroup><h2>Removing columns with threshold of percent missing row values</h2></hgroup><article class="codesmall" id="removing-columns-with-threshold-of-percent-missing-row-values">
650
+
</article></slide><slide class=""><hgroup><h2>Removing columns with threshold of percent missing row values</h2></hgroup><article id="removing-columns-with-threshold-of-percent-missing-row-values" class="codesmall">
651
651
652
652
<pre class = 'prettyprint lang-r'>is.na(df) %>% head(n = 3)</pre>
653
653
@@ -712,7 +712,7 @@ <h1 data-config-title><!-- populated from slide_config.json --></h1>
712
712
713
713
<p>You might want to keep the <code>NA</code> values so that you know the original sample size.</p>
714
714
715
-
</article></slide><slide class=""><hgroup><h2>Word of caution</h2></hgroup><article class="codesmall" id="word-of-caution">
715
+
</article></slide><slide class=""><hgroup><h2>Word of caution</h2></hgroup><article id="word-of-caution" class="codesmall">
716
716
717
717
<p>Calculating percentages will give you a different result depending on your choice to include NA values.</p>
0 commit comments