Skip to content

Commit 0ad9c37

Browse files
committed
adding new option for determine_outliers. Close #28
1 parent 7590a72 commit 0ad9c37

File tree

131 files changed

+1234
-790
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

131 files changed

+1234
-790
lines changed

DESCRIPTION

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
Package: visualizationQualityControl
2-
Version: 0.4.10
2+
Version: 0.4.11
33
Title: Development of visualization methods for quality control
44
Description: Provides utilities useful quality control of
55
high-throughput -omics datasets.
@@ -11,12 +11,13 @@ Date: 2021-12-28
1111
Depends: R (>= 3.1.1)
1212
biocViews:
1313
Imports: ComplexHeatmap (>= 1.2.1), stats, dendsort, colorspace, dplyr,
14-
ggplot2, broom, knitrProgressBar, magrittr
14+
ggplot2, broom, knitrProgressBar, magrittr, purrr
1515
License: MIT + file LICENSE
1616
VignetteBuilder: knitr
1717
Suggests: testthat, knitr, rmarkdown, circlize, viridis, ICIKendallTau,
1818
ggforce
19-
RoxygenNote: 7.1.2
19+
RoxygenNote: 7.2.3
20+
Encoding: UTF-8
2021
LinkingTo: Rcpp
2122
URL:
2223
https://moseleybioinformaticslab.github.io/visualizationQualityControl

NEWS.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,7 @@
1+
# vsualizationQualityControl 0.4.11
2+
3+
* Added a new argument `only_low` to `determine_outliers` to only look at the low end of the score distribution for outliers, as sometimes `boxplot.stats` will pick up outliers at the high end as well.
4+
15
# visualizationQualityControl 0.4.10
26

37
* Updated the quality_control vignette to use ICIKendallTau instead of other correlation measures.

R/correlations.R

Lines changed: 15 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -812,6 +812,7 @@ outlier_fraction <- function(data, sample_classes = NULL, n_trim = 3,
812812
#' @param outlier_fraction outlier fractions
813813
#' @param cor_weight how much weight for the correlation score?
814814
#' @param frac_weight how much weight for the outlier fraction?
815+
#' @param only_low should only things at the low end of score be removed?
815816
#'
816817
#' @details For outlier sample detection, one should
817818
#' first generate median correlations using
@@ -826,7 +827,7 @@ outlier_fraction <- function(data, sample_classes = NULL, n_trim = 3,
826827
#' @export
827828
#' @return data.frame
828829
determine_outliers = function(median_correlations = NULL, outlier_fraction = NULL,
829-
cor_weight = 1, frac_weight = 1){
830+
cor_weight = 1, frac_weight = 1, only_low = TRUE){
830831

831832
if (!is.null(median_correlations) && !is.null(outlier_fraction)) {
832833
full_data = dplyr::left_join(median_correlations, outlier_fraction, by = "sample_id", suffix = c(".cor", ".frac"))
@@ -866,6 +867,19 @@ determine_outliers = function(median_correlations = NULL, outlier_fraction = NUL
866867
full_data$outlier = FALSE
867868
full_data$outlier[full_data$sample_id %in% all_out] = TRUE
868869

870+
if (only_low) {
871+
split_data = split(full_data, full_data$sample_class)
872+
full_data = purrr::map(split_data, \(in_data){
873+
mean_score = mean(in_data$score)
874+
wrong_side = in_data |>
875+
dplyr::filter(score < mean_score, outlier) |>
876+
dplyr::pull(sample_id)
877+
in_data$outlier[in_data$sample_id %in% wrong_side] = FALSE
878+
in_data
879+
}) |>
880+
dplyr::bind_rows()
881+
}
882+
869883
full_data
870884

871885
}

docs/404.html

Lines changed: 6 additions & 6 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

docs/LICENSE-text.html

Lines changed: 4 additions & 4 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

0 commit comments

Comments
 (0)