-
Notifications
You must be signed in to change notification settings - Fork 5
Description
Typically between Arguments and Value sections in Rd files, there is a Details section (c.f. ?edgeR::estimateGLMRobustDisp). This could be helpful to explain what the options mean. For example,
selectFeatures: A vector indicates the gene selection method, set as "limma" by default. This should be one or more of "limma", "DV", "DD", "chisq", "BI" and "Cepo".
DV and DD could refer to a variety of approaches, such as Bartlett's test, Levene's test, Kolmogorov-Smirnov test, Kullback-Leibler divergence, etc. Which one is used for each of these? Also, what is BI an acronym for? I suppose that it is bimodality index but the journal article has used BD. Similarly,
exprsMat_train: A matrix of log-transformed expression matrix of reference dataset.
Is the bioinformatics convention of features in rows and samples in columns or statistics convention of features in columns and samples in rows expected? For example mixOmics and ClassifyR follow statistics convention, but limma and edgeR follow bioinformatics convention. The data set used in Examples section allows the user to infer bioinformatics convention, but it could be made explicit.
> exprsMat_xin_subset[1:3, 1:3]
3 x 3 sparse Matrix of class "dgCMatrix"
SRR3541305 SRR3541306 SRR3541307
CIZ1 1.479874 . .
HCFC1R1 . . 4.180656
MAGI2 5.864330 3.665412 3.634933