a versatile approach to integrate heterogeneous datasets
If you find our package useful, please cite:
Laura M Zingaretti, Gilles Renand, Diego P Morgavi, Yuliaxis Ramayo-Caldas, Link-HD: a versatile framework to explore and integrate heterogeneous microbial communities, Bioinformatics, https://doi.org/10.1093/bioinformatics/btz862
LinkHD is a general R software to integrate heterogeneous dataset focusing on micribial communities. LinkHD combines multivariate techniques to perform data integration with cluster and variable selection. The method also allows us to study the relashionships between observations and features and to obtain enrichment taxa analysis.
Clone the repository or
devtools::install_github(repo="lauzingaretti/LinkHD")
or from Bioconductor
if (!requireNamespace("BiocManager", quietly = TRUE))
install.packages("BiocManager")
#The following initializes usage of Bioc devel
BiocManager::install(version='devel')
BiocManager::install("LinkHD")
library(linkHD)
LinkHD capabilities were demostrated analizing public datasets from TARA Ocean expedition (https://oceans.taraexpeditions.org/en/m/about-tara/les-expeditions/tara-oceans/) and datasets from rumen metataxonomic communities (including bactera, archaea and protozoa data). Data can be loaded by the next command:
data(Rumynotipes)
#or
data(Tataoceans)
More examples and explanation of methods are available at: https://lauzingaretti.github.io/LinkHD-examples/
we have added a function to impute missing values in raw data. Note that this function should be used before any data transformation!
In fact, the process should be the following:
#dir where data are stored
setwd('dir')
Data<-ReadData()
Out<-Na_inspect(Data)
# if some of your data contains NA, you can use impute_missing() for each data.frame, otherwise you can follow with the standard analysis.