Once ingredients have been procured, they are shelved in the pantry ready for use.
Bioinformatics datasets are provided by a wide range of groups, in a wide range formats, and are rapidly changing. This is an opinionated guide to collect and curate data sets. It sets up a local database of datasets that can be used for an integrated analysis.
1a. Create the json
file ~/.pantry_login that will be passed to dplyr::src_postgres
to login ot the database. For example:
{
"staging_directory" : "$HOME/pantry_sets",
"login" : {
"dbname" : "<database name",
"host" : "<host>",
"user" : "<user>",
"password" : "<password>",
"port" : <port>
}
}
install.packages("devtools")
devtools::install_github("maomlab/BioChemPantry")
Load datasets by following the vignettes in vignettes/sets/
Use datasets
library(plyr)
library(dplyr)
library(BioChemPantry)
<- get_pantry(schema=<dataset>)
pantry
<- pantry %>% schema_tbl("<tbl>") tbl