Biogrid is a database curating protein-protein interactions from hight- and low-throughput physical and genetic interactions
read_biogrid_tab2(fname, taxon)
fname | path to .tab2 file |
---|---|
taxon | filter for interactions where both partners are from ncbi <taxon> |
tibble::tibble with columns biogrid_interaction_id gene_id_1 gene_id_2 biogrid_id_1 biogrid_id_2 feature_name_1 feature_name_2 gene_symbol_1 gene_symbol_2 synonyms_1 synonyms_2 experimental_system experimental_system_type author pubmed_id taxon_1 taxon_2 throughput score modification phenotypes qualifications tags source_database
usage: From the command line, download the by-organism biogrid dataset for the desired release and filter for the desired organism
BIOGRID_RELEASE=3.4.161 ORGANISM_NAME=Candida_albicans_SC5314 pushd raw_data wget https://downloads.thebiogrid.org/Download/BioGRID/Release-Archive/BIOGRID-$BIOGRID_RELEASE/BIOGRID-ORGANISM-$BIOGRID_RELEASE.tab2.zip unzip BIOGRID-$BIOGRID_RELEASE/BIOGRID-ORGANISM-$BIOGRID_RELEASE.tab2.zip ls | grep -v -e "$ORANISM_NAME | xargs rm popd
from R, parse biogrid data
biogrid_release <- "3.4.161" organism_name <- "Candida_albicans_SC5314" taxon <- "237561" biogrid_data <- CalCEN::read_biogrid_tab2( fname = paste0("raw_data/biogrid/BIOGRID-ORGANISM-", organism_name, "-", biogrid_release, ".tab2.txt"), taxon = taxon)