Bio4j is a bioinformatics graph data platform, integrating most data available in Uniprot KB (SwissProt + Trembl), Gene Ontology (GO), UniRef (50,90,100), NCBI Taxonomy, and Expasy Enzyme DB.

Bio4j provides a completely new and powerful framework for protein related information querying and management. The use of a graph-based data model makes possible to store and query data in a way that semantically represents its own structure. On the contrary, traditional relational models and databases must flatten the data they represent into tables, creating artificial ids in order to connect the different tuples; which can in some cases eventually lead to domain models that have almost nothing to do with the actual structure of data.

For more, go to bio4j/bio4j at GitHub.

Community and contact


Bio4j is an open source platform released under the AGPLv3 license.

Citing Bio4j

Bio4j: a high-performance cloud-enabled graph-based data platform.
Pablo Pareja-Tobes, Raquel Tobes, Marina Manrique, Eduardo Pareja, Eduardo Pareja-Tobes
bioRxivdoi 10.1101/016758