Berkeley Phylogenomics Group receives an NSF grant to develop a graph DB for Big Data challenges in genomics building on Bio4j

The Sjölander Lab at the University of California, Berkeley, has recently been awarded a 250K US dollars EAGER grant from the National Science Foundation to build a graph database for Big Data challenges in genomics. Naturally, they’re building on Bio4j.

The project “EAGER: Towards a self-organizing map and hyper-dimensional information network for the human genome” aims to create a graph database of genome and proteome data for the human genome and related species to allow biologists and computational biologists to mine the information in gene family trees, biological networks and other graph data that cannot be represented effectively in relational databases. For these goals, they will develop on top of the pioneering graph-based bioinformatics platform Bio4j.

We are excited to see how Bio4j is used by top research groups to build cutting-edge bioinformatics solutions” said Eduardo Pareja, Era7 Bioinformatics CEO. “To reach an even broader user base, we are pleased to announce that we now provide versions for both Neo4j and Titan graph databases, for which we have developed another layer of abstraction for the domain model using Blueprints.”

EAGER stands for Early-concept Grants for Exploratory Research”, explained Professor Kimmen Sjölander, head of the Berkeley Phylogenomics Group: “NSF awards these grants to support exploratory work in its early stages on untested, but potentially transformative, research ideas or approaches”. “My lab’s focus is on machine learning methods for Big Data challenges in biology, particularly for graphical data such as gene trees, networks, pathways and protein structures. The limitations of relational database technologies for graph data, particularly BIG graph data, restrict scientists’ ability to get any real information from that data. When we decided to switch to a graph database, we did a lot of research into the options. When we found out about Bio4j, we knew we’d found our solution. The Bio4j team has made our development tasks so much easier, and we look forward to a long and fruitful collaboration in this open-source project”.

You can find more information here: