Datavis GSoC project
DataVis project
InterMine provides large amounts of data, and we’d like to add some more data visualisation tools to our new user interface. Some possible examples might be:
- Expression graphs, such as:
- Visualisations similar to the ones found in our python client
- BioJS MSA viewer
- This would show on a gene list page and pass FASTA sequences for each gene to the msa viewer. FASTA for lists is available via the web services.
- a 3d Protein visualiser
- Most InterMines don’t have pdb ids associated with their proteins. This pdb id can be fetched by passing a protein’s primary accession to the PBD search REST API (selecting the UniProtKB Accession Number(s) endpoint). Once the pdb id is determined the pdb files can be downloaded from the PDB download service, and then passed to the visualiser.
- Volcano plots
- Manhattan plots
- Box plots
- Violin plots
- Combining Swarm plots and Box plots
biojs.net is a good source for more visualisation libraries.
Background
The current InterMine interface is powered by JSPs (example: http://www.flymine.org) and will be discontinued in the next few years. We’re building a new interface, code-named BlueGenes (Live demo). We want to make it extra easy for people to port their existing javascript tools into BlueGenes, so we’ve created a set of specifications to allow javascript applications to interact with BlueGenes. This is known as the BlueGenes Tool API. You can read more about this on our Tool API release blog announcement.
Getting started
- Take some time to learn more about InterMine and InterMine queries. Take a look through Getting started with InterMine JS-Based applications.
- Okay, now we have the basics down, let’s take a look at an existing visualisation. Visit a report page in FlyMine and scroll down to the interaction network viewer. This is one example of a data visualisation where we use data from InterMine that has also been ported to BlueGenes. To see the same visualisation in BlueGenes, search for “FBgn0004053” and go to the first result page.
- Now let’s have a quick look at the code for the interaction network viewer:
- Source code: https://github.com/intermine/cytoscape-intermine - take a look at index.html to see how the visualiser is initialised
- To include it in bluegenes, we’ve added a thin wrapper around the cytoscape-intermine repo - check out https://github.com/intermine/bluegenes-tool-cytoscape - you’ll see it doesn’t have much code at all!
- Take the BlueGenes Tool API Tutorial. The yeoman generator in the tutorial was used to convert the network viewer into a bluegenes compatible tool. Go back to https://github.com/intermine/bluegenes-tool-cytoscape and have a look through the files now that you’ve completed the tutorial. It should look more familiar!
- Time to think about the project proposal: if you’ve made it this far, hopefully you have an understanding of the technical requirements to make a BlueGenes data visualisation. Pick a few of the possible visualisations that interest you and start to prepare a project plan. How long do you think each task might take and what would your preferred approach be? You can discuss this with mentors and/or start writing your proposal. Always share our proposal drafts with your mentors so they can offer feedback, and take a look through our advice for applications to find tips and a proposal template.
Need help? Questions?
Try asking in the GSoC chat. The mentors for this project are:
- Yo Yehudi
- Adrián Rodríguez-Bazaga
- Aman Dwivedi