The 2019 project ideas list is still available here: 2019 project ideas list

Table of Contents:

Ideas list:

Rewrite im-tables as a React webapp in ES6

Description: im-tables is a browser-based data table displayer for InterMine data written in CoffeeScript using the Backbone framework. Your task will be to create an identical webapp using modern tooling, primarily the React framework with ES6 syntax. Most of the functionality will have to be rewritten from scratch, although HTML and CSS can be salvaged from the old project. If time allows, you can also work on implementing new features for your replacement of im-tables. This would include adding displayers to the column summary features.

Required skills:

  • HTML (JSX) and CSS (LESS/SASS)
  • JavaScript (ES6/ES2015)
  • JavaScript build tools (webpack), linting (eslint) and test libraries (Jest)
  • JavaScript frontend frameworks (React, etc.)
  • JavaScript state management (React Context, Redux, etc.)

Possible mentors:

  • Aman Dwivedi
  • Kevin Herald Reierskog (khr29@cam.ac.uk)
  • Nikhil Vats (nikvats5499@gmail.com)

Please always email both mentors in the same mail if you would like to ask questions or discuss the project.

Expected outcome: Complete replacement of im-tables using modern tooling, which can be easily updated and maintained for many years to come.

Difficulty level: Easy to Hard depending on your experience with creating modern JavaScript webapps


InterMine training portal

Description: InterMine has various bits of training materials at intermine.org/tutorials, but these are fragmented tutorials and many should be updated. It would be nice to have a more comprehensive resource, similar to the Galaxy Training Network. This should still be embedded in the InterMine website (source code: https://github.com/intermine/intermine-homepage-2017, live: intermine.org). You can also use your creativity and propose novel additions, e.g. short video tutorials or other interesting approaches.

Required skills:

Possible mentors:

  • Yo Yehudi
  • Aman Dwivedi (if needed)
  • Asher Pasha

Please always email both mentors in the same mail if you would like to ask questions or discuss the project.

Expected outcome: An updated set of tutorials and/or training infrastructure for InterMine.

How could this integrate with existing InterMine tools? This will directly instruct people how to use these tools.

Difficulty level: Medium if you know Hugo and understand InterMine queries. Harder otherwise.


Javascript Data Visualisations

Title: Last year, we began to create a suite of javascript-based biological data visualisations for BlueGenes, InterMine’s ClojureScript based user interface, and would need to conform to the Bluegenes Tool API. Clojure knowledge is not required.

Required skills:

  • CSS
  • HTML
  • Javascript
  • DOM manipulation or library of your choice (eg. React)

Possible mentors:

  • Aman Dwivedi (if needed)
  • Kevin Herald Reierskog (khr29@cam.ac.uk)
  • Adrián Rodríguez-Bazaga (ar989@cam.ac.uk)
  • Asher Pasha

Please always email all mentors in the same mail if you would like to ask questions or discuss the project.

Expected outcome: A suite of 2-5 javascript based tools that can be used in production BlueGenes.

How could this integrate with existing InterMine tools? This will be used directly within BlueGenes.

Difficulty level: Medium if familiar with javascript visualisation techniques, Hard if not.


End to End and integration testing with Cypress

Description: BlueGenes (GitHub, Demo) is a ClojureScript browser-based UI for InterMine. im-tables-3 (GitHub) is a dynamic result table library used to show InterMine query results in BlueGenes.

Test libraries like Cypress which automate a series of interactions with the webapp through a controlled web browser, are very useful to avoid regression in functionality during development. By using selectors and methods which correspond to common actions done in a web browser, you are able to describe these automated interactions in JavaScript. The Cypress test runner will then run these interactions on the running webapp, and produce a test report along with artifacts like a video recording or images.

BlueGenes has a few integration tests written in JavaScript with Cypress, while im-tables-3 has none. Both could use vastly more of this type of tests. These tests would also need to pass consistently in constrained environments like Travis CI.

Required skills:

  • JavaScript
  • Cypress or other frontend interaction/integration testing library

Possible mentors:

  • Kevin Herald Reierskog (khr29@cam.ac.uk)
  • Yo Yehudi (yy406@cam.ac.uk)
  • Aman Dwivedi

Expected outcome: A vast pool of tests that covers the majority of use cases.

Difficulty level: Easy to Medium, depending on your experience with JavaScript and Cypress


Improving the InterMine Data Browser

Description: The InterMine Data Browser (GitHub, Demo) is a JavaScript browser-based UI to explore the data among the different InterMines. It allows for easy search and exploration of the different mines available in the InterMine Registry, providing the user with an easy way of applying filters over the data without the need of too much knowledge of the underlying data model.

Required skills:

  • JavaScript
  • CSS

Possible mentors:

  • Adrián Rodríguez-Bazaga (ar989@cam.ac.uk)
  • Laksh Singla (lakshsingla@gmail.com)
  • Nikhil Vats (nikvats5499@gmail.com)
  • Asher Pasha

Please always email both mentors in the same mail if you would like to ask questions or discuss the project.

Expected outcome: The InterMine Data Browser is further improved with new features added, and current bugs existing in it are fixed. How could this integrate with existing InterMine tools? This is a standalone application that works directly with the InterMine instances APIs. Difficulty level: Easy if you know JavaScript and CSS - Medium if not.


Implementing a tuple-based backing database for InterMine

Description: InterMine is an open source integration system that uses PostgreSQL to store complex biological data. The InterMine Neo4j project attempted to replace the current relational database with a Neo4j graph database; it developed the data loaders for an InterMine Neo4j database and the RESTful API from which one can query an InterMine Neo4j database using PathQuery. However, mapping an InterMine data model to a graph is a very non-trivial task, requiring many undesirable hacks.

We’d like to try a different project in 2020 by attempting to implement a MongoDB-based datastore instead of Neo4j. A tuple-based datastore like MongoDB seems to be a much better fit for the heirarchical Java-based InterMine data model.

Required skills:

  • Good understanding of Java
  • Good understanding of RESTful APIs
  • Good understanding of tuple (NoSQL) databases, preferably MongoDB experience
  • GitHub
  • Note: no biology skills needed, this is straight-up database and API programming

Possible mentors:

  • primary mentor: Sam Hokin - shokin@ncgr.org
  • secondary mentor: Andrew Farmer - adf@ncgr.org

Please email both mentors for all communications. Sam and Andrew don’t usually hang out in the InterMine chat, but you’re welcome to come say hi and meet the other students anyway :)

Expected outcome: A fully functional InterMine running from a MongoDB backend rather than from PostgreSQL.

How could this integrate with existing InterMine tools? This would integrate directly with the InterMine server package.

Difficulty level: Fairly difficult.


Generating RDF from InterMine database

Description: InterMine is an open source integration system that uses PostgreSQL to store complex biological data. The data model is generated from a core model XML file and any number of additions files which define extra classes and fields. The data stored in the InterMine instances can be access via web interface and restful APIs, and exported in several formats (JSON, XML, CSV, FASTA).
In order to improve data interoperability, we aim to:

  • provide the data available in an InterMine instance in RDF format. InterMine instances are updated by periodic rebuilds, we will generate the RDF at build time;
  • implement a new web service endpoint wich executes a query and returns the result in json format to be used by others in their federeted queries.

Required skills:

  • RDF
  • Java
  • git
  • no biology skills needed

Possible mentors:

  • Daniela Butano (daniela@intermine.org)
  • Arunan Sugunakumar (arunans.14@cse.mrt.ac.lk)
  • Rahul Yadav (rahulyadavwk@gmail.com)

Please email Daniela for all communications.

Expected outcome: bulk download created at build time from InterMine data

How could this integrate with existing InterMine tools? This would integrate directly with the InterMine server package.

Difficulty level: Medium


Implementing InterMine model persistence with Hibernate

Description: InterMine is an open source integration system that uses PostgreSQL to store complex biological data. The data model is flexible and generated from a core model XML file and any number of additions files which define extra classes and fields. InterMine stores data using a custom Object Relational Mapping and we would like to prototype a new solution using Hibernate. In addition, because InterMine accepts queries over its data in a custom format called PathQuery, we need to convert the PathQuery into the Hibernate query language / Criteria Query.

Required skills:

  • Hibernate
  • Java
  • git
  • no biology skills needed

Possible mentors:

  • Daniela Butano (daniela@intermine.org)
  • Arunan Sugunakumar (arunans.14@cse.mrt.ac.lk)
  • Rahul Yadav (rahulyadavwk@gmail.com)

Please email Daniela for all communications.

Expected outcome: biotestmine prototype

How could this integrate with existing InterMine tools? This would be deeply integrated with the InterMine server package.

Difficulty level: Medium


Improve UI & UX of BlueGenes

Description: BlueGenes (GitHub, Demo) is a ClojureScript browser-based UI for InterMine. The webapp has been designed by software engineers with limited experience in creating beautiful and joyful interfaces. We want a more UI/UX inclined person to evaluate and improve the interface of BlueGenes.

This can be either a frontend engineer with skills in UI/UX (GSoC and Outreachy), or a designer (coding skills are optional in this case - Outreachy only).

Tasks:

  • Create responsive (mobile-friendly) designs of existing pages
  • Create new designs of existing pages with improvements in UI/UX
    • This can include im-tables-3, a dynamic result table used in BlueGenes.
    • We particularly want design suggestions for report pages of genes, proteins, etc.
  • Address UX concerns raised during user testing (we have a list of these)
  • Iterative user testing with volunteers in the InterMine community
  • Other UI related issues in the BlueGenes repository can be tackled as a stretch goal

Required skills:

  • Graphics, UI and UX design (preferably with a portfolio)
  • HTML and CSS (optional if you’re a designer)
  • Bonus: Interest in learning and working with the Hiccup HTML syntax

Possible mentors:

  • Kevin Herald Reierskog (khr29@cam.ac.uk)
  • Yo Yehudi (yy406@cam.ac.uk)
  • Nikhil Vats (nikvats5499@gmail.com)

Please always email both mentors in the same mail if you would like to ask questions or discuss the project.

Expected outcome: A new BlueGenes which is more beautiful and joyful to use.

Difficulty level: Medium


InterMine Boot

Description: InterMine Boot is a part of the InterMine cloud project. It is a command line interface (CLI) tool written in Python. We intend to provide some important convenience features to users, like single command setup of InterMine instances locally and on CI environments, interacting with InterMine cloud api to upload data files etc. We have a document that lists all the requirements here. An initial implementation of the CI use case can be found here.

Tasks:

  • Read and understand the requirements.
  • Then, talk to us about it and ask as many questions as you can!
  • Implement the single command setup of InterMine instances on CI use case first
  • Use wizard, compose and configurator to implement the local use case
  • Get familiar with Intermine Cloud API and implement data file uploads
  • If you are feeling adventurous, you can optionally use the intermine-cloud repo to implement setting up of InterMine instances on public cloud like AWS and GCP.

Required Skills:

  • Python
  • Docker
  • Rest APIs and Data handling
  • Git and Travis CI
  • no biology skills needed

Possible Mentors:

  • Kevin Herald Reierskog (khr29@cam.ac.uk)
  • Ankur Kumar (leoank98@gmail.com)

Expected Outcome: A CLI tool that will launch InterMine instances on the cloud or locally.

Difficulty: Medium if you are comfortable with docker, python and linux environment. Harder otherwise.