Querying over genomic ranges
InterMine includes functionality for querying features with overlapping genome coordinates. We have an index that is created on the
Location table. This is used by a 'virtual'
SequenceFeature.overlappingFeatures collection that is a
view in the postgres database using the native Postgres index to find other features that overlap it.
In modMine (the InterMine for the modENCODE project), we also create
GeneFlankingRegion features to represent specific distances upstream and downstream of genes to query for genes that are nearby other features.
#Create the index
You need to create the index on the location table in your production database by adding the
create-location-range-index post-process step to your
project.xml file like so:
#Create the overlappingFeatures view
SequenceFeature.overlappingFeatures view in the database. This allows you to query for any features that overlap any other types of features in the web interface or query API. Add the
create-overlap-view post-process step, which needs to be located after
create-location-range-index in your project XML file.
Now any queries on the
overlappingFeatures collections will use this view and the new index.