Python Webservice API Client
Purpose
To help those who would like to automate queries, InterMine provides a client library to simplify access to our REST webservice API. This library is designed to facilitate access to information when you already know what you are looking for, or when you are doing lots of similar queries on a repetitive basis.
Prerequisites
The InterMine client package is pure Python and has been tested successfully on Python 2.5, 2.6, 2.7, pypy and Jython, and on Linux (Ubuntu 10.10, 11.04), Mac OS 10.5 and Windows XP. Python 2.4 is not supported, and neither is python3. By all means, if you absolutely must have python3 support, let us, know, and we will try and sort something out.
There are some very minimal prerequisites - if you run a modern Linux, you probably already meet them:
- Python (preferably v2.6 or above)
- Setuptools/Distutils - Windows users will need to install this if they want to use easy_install.
- If you have v2.5, or you are using Jython (this primarily concerns Mac OS users), you will also need simplejson. On Linux/Mac? OS X:
sudo easy_install simplejson
On Windows:easy_install.exe simplejson
Installation
The webservice client package is easily installed with the following command in Linux and Mac OS X:
sudo easy_install intermine
on Windows, you can use the graphical installer or easy_install (see above - remembering to make sure you add setuptools to your PATH):
easy_install.exe intermine
Downloading the Source
If you do not have access to easy_install, you can also download our source code in several ways:
From intermine.org:
wget http://www.intermine.org/lib/python-webservice-client-0.97.01.tar.gz
And from PyPi:
wget http://pypi.python.org/packages/source/i/intermine/intermine-0.97.01.tar.gz
From github:
git clone git://github.com/FlyMine/intermine-ws-python.git
From the InterMine svn repository:
svn co svn://subversion.flymine.org/trunk/flymine/intermine/python
If you download the source code, you will need to unpack it (in the case of tarred distributions), enter the main source directory, and then install with the following commands (testing is optional):
python setup.py test sudo python setup.py install
Documentation
Once installed, documentation is available with the pydoc command:
pydoc intermine.webservice
Useful documentation pages include:
- intermine.webservice
- intermine.query
You can view documentation online at:
Constructing Queries
You main interface to the library will be through the intermine.webservice.Service class. You will probably need the following two lines at the top of every python script you write:
from intermine.webservice import Service service = Service("http://www.flymine.org/query/service")
If you want to access private templates or make queries that refer to your private lists, you will need to provide your log-in details as well:
from intermine.webservice import Service service = Service("http://www.flymine.org/query/service", "YOUR_LOGIN", "YOUR_PASSWORD")
Obviously, you should change the URL in the example above to that of the service you wish to query. Some examples of webservices that support this API include:
FlyMine
RatMine
YeastMine
modMine
MetabolicMine
Constructing a Query
Once you have specified a webservice, you can construct a query:
query = service.new_query()
The query will fetch its data model from the service you specify, which it uses to validate the query you construct. The query will also fetch its results from the service you specify.
Queries are composed of three central concepts:
- The kind of thing you want back (the root class)
- The information about those things you want to see (the view)
- Rules to filter which things to include or not (the constraints)
Optionally, you can also specify the ways the results are sorted, (the sort order) and the ways the rules are combined (the constraint logic).
The root class
Generally you want to get back information about one kind of thing - say you have genes and you want their protein residues, then what you want is a list of proteins, for which you want to see their residues. So all paths in the query should descend from the same class, here Protein.
Paths are strings which are used to specify which information to supply and receive, such as Gene.organism.name (the name of the organism for a gene) and Protein.proteinDomains.name (the name of the protein domain for a protein). The root class refers to the first element in these dotted path strings.
The view
If you want to see these proteins' residues, then you must specify them as output columns:
query.add_view("Protein.name", "Protein.symbol", "Protein.sequence.residues", "Protein.genes.symbol")
This view will produce a table where each row has four columns, with the name of the protein, its symbol, the residue of the protein, and the symbol of the gene that is associated with the protein
The Constraints
Constraints are filters on the data we get back - if we don't, we will get back information from all proteins in the database. Instead, we will filter it so that it only contains information about proteins connected to genes we are interested in:
query.add_constraint("Protein.genes.symbol", "ONE OF", ["zen", "hairy", "eve"])
Now we will only get back information for proteins associated with one of these three proteins.
See pydoc intermine.constraints for information about the different constraint types.
Results
Results can be returned in a number of ways. The default is an iterator of lists, where each row is a list of columns.
for row in query.results(): for col in row: process(col)
You can also get dictionaries, where the key is the view path you selected as the column header:
for row in query.results("dict"): if row["Protein.genes.symbol"] == "zen": process_zen_row(row)
And if you prefer, you can get access to the raw strings, where each row is a comma-delimited line of fields. This is most useful if you are printing, or piping the output:
for row in query.results("string"): print(row)
Logging In to Get Access to Private Lists and Templates
You can access any of your private lists and templates by authenticating with the service:
from intermine.webservice import Service service = Service("http://www.flymine.org/query/service", YOUR_USER_NAME, YOUR_PASSWORD) template = service.get_template("my_private_template") ...
Examples
The following is an example of a complex arbitrary query:
#!/usr/bin/env python # The following two lines will be needed in every python script: from intermine.webservice import Service service = Service("http://www.flymine.org/query/service") # Get a new query from the service you will be querying: query = service.new_query() # The view specifies the output columns query.add_view( "Gene.symbol", "Gene.name", "Gene.organism.shortName", "Gene.microArrayResults.affyCall", "Gene.microArrayResults.enrichment", "Gene.microArrayResults.mRNASignal", "Gene.microArrayResults.tissue.name" ) # The default sort order is sorting by the first column value, ascending # You can specify a custom sort order: query.add_sort_order("Gene.symbol", "ASC") # Multiple sort orders specify which column to do secondary sorting on: query.add_sort_order("Gene.name", "DESC") # There are various different contraint types... # You can constraint a path to represent a more specific class (Subclass Constraint) query.add_constraint("Gene.microArrayResults", "FlyAtlasResult") # You can constrain an element to be one of a set of elements (Multi Constraint) query.add_constraint("Gene.microArrayResults.tissue.name", "ONE OF", ["Adult carcass","Adult eye","Adult fat body"]) # You can constrain a in relation to standard alegbraic operators: # You can use: =, !=, <, >, <=, >= # These also operate on strings query.add_constraint("Gene.length", "<", "1000") # You can constrain a table to have any field match a value, optionally supplying an origin: query.add_constraint("Gene", "LOOKUP", "zen", "D. melanogaster") # You can constrain a value to have/not have a value, regardless of what it is: query.add_constraint("Gene.interactions.interactionType", "IS NULL") # The default logic is "A and B and C ..." # Available operators are "and" and "or" query.set_logic("(A or B) and C and D") print(query.results())
A webservice's collection of predefined queries (templates) can also be accessed from the client library:
#!/usr/bin/env python # The following two lines will be needed in every python script: from intermine.webservice import Service service = Service("http://www.flymine.org/query/service") # Find all screen results for a specified amplicon from a specified DRSC RNAi # screen. Result types can be: not screened, not a hit, weak, medium or strong # hit. (Data Source: DRSC). template = service.get_template('AmpliconScreen_RNAiResults') results = template.results( A = {"op": "=", "value": "16452979"}, # from the RNAi screen published in this paper: B = {"op": "LOOKUP", "value": "DRSC04651", "extra_value": ""} # Show all results for the following amplicon ) print(results)
These can have the advantage of being more concise. You can write your own prefined queries, save them as templates and use them from the webservice by providing your log-in details.
Further Examples
You do not have to write your own code - you can get the webapp for your preferred webservice to generate the code for you - simply click on one of the code generation links at the bottom of any query-builder or template-form page. This will produce code you can immediately run, and perhaps later edit to refine the query:
Other clients
We also have client libraries for:


