BioRange Human Media Interaction - University of Twente

Help - Rshell

Introduction

Bioinformatics experiments often contain a lot of statistics. These statistics are often done using the statistical tool R (www.r-project.org).

Requirements

Installation

You have to download the Rshell.jar and the JRClientRF_UT.jar at the download page (BioRangeDownloads). The first one is the Taverna plugin for running R scripts, the later is a library for establishing connections with an Rserve server for executing the R scripts. Our version of JRClient corrects a Java Serializable bug in the original version. This will hopefully be corrected in the original verions.

Place these two files in the ./lib directory of your Taverna installation. Be sure that this modified version of the JRClient is before the original version in the java classpath. If you don't know how to do this, you can simply delete the old JRClientXXX.jar file in the ./lib directory, since the new one is backwards compatible with the original library.

The R scripts can be executed on an external server. It is also possible to executed them localy. Therefore, you need a local installation of R and an installation of the Rserve server. These can be found below:

Creating a new Rshell process

The factory for making Rshell processes is available in the service explorer, which is the frame with the title "Available Services". To create a new Rshell process, select the Rshell symbol and drag it to the "Advanced model explorer". rshellProcess.png
For setting the script, adding ports or modifying connection settings, the Rshell needs to be configured. In order to do this, right-click on the Rshell process in the "Advanced Model Explorer" and select "Configure Rshell...". rshellConfigPanel.png

Rshell Script

The script-tab is used for typing the R-script to be executed by the Rshell process. The script can be typed manually, or can be read from an R-file (ending with ".R"). The input and output variables do not have to be declared here, since this is done in the input port tab and output port tab. In our example, the Rshell generates a list of random numbers. The length of the list is equal to numberOfRandomValues. rshellRScript.png

Creating ports

The Rshell has no ports by default. To add input ports and output ports, one has to select respectively the input port tab or output port tab. Each port has a name and is represented as a variable in the R script. Besides a name, every port has a type, which specifieces the type of data it expects to be.

Port types

Input ports

Input ports are the port from which the Rshell can receive data coming from other processes in Taverna. Modifying the variable representing the input port in the R script has no influence on the input-port itself.
To add a new input port, click on the "Create input port" button. Now, a popup box will appear, where one can enter the port name. When the name is entered, the created port will appear in the port table. rShellInputPort.png
By default, a port is of type "R-expression". To modify the type, one can click on the current port type. Now, a drop-down list will appear. One can select one of the types in this drop-down list. rshellInputPortTypeSelection.png

Output ports

Configuring connection to the Rserv

Output ports are used for passing data to other processes in Taverna. They represent the processors output. Creating and modifying the output port is done in the same way as input ports are created adn modified. The name of the output port is also the variable representing the output port. Only the last assignment to the variable representing the output port is passed to the Taverna. rshellOutputPort.png
By default, the Rshell is configured for making connections with a Rserv on the same computer, in other words localhost. It can be the case that the Rserve is on another system. Then one has to change the connection settings.

The connection settings are:

  • Hostname representing the url of the computer running Rserve
  • Port this is the port number the client can connect the Rserve
  • Username the username, if required, otherwise it needs to be blanc
  • Password the password required for logging in, if not needed, leave it blanc

Besides these settings, there is an option called "Keep session alive". If this option is selected, then for all Rshell processes where this option is selected and which have the same connection settings, including username and password, share the same connection. The connection will be closed when all running workflows are finished.

This option is usefull to keep and to reuse variables stored at the Rserve. Normally when a connection is closed, all variables are thrown away or in other words, the session is closed. This can be used to prevent transfering huge amounts of data between Rshell processes. When a Rshell A process needs variables created in an other Rshell, named it B, then one needs to add a constraint between the processes A and B. This constraint prevents process A being executed before process B is finished.

rshellConnectionSettings.png

The result workflow

The figure below illustrates the created workflow. This workflow generates a list of N random numbers, where N is the number inserted as workflow input. The result list containing the random number is passed to the output.
rshellWorkflowExample.png

Transfering images

Rshell supports transfering images to Taverna. This is done using normal ports. To do this, a port needs to be added with the name corresponding to the image. rshellImageOutput.png
When one wants to write to the image, the graphics device needs to be linked to png-device. This is done using the png command where the port name is the parameter of this command. When plotting is finished, the device needs to be closed, using the dev.off() command.

An example of the script is given below:

png(PNGRandomPlot);
plot(rnorm(1:100));
dev.off()

The result of the script is given in the image below.

rshellImagePlot.png

Download the example

The example workflow can be downloaded below. In addition, a workflow is listed where the connection with the server is kept allive between the Rshell processes.


BioRange @ University of Twente - Human Media Interaction (HMI) - Ingo Wassink
http://www.ewi.utwente.nl/~biorange