Application Instructions:

Input Data

There are currently two ways to input your RNA-seq data to make kinase inhibitor cell viability predictions. The first is to upload the "quant.sf" file from the salmon (or compatible) read aligner, while the second is to specify a GEO ID for a human RNA-seq data sample.

Salmon Upload

Two transcript databases are supported, ensembl (ENST identifiers) and RefSeq (NM_ and NR_ identifiers). Please contact us if you would like to see a different type of processed RNA-seq data format supported. The server is designed to only work on a single RNA-seq data set at a time, if you have a substantially larger set of RNA-seq profiles for which you would like to make predictions, please get in touch with me (matthew dot berginski at gmail dot com).

You can download an example of a sample salmon output file if you would like to test the server or see what the file looks like.

GEO ID

Alternatively, you can input a GEO database ID. This input method is enabled by the ARCHS4 project. As you start typing, you will see a list of GEO IDs that are available for processing.

Processing Steps

After inputting a data set, the processing pipeline will organize your data and search for the data related to the genes used in the model. Then the model will be loaded, and cell viability predictions will be made for your data set for all 229 compounds from the Klaeger et al. set. Finally, a preview of the results will be displayed with an option to download the full predictions and a summary document highlighting some of the most interesting compounds.

The processing should take less than a minute and progress indicators will appear in the bottom corner.

Data Set Submission Success

The server has recieved your data and is in the queue to be processed. Once processing has begun, progress notifications will appear in the bottom right hand corner and it typically takes about 45 seconds to produce prediction results.

Data Set Submission Problem

There is a problem with the data set or GEO ID you have submitted to the server, please try again.

The model has finished running and a summary of your results follows. You will find two buttons at the bottom of this section to download a CSV file with the model predictions and a report with more background on understanding your results.


Compound Viability Predictions

To provide context for the predictions from your data, all of the following plots also show a summary of the predictions from the cell lines in the CCLE. The gray shaded region shows the range (95% coverage) of predictions for that compound, while the black line shows the average prediction. Predictions for your data appear as a blue line.

To help pick out potentially interesting compounds from the model predictions, we've sorted the predictions using four different methods:

  • The compound is predicted to have minimal effects on cell viability.
  • The compound is predicted to have high average effect on cell viability.
  • The compound is predicted to show a wide range of effect on cell viability.
  • The compound is predicted to vary from the CCLE average effect.

These categories are not mutually exclusive, so it's possible that a single compound will be present in multiple compound sets. Otherwise, the results are displayed as a set of small multiple graphs with the compound name in the title section.

Your results are appear in blue, while a set of reference predictions appear in gray.

Minimal Predicted Effect

Highest Predicted Effect

Highest Range of Predicted Effect

Largest Difference with CCLE Lines