Release History
For those who do not use Unix, Perl or R, here is a Windows version made available on Oct 28, 2008.
All C# source code is also made available.
Installation
Download the installation .MSI file, run it on your Windows machine. The program will prompt you to install a free MS .NET Framkwork 3.5 package (from MicroSoft) first, if it is not already installed. Successful installation will create a program folder called "GNF" under your Start menu. Feel free to create a desktop shortcut afterwards.
The windows package include two executables, RSAGUI.exe is an Windows application, RSAConsole.exe is a command line program.
Usage of Graphical Interface
Step 1. Click "Input" to choose an input file. The file must be in CSV format
the spreadsheet must contain at least three columns:
Gene_ID: the gene identifier for the well
Well_ID: the well identifier
Score: numerical value for hit picking
Notice:
1) the order of the these three columns can be arbitrary
2) wells share the same Gene_ID are consider independent siRNAs for the same gene
3) wells are ignored, if Gene_ID or Score is not defined
Step 2. Specify lower boundary and upper boundary.
Check "Reverse" only if you are looking for high-score wells.
Step 3. Specify output file name, click "Run".
To understand the settings for "Boundaries" and "Reverse", please read the two "Command line examples" below.
Usage of The Command Line Version
The command line version is useful, if you have many data files need to analyze. You can type commands and run them as a Windows batch job automatically.
Syntax: RSAConsole.exe [Options]
-lb: lower_bound, defaults to 0
-ub: upper bound, defaults to 1
-isReverse: reverse hit picking, the higher the score the better
if -isReverse flag is off, the lower the score the better
-inputFN: input file name
-outputFN: output file name
Command Line Examples
RSAConsole.exe --lb 0.2 --ub 0.8 --inputFN="C:\input.csv" --outputFN="C:\output.csv"
wells with lower scores are considered more active
wells <=0.2 are guaranteed hits, wells >0.8 are guaranteed non-hits
wells (0.2,0.8] are determined by RSA algorithm
output results in output.csv file
RSAConsole.exe --lb 1.5 --ub 2.0 --inputFN="C:\input.csv" --outputFN="C:\output.csv"
wells with higher scores are considered more active (specified by -r flag)
wells >=2.0 are guaranteed hits, wells <1.2 are guaranteed non-hits
wells [1.2,2.0) are determined by RSA algorithm
output results in output.csv file
Output Format
Gene_ID,Well_ID,Score: columns from input spreadsheet
LogP: RSA p-value in log10, i.e., -2 means 0.01;
RSA_Hit: whether the well is a hit, 1 means yes, 0 means no;
#hitWell: number of hit wells for the gene
#totalWell: total number of wells for the gene
if gene A has three wells w1, w2 and w3, and w1 and w2 are hits,
#totalWell should be 3, #hitWell should be 2, w1 and w2 should have RSA_Hit set as 1
and w3 should have RSA_Hit set as 0.
OPI_Rank: ranking column to sort all wells for hit picking
Cutoff_Rank: ranking column to sort all wells based on Score in the simple activity-based method
Note: a rank value of 999999 means the well is not a hit. We put a large rank number here
for the convenient of spreadsheet sorting.
Examples A in output.csv:
-------------------------
1221200,7_O20,0.0541,-6.810,1,3,3,1,33
1221200,18_A21,0.0626,-6.810,1,3,3,2,43
1221200,41_A21,0.0765,-6.810,1,3,3,3,72
Gene ID 1221200 has three wells, 7_O20, 18_A21 and 41_A21. All show good scores.
Therefore 3 out of 3 wells are hits (#totalWell=3, #hitWell=3, RSA_Hit=1 for all three wells)
LogP is -6.810. These three wells are ranked as the best three wells by RSA.
However, they are ranked as the 33th, 43th and 73th well by the traditional cutoff method.
Examples B in output.csv:
-------------------------
3620,21_I17,0.0537,-2.344,1,1,2,162,31
3620,44_I17,0.7335,-2.344,0,1,2,999999,4113
Gene ID 3620 has two wells, 21_I17 is active, while 44_I17 is relative inactive.
RSA decides that only 1 out of the 2 wells is a hit. Therefore one well has RSA_Hit set as 1,
and the other 0. #totalWell=2, but #hitWell=1.
The first well is the 162th hit by RSA, 31th by cutoff method.
The second well is not a hit by RSA, 4113th by cutoff method.
Credits
Bin Zhou, bzhou_at_gnf_dot_org & Yingyao Zhou, yzhou_at_gnf_dot_org, Oct 28, 2008