Raw Data file formats:
1) Suffix: no limitations.
2) Separating token: tab delimiter.
3) Format:
1st line: contains the string "probeId" and a tab delimiter, followed by the string "geneSymbol" and a tab delimiter, followed by the names of all conditions separated by tab delimiters. Each condition name can appear either as a stand-alone name or in the format of <category name>/<condition name>. In the second case condition categories will be used as condition classifications by Expander.
Next lines: Each subsequent line consists of the probe ID (an identifier string that is unique to each probe in the chip), followed by a string, which represents the gene full name (if missing can be left empty by adding an additional tab delimiter), followed by its expression values (all tab delimited). If the expression file contains missing values, Expander will replace them with 0 values incase this is Oligonucleotide array data. Expander currently does not deal with missing values for cDNA microarray data.
*For example see files “expressionData1.txt” and “expressionData2.txt” in the Expander/sample_input_files/ directory.
If the data is not in the above format, it may be possible to load it using the “Advanced” dialog box, which appears upon pressing the “Advanced” button in the Expression Data load dialog box (see Loading Input Data).
Gene Sets file:
1) Suffix: no limitations
2) Format: Each line contains a gene ID, a gene symbol (optional) and
the number of its set. Each field in the line should be tab separated from the
previous field.
The gene IDs are expected to be of the same convention used in the GO
annotation and TF fingerprint files. For details regarding the Gene ID
convention that is used for each organism, refer to the Supplied
files section.
*For example see file “geneSetsData1.txt” under the Expander/sample_input_files/ directory (see Sample input files for more details).
Suffix: Currently, there are no limitations regarding the file name suffix.
Format: Each line contains the probe id as it appears in the data file,
a tab separator and the corresponding gene ID (e.g. Entrez/Locus-Link ids for
mouse and human genes and ORF codes for yeast). The second field can be left
blank, indicating no conversion for that probe ID.
* It is possible that several probe IDs in the data file will be mapped to the
same gene ID (e.g.: several ESTs from the same gene).
1) Suffix: no limitations.
2) Format: Each line contains the probeID, a tab separator and the
number of its cluster.
Cluster number 0 is reserved for genes that are left unclustered. The file does
not have to contain all genes in the data. If a gene does not appear in the
file, it is automatically set as unclustered.
*For example see file “expressionData1Clustering.sol” (a clustering solution for the data file” expressionData1.txt”) under the Expander/sample_input_files/ directory (see Sample input files section for more details).
1) Suffix: `.bic`.
2) Format: the file is composed of two parts, presented here.
Part 1 presents a summery of the biclusters found.
Part 2 presents the probesets and the conditions contained in each bicluster.
1) Suffix: no limitation.
2) Format: each line should contain one gene ID. The gene IDs are expected to be of the same convention used in the annotation and TF fingerprint files for the organism you are working on (please refer to the Supplied files section).