Importing Data
CyDAS stores cytogenetic data along with data on patients and their examinations (investigations) in the CyDAS database.
Import
The data import utility is accessed from main window in the File-Import menu. On the Import form, choose the type of data you wish to import.
There are default values for the file name containing the data for each type of import filter. If the data are stored in a file with a different name, you can select it by clicking the "..." button which opens a common file open dialog.
Click "OK" to start the import. After the import, you will be informed on the quantities of imported data.
All new data are imported into the system group "NEW" and can be transferred into one or more groups / subgroups from this location (see Working with Groups).
Mitelman
Data downloaded from the Mitelman database and stored as a plain text file can be imported here (see also "How to download data from the Mitelman database").
You may also use this option for importing your own data, if they are structured as are the Mitelman data:
- 1st field: Reference Number: identifier of a publication
- 2nd field: Case Number
- 3rd field: Investigation Number
- 4th field: Author, Year (publication data)
- 5th field: Journal Name (publication data)
- 6th field: Volume, Page (publication data)
- 7th field: Morphology
- 8th field: Topography
- 9th field: Short Karyotype (ISCN formula of all clones of the karyotypes).
Fields must be separated by a tabulator. If no data are present for a field, that field must still be present but left empty or 0 for number fields. Data must not be surrounded by quotes. A header line is expected to be present.
Example:
Reference Number Case Number Investigation Number Author, Year Journal Name Volume, Page Morphology Topography Short Karyotype 3409 1 1 Abdi et al 1990 J Pakistan Med Ass 40:9-11 Acute lymphoblastic leukemia, FAB type L1 48,XY,+2,+8,t(13;22)(q?;q?) 3409 3 1 Abdi et al 1990 J Pakistan Med Ass 40:9-11 Acute lymphoblastic leukemia, FAB type L1 48,XY,-11,+3mar 3409 6 1 Abdi et al 1990 J Pakistan Med Ass 40:9-11 Acute lymphoblastic leukemia, FAB type L2 47,XX,+18 1139 1 1 Abe & Sandberg 1984 Cancer Genet Cytogenet 13:121-127 Acute lymphoblastic leukemia, NOS 46,XY,t(4;11)(q21;q23) 606 1 1 Abe et al 1979 Am J Hematol 6:259-266 Acute lymphoblastic leukemia, NOS 46,XY,del(5)(q12q23),del(9)(p21) 410 1 1 Abe et al 1982 Cancer Genet Cytogenet 7:185-195 Acute lymphoblastic leukemia, FAB type L3 46,XY,t(8;22)(q24;q12)/46,idem,+del(1)(p22),-22/46,idem,add(1)(q?),+del(1),-5 838 1 1 Abe et al 1983 Cancer Genet Cytogenet 9:139-144 Acute lymphoblastic leukemia, NOS 46,XX,del(11)(q13q23),ins(19;11)(p13;q13q23) 1162 1 1 Abe et al 1985 Cancer Genet Cytogenet 14:45-59 Acute lymphoblastic leukemia, FAB type L2 48-52,XX,+7,+11,+12,+13,+14,i(17)(q10),+20,+22 1162 1 2 Abe et al 1985 Cancer Genet Cytogenet 14:45-59 Acute lymphoblastic leukemia, FAB type L2 103,XXXX,+2,-4,+7,+7,+11,+12,+12,+13,+14,+16,i(17)(q10)x2,+20,+20,+22/53,XX,+X,+7,+11,+12,+13,i(17)(q10),+20,+22 1303 1 1 Abe et al 1985 Cancer Genet Cytogenet 18:49-54 Acute lymphoblastic leukemia, FAB type L2 46,XX,t(9;22)(q34;q11) 2398 1 1 Abe et al 1988 Cancer Genet Cytogenet 31:279-283 Acute lymphoblastic leukemia, NOS 46,XY,del(9)(p13p22),add(10)(p11),del(11)(q21q23) 5513 1 1 Abeliovich et al 1994 Cancer Genet Cytogenet 76:70-71 Acute lymphoblastic leukemia, FAB type L2 45,X,-Y/46,XY,t(9;22)(q34;q11) 1059 1 1 Abromowitch et al 1984 Br J Haematol 56:409-416 Acute lymphoblastic leukemia, FAB type L1 46,XY,t(1;19)(q23;p13),add(13)(q?) 1059 1 2 Abromowitch et al 1984 Br J Haematol 56:409-416 Acute lymphoblastic leukemia, FAB type L1 85,XXYY,-1,t(1;19),-2,-3,-4,del(4)(q23),-5,del(5)(p13),del(6)(q15),-7,+8,+8,+del(8)(p21),-9,-10,-12,dup(14)(q13q32)x2,-16,-17,-18,der(19)t(1;19),+20,+21,+21,-22,-22,+mar 4455 1 1 Abshire et al 1992 Leukemia 6:357-362 Acute lymphoblastic leukemia, NOS 44,X,-X,-20,t(20;22)(p?;q?),-22 4455 1 2 Abshire et al 1992 Leukemia 6:357-362 Acute lymphoblastic leukemia, NOS 44,X,-X,del(2)(q?),t(6;9)(q?;q?),-20,+t(20;22),-22 879 8 1 Aide et al 1981 Acta Acad Med Wuhan 1:7-15 Acute lymphoblastic leukemia, NOS 46,XY,t(9;22)(q34;q11)
Custom Filters
Several options for importing your own data were prepared and are described below. You can write your own import filters based on the basic concepts of the filters for the Mitelman data, the CDB structure and the CGH structure. You may also remove or rename or simply change existing filters. All filters expect the data to be stored in flat text files. For more details, see the chapter on Import Filters.
For simplicity, it is suggested that you use the pre-defined filters described below.
CDB1 (custom database #1)
The distinct pre-defined CDB types vary with the field separator (pipe "|" or tabulator), and the notation of polyclonal karyotypes (one clone of a karyotype per line, or all clones of a karyotype in one field).
With CDB1, one line with the following configuration is required for each clone of a karyotype:
- 1st field: patient identifier
- 2nd field: name of patient
- 3rd field: date of birth
- 4th field: sex (M male, F female)
- 5th field: diagnosis (morphology)
- 6th field: case number (ID of investigation/examination)
- 7th field: date of investigation/examination
- 8th field: age (in years; if 0, it will be calculated from date of investigation and date of birth)
- 9th field: sample / topography
- 10th field: total number of metaphases examined
- 11th field: number of the clone of the karyotype
- 12th field: number of modal chromosomes
- 13th field: ISCN formula of a clone of the karyotype
- 14th field: clone size (number of metaphases; if not specified, it may be calculated from the ISCN formula)
Fields must be separated by a pipe character ("|"). If no data are present for a field, that field must still be present but left empty or 0 for number fields. Data may optionally be surrounded by quotes ("). No header line is present.
If your database does not have a patient identifier, you may use the case number instead. But this will cause the few patients with more than one examination to be imported as several distinct patients. Generally, this would not cause problems.
Example:
1 | 2 | 3 |4| 5 | 6 | 7 |8| 9 |10|11|12| 13 |14 PID01|Hans Miller|15.08.1937|M|Adenocarcinoma|IID001|21.11.1997|0|Bladder|32|1|47|47,XY,+7|15 PID01|Hans Miller|15.08.1937|M|Adenocarcinoma|IID002|11.01.1998|0|Bladder|40|1|47|47,XY,+7|22 PID01|Hans Miller|15.08.1937|M|Adenocarcinoma|IID002|11.01.1998|0|Bladder|40|2|48|48,XY,+7,+8|12 PID02|Maria Meier|23.04.1944|F|Adenoma|IID003|15.02.1998|0|Bladder|0|1|0|47,XY,+X|11
CDB2 (custom database #2)
Custom database #2 has essentially the same data structure as CDB1, but the field separator is a tabulator. As with CDB1, data may optionally be surrounded by quotes ("). No header line is present.
Example:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 PID01 Hans Miller 15.08.1937 M Adenocarcinoma IID001 21.11.1997 0 Bladder 32 1 47 47,XY,+7 15 PID01 Hans Miller 15.08.1937 M Adenocarcinoma IID002 11.01.1998 0 Bladder 40 1 47 47,XY,+7 22 PID01 Hans Miller 15.08.1937 M Adenocarcinoma IID002 11.01.1998 0 Bladder 40 2 48 48,XY,+7,+8 12 PID02 Maria Meier 23.04.1944 F Adenoma IID003 15.02.1998 0 Bladder 22 1 47 47,XY,+X 11
CDB3 (custom database #3)
Again, custom database 3 is similiar to CDB1. But here, all karyotypes are given in one field. Karyotypes are separated from each other with forward slashes("/") and their respective clone sizes is optionally given in rectangular brackets ("[]").
One line with the following configuration is required for each case (or examination or investiagtion):
- 1st field: patient identifier
- 2nd field: name of patient
- 3rd field: date of birth
- 4th field: sex (M male, F female)
- 5th field: diagnosis (morphology)
- 6th field: case number (ID of investigation/examination)
- 7th field: date of investigation/examination
- 8th field: age (in years; if 0, it will be calculated from date of investigation and date of birth)
- 9th field: sample / topography
- 10th field: total number of metaphases examined
- 11th field: number of modal chromosomes
- 12th field: ISCN formula of all clones of the karyotype (if there is more than one clone, the clones are separated by a "/"; clone sizes are optionally given in brackets ("[]")
Fields are separated by a pipe character ("|"). If no data are present for a field, that field must still be present but left empty or 0 for number fields. Data may optionally be surrounded by quotes ("). No header line is present.
Example:
1 | 2 | 3 |4| 5 | 6 | 7 |8| 9 |10|11| 12 PID01|Hans Miller|15.08.1937|M|Adenocarcinoma|IID001|21.11.1997|0|Bladder|32|47|47,XY,+7[15] PID01|Hans Miller|15.08.1937|M|Adenocarcinoma|IID002|11.01.1998|0|Bladder|40|47|47,XY,+7[22]/48,XY,+7,+8[12] PID02|Maria Meier|23.04.1944|F|Adenoma|IID003|15.02.1998|0|Bladder|22|47|47,XY,+X[11]
CDB4 (custom database #4)
Custom database 4 corresponds to custom database 3, but the field separator is a tabulator. As with CDB3, data may optionally be surrounded by quotes ("). No header line is present.
Example:
1 2 3 4 5 6 7 8 9 10 11 12 PID01 Hans Miller 15.08.1937 M Adenocarcinoma IID001 21.11.1997 0 Bladder 32 47 47,XY,+7[15] PID01 Hans Miller 15.08.1937 M Adenocarcinoma IID002 11.01.1998 0 Bladder 40 47 47,XY,+7[22]/48,XY,+7,+8[12] PID02 Maria Meier 23.04.1944 F Adenoma IID003 15.02.1998 0 Bladder 22 47 47,XY,+X[11]
CGH1
The CGH option closely resembles the CDB options, but is adapted for use with CGH data.
For each case, one line with the following configuration is required for each case (or examination or investiagtion):
- 1st field: patient identifier
- 2nd field: name of patient
- 3rd field: date of birth
- 4th field: sex (M male, F female)
- 5th field: diagnosis (morphology)
- 6th field: case number (ID of investigation/examination)
- 7th field: date of investigation/examination
- 8th field: age (in years; if 0, it will be calculated from date of investigation and date of birth)
- 9th field: sample / topography
- 10th field: CGH formula
Fields are separated by a pipe character ("|"). If no data are present for a field, that field must still be present but left empty or 0 for number fields. Data may optionally be surrounded by quotes ("). No header line is present.
If your database does not have a patient identifier, you may use the case number instead. But this will cause the few patients with more than one examination to be imported as several distinct patients. Generally, this would not cause problems.
Example:
1| 2 | 3 |4| 5 | 6| 7 |8| 9| 10 74|Hans Miller|15.08.1937|M|Retinoblastoma|50|21.6.2004||Eye|Rev ish enh(1q21q32,7, 6p22p25, 19, 20q), dim(10q23q26,16q) 74|Hans Miller|15.08.1937|M|Retinoblastoma|36|10.1.2005||Eye|Rev ish enh(1q21q32,1q43qter, 6p) 461|Maria Meier|23.04.1944|F|Retinoblastoma|7|15.1.2005|40|Eye|Rev ish enh(2p12pter; 6p) 734|Joseph Schulz|24.11.1951|M|Retinoblastoma|43|8.1.2005||Eye|Rev ish enh(1q,4q27q35, 6p, 14q22q32.3, 17q22q25,19), dim(16q12.2qter)
CGH2
CHG2 corresponds to CGH1, but the field separator is a tabulator. As with CGH1, data may optionally be surrounded by quotes ("). No header line is present.
Example:
1 2 3 4 5 6 7 8 9 10 74 Hans Miller 15.08.1937 M Retinoblastoma 50 21.6.2004 Eye Rev ish enh(1q21q32,7, 6p22p25, 19, 20q), dim(10q23q26,16q) 74 Hans Miller 15.08.1937 M Retinoblastoma 36 10.1.2005 Eye Rev ish enh(1q21q32,1q43qter, 6p) 461 Maria Meier 23.04.1944 F Retinoblastoma 7 15.1.2005 40 Eye Rev ish enh(2p12pter; 6p) 734 Joseph Schulz 24.11.1951 M Retinoblastoma 43 8.1.2005 Eye Rev ish enh(1q,4q27q35, 6p, 14q22q32.3, 17q22q25,19), dim(16q12.2qter)