Importing Data

CyDAS stores cytogenetic data along with data on patients and their examinations (investigations) in the CyDAS database.

Import

The data import utility is accessed from main window in the File-Import menu. On the Import form, choose the type of data you wish to import.

There are default values for the file name containing the data for each type of import filter. If the data are stored in a file with a different name, you can select it by clicking the "..." button which opens a common file open dialog.

Click "OK" to start the import. After the import, you will be informed on the quantities of imported data.

All new data are imported into the system group "NEW" and can be transferred into one or more groups / subgroups from this location (see Working with Groups).

Mitelman

Data downloaded from the Mitelman database and stored as a plain text file can be imported here (see also "How to download data from the Mitelman database").

You may also use this option for importing your own data, if they are structured as are the Mitelman data:

  • 1st field: Reference Number: identifier of a publication
  • 2nd field: Case Number
  • 3rd field: Investigation Number
  • 4th field: Author, Year (publication data)
  • 5th field: Journal Name (publication data)
  • 6th field: Volume, Page (publication data)
  • 7th field: Morphology
  • 8th field: Topography
  • 9th field: Short Karyotype (ISCN formula of all clones of the karyotypes).

Fields must be separated by a tabulator. If no data are present for a field, that field must still be present but left empty or 0 for number fields. Data must not be surrounded by quotes. A header line is expected to be present.

Example:

    Reference Number    Case Number     Investigation Number    Author, Year    Journal Name    Volume, Page    Morphology      Topography      Short Karyotype
3409    1       1       Abdi et al 1990 J Pakistan Med Ass      40:9-11 Acute lymphoblastic leukemia, FAB type L1               48,XY,+2,+8,t(13;22)(q?;q?)
3409    3       1       Abdi et al 1990 J Pakistan Med Ass      40:9-11 Acute lymphoblastic leukemia, FAB type L1               48,XY,-11,+3mar
3409    6       1       Abdi et al 1990 J Pakistan Med Ass      40:9-11 Acute lymphoblastic leukemia, FAB type L2               47,XX,+18
1139    1       1       Abe & Sandberg 1984     Cancer Genet Cytogenet  13:121-127      Acute lymphoblastic leukemia, NOS               46,XY,t(4;11)(q21;q23)
606     1       1       Abe et al 1979  Am J Hematol    6:259-266       Acute lymphoblastic leukemia, NOS               46,XY,del(5)(q12q23),del(9)(p21)
410     1       1       Abe et al 1982  Cancer Genet Cytogenet  7:185-195       Acute lymphoblastic leukemia, FAB type L3               46,XY,t(8;22)(q24;q12)/46,idem,+del(1)(p22),-22/46,idem,add(1)(q?),+del(1),-5
838     1       1       Abe et al 1983  Cancer Genet Cytogenet  9:139-144       Acute lymphoblastic leukemia, NOS               46,XX,del(11)(q13q23),ins(19;11)(p13;q13q23)
1162    1       1       Abe et al 1985  Cancer Genet Cytogenet  14:45-59        Acute lymphoblastic leukemia, FAB type L2               48-52,XX,+7,+11,+12,+13,+14,i(17)(q10),+20,+22
1162    1       2       Abe et al 1985  Cancer Genet Cytogenet  14:45-59        Acute lymphoblastic leukemia, FAB type L2               103,XXXX,+2,-4,+7,+7,+11,+12,+12,+13,+14,+16,i(17)(q10)x2,+20,+20,+22/53,XX,+X,+7,+11,+12,+13,i(17)(q10),+20,+22
1303    1       1       Abe et al 1985  Cancer Genet Cytogenet  18:49-54        Acute lymphoblastic leukemia, FAB type L2               46,XX,t(9;22)(q34;q11)
2398    1       1       Abe et al 1988  Cancer Genet Cytogenet  31:279-283      Acute lymphoblastic leukemia, NOS               46,XY,del(9)(p13p22),add(10)(p11),del(11)(q21q23)
5513    1       1       Abeliovich et al 1994   Cancer Genet Cytogenet  76:70-71        Acute lymphoblastic leukemia, FAB type L2               45,X,-Y/46,XY,t(9;22)(q34;q11)
1059    1       1       Abromowitch et al 1984  Br J Haematol   56:409-416      Acute lymphoblastic leukemia, FAB type L1               46,XY,t(1;19)(q23;p13),add(13)(q?)
1059    1       2       Abromowitch et al 1984  Br J Haematol   56:409-416      Acute lymphoblastic leukemia, FAB type L1               85,XXYY,-1,t(1;19),-2,-3,-4,del(4)(q23),-5,del(5)(p13),del(6)(q15),-7,+8,+8,+del(8)(p21),-9,-10,-12,dup(14)(q13q32)x2,-16,-17,-18,der(19)t(1;19),+20,+21,+21,-22,-22,+mar
4455    1       1       Abshire et al 1992      Leukemia        6:357-362       Acute lymphoblastic leukemia, NOS               44,X,-X,-20,t(20;22)(p?;q?),-22
4455    1       2       Abshire et al 1992      Leukemia        6:357-362       Acute lymphoblastic leukemia, NOS               44,X,-X,del(2)(q?),t(6;9)(q?;q?),-20,+t(20;22),-22
879     8       1       Aide et al 1981 Acta Acad Med Wuhan     1:7-15  Acute lymphoblastic leukemia, NOS               46,XY,t(9;22)(q34;q11)

Custom Filters

Several options for importing your own data were prepared and are described below. You can write your own import filters based on the basic concepts of the filters for the Mitelman data, the CDB structure and the CGH structure. You may also remove or rename or simply change existing filters. All filters expect the data to be stored in flat text files. For more details, see the chapter on Import Filters.

For simplicity, it is suggested that you use the pre-defined filters described below.

CDB1 (custom database #1)

The distinct pre-defined CDB types vary with the field separator (pipe "|" or tabulator), and the notation of polyclonal karyotypes (one clone of a karyotype per line, or all clones of a karyotype in one field).

With CDB1, one line with the following configuration is required for each clone of a karyotype:

  • 1st field: patient identifier
  • 2nd field: name of patient
  • 3rd field: date of birth
  • 4th field: sex (M male, F female)
  • 5th field: diagnosis (morphology)
  • 6th field: case number (ID of investigation/examination)
  • 7th field: date of investigation/examination
  • 8th field: age (in years; if 0, it will be calculated from date of investigation and date of birth)
  • 9th field: sample / topography
  • 10th field: total number of metaphases examined
  • 11th field: number of the clone of the karyotype
  • 12th field: number of modal chromosomes
  • 13th field: ISCN formula of a clone of the karyotype
  • 14th field: clone size (number of metaphases; if not specified, it may be calculated from the ISCN formula)

Fields must be separated by a pipe character ("|"). If no data are present for a field, that field must still be present but left empty or 0 for number fields. Data may optionally be surrounded by quotes ("). No header line is present.

If your database does not have a patient identifier, you may use the case number instead. But this will cause the few patients with more than one examination to be imported as several distinct patients. Generally, this would not cause problems.

Example:

  1  |    2      |     3    |4|     5        |  6   |    7     |8|   9   |10|11|12|  13   |14
PID01|Hans Miller|15.08.1937|M|Adenocarcinoma|IID001|21.11.1997|0|Bladder|32|1|47|47,XY,+7|15
PID01|Hans Miller|15.08.1937|M|Adenocarcinoma|IID002|11.01.1998|0|Bladder|40|1|47|47,XY,+7|22
PID01|Hans Miller|15.08.1937|M|Adenocarcinoma|IID002|11.01.1998|0|Bladder|40|2|48|48,XY,+7,+8|12
PID02|Maria Meier|23.04.1944|F|Adenoma|IID003|15.02.1998|0|Bladder|0|1|0|47,XY,+X|11

CDB2 (custom database #2)

Custom database #2 has essentially the same data structure as CDB1, but the field separator is a tabulator. As with CDB1, data may optionally be surrounded by quotes ("). No header line is present.

Example:

  1           2             3        4         5               6          7        8       9      10   11    12        13     14
PID01    Hans Miller    15.08.1937   M    Adenocarcinoma    IID001   21.11.1997    0    Bladder   32    1    47    47,XY,+7   15
PID01    Hans Miller    15.08.1937   M    Adenocarcinoma    IID002   11.01.1998    0    Bladder   40    1    47    47,XY,+7   22
PID01    Hans Miller    15.08.1937   M    Adenocarcinoma    IID002   11.01.1998    0    Bladder   40    2    48    48,XY,+7,+8   12
PID02    Maria Meier    23.04.1944   F    Adenoma    IID003   15.02.1998    0    Bladder   22    1    47    47,XY,+X   11

CDB3 (custom database #3)

Again, custom database 3 is similiar to CDB1. But here, all karyotypes are given in one field. Karyotypes are separated from each other with forward slashes("/") and their respective clone sizes is optionally given in rectangular brackets ("[]").

One line with the following configuration is required for each case (or examination or investiagtion):

  • 1st field: patient identifier
  • 2nd field: name of patient
  • 3rd field: date of birth
  • 4th field: sex (M male, F female)
  • 5th field: diagnosis (morphology)
  • 6th field: case number (ID of investigation/examination)
  • 7th field: date of investigation/examination
  • 8th field: age (in years; if 0, it will be calculated from date of investigation and date of birth)
  • 9th field: sample / topography
  • 10th field: total number of metaphases examined
  • 11th field: number of modal chromosomes
  • 12th field: ISCN formula of all clones of the karyotype (if there is more than one clone, the clones are separated by a "/"; clone sizes are optionally given in brackets ("[]")

Fields are separated by a pipe character ("|"). If no data are present for a field, that field must still be present but left empty or 0 for number fields. Data may optionally be surrounded by quotes ("). No header line is present.

Example:

  1  |    2      |     3    |4|     5        |  6   |    7     |8|   9   |10|11|     12
PID01|Hans Miller|15.08.1937|M|Adenocarcinoma|IID001|21.11.1997|0|Bladder|32|47|47,XY,+7[15]
PID01|Hans Miller|15.08.1937|M|Adenocarcinoma|IID002|11.01.1998|0|Bladder|40|47|47,XY,+7[22]/48,XY,+7,+8[12]
PID02|Maria Meier|23.04.1944|F|Adenoma|IID003|15.02.1998|0|Bladder|22|47|47,XY,+X[11]

CDB4 (custom database #4)

Custom database 4 corresponds to custom database 3, but the field separator is a tabulator. As with CDB3, data may optionally be surrounded by quotes ("). No header line is present.

Example:

  1           2              3        4          5             6           7         8       9       10    11    12  
PID01    Hans Miller    15.08.1937    M    Adenocarcinoma    IID001    21.11.1997    0    Bladder    32    47    47,XY,+7[15]
PID01    Hans Miller    15.08.1937    M    Adenocarcinoma    IID002    11.01.1998    0    Bladder    40    47    47,XY,+7[22]/48,XY,+7,+8[12]
PID02    Maria Meier    23.04.1944    F    Adenoma    IID003    15.02.1998    0    Bladder    22    47    47,XY,+X[11]

CGH1

The CGH option closely resembles the CDB options, but is adapted for use with CGH data.

For each case, one line with the following configuration is required for each case (or examination or investiagtion):

  • 1st field: patient identifier
  • 2nd field: name of patient
  • 3rd field: date of birth
  • 4th field: sex (M male, F female)
  • 5th field: diagnosis (morphology)
  • 6th field: case number (ID of investigation/examination)
  • 7th field: date of investigation/examination
  • 8th field: age (in years; if 0, it will be calculated from date of investigation and date of birth)
  • 9th field: sample / topography
  • 10th field: CGH formula

Fields are separated by a pipe character ("|"). If no data are present for a field, that field must still be present but left empty or 0 for number fields. Data may optionally be surrounded by quotes ("). No header line is present.

If your database does not have a patient identifier, you may use the case number instead. But this will cause the few patients with more than one examination to be imported as several distinct patients. Generally, this would not cause problems.

Example:

 1|     2     |     3    |4|     5        | 6|   7     |8| 9|           10
74|Hans Miller|15.08.1937|M|Retinoblastoma|50|21.6.2004||Eye|Rev ish enh(1q21q32,7, 6p22p25, 19, 20q), dim(10q23q26,16q)
74|Hans Miller|15.08.1937|M|Retinoblastoma|36|10.1.2005||Eye|Rev ish enh(1q21q32,1q43qter, 6p)
461|Maria Meier|23.04.1944|F|Retinoblastoma|7|15.1.2005|40|Eye|Rev ish enh(2p12pter; 6p)
734|Joseph Schulz|24.11.1951|M|Retinoblastoma|43|8.1.2005||Eye|Rev ish enh(1q,4q27q35, 6p, 14q22q32.3, 17q22q25,19), dim(16q12.2qter)

CGH2

CHG2 corresponds to CGH1, but the field separator is a tabulator. As with CGH1, data may optionally be surrounded by quotes ("). No header line is present.

Example:

 1           2              3           4             5          6         7            8        9               10
74      Hans Miller     15.08.1937      M       Retinoblastoma  50      21.6.2004               Eye     Rev ish enh(1q21q32,7, 6p22p25, 19, 20q), dim(10q23q26,16q)
74      Hans Miller     15.08.1937      M       Retinoblastoma  36      10.1.2005               Eye     Rev ish enh(1q21q32,1q43qter, 6p)
461     Maria Meier     23.04.1944      F       Retinoblastoma  7       15.1.2005       40      Eye     Rev ish enh(2p12pter; 6p)
734     Joseph Schulz   24.11.1951      M       Retinoblastoma  43      8.1.2005                Eye     Rev ish enh(1q,4q27q35, 6p, 14q22q32.3, 17q22q25,19), dim(16q12.2qter)