Evolution Tree

The evolution tree shows putative pathways of karyotype development during tumour progression.

All suggested pathways ought to be regarded "putative" because only mulitple analysis of a patient's karyotpe over a long period of time can provide a good basis for a "real" pathway. Here, the evolution steps have to be reconstructed from data of many patients at different stages fo tumour progression.

Not always is the first rearrangement to become cytogenetically visible also the start event of a tumour progression. It could be a later event which will then start its typical evolution pathway. Also such cases are meant to become detectable.

The Evolution Tree Window

The evolution tree is opened from the menu of the main window. Select "Mine - Evolution Tree" from its menu.

There upon the evolution tree window is opened, but the evolution tree is not yet calculated. You may edit the parameters first. To show the evolution tree, select "Show - Tree" from the menu (the "Show - Data" menu is not yet functional).

The evolution tree is calculated and shown in the window:

The window is divided into two sections: on the left panel, the evolution tree proper is shown as a tree view. On the right, statistical information is shown on a selected node (point in evolution).

Between the two sections, there is a splitter which can be moved with the mouse (click on the blue bar with the left button, and move the mouse while keeping the button pressed).

Nodes in the evolution tree start with a boxed "+" or "-". A "+" means that the node can be expanded, thus making more information (later steps in evolution) visible. Clicking on "-", later steps of evolution are hidden.

A point of an evolution pathway (node) is selected by clicking on it. The right panel is updated after each selection.

Technical information on the EvolutionTree control is available from the Data Mining documentation.

The Meaning of the Data

Start points of evolution are depicted on the left side; there may exist alternative start points which are shown on the same (left) level. In the example above, only one start point was found: "t(9;22)(q34;q11)".

The next evolutionary step is shown below its parent node and indented once. Also here, alternative next steps may exist and are shown below each other. In the above example, the tumour progressed with either "+8", or "i(17)(q10)" or "-Y" or "+der(22)t(9;22)(q34;q11)" after the initial "t(9;22)(q34;q11)"; i.e. there are four alternatives.

In the above example, the last node is selected. For clarity, its text is shown on the right panel as first item.

The "Full Path" shows all the steps and their series towards the selected node. Here, after intitial "t(9;22)(q34;q11)" followed "+der(22)t(9;22)(q34;q11)", and finally "i(17)(q10)". That is, the respective karyotypes were initially  "...,t(9;22)(q34;q11)", "  "...,t(9;22)(q34;q11),+der(22)t(9;22)(q34;q11)" and finally "...,t(9;22)(q34;q11),i(17)(q10),+der(22)t(9;22)(q34;q11)" (here, no information is given on the sex chromosomes; hence the points ("...") may stand for "46,XX" and "47,XX" or "46,XY" and "47,XY", resp.).

"Support" means the number of cases (karyotypes) which contains the above combination of rearrangements; those karyotypes may contain further rearrangements which may not be shown. In this example, only two karyotypes were found with "t(9;22)(q34;q11),i(17)(q10),+der(22)t(9;22)(q34;q11)", which corresponds to 0.76% of all karyotypes.

"Parent Node Support" means the number of cases (karyotypes) which contains the combination of rearrangements of the previous evolution step (here: "t(9;22)(q34;q11),+der(22)t(9;22)(q34;q11)"); those karyotypes may contain further rearrangements (e.g. the subsequent evolution steps or random aberrations) which may not be shown. Here, there are 12 karyotypes at the parent node.
The 2 karyotypes found with the selected node correspond to 16.67% of all karyotypes with the parent aberrations.

"Expected Support at Independence" shows how often the selected combination of events was expected to be found when the last event and its parent were statistically independent from each, with the pre-condition of the events of the grand-partent node present. The dependence factor gives a measure for the reliabilty of the dependence; it is the chi-square value for the above statistics.

"Alternative Pathways" may lead to the same final combination of rearrangements. Here, the final point could be reached alternatively by "t(9;22)(q34;q11)\i(17)(q10)\+der(22)t(9;22)(q34;q11)"; the mining algorithm could not decide which pathway was more important, with the current set of mining parameters.

In any case, do not believe that the pathways shown must be correct - they still could be bare nonsense. Always look at them with your knowledge of tumours and tumour progression.

Mining Parameters

The mining parameters can be accessed from the "Edit" menu. Both submenus will open the same edit window, but with the appriopriate tab selected. Editing the parameters via this menu will change the parameters for subsequent calculations of this window only.

Alternatively, the parameters can be accessed from the CyDAS main window using the "Edit - Mining Parameters" menu. Changing the parameters from the main window will influence all mining windows opened later.

The edit form shows the parameters and their present values:

First Event Frequency

This value denotes the frequency with which the start events of evolution must be encountered. The default value is 0,2, i.e. the start event must be available in 20% of all karyotypes. A low value will give raise to many alternative start points, while a very high value might prevent finding a start point at all.

Frequency Decrease Factor

Later events often happen at very low frequency. Hence, the Frequency Decrease Factor was established which denotes the frequency an event must have at least among all cases with the parent events. The default value is 0,1.

E.g. if the start event was found in 200 cases, at least 20 cases among them must have the second event. If that combination of start event and second event was found in 30 cases, at least 3 cases among them must have the third event, and so on.

Minimum Event Count

The Minimum Event Count was established to prevent rare cases to be shown up in the evolution tree. By default, any pathway of evolution must be supported by at least three cases for its outmost node.

Binning Threshold

Events which always or almost always occur together can be "binned", i.e. treated as one event consisting of both original events.

The default value of 1 means that both events must have always occured together. A value greater than 1 will prevent binning. Values less than 1 do not require full mutual dependence of the events; such values may be useful when calculating evolution trees from cytoband data.

Dependence Threshold

The Dependence Threshold is used to determine if an event belongs to a level farther down the evolution pathway or if it starts a distinct branch. The value is the (signed) chi-square value for the dependence of the present event with any other event of the same level in the evolution pathway.

High values lead to highly branched evolution trees, while small values give raise to a straight line of evolution.

Maximum Depth

The Maximum Depth denotes the number of evolutionary steps which are to be calculated. By default, only upto three steps will be calculated, sometimes a pathway may stop earlier if it is not supported by enough cases.

High values may require lots of time for getting to a result, the time may increase exponentially.

Calculation of the Evolution Tree

More background information on the algorithms used for calculating the evolution tree can be found in the technical documentation for the Miner class.