NeighborJoining Functions
NeighborJoining.NJClustNeighborJoining.heightsNeighborJoining.mergesNeighborJoining.newickstringNeighborJoining.orderNeighborJoining.RegularNeighborJoining.regNJNeighborJoining.FastNeighborJoining.fastNJ
NeighborJoining.NJClust — TypeNJClust(merges::Matrix{M}, heights::Matrix{H})fields:
- merges is a n-1 x 2 matrix of integers: absolute values of negative integers indicate index into the distance matrix (i.e., leaves). positive integers are the index into the merge list (i.e., the kth internal node)
- heights is an n-1 x 2 matrix where each value is the distance from the left (1) or right (2) chield from its parent. Specifically
heights[i,j]is thejth childs distance to the parent node, rowi.
NeighborJoining.heights — Methodheights(t::NJClust)extracts the heights list from NJClust object. rowindex is the internal node id, and each column represents the distance from the internal node to it's left and right child respectively.
NeighborJoining.merges — Methodmerges(t::NJClust)extracts the merge list from NJClust object. rowindex is the internal node id, and each column represents the children nodes. Leaf nodes are represented by negative integers corresponding to the index in the original distance matrix
NeighborJoining.newickstring — Functionnewickstring(njc::NJClust, tiplabels=AbstractVector{<:String}; labelinternalnodes=false)
newickstring(merges::AbstractArray{<:Integer}, heights::AbstractArray{<:AbstractFloat}, tiplabels::AbstractVector{<:String}; labelinternalnodes=false)Converts a list of merges into a newicktree formatted string.
args:
- njc: is a struct that has merges and heights
- merges and heights from a NJClust struct:
- merges is a n-1 x 2 matrix of integers: absolute values of negative integers indicate index into the distance matrix (i.e., leaves). positive integers are the index into the merge list (i.e., the kth internal node)
- heights is an n-1 x 2 matrix where each value is the distance from the left (1) or right (2) chield from its parent. Specifically
heights[i,j]is thejth childs distance to the parent node, rowi.
- tiplabels: vector of string labels corresponding to the order of leaves in the distance matrix
- labelinternalnodes: whether to generate node labels for the internal nodes. defaults to
false.
returns:
- newicktree formatted string
example:
julia> d = [
0 5 9 9 8
5 0 10 10 9
9 10 0 8 7
9 10 8 0 3
8 9 7 3 0
];
julia> njclusts = regNJ(d)
NJClust{Int64, Float64}([-2 -1; -3 1; -4 2; -5 3], [3.0 2.0; 4.0 3.0; 2.0 2.0; 0.5 0.5])
julia> nwstring = newickstring(njclusts)
"(5:5.000000e-01,(4:2.000000e+00,(3:4.000000e+00,(2:3.000000e+00,1:2.000000e+00):3.000000e+00):2.000000e+00):5.000000e-01):0.000000e+00;"NeighborJoining.order — Methodorder(clust::NJClust)Returns the left-to-right order of leaf nodes (tips) in the phylogenetic tree.
Performs a depth-first traversal of the tree structure to determine the order in which leaf nodes appear when reading the tree from left to right. This is useful for plotting or arranging data according to the tree topology.
Arguments
clust::NJClust: A neighbor-joining clustering result containing the tree structure with merges and heights matrices
Returns
Vector{Int}: A vector of integers representing the indices of leaf nodes in their left-to-right order in the tree. The indices correspond to the original positions in the distance matrix used to construct the tree.
Example
julia> d = [
0 5 9 9 8
5 0 10 10 9
9 10 0 8 7
9 10 8 0 3
8 9 7 3 0
];
julia> njclusts = regNJ(d)
NJClust{Int64, Float64}([-2 -1; -3 1; -4 2; -5 3], [3.0 2.0; 4.0 3.0; 2.0 2.0; 0.5 0.5])
julia> leaf_order = order(njclusts)
5-element Vector{Int64}:
5
4
3
2
1NeighborJoining.RegularNeighborJoining.regNJ — MethodregNJ(d::AbstractMatrix{<:Number})regNJ algorithm is the traditional NeighborJoining algorithm from
Saitou, N. & Nei, M. The neighbor-joining method: a new method for reconstructing phylogenetic trees. Molecular Biology and Evolution 4, 406-425 (1987).
This algorithm is guarenteed to infer the tree for additive distance matrices, but it does have an algorithmic complexity of O(n^3), so it can be slow for distance matrices on the order of >10³.
args:
- d is an n by n square symetric distance matrix
returns:
- NJClust struct with fields merges and heights
examples:
julia> d = [
0 5 9 9 8
5 0 10 10 9
9 10 0 8 7
9 10 8 0 3
8 9 7 3 0
];
julia> njclusts = regNJ(d)
NJClust{Int64, Float64}([-2 -1; -3 1; -4 2; -5 3], [3.0 2.0; 4.0 3.0; 2.0 2.0; 0.5 0.5])
julia> nwstring = newickstring(njclusts)
"(5:5.000000e-01,(4:2.000000e+00,(3:4.000000e+00,(2:3.000000e+00,1:2.000000e+00):3.000000e+00):2.000000e+00):5.000000e-01):0.000000e+00;"NeighborJoining.FastNeighborJoining.fastNJ — MethodfastNJ(d::AbstractMatrix{<:Number})fastNJ algorithm finds k independent pairs to merge and merges them for each iteration. This is significantly faster because it is not recalculating the full pairwise Q for each pair joined.
This algorithm is nearly additive, but there are instances where the topology '(a,(b,(c,d)))' is inferred as '((a,b), (c,d))'. This happens when 'b' is not as close to 'c' or 'd' as it is to 'a', but 'b' is closer to the common ancestor '(c,d)' than it is to 'a'.
args:
- d is an n by n square symetric distance matrix
returns:
- NJClust struct with fields merges and heights
julia> d = [
0 5 9 9 8
5 0 10 10 9
9 10 0 8 7
9 10 8 0 3
8 9 7 3 0
];
julia> njclusts = fastNJ(d)
NJClust{Int64, Float64}([-2 -1; -5 -4; 1 -3; 3 2], [3.0 2.0; 1.0 2.0; 3.0 4.0; 1.0 1.0])
julia> nwstring = newickstring(njclusts)
"(5:5.000000e-01,(4:2.000000e+00,(3:4.000000e+00,(2:3.000000e+00,1:2.000000e+00):3.000000e+00):2.000000e+00):5.000000e-01):0.000000e+00;"<!– @docs NJClust regNJ fastNJ newickstring –>