This was part of
Algebraic Statistics for Ecological and Biological Systems
Inferring the tree-like parts of a species network under the coalescent
John Rhodes, University of Alaska Fairbanks
Monday, October 9, 2023
Abstract: If the evolutionary history of a collection of organisms involves hybridization or gene flow, a network is needed for its graphical depiction. Inferring such a network from biological data, however, is a demanding problem, both practically and theoretically. The size of network space, the size of genomic data sets, the confounding signal arising from incomplete lineage sorting, and the poorly understood identifiability properties of the Network Multispecies Coalescent (NMSC) model all cause difficulties. Bayesian inference has been effective only on tiny problems. Current tractable methods have adopted pseudolikelihood applied to subnetwork summary statistics, or inference of small subnetworks combined with combinatorial network building. Generally these methods also assume simple network structure, such as level-1-ness, that may be hard to justify biologically. This talk describes recent results on identifiability of the `Tree of Blobs' of a species network (in which all biconnected components of the network are collapsed to nodes), as well as an algorithm for its inference. No assumption on network structure is needed, and regardless of whether the internal structure of the blobs is identifiable the estimate of the tree of blobs is consistent under the NMSC.