Welcome to Pyrea’s documentation!

Pyrea is a Python package for multi-view hierarchical clustering with flexible ensemble structures.

The name Pyrea is derived from the Greek word Parea, meaning a group of friends who gather to share experiences, values, and ideas.

Pyrea is licensed under the terms of the MIT License. See the license section below for details.

Installation is via pip:

pip install pyrea

Authors: Marcus D. Bloice and Bastian Pfeifer, Medical University of Graz.

Overview

Pyrea allows complex, layered ensembles to be created using an easy to use API.

Formally, ensembles are created as follows. A view \(V \in \mathbb{R}^{n \times p}\), where \(n\) is the number of samples and \(p\) is the number of predictors, and is associated with a clustering method, \(c\).

An ensemble, \(\mathcal{E}\), can be modelled using a set of views \(\mathcal{V}\) and an associated fusion algorithm, \(f\).

\[\mathcal{V} \leftarrow \{(V \in \mathbb{R}^{n\times p}, c)\}\]
\[\mathcal{E}(\mathcal{V}, f) \rightarrow \widetilde{V}\in \mathbb{R}^{p\times p}\]
\[\mathcal{V} \leftarrow \{(\widetilde{V}\in \mathbb{R}^{p\times p}, c)\}\]

From the above equations we can see that a specified ensemble \(\mathcal{E}\) creates a view \(\widetilde{V} \in \mathbb{R}^{p\times p}\) which again can be used to specify \(\mathcal{V}\) including an associated clustering algorithm \(c\). With this concept it is possible to layer-wise stack views and ensembles to a arbitrary complex ensemble architecture. It should be noted, however, that the resulting view of a specified ensemble \(\mathcal{E}\) reflects an affinity matrix of dimension \(p \times p\), and thus only clustering methods which accepts an affinity or a distance matrix as an input are applicable.

Example “Parea”

In the paper by Pfeifer et al.[1], a method called Pareahc was introduced. We show here how the Parea workflow from this paper can be reproduced using Pyrea.

Indeed, the Pareahc method supports two different hierarchical ensemble architectures. Pareahc1 clusters multiple data views using two hierarchical clustering methods hc1 and hc2. The resulting fused matrices \(\widetilde{V}\) are clustered with the same methods and the results are combined to a final consensus. A formal description of the Pareahc1 is:

\[\mathcal{V}_{1} \leftarrow \{(V_{1},hc_{1}),(V_{2},hc_{1}),\ldots, (V_{N},hc_{1})\}, \quad \mathcal{V}_{2} \leftarrow \{(V_{1},hc_{2}),(V_{2},hc_{2}),\ldots, (V_{N},hc_{2})\}\]
\[\mathcal{E}_{1}(\mathcal{V}_{1}, f) \rightarrow \widetilde{V}_{1}, \quad \mathcal{E}_{2}(\mathcal{V}_{2}, f) \rightarrow \widetilde{V}_{2}\]
\[\mathcal{V}_{3} \leftarrow \{(\widetilde{V}_{1},hc_{1}),(\widetilde{V}_{2},hc_{2})\}\]
\[\mathcal{E}_{3}(\mathcal{V}_{3}, f) \rightarrow \widetilde{V}_{3}.\]

The affinity matrix \(\widetilde{V}_{3}\) is then clustered with \(hc_{1}\) and \(hc_{2}\) from the first layer, and the consensus of the obtained clustering solutions reflect the final cluster assignments.

Pyrea Implementation

In order to implement the method descrbed in the paper by Pfeifer et al.[1] and to demonstrate the API with an example, we provide here the source code to implement this method:

Implementing the Parea method using Pyrea
 1import pyrea
 2
 3c1 = pyrea.clusterer('ward')
 4c2 = pyrea.clusterer('complete')
 5c3 = pyrea.clusterer('single')
 6
 7# Make some datasets
 8d1 = np.random.rand(100,10)
 9d2 = np.random.rand(100,10)
10d3 = np.random.rand(100,10)
11
12# A view consists of a dataset (2d array/matrix) and a clustering algorithm
13v1 = pyrea.view(d1, c1)
14v2 = pyrea.view(d2, c2)
15v3 = pyrea.view(d3, c3)
16
17# Create a fusion algorithm object
18f = pyrea.fuser('agreement')
19
20# An ensemble consists of views and a fusion algorithm,
21# and once executed returns a new view
22v_res_1 = pyrea.execute_ensemble([v1,v2,v3], f, c1)
23v_res_2 = pyrea.execute_ensemble([v1,v2,v3], f, c1)
24
25# These views can then be used to create another emsemble
26v_final = pyrea.execute_ensemble([v_res_1, v_res_2], f, c)

For complete documentation of all modules, classes, and functions, see the sections below.

Main Documentation

Indices and tables

References