Pyrea Module, Class, and Function Documentation

This section contains documentation of every class and function in Pyrea.

Generally speaking, the tutorials and project’s README should contain enough information to get started, however the auto-generated documentation below provides comprehensive help.

The core Module

The pyrea.core module contains all the user-facing API functions required to use Pyrea. Generally, users will only need to interact with the functions within this module in order to create their ensemble structures.

Developers, especially those who wish to extend Pyrea, may want to look at the classes and functions defined in the pyrea.structure module.

pyrea.core.clusterer(clusterer: str, precomputed: bool = False, **kwargs) Clusterer[source]

Creates a Clusterer object to be used when creating a View or Ensemble. Can be one of: 'spectral', 'hierarchical', 'dbscan', or 'optics'.

c = pyrea.clusterer('hierarchical', n_clusters=2)

Then, c can be used when creating a view:

v = pyrea.view(d, c)

Where d is a data source.

See also

The view() function.

See also

The execute_ensemble() function.

Each clustering algorithm has a different set of parameters, default values are used throughout and can be overridden if required. For example, hierarchical and spectral clustering allow you to specify the number of clusters to find using n_clusters, while DBSCAN and OPTICS do not.

Also, hierarchical clustering allows for a distance_metric to be set, which can be one of: 'braycurtis', 'canberra', 'chebyshev', 'cityblock', 'correlation', 'cosine', 'dice', 'euclidean', 'hamming', 'jaccard', 'jensenshannon', 'kulczynski1', 'mahalanobis', 'matching', 'minkowski', 'rogerstanimoto', 'russellrao', 'seuclidean', 'sokalmichener', 'sokalsneath', 'sqeuclidean', or 'yule'.

Likewise, adjusting the linkage method is possible using hierarchical clustering algorithms, this can be one of: 'single', 'complete', 'average', 'weighted', 'centroid', 'median', or 'ward'.

For complete documentation of each clustering algorithm’s parameters see the following:

Parameters:
  • clusterer – The type of clusterer to use. Can be one of: 'spectral', 'hierarchical', 'dbscan', or 'optics'.

  • precomputed – Whether the clusterer should assume the data is a distance matrix.

  • **kwargs – Keyword arguments to be passed to the clusterer. See each clustering algorithm’s documentation for full details: Spectral: SpectralClusteringPyrea, Hierarchical: HierarchicalClusteringPyrea, DBSCAN: DBSCANPyrea, and OPTICS: OPTICSPyrea.

pyrea.core.consensus(labels: list)[source]
pyrea.core.execute_ensemble(views: List[View], fuser: Fusion) list[source]

Executes an ensemble and returns a new View object.

Parameters:
  • views – The ensemble’s views.

  • fuser – The fusion algorithm used to fuse the clustered data.

  • clusterers – A clustering algorithm or list of clustering algorithms used to cluster the fused matrix created by the fusion algorithm.

v = pyrea.execute_ensemble([view1, view2, view3], fusion, clusterer)

Returns a View object which can consequently be included in a further ensemble.

See also

The view() function.

See also

The clusterer() function.

pyrea.core.fuser(fuser: str)[source]

Creates a Fusion object, which is used to fuse the results of an arbitrarily long list of clusterings.

f = pyrea.fuser('agreement')
Parameters:

fuser – The fusion algorithm to use. Must be one of ‘agreement’, ‘disagreement’, ‘consensus’.

pyrea.core.get_ensemble(views: List[View], fuser: Fusion, clusterers: List[Clusterer]) Ensemble[source]

Creates and returns an Ensemble object which must be executed later to get the ensemble’s computed view.

pyrea.core.parea_1(views: list | None = None, c_1_type='hierarchical', c_1_method='ward', c_2_type='hierarchical', c_2_method='complete', c_1_pre_type='hierarchical', c_1_pre_method='ward', c_2_pre_type='hierarchical', c_2_pre_method='complete', fusion_method='disagreement', k=2)[source]

Implements the PAREA-1 algorithm.

The function accepts a list of parameters for the Parea 1 algorithm, which can optionally be optimised using a genetic algorithm.

The default values are those described in the package’s paper and README documentation.

pyrea.core.parea_1_genetic(views: list, k: int)[source]

Genetic algorithm optimised implementation of Parea 1.

pyrea.core.parea_2(c_1_type='hierarchical', c_1_method='ward', c_2_type='hierarchical', c_2_method='complete', c_3_type='hierarchical', c_3_method='single', c_1_pre_type='hierarchical', c_1_pre_method='ward', c_2_pre_type='hierarchical', c_2_pre_method='complete', c_3_pre_type='hierarchical', c_3_pre_method='single', fusion_method='disagreement', k=2)[source]
pyrea.core.parea_2_genetic(views: list, k: int)[source]

Genetic algorithm optimised implementation of Parea 2.

pyrea.core.silhouette(labels: list)[source]
pyrea.core.summary()[source]

Not yet implemented.

Prints a summary of the current ensemble structure, including any already calculated statistics.

pyrea.core.view(data: array, clusterer: Clusterer) View[source]

Creates a View object that can subsequently used to create an Ensemble.

Views are created using some data in the form of a NumPy matrix or 2D array, and a clustering algorithm:

d = numpy.random.rand(100,10)
v = pyrea.view(d, c)

Views are used to create ensembles. They consist of some data, d above, and a clustering algorimth, c above.

The structure Module

The pyrea.structure module contains the classes used for the internal functionaliy of Pyrea. The classes contained here are not generally called or instantiated by the user, see the pyrea.core module for the user-facing API.

Developers who wish to extend Pyrea, such as by creating a custom clustering algorthim, should consult the documentation of the Clusterer abstract base class for example. The Fusion class is another such abstract base class that must be used if a developer wishes to create a custom fusion algorithm for use within Pyrea.

class pyrea.structure.AgglomerativeClusteringPyrea(n_clusters=2, linkage: str = 'ward', affinity: str = 'euclidean', memory: None | Any = None, connectivity=None, compute_full_tree='auto', distance_threshold=None, compute_distances=False)[source]

Perform agglomerative clustering.

See https://scikit-learn.org/stable/modules/generated/sklearn.cluster.AgglomerativeClustering.html

execute(data: list) list[source]

Execute the clustering algorithm with the given data.

class pyrea.structure.Agreement[source]

Agreement fusion function.

Creates the agreement of two clusterings.

Fusion is the Abstract Base Class for all fusion algorithms. All fusion algorithms must be a subclass of this class in order to accepted by functions such as execute_ensemble(). To extend Pyrea with a custom fusion algorithm, create a new class that is a subclass of Fusion, and implement the Fusion.execute() function.

execute(views: list) list[source]

Executes the agreement fusion algorithm on the provided clusterings, views.

class pyrea.structure.Average[source]

Implements the ‘average’ clustering algorithm.

execute(data)[source]

Perform the clustering and return the results.

class pyrea.structure.Clusterer[source]

Clusterer is the Abstract Base Class for all clustering algorithms. All clustering algorithms must be a subclass of this class in order to accepted by functions such as execute_ensemble(). To extend Pyrea with a custom clustering algorithm, create a new class that is a subclass of Clusterer, and implement the Clusterer.execute() function.

execute() list[source]

Execute the clustering algorithm with the given data.

class pyrea.structure.Complete[source]

Implements the ‘complete’ clustering algorithm.

execute(data)[source]

Perform the clustering and return the results.

class pyrea.structure.Consensus[source]

Consensus fusion function.

Creates the consensus of two clusterings.

Fusion is the Abstract Base Class for all fusion algorithms. All fusion algorithms must be a subclass of this class in order to accepted by functions such as execute_ensemble(). To extend Pyrea with a custom fusion algorithm, create a new class that is a subclass of Fusion, and implement the Fusion.execute() function.

execute(views: list)[source]

Executes the consensus fusion algorithm on the provided clusterings, views.

class pyrea.structure.DBSCANPyrea(eps=0.5, min_samples=5, metric='euclidean', metric_params=None, algorithm='auto', leaf_size=30, p=None, n_jobs=None)[source]
execute(data) list[source]

Execute the clustering algorithm with the given data.

class pyrea.structure.Disagreement[source]

Disagreement fusion function.

Creates the disagreement of two clusterings.

Fusion is the Abstract Base Class for all fusion algorithms. All fusion algorithms must be a subclass of this class in order to accepted by functions such as execute_ensemble(). To extend Pyrea with a custom fusion algorithm, create a new class that is a subclass of Fusion, and implement the Fusion.execute() function.

execute(views: list) list[source]

Executes the disagreement fusion algorithm on the provided clusterings, views.

class pyrea.structure.Ensemble(views: List[View], fuser: Fusion)[source]

The Ensemble class encapsulates the views, fusion algorithm and clustering methods required to perform a multi-view clustering.

Parameters:
  • views – The views that constitute the ensemble’s multi-view data.

  • fuser – The fusion algorithm to use.

  • clusterers – The clustering algorithms to use on the fused matrix.

execute()[source]

Executes the ensemble, returning a View object.

The new View can then be passed to subsequent ensembles.

class pyrea.structure.Fusion[source]

Fusion is the Abstract Base Class for all fusion algorithms. All fusion algorithms must be a subclass of this class in order to accepted by functions such as execute_ensemble(). To extend Pyrea with a custom fusion algorithm, create a new class that is a subclass of Fusion, and implement the Fusion.execute() function.

execute(views: list) list[source]

Execute the fusion algorithm on the provided views.

class pyrea.structure.HierarchicalClusteringPyrea(precomputed, method='single', metric='euclidean', optimal_ordering=False, distance_metric='euclidean', out=None, n_clusters=None, height=None)[source]
execute(data) list[source]

Execute the clustering algorithm with the given data.

class pyrea.structure.OPTICSPyrea(min_samples=5, max_eps=inf, metric='minkowski', p=2, metric_params=None, cluster_method='xi', eps=None, xi=0.05, predecessor_correction=True, min_cluster_size=None, algorithm='auto', leaf_size=30, n_jobs=None)[source]
execute(data: list) list[source]

Execute the clustering algorithm with the given data.

class pyrea.structure.Parea[source]

Parea fusion algorithm. This functionality is not yet implemented.

Fusion is the Abstract Base Class for all fusion algorithms. All fusion algorithms must be a subclass of this class in order to accepted by functions such as execute_ensemble(). To extend Pyrea with a custom fusion algorithm, create a new class that is a subclass of Fusion, and implement the Fusion.execute() function.

execute(views: list) list[source]

Performs the fusion of a set of views.

Not yet implemented.

class pyrea.structure.Single[source]

Implements the ‘single’ clustering algorithm.

execute(data)[source]

Perform the clustering and return the results.

class pyrea.structure.SpectralClusteringPyrea(n_clusters=8, eigen_solver=None, n_components=None, random_state=None, n_init=10, gamma=1.0, affinity='rbf', n_neighbors=10, eigen_tol=0.0, assign_labels='kmeans', degree=3, coef0=1, kernel_params=None, n_jobs=None, verbose=False)[source]

Perform spectral clustering.

See: https://scikit-learn.org/stable/modules/generated/sklearn.cluster.SpectralClustering.html

class pyrea.structure.View(data, clusterer: List[Clusterer])[source]

Represents a View, which consists of some data and a clustering algorithm, clusterer.

Requires a data source, data, which is used to create the view (the data source can be a Python matrix (a list of lists), a NumPy 2D array, or a Pandas DataFrame) and a clustering method clusterer.

Some examples follow (using a list of lists):

import pyrea

data = [[1, 5, 3, 7],
        [4, 2, 9, 4],
        [8, 6, 1, 9],
        [7, 1, 8, 1]]

v = pyrea.view(data, pyrea.cluster('ward'))

Or by passing a Pandas DataFrame (pandas.core.frame.DataFrame):

import pyrea
import pandas

data = pandas.read_csv('iris.csv')

v = pyrea.view(data, pyrea.cluster('ward'))

Or (passing a numpy 2d array or matrix (numpy.matrix or numpy.ndarray)):

import pyrea
import numpy

data = numpy.random.randint(0, 10, (4,4))

v = pyrea.view(data, pyrea.cluster('ward'))

See also

The Clusterer class.

Parameters:
  • data – The data from which to create your View.

  • clusterer – The clustering algorithm to use to cluster your data

Variables:

labels – Contains the calculated labels when the clusterer is run on the data.

execute() list[source]

Clusters the data using the clusterer specified at initialisation.

class pyrea.structure.Ward[source]

Implements the ‘Ward’ clustering algorithm.

execute(data)[source]

Perform the clustering and return the results.