Pyrea Module, Class, and Function Documentation¶
This section contains documentation of every class and function in Pyrea.
Generally speaking, the tutorials and project’s README should contain enough information to get started, however the auto-generated documentation below provides comprehensive help.
The core
Module¶
The pyrea.core
module contains all the user-facing API functions
required to use Pyrea. Generally, users will only need to interact with the
functions within this module in order to create their ensemble structures.
Developers, especially those who wish to extend Pyrea, may want to look at the
classes and functions defined in the pyrea.structure
module.
- pyrea.core.clusterer(clusterer: str, precomputed: bool = False, **kwargs) Clusterer [source]¶
Creates a
Clusterer
object to be used when creating aView
orEnsemble
. Can be one of:'spectral'
,'hierarchical'
,'dbscan'
, or'optics'
.c = pyrea.clusterer('hierarchical', n_clusters=2)
Then,
c
can be used when creating a view:v = pyrea.view(d, c)
Where
d
is a data source.See also
The
view()
function.See also
The
execute_ensemble()
function.Each clustering algorithm has a different set of parameters, default values are used throughout and can be overridden if required. For example, hierarchical and spectral clustering allow you to specify the number of clusters to find using
n_clusters
, while DBSCAN and OPTICS do not.Also, hierarchical clustering allows for a
distance_metric
to be set, which can be one of:'braycurtis'
,'canberra'
,'chebyshev'
,'cityblock'
,'correlation'
,'cosine'
,'dice'
,'euclidean'
,'hamming'
,'jaccard'
,'jensenshannon'
,'kulczynski1'
,'mahalanobis'
,'matching'
,'minkowski'
,'rogerstanimoto'
,'russellrao'
,'seuclidean'
,'sokalmichener'
,'sokalsneath'
,'sqeuclidean'
, or'yule'
.Likewise, adjusting the linkage method is possible using hierarchical clustering algorithms, this can be one of:
'single'
,'complete'
,'average'
,'weighted'
,'centroid'
,'median'
, or'ward'
.For complete documentation of each clustering algorithm’s parameters see the following:
Spectral:
SpectralClusteringPyrea
Hierarchical:
HierarchicalClusteringPyrea
DBSCAN:
DBSCANPyrea
OPTICS:
OPTICSPyrea
- Parameters:
clusterer – The type of clusterer to use. Can be one of:
'spectral'
,'hierarchical'
,'dbscan'
, or'optics'
.precomputed – Whether the clusterer should assume the data is a distance matrix.
**kwargs – Keyword arguments to be passed to the clusterer. See each clustering algorithm’s documentation for full details: Spectral:
SpectralClusteringPyrea
, Hierarchical:HierarchicalClusteringPyrea
, DBSCAN:DBSCANPyrea
, and OPTICS:OPTICSPyrea
.
- pyrea.core.execute_ensemble(views: List[View], fuser: Fusion) list [source]¶
Executes an ensemble and returns a new
View
object.- Parameters:
views – The ensemble’s views.
fuser – The fusion algorithm used to fuse the clustered data.
clusterers – A clustering algorithm or list of clustering algorithms used to cluster the fused matrix created by the fusion algorithm.
v = pyrea.execute_ensemble([view1, view2, view3], fusion, clusterer)
Returns a
View
object which can consequently be included in a further ensemble.See also
The
view()
function.See also
The
clusterer()
function.
- pyrea.core.fuser(fuser: str)[source]¶
Creates a
Fusion
object, which is used to fuse the results of an arbitrarily long list of clusterings.f = pyrea.fuser('agreement')
- Parameters:
fuser – The fusion algorithm to use. Must be one of ‘agreement’, ‘disagreement’, ‘consensus’.
- pyrea.core.get_ensemble(views: List[View], fuser: Fusion, clusterers: List[Clusterer]) Ensemble [source]¶
Creates and returns an
Ensemble
object which must be executed later to get the ensemble’s computed view.
- pyrea.core.parea_1(views: list | None = None, c_1_type='hierarchical', c_1_method='ward', c_2_type='hierarchical', c_2_method='complete', c_1_pre_type='hierarchical', c_1_pre_method='ward', c_2_pre_type='hierarchical', c_2_pre_method='complete', fusion_method='disagreement', k=2)[source]¶
Implements the PAREA-1 algorithm.
The function accepts a list of parameters for the Parea 1 algorithm, which can optionally be optimised using a genetic algorithm.
The default values are those described in the package’s paper and README documentation.
- pyrea.core.parea_1_genetic(views: list, k: int)[source]¶
Genetic algorithm optimised implementation of Parea 1.
- pyrea.core.parea_2(c_1_type='hierarchical', c_1_method='ward', c_2_type='hierarchical', c_2_method='complete', c_3_type='hierarchical', c_3_method='single', c_1_pre_type='hierarchical', c_1_pre_method='ward', c_2_pre_type='hierarchical', c_2_pre_method='complete', c_3_pre_type='hierarchical', c_3_pre_method='single', fusion_method='disagreement', k=2)[source]¶
- pyrea.core.parea_2_genetic(views: list, k: int)[source]¶
Genetic algorithm optimised implementation of Parea 2.
- pyrea.core.summary()[source]¶
Not yet implemented.
Prints a summary of the current ensemble structure, including any already calculated statistics.
- pyrea.core.view(data: array, clusterer: Clusterer) View [source]¶
Creates a
View
object that can subsequently used to create anEnsemble
.Views are created using some data in the form of a NumPy matrix or 2D array, and a clustering algorithm:
d = numpy.random.rand(100,10) v = pyrea.view(d, c)
Views are used to create ensembles. They consist of some data,
d
above, and a clustering algorimth,c
above.
The structure
Module¶
The pyrea.structure
module contains the classes used for the internal functionaliy
of Pyrea. The classes contained here are not generally called or instantiated
by the user, see the pyrea.core
module for the user-facing API.
Developers who wish to extend Pyrea, such as by creating a custom clustering
algorthim, should consult the documentation of the Clusterer
abstract
base class for example. The Fusion
class is another such abstract base
class that must be used if a developer wishes to create a custom fusion
algorithm for use within Pyrea.
- class pyrea.structure.AgglomerativeClusteringPyrea(n_clusters=2, linkage: str = 'ward', affinity: str = 'euclidean', memory: None | Any = None, connectivity=None, compute_full_tree='auto', distance_threshold=None, compute_distances=False)[source]¶
Perform agglomerative clustering.
See https://scikit-learn.org/stable/modules/generated/sklearn.cluster.AgglomerativeClustering.html
- class pyrea.structure.Agreement[source]¶
Agreement fusion function.
Creates the agreement of two clusterings.
Fusion
is the Abstract Base Class for all fusion algorithms. All fusion algorithms must be a subclass of this class in order to accepted by functions such asexecute_ensemble()
. To extend Pyrea with a custom fusion algorithm, create a new class that is a subclass ofFusion
, and implement theFusion.execute()
function.
- class pyrea.structure.Clusterer[source]¶
Clusterer
is the Abstract Base Class for all clustering algorithms. All clustering algorithms must be a subclass of this class in order to accepted by functions such asexecute_ensemble()
. To extend Pyrea with a custom clustering algorithm, create a new class that is a subclass ofClusterer
, and implement theClusterer.execute()
function.
- class pyrea.structure.Consensus[source]¶
Consensus fusion function.
Creates the consensus of two clusterings.
Fusion
is the Abstract Base Class for all fusion algorithms. All fusion algorithms must be a subclass of this class in order to accepted by functions such asexecute_ensemble()
. To extend Pyrea with a custom fusion algorithm, create a new class that is a subclass ofFusion
, and implement theFusion.execute()
function.
- class pyrea.structure.DBSCANPyrea(eps=0.5, min_samples=5, metric='euclidean', metric_params=None, algorithm='auto', leaf_size=30, p=None, n_jobs=None)[source]¶
- class pyrea.structure.Disagreement[source]¶
Disagreement fusion function.
Creates the disagreement of two clusterings.
Fusion
is the Abstract Base Class for all fusion algorithms. All fusion algorithms must be a subclass of this class in order to accepted by functions such asexecute_ensemble()
. To extend Pyrea with a custom fusion algorithm, create a new class that is a subclass ofFusion
, and implement theFusion.execute()
function.
- class pyrea.structure.Ensemble(views: List[View], fuser: Fusion)[source]¶
The Ensemble class encapsulates the views, fusion algorithm and clustering methods required to perform a multi-view clustering.
- Parameters:
views – The views that constitute the ensemble’s multi-view data.
fuser – The fusion algorithm to use.
clusterers – The clustering algorithms to use on the fused matrix.
- class pyrea.structure.Fusion[source]¶
Fusion
is the Abstract Base Class for all fusion algorithms. All fusion algorithms must be a subclass of this class in order to accepted by functions such asexecute_ensemble()
. To extend Pyrea with a custom fusion algorithm, create a new class that is a subclass ofFusion
, and implement theFusion.execute()
function.
- class pyrea.structure.HierarchicalClusteringPyrea(precomputed, method='single', metric='euclidean', optimal_ordering=False, distance_metric='euclidean', out=None, n_clusters=None, height=None)[source]¶
- class pyrea.structure.OPTICSPyrea(min_samples=5, max_eps=inf, metric='minkowski', p=2, metric_params=None, cluster_method='xi', eps=None, xi=0.05, predecessor_correction=True, min_cluster_size=None, algorithm='auto', leaf_size=30, n_jobs=None)[source]¶
- class pyrea.structure.Parea[source]¶
Parea fusion algorithm. This functionality is not yet implemented.
Fusion
is the Abstract Base Class for all fusion algorithms. All fusion algorithms must be a subclass of this class in order to accepted by functions such asexecute_ensemble()
. To extend Pyrea with a custom fusion algorithm, create a new class that is a subclass ofFusion
, and implement theFusion.execute()
function.
- class pyrea.structure.SpectralClusteringPyrea(n_clusters=8, eigen_solver=None, n_components=None, random_state=None, n_init=10, gamma=1.0, affinity='rbf', n_neighbors=10, eigen_tol=0.0, assign_labels='kmeans', degree=3, coef0=1, kernel_params=None, n_jobs=None, verbose=False)[source]¶
Perform spectral clustering.
See: https://scikit-learn.org/stable/modules/generated/sklearn.cluster.SpectralClustering.html
- class pyrea.structure.View(data, clusterer: List[Clusterer])[source]¶
Represents a
View
, which consists of somedata
and a clustering algorithm,clusterer
.Requires a data source,
data
, which is used to create the view (the data source can be a Python matrix (a list of lists), a NumPy 2D array, or a Pandas DataFrame) and a clustering methodclusterer
.Some examples follow (using a list of lists):
import pyrea data = [[1, 5, 3, 7], [4, 2, 9, 4], [8, 6, 1, 9], [7, 1, 8, 1]] v = pyrea.view(data, pyrea.cluster('ward'))
Or by passing a Pandas DataFrame (
pandas.core.frame.DataFrame
):import pyrea import pandas data = pandas.read_csv('iris.csv') v = pyrea.view(data, pyrea.cluster('ward'))
Or (passing a numpy 2d array or matrix (
numpy.matrix
ornumpy.ndarray
)):import pyrea import numpy data = numpy.random.randint(0, 10, (4,4)) v = pyrea.view(data, pyrea.cluster('ward'))
See also
The
Clusterer
class.- Parameters:
data – The data from which to create your
View
.clusterer – The clustering algorithm to use to cluster your
data
- Variables:
labels – Contains the calculated labels when the
clusterer
is run on thedata
.