Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[pyclustering.cluster.xmeans] Specify probabilistic bounds for MNDL #624

Closed
annoviko opened this issue Aug 18, 2020 · 1 comment
Closed
Assignees
Labels
Enhancement Tasks related to enhancement and development Good First Issue Tasks that can be easily done by contributors

Comments

@annoviko
Copy link
Owner

Introduction
alpha and betta are by default 0.9. These values might affect X-Means results and it is useful to have an access to them via **kwargs.

Description
Introduce alpha and betta for MDNL X-Means.

@annoviko annoviko added Enhancement Tasks related to enhancement and development Good First Issue Tasks that can be easily done by contributors labels Aug 18, 2020
@annoviko annoviko self-assigned this Aug 18, 2020
@annoviko
Copy link
Owner Author

annoviko commented Aug 20, 2020

Usage example in case of Python (using arguments alpha and beta in constructor):

from pyclustering.cluster import cluster_visualizer
from pyclustering.cluster.xmeans import xmeans, splitting_type
from pyclustering.cluster.center_initializer import kmeans_plusplus_initializer
from pyclustering.utils import read_sample
from pyclustering.samples.definitions import FCPS_SAMPLES

# Read sample 'Target' from file.
sample = read_sample(FCPS_SAMPLES.SAMPLE_TARGET)

# Random state.
seed = 1000

# Prepare initial centers - amount of initial centers defines amount of clusters from which X-Means will start analysis.
amount_initial_centers = 3
initial_centers = kmeans_plusplus_initializer(sample, amount_initial_centers, random_state=seed).initialize()

# Create instance of X-Means algorithm.
xmeans_mndl = xmeans(sample, initial_centers, 20, splitting_type=splitting_type.MINIMUM_NOISELESS_DESCRIPTION_LENGTH, alpha=0.5, beta=0.5, random_state=seed).process()

# Extract X-Means MNDL clustering results:
mndl_clusters = xmeans_mndl.get_clusters()

# Visualize clustering results
visualizer = cluster_visualizer(1, titles=['MNDL'])
visualizer.append_clusters(mndl_clusters, sample, 0)
visualizer.show()

Usage example in case of C++ (methods: set_mndl_alpha_bound and set_mndl_beta_bound):

pyclustering::clst::xmeans solver(centers, p_kmax, p_tolerance, pyclustering::clst::splitting_type::MINIMUM_NOISELESS_DESCRIPTION_LENGTH);
solver.set_mndl_alpha_bound(p_alpha);   // <--- set alpha probabilistic bound for MNDL splitting criteria X-Means.
solver.set_mndl_beta_bound(p_beta);      // <--- set beta probabilistic bound for MNDL splitting criteria X-Means.

Figure_1

annoviko added a commit that referenced this issue Aug 20, 2020
…. Introduced new parameters 'alpha' and 'beta' in order to control MNDL.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement Tasks related to enhancement and development Good First Issue Tasks that can be easily done by contributors
Projects
None yet
Development

No branches or pull requests

1 participant