Q-GEV Based Novel Trainable Clustering Scheme for Reducing Complexity of Data Clustering

Authors

  • Mohamed Abd Elaziz Faculty of Computer Science and Engineering, Galala University, Suez, Egypt Egypt and With the Academy of Scientific Research and Technology (ASRT), Cairo, Egypt Department of Mathematics, Faculty of Science, Zagazig University, Zagazig, Egypt Author
  • Esraa Osama Abo Zaid Department of Mathematics and Computer Science, Faculty of Science, Seuz University, Suez, Egypt Author
  • Mohammed A. A. Al-qaness Emirates International University image/svg+xml Author
  • Amjad Ali College of Science and Engineering, Hamad Bin Khalifa University, Doha, Qatar Author
  • Ali Kashif Bashir Department of Computing and Mathematics, Manchester Metropolitan University, Manchester, UK Author
  • Ahmed A. Ewees Department of Computer, Damietta University, Damietta, Egypt Author
  • Yasser D. Al-Otaibi Department of Information Systems, Faculty of Computing and Information Technology in Rabigh, King Abdulaziz University, Jeddah, Saudi Arabia Author
  • Ala Al-Fuqaha College of Science and Engineering, Hamad Bin Khalifa University, Doha, Qatar Author

DOI:

https://doi.org/10.1111/exsy.70011

Keywords:

artificial intelligence; continual learning; data clustering; density peak clustering; generalised extreme value; learning model; machine learning

Abstract

This paper presents a new data clustering technique aimed at enhancing the performance of the trainable path-cost algorithm and reducing the computational complexity of data clustering models. The proposed method facilitates the discovery of natural groupings and behaviours, which is crucial for effective coordination in complex environments. It identifies natural groupings within a set of features and detects the best clusters with similar behaviour in the data, overcoming the limitations of traditional state-of-the-art methods. The algorithm utilises a density peak clustering method to determine cluster centers and then extracts features from paths passing through these peak points (centers). These features are used to train the support vector machine (SVM) to predict the labels of other points. The proposed algorithm is enhanced using two key concepts: first, it employs Q-Generalised Extreme Value (Q-GEV) under power normalisation instead of traditional generalised extreme value distributions, thereby increasing modelling flexibility; second, it utilises the random vector functional link (RVFL) network rather than the SVM, which helps avoid overfitting and improves label prediction accuracy. The effectiveness of the proposed clustering algorithm is evaluated through various experiments, including those on UCI benchmark datasets and real-world data, demonstrating significant improvements across multiple performance metrics, including F1 measure, Jaccard index, purity, and accuracy, highlighting its capability in accurately identifying paths between similar clusters. Its average F1 measure, Jaccard index, purity, and accuracy is measured 76.87%, 56.29%, 80.29%, and 79.64%, respectively.

 

Author Biography

  • Mohammed A. A. Al-qaness, Emirates International University

    Mohammed A. A. Al-qaness

References

Abd Elaziz, M., Nabil, N., Ewees, A. A., & Lu, S. (2019). Automatic data clustering based on hybrid atom search optimization and sine-cosine algorithm. In 2019 IEEE Congress on Evolutionary Computation (CEC) (pp. 2315–2322). IEEE.

Ali, A., Ahmed, M. E., Ali, F., Tran, N. H., Niyato, D., & Pack, S. (2019). Non-parametric Bayesian channels clustering (Nobel) scheme for wireless multimedia cognitive radio networks. IEEE Journal on Selected Areas in Communications, 37(10), 2293–2305.

Chen, X., Qi, J., Zhu, X., Wang, X., & Zha, Z. (2020a). Unlabelled text mining methods based on two extension models of concept lattices. International Journal of Machine Learning and Cybernetics, 11(2), 475–490.

Chen, Y., Hu, X., Fan, W., et al. (2020b). Fast density peak clustering for large scale data based on Knn. Knowledge-Based Systems, 187, Article 104824.

De Sole, A., & Kac, V. (2003). On integral representations of q-gamma and q-beta functions. arXiv preprint math/0302032.

Dijkstra, E. W. (1959). A note on two problems in connexion with graphs. Numerische Mathematik, 1(1), 269–271.

Ding, J., He, X., Yuan, J., & Jiang, B. (2018). Automatic clustering based on density peak detection using generalized extreme value distribution. Soft Computing, 22(9), 2777–2796.

Du, M., Ding, S., & Jia, H. (2016). Study on density peaks clustering based on k-nearest neighbors and principal component analysis. Knowledge-Based Systems, 99, 135–145.

Evers, F. T., Höppner, F., Klawonn, F., Kruse, R., & Runkler, T. (1999). Fuzzy cluster analysis: Methods for classification, data analysis and image recognition. John Wiley & Sons.

Fischer, B., & Buhmann, J. M. (2003). Path-based clustering for grouping of smooth curves and texture segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 25(4), 513–518.

Goldberger, A., Amaral, L., Glass, L., et al. (2000). Physionet: Components of a new research resource for complex physiologic signals. Circulation, 101(23), e215–e220.

Graña, M., Nanni, L., Brahnam, S., & Menegatti, E. (2015). Texture descriptors based on Dijkstra's algorithm for medical image analysis. Innovation Medicine Healthcare, 207, 74.

Jain, A. K. (2010). Data clustering: 50 years beyond k-means. Pattern Recognition Letters, 31(8), 651–666.

Jiang, J., Chen, Y., Meng, X., Wang, L., & Li, K. (2019). A novel density peaks clustering algorithm based on k nearest neighbors for improving assignment process. Physica A: Statistical Mechanics and its Applications, 523, 702–713.

Karim, M. R., Beyan, O., Zappa, A., et al. (2021). Deep learning-based clustering approaches for bioinformatics. Briefings in Bioinformatics, 22(1), 393–415.

Maška, M., Ulman, V., Svoboda, D., et al. (2014). A benchmark for comparison of cell tracking algorithms. Bioinformatics, 30(11), 1609–1617.

Pao, Y.-H., & Takefuji, Y. (1992). Functional-link net computing: Theory, system architecture, and functionalities. Computer, 25(5), 76–79.

Pizzagalli, D. U., Gonzalez, S. F., & Krause, R. (2019). A trainable clustering algorithm based on shortest paths from density peaks. Science Advances, 5(10), eaax3770.

Provost, S. B., Saboor, A., Cordeiro, G. M., & Mansoor, M. (2018). On the q-generalized extreme value distribution. REVSTAT-Statistical Journal, 16(1), 45–70.

Rodriguez, A., & Laio, A. (2014). Clustering by fast search and find of density peaks. Science, 344(6191), 1492–1496.

Saxena, A., Prasad, M., Gupta, A., et al. (2017). A review of clustering techniques and developments. Neurocomputing, 267, 664–681.

Siddiqi, M. H., K. Asghar, U. Draz, et al. (2021). Image splicing-based forgery detection using discrete wavelet transform and edge weighted local binary patterns. Security and Communication Networks, 2021(1), Article 4270776.

Ulman, V., Maška, M., Magnusson, K. E., et al. (2017). An objective comparison of cell-tracking algorithms. Nature Methods, 14(12), 1141–1152.

Xu, R., & Wunsch, D. (2005). Survey of clustering algorithms. IEEE Transactions on Neural Networks, 16(3), 645–678.

Xu, X., Ding, S., & Shi, Z. (2018). An improved density peaks clustering algorithm with fast finding cluster centers. Knowledge-Based Systems, 158, 65–74.

Xu, X., Ding, S., Wang, L., & Y. Wang. (2020). A robust density peaks clustering algorithm with density-sensitive similarity. Knowledge-Based Systems, 200, Article 106028.

Zhou, Z., Si, G., Sun, H., Qu, K., & Hou, W. (2022). A robust clustering algorithm based on the identification of core points and Knn kernel density estimation. Expert Systems with Applications, 195, Article 116573.

1

Downloads

Published

2025-02-27

Issue

Section

Articles

Categories

How to Cite

Abd Elaziz, M., Abo Zaid, E. O., Al-qaness, M. A. A., Ali, A., Bashir, A. K., Ewees, A. A., Al-Otaibi, Y. D., & Al-Fuqaha, A. (2025). Q-GEV Based Novel Trainable Clustering Scheme for Reducing Complexity of Data Clustering. Emirates International University Digital Repository, 1(1). https://doi.org/10.1111/exsy.70011

Similar Articles

11-20 of 49

You may also start an advanced similarity search for this article.

Most read articles by the same author(s)