Araştırma Makalesi

KARMA TİPTEKİ VERİLERİ KAMILA, K-ORTALAMALAR, KORTAYLAR ve K-PROTOTİPLER ALGORİTMALARIYLA KÜMELEME: KARŞILAŞTIRMALI BİR UYGULAMA

Cilt: 20 Sayı: 2 30 Kasım 2019
PDF İndir
EN TR

A Comparative Application on Clustering of Mixed-type Data Sets with kamila, k-means, k-medoids and k-prototypes Algorithms

Abstract

Cluster Analysis is one of the crucial tools which is being used in many areas of scientific researches. As known, there are many algorithms for performing Cluster Analysis.
Nowadays, the main two debates relating to these algorithms are; which one to use for mixedtype data sets and how to decide selecting the best number of clusters. In this study, KAMILA algorithm which is created very ambitiously and other algorithms used before KAMILA such as k-means, k-medoids and k-prototypes algorithms will be performed for clustering the values
of different scaled variables. With this aim, a data set of a grocery store in Istanbul will be analyzed. The company has stores in different districts of Istanbul and the customers have different demographic characteristics and different purchasing behaviors. The data set provided for 999 customers includes information such as; whether the customers are purchasing the product categories that are crucial for the company's profitability and how much the total price of the purchased items are. These data were subjected to clustering analysis for customer segmentation. As a result, it is observed that KAMILA algorithm can successfully identify the customers in the segment that can be named the gold segment.

Keywords

Kaynakça

  1. AGGARWAL, Charu C (2015), Data mining: The textbook, Switzerland: Springer.
  2. CUI, Hongyan, Kuo ZHANG, Yajun FANG, Stanislav SOBOLEVSKY, Carlo RATTI and Berthold KP HORN (2017), "A Clustering Validity Index Based on Pairing Frequency", IEEE Access, 5, 24884-24894.
  3. EVERITT, Brian and Torsten HOTHORN. (2011). Cluster analysis An Introduction to Applied Multivariate Analysis with R (pp. 163-200): Springer.
  4. R Development Core Team (2008). R: A language and environment, STATISTICAL COMPUTING. R FOUNDATION FOR STATISTICAL COMPUTING and Austria. ISBN 3-900051-07-0 VIENNA, URL http://www.R-project.org.
  5. FOSS, Alex, Marianthi MARKATOU, Bonnie RAY and Aliza HECHING (2016), "A semiparametric method for clustering mixed data", Machine Learning, 105(3), 419-458.
  6. FOSS, Alexander H, Marianthi MARKATOU and Bonnie RAY (2018), "Distance Metrics and Clustering Methods for Mixed‐type Data", International Statistical Review.
  7. GAN, Guojun, Chaoqun MA and Jianhong WU (2007), Data clustering: theory, algorithms, and applications, (Vol. 20): Siam.
  8. GOWER, John C (1971), "A general coefficient of similarity and some of its properties", Biometrics, 857-871.

Ayrıntılar

Birincil Dil

Türkçe

Konular

-

Bölüm

Araştırma Makalesi

Yayımlanma Tarihi

30 Kasım 2019

Gönderilme Tarihi

2 Ocak 2019

Kabul Tarihi

1 Kasım 2019

Yayımlandığı Sayı

Yıl 1970 Cilt: 20 Sayı: 2

Kaynak Göster

APA
Bilgiç, E. (2019). KARMA TİPTEKİ VERİLERİ KAMILA, K-ORTALAMALAR, KORTAYLAR ve K-PROTOTİPLER ALGORİTMALARIYLA KÜMELEME: KARŞILAŞTIRMALI BİR UYGULAMA. Cumhuriyet Üniversitesi İktisadi ve İdari Bilimler Dergisi, 20(2), 48-70. https://doi.org/10.37880/cumuiibf.507182

Cumhuriyet Üniversitesi İktisadi ve İdari Bilimler Dergisi Creative Commons Atıf-GayriTicari 4.0 Uluslararası Lisansı (CC BY NC) ile lisanslanmıştır.