Data Mining Techniques in Direct Marketing on Imbalanced Data Using Tomek Link Combined With Random Under-Sampling

dc.contributor.author Yilmaz, Umit
dc.contributor.author Gezer, Cengiz
dc.contributor.author Aydin, Zafer
dc.contributor.author Gungor, V. CaGri
dc.date.accessioned 2025-09-25T10:43:23Z
dc.date.available 2025-09-25T10:43:23Z
dc.date.issued 2021
dc.description Yilmaz, Umit/0000-0003-2918-7799; en_US
dc.description.abstract Determining the potential customers is very important in direct marketing. Data mining techniques are one of the most important methods for companies to determine potential customers. However, since the number of potential customers is very low compared to the number of non-potential customers, there is a class imbalance problem that significantly affects the performance of data mining techniques. In this paper, different combinations of basic and advanced resampling techniques such as Synthetic Minority Over-sampling Technique (SMOTE), Tomek Link, RUS, and ROS were evaluated to improve the performance of customer classification. Different feature selection techniques are used in order the decrease the number of non-informative features from the data such as Information Gain, Gain Ratio, Chi-squared, and Relief. Classification performance was compared and utilized using several data mining techniques, such as LightGBM, XGBoost, Gradient Boost, Random Forest, AdaBoost, ANN, Logistic Regression, Decision Trees, SVC, Bagging Classifier based on ROC AUC and sensitivity metrics. A combination of Tomek Link and Random Under-Sampling as a resampling technique and Chi-squared method as feature selection algorithm showed superior performance among the other combinations. Detailed performance evaluations demonstrated that with the proposed approach, LightGBM, which is a gradient boosting algorithm based on decision tree, gave the best results among the other classifiers with 0.947 sensitivity and 0.896 ROC AUC value. en_US
dc.identifier.doi 10.1145/3471287.3471299
dc.identifier.isbn 9781450389549
dc.identifier.scopus 2-s2.0-85116076075
dc.identifier.uri https://doi.org/10.1145/3471287.3471299
dc.identifier.uri https://hdl.handle.net/20.500.12573/3559
dc.language.iso en en_US
dc.publisher Assoc Computing Machinery en_US
dc.relation.ispartof 5th International Conference on Information System and Data Mining (ICISDM) -- MAY 27-29, 2021 -- ELECTR NETWORK en_US
dc.rights info:eu-repo/semantics/closedAccess en_US
dc.subject Direct Marketing en_US
dc.subject Data Mining en_US
dc.subject Tomek Link en_US
dc.subject Machine Learning en_US
dc.subject Imbalanced Data en_US
dc.title Data Mining Techniques in Direct Marketing on Imbalanced Data Using Tomek Link Combined With Random Under-Sampling en_US
dc.type Conference Object en_US
dspace.entity.type Publication
gdc.author.id Yilmaz, Umit/0000-0003-2918-7799
gdc.author.scopusid 57281003300
gdc.author.scopusid 57212210999
gdc.author.scopusid 7003852510
gdc.author.scopusid 10739803300
gdc.author.wosid Yılmaz, Ülkü/Aaa-7545-2020
gdc.bip.impulseclass C5
gdc.bip.influenceclass C5
gdc.bip.popularityclass C5
gdc.coar.access metadata only access
gdc.coar.type text::conference output
gdc.collaboration.industrial false
gdc.description.department Abdullah Gül University en_US
gdc.description.departmenttemp [Yilmaz, Umit; Aydin, Zafer; Gungor, V. CaGri] Abdullah Gul Univ, Dept Comp Engn, Kayseri, Turkey; [Gezer, Cengiz] Adesso Turkey, Res & Dev Ctr, Istanbul, Turkey en_US
gdc.description.endpage 73 en_US
gdc.description.publicationcategory Konferans Öğesi - Uluslararası - Kurum Öğretim Elemanı en_US
gdc.description.scopusquality N/A
gdc.description.startpage 67 en_US
gdc.description.woscitationindex Conference Proceedings Citation Index - Science
gdc.description.wosquality N/A
gdc.identifier.openalex W3203718418
gdc.identifier.wos WOS:000794191400010
gdc.index.type WoS
gdc.index.type Scopus
gdc.oaire.diamondjournal false
gdc.oaire.impulse 1.0
gdc.oaire.influence 2.6025408E-9
gdc.oaire.isgreen false
gdc.oaire.popularity 3.0683875E-9
gdc.oaire.publicfunded false
gdc.oaire.sciencefields 0202 electrical engineering, electronic engineering, information engineering
gdc.oaire.sciencefields 02 engineering and technology
gdc.openalex.collaboration National
gdc.openalex.fwci 0.2719
gdc.openalex.normalizedpercentile 0.64
gdc.opencitations.count 2
gdc.plumx.crossrefcites 2
gdc.plumx.mendeley 16
gdc.plumx.scopuscites 3
gdc.scopus.citedcount 3
gdc.virtual.author Aydın, Zafer
gdc.wos.citedcount 1
relation.isAuthorOfPublication a26c06af-eae3-407c-a21a-128459fa4d2f
relation.isAuthorOfPublication.latestForDiscovery a26c06af-eae3-407c-a21a-128459fa4d2f
relation.isOrgUnitOfPublication 665d3039-05f8-4a25-9a3c-b9550bffecef
relation.isOrgUnitOfPublication 52f507ab-f278-4a1f-824c-44da2a86bd51
relation.isOrgUnitOfPublication ef13a800-4c99-4124-81e0-3e25b33c0c2b
relation.isOrgUnitOfPublication.latestForDiscovery 665d3039-05f8-4a25-9a3c-b9550bffecef

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Data Mining Techniques in Direct Marketing on Imbalanced.pdf
Size:
466.97 KB
Format:
Adobe Portable Document Format
Description:
Konferans Ögesi

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.44 KB
Format:
Item-specific license agreed upon to submission
Description: