Cover Image for System.Linq.Enumerable+EnumerablePartition`1[System.Char]

Predicting Seminal Quality and its Dependence on Life Style Factors Through Ensemble Learning

OAI: oai:igi-global.com:246079 DOI: 10.4018/IJEHMC.2020040105
Published by: IGI Global

Abstract

The awareness related to fertility is of great importance due to the change in lifestyle habits. Semen analysis is a reliable confirmatory test to check the fertility in men. The supervised machine learning models of base classifiers include Decision Tree, Logistic Regression and Naive Bayes classifiers in which logistic regression shows a promising accuracy of 88%. Comparing with the bagging ensemble method for the weakest classifier, the results show a leap in accuracy from 78.80% to 90.02%. The authors have also attempted to design a novel voting classifier which votes over the ensemble learners and creates a more complex model to give an accuracy of 89%. Apart from this, the authors have also analyzed the receiver operating characteristic (ROC) curve for Extra Tree classifier which shows a 66% of area under the curve (AUC). The validation procedure used is a 5 fold cross-validation. The authors have further analyzed the lifestyle habits responsible for contributing to this problem based on impurity-based feature selection and have obtained ‘Age' as the most crucial factor in declining seminal quality.