Cover Image for System.Linq.Enumerable+EnumerablePartition`1[System.Char]

Intelligent Information Retrieval for Reducing Missed Cancer and Improving the Healthcare System

OAI: oai:igi-global.com:284488 DOI: 10.4018/IJIRR.2022010102
Published by: IGI Global

Abstract

This study presents an intelligent information retrieval system that will effectively extract useful information from breast cancer datasets and utilized that information to build a classification model. The proposed model will reduce the missed cancer rate by providing a comprehensive decision support to the radiologist. The model is built on two datasets, Wisconsin Breast Cancer Dataset (WBCD) and 365 free text mammography reports from a hospital. Effective pre-processing techniques including filling missing values with regression, an effective Natural Language Processing (NLP) Parser is developed to handle free text mammography reports, balancing the dataset with Synthetic Minority Oversampling (SMOTE) was applied to prepare the dataset for learning. Most relevant features were selected with the help of filter method and tf-idf scores. K-NN and SGD classifiers are optimized with optimum value of k for K-NN and hyper tuning the SGD parameters with grid search technique.