A Cluster-Based Machine Learning Model for Large Healthcare Data Analysis [abstract]

A Cluster-Based Machine Learning Model for Large Healthcare Data Analysis [abstract]

A Cluster-Based Machine Learning Model for Large Healthcare Data Analysis

Fatemeh Sharifi, Emad Mohammed, Trafford Crump, Behrouz H. Far


There is huge growth in the amount of patient survey data being generated in healthcare industries and hospitals. Curse of dimensionality is a barrier to extracting useful information from patient survey data which can help in the treatment and care of patients. It is paramount to have methods to find importance of features based on such huge volumes of stored information for the desired outputs. The health-related quality of life (HRQOL) is a powerful paradigm to help reaching such a desired output, measuring as patient satisfaction. In such scenarios, it is difficult to investigate the features, out of such high-dimensional data, that could best represent desired output and explain them so that such features can be used in the future at the point f care. In this paper we propose a Cluster-based Random Forest (CB-RF) method to particularly exploit the most important features for the desired output, which is Expanded Prostate Index Composite-26 (EPIC-26) domain scores. EPIC-26 is being used for assessing a range of HRQOL issues related to the diagnosis and treatment of prostate cancer. Different feature extraction methods are applied to extract features and the best method is the proposed CB-RF model which could find the most important features (10 or less) out of over 1500 features that can be used to accurately estimate patient with their EPIC-26 values with on average 85% coefficient of correlation between predicted and observed values of real dataset including 5093 patients.


Machine learning Big data Patient quality of life Dimension reduction 

Part of the Communications in Computer and Information Science book series (CCIS, volume 1054)

It’s Movember, time to Grow a Mo for a Bro!

It’s Movember, time to grow your moustache to raise funds and awareness of some serious health risks that men face, like suicide, testicular cancer and prostate cancer. Maybe growing a moustache isn’t your thing? No problem, host a Mo-ment for the men in your life instead!

APCaRI is a key stakeholder in the TrueNTH Global Registry; contributing 92% of the submitted patients in February 2018. Recently described by Evans et al., 2017 in an article published in BMJ Open, this project was established as an international registry with the goal to monitor the care of men with localised prostate cancer from 13 Movember-fundraising countries. Prostate cancer treatment and outcomes for men vary according to where they live, their race and the care they receive. The TrueNTH Global Registry is collecting a dataset based on the International Consortium for Health Outcome Measures (ICHOM) so we can better understand how to improve the care and treatment of men with localized prostate cancer, regardless of ethnicity and geography.

Please check out previous APCaRI blog posts that have talked about Movember (@Movember); the international Mens’ health Awareness charity, and about TrueNTH (@TrueNTH_Canada); a program funded by Prostate Cancer Canada (PCC) and the Movember Foundation that aims to improve the quality of life of men with prostate cancer and their families.

So start growing (or attach) your moustache today to raise funds and awareness to improve mens’ health!

- Perrin Beatty