A Cluster-Based Machine Learning Model for Large Healthcare Data Analysis [abstract]

A Cluster-Based Machine Learning Model for Large Healthcare Data Analysis [abstract]

A Cluster-Based Machine Learning Model for Large Healthcare Data Analysis

Fatemeh Sharifi, Emad Mohammed, Trafford Crump, Behrouz H. Far


There is huge growth in the amount of patient survey data being generated in healthcare industries and hospitals. Curse of dimensionality is a barrier to extracting useful information from patient survey data which can help in the treatment and care of patients. It is paramount to have methods to find importance of features based on such huge volumes of stored information for the desired outputs. The health-related quality of life (HRQOL) is a powerful paradigm to help reaching such a desired output, measuring as patient satisfaction. In such scenarios, it is difficult to investigate the features, out of such high-dimensional data, that could best represent desired output and explain them so that such features can be used in the future at the point f care. In this paper we propose a Cluster-based Random Forest (CB-RF) method to particularly exploit the most important features for the desired output, which is Expanded Prostate Index Composite-26 (EPIC-26) domain scores. EPIC-26 is being used for assessing a range of HRQOL issues related to the diagnosis and treatment of prostate cancer. Different feature extraction methods are applied to extract features and the best method is the proposed CB-RF model which could find the most important features (10 or less) out of over 1500 features that can be used to accurately estimate patient with their EPIC-26 values with on average 85% coefficient of correlation between predicted and observed values of real dataset including 5093 patients.


Machine learning Big data Patient quality of life Dimension reduction 

Part of the Communications in Computer and Information Science book series (CCIS, volume 1054)

Annual Terwillegar Trail Run and Walk Fundraiser

It was a beautiful crisp fall morning for a 10 Km trail run or 7.5 Km walk through the Terwillegar ravine on Saturday, September 29th. The run/walk, hosted by the Terwillegar Trail Run/Walk and the Alberta Cancer Foundation,  is in its 7th year. Its goal is to bring families and friends together to enjoy the outdoors and ultimately raise funds for prostate cancer research.

John Lewis’ research group was out in force; represented by John Lewis, Catalina Vasquez, Arun Raturi, Perrin Beatty and Abbie Coros. Despite the fact that, as one of the run/walk organizers Doug Mitchell pointed out to the participants, John ran in 15-year-old tennis shoes, the Lewis group runners ran well and had a great time!

Funds raised by the Terwillegar Trail Run and Walk go to support cancer research in Alberta. Check out the Alberta Cancer Foundations’ “Dollars at Work” to read about how these funds have been used to support the research from APCaRI members Dr. Frank Wuest and Dr. John Lewis’ labs!

With just over 100 participants this year the 2018 Terwillegar Trail Run/Walk raised over $21 000 for prostate cancer research! You can still donate to this awesome fundraiser, just go to Alberta Cancer Foundation TTRW and click on the Donate Now button!

- Perrin Beatty