- This event has passed.
Thesis Defence: Predicting remaining time for an energy regulator’s highly parallel process
May 24, 2023 at 3:00 pm - 6:00 pm
Corey Bond, supervised by Dr. Patricia Lasserre and Dr. Yves Lucet, will defend their thesis titled “Predicting remaining time for an energy regulator’s highly parallel process” in partial fulfillment of the requirements for the degree of Master of Science in Computer Science.
An abstract for Corey Bond’s thesis is included below.
Examinations are open to all members of the campus community as well as the general public. Registration is not required for in person defences.
ABSTRACT
Business process mining is a field that allows businesses to leverage data collected by business systems to gain insight into their processes. While process mining is a broad field, this thesis concentrates on process duration prediction based on information gained by reviewing historical data. Specifically, we investigate a provincial energy regulator’s permit application process, and explore methods to predict the duration of the application, along with one of the sub-processes. The primary concern is developing an interpretable model to support the regulatory body in explaining the results of the predictions.
Our goals are to solve the problem of how long a permit application might take to process, and support engagement with local First Nations communities by providing accurate estimates of the consultation process. These predictions are complicated by the highly complex processes being predicted, which include parallel execution of process tasks, causing significant data complexity.
We solve the challenge of predicting a highly complex process by bridging the gap between traditional process mining and machine learning methods, while focusing on an interpretable and explainable model. We modify a state of the art random forest method called eXplainable Reasonably Randomized Forest (XRRF) by implementing a comprehensive data cleaning and engineering strategy. This modification also extends the feature selection method that provides interpretability of the results, and increases the hyperparameters available to improve the model’s accuracy.
In this thesis, we present the modified method, called Extended eXplainable Reasonably Randomized Forest (EXRRF). With this method, we were able to identify the factors that have the greatest impact on the duration of the consultation and application processes, while maintaining accuracy within 29-35% of the mean duration for each of these highly complex and variable processes.