
Thesis Defence: A Machine Learning Approach to Survival Analysis for Sustainable Gas Well Decommissioning
July 3 at 9:00 am - 1:00 pm

Christina John Mjema, supervised by Dr. Yves Lucet, will defend their thesis titled “A Machine Learning Approach to Survival Analysis for Sustainable Gas Well Decommissioning” in partial fulfillment of the requirements for the degree of Master of Science in Computer Science.
An abstract for Christina John Mjema’s thesis is included below.
Defences are open to all members of the campus community as well as the general public. Please email yves.lucet@ubc.ca to receive the Zoom link for this defence.
Abstract
The ability to predict the productive lifespan of gas wells is essential for informed regulatory oversight and strategic operational planning. Traditional decline curve analysis methods, though widely used, fit mathematical models to individual well production histories, often neglecting broader behavioural patterns across well populations and struggling with missing or irregular data.
This thesis presents a survival analysis framework that integrates domain knowledge, formation-based clustering, and threshold-based data segmentation to improve the accuracy and interpretability of end-of-life predictions for approximately 20,000 wells. Survival models—including the time-varying Cox proportional hazards model, the random survival forest model, and a hybrid model combining both—are applied to both the full dataset and geologically homogeneous clusters. The framework incorporates 12 static geological attributes and 6 time-dependent production features. Production thresholds are used to define survival labels, representing economically meaningful definitions of low production, such as daily rates below 50,000 cubic meters.
Experimental results show that clustering reduces the mean absolute error in predicted well lifespan to 1.8 years for wells with an average productive life of 20 years. For example, in Jean Marie Formation wells, clustered models outperform full-dataset models by up to 15 percent in relative error. Clustering also improves computational efficiency: training time for random survival forest models on 3,000 wells is reduced from 9 hours to 1.5 hours. Among all models, the time-varying Cox proportional hazards model achieves the highest accuracy, especially for wells with strong recent production trends. Hybrid models perform comparably well, while standard Cox models underperform in scenarios involving time-dependent behaviour.
To validate predictions, classical exponential decline curves are fitted and plotted alongside survival model outputs and economic limits. The visual alignment is strong for most wells, though deviations occur for wells with scattered or atypical production, such as wells 89 and 38557. Random survival forest models are also identified as the most computationally demanding, and current software packages lack graphics processing unit support, suggesting a clear direction for future development.
This research demonstrates that combining survival analysis with domain-informed clustering and decline curve validation offers a practical, interpretable, and scalable approach to predicting gas well end-of-life, supporting timely abandonment planning and strategic resource allocation.