Loading Events

« All Events

Thesis Defence—Exploring Code Clones in Software Development: A Study of PyTorch on GitHub and Stack Overflow

June 20 at 9:00 am - 1:00 pm

A graphic that speaks to Md Jumar Alam defending their thesis.

Md Jumar Alam, supervised by Dr. Fatemeh Fard, will defend their thesis titled “Exploring Code Clones in Software Development: A Study of PyTorch on GitHub and Stack Overflow” in partial fulfillment of the requirements for the degree of Master of Science in Computer Science.

An abstract for Md Jumar Alam’s thesis is included below.

Defences are open to all members of the campus community as well as the general public. Registration is not required for in person defences.


ABSTRACT

Code cloning, the practice of duplicating identical or highly similar source code fragments within or across different projects, is a prevalent phenomenon in software development. This practice is not exclusive to traditional software development but extends to deep learning frameworks. Developers often clone code from both within their own repository files and distant repositories in the open-source system. Platforms like GitHub and Stack Overflow serve as rich ecosystems for such practices. This thesis looks into the specifics of code cloning in the context of the PyTorch framework. The research addresses the distribution of PyTorch code clones, the rela- tionship between code cloning practices and user and repository metadata, and the phases of deep learning development where code cloning primarily occurs. Findings reveal that function cloning is more prevalent than block cloning in GitHub-GitHub clones, with Type I and Type II clones being more common. However, for GitHub-Stack Overflow and Stack Overflow- Stack Overflow clones, Type III clones are more prevalent, indicating users often modify codes cloned from different platforms. The research also finds that user contributions, follower count, following count, organizational mem- bership, and repository popularity do not strongly influence code cloning practices. F indings from the clone also show that the data processing, model construction, and model evaluation phases of deep learning devel- opment stages have more clones than other stages. Comparisons show that most cloning occurs in the preliminary preparation and data collection stage by different users. Whereas, clones in data processing, model construction and model evaluation stages are mostly done by the same users in their repositories. Across all stages, Type III clones are more prominent in both categories of users. Despite limitations, such as the inability to use data from the PyTorch forum, the study provides valuable insights into code cloning practices in PyTorch. The findings could guide future research, such as analyzing the usage of PyTorch APIs during code cloning.

Details

Date:
June 20
Time:
9:00 am - 1:00 pm

Venue

Arts and Sciences Centre (ASC)
3187 University Way
Kelowna, BC V1V 1V7 Canada
+ Google Map

Additional Info

Room Number
ASC 301
Registration/RSVP Required
No
Event Type
Thesis Defence
Topic
Research and Innovation, Science, Technology and Engineering
Audiences
Alumni, Community, Faculty, Staff, Families, Partners and Industry, Students, Postdoctoral Fellows and Research Associates