Gerrit Krot

ConvE

Applying Transfer Learning to a Convolutional Neural Network.

Skills: Deep Learning, Machine Learning, Research

Languages: Python

ConvE is a convolutional neural network that was used to predict missing links in 2D knowledge graphs. Each link represents a relationship between a subject and an object, such as a subject, “Acorn” has the relationship “IsSeedOf” with another object, “OakTree.” ConvE uses Convolution to create a data structure that compares all of the knowledge between two given subjects, and uses this to identify relationships that may potentially be missing. Constructing missing knowledge graphs links is difficult and complex, and training a neural network to perform this task can take many, many hours. In 2023, I wanted to see if transfer learning could be applied to this task to speed up the training process.

Transfer learning is a relatively simple concept. First, you train a model on one set of data until it is effective and reliable. For subsequent models trained on different data, instead of initializing them with random weights, you use the weights of the trained model again. This often results in a significant reduction in training time because some model weights are generalizable to different datasets, and some variables have optimal weights with small ranges. By using weights from a pre-trained model, a portion of the weights of the model are relatively close to where they should be. As such, the time needed to get all of the weights into good ranges is dramatically reduced. In this project, my group was able to reduce the number of epochs needed to train ConvE by over 70% in all cases, showing that transfer learning is applicable to knowledge graph link prediction. To view the rest of our findings, please check out the full report.

Because we were able to show that transfer learning was applicable to knowledge graph link-prediction, more research can be done to isolate the values that remain consistent, and cause the improved training speed. This knowledge can then be leveraged to identify good initialization values for this task, and ideally learn why the values work well.