Deep Learning (DL) applications have become pervasive in almost every field, including application domains in science and technology. Large-scale, memory-intensive DL applications that require training of complex DL models on high-resolution data and large batch sizes result in a high amount of memory consumption during the model training phase. In such cases, often the memory being consumed exceeds the available system resources. Reliably predicting the memory consumption of a DL model being trained prior to runtime is valuable to minimize OutOfMemory errors and save limited system resources and/or computation budget (for example, when running on cloud).
In this paper, we adopt a modeling approach based on symbolic regression principles to generate an accurate memory consumption model for a DL application to predict the peak memory consumption during training. We evaluated our modeling approach using 3D U-net as an application case-study, because it exhibits high memory consumption during the training phase (close to 1TB of data for large image sizes). Based on our approach, the memory consumption model generated was able to predict the peak memory consumed during the training phase for different input size and batch sizes with less than 5% Mean Absolute Percentage Error (MAPE). We subsequently compared our model approach against models generated from other machine-learning based regression methods, demonstrating the superior accuracy of our modeling approach.