New approach reduces training time for DL networks by 69%
Researchers at North Carolina State University have developed a model that cuts the training time for deep learning networks by up to 69%.
A team led by Xipeng Shen, a professor of computer science at NC State, aimed to accelerate the AI training process, facilitating the development of deep learning solutions down the line.
While training a deep learning network traditionally involves a computer breaking a data sample into chunks of consecutive data points—dividing a digital image into blocks of adjacent pixels that are then run through a series of computational filters—Shen et al.’s approach groups similar data points together, allowing the technology to simultaneously run filters on multiple chunks of data at once.
Considering that datasets can consist of millions of data samples and training a network involves running the same dataset hundreds of times, the researchers’ method could save a lot of computing power.
“One of the biggest challenges facing the development of new AI tools is the amount of time and computing power it takes to train deep learning networks to identify and respond to the data patterns that are relevant to their applications,” Shen said in a release. “We’ve come up with a way to expedite that process, which we call Adaptive Deep Reuse.”
The team refined their model by first looking at large chunks of data using a relatively low threshold for determining similarity between samples, then reducing those numbers and training the network with a more stringent similarity threshold. Shen and colleagues also designed an adaptive algorithm for the network that automatically implemented changes during the training process.
The researchers tested the Adaptive Deep Reuse technique using three popular deep learning networks and datasets: CifarNet using Cifar10, AlexNet using ImageNet and VGG-19 using ImageNet. Their method cut training time for AlexNet by 69%, for VGG-19 by 68% and for CifarNet by 63%, all without sacrificing accuracy.
“This demonstrates that the technique drastically reduces training times,” Hui Guan, a PhD candidate at NC State and a researcher on the project, said in the release. “It also indicates that the larger the network, the more Adaptive Deep Reuse is able to reduce training times, since AlexNet and VGG-19 are both substantially larger than CifarNet.”
Shen, Guan and colleagues will present their findings at the Institute of Electrical and Electronics Engineers International Conference on Data Engineering, which began April 8 in Macau, China.