Microsoft and OneFlow Leverage Efficient Coding Principle to Design Unsupervised DNN Structure Learning That Outperforms Human-Designed Structures
The performance of deep neural networks (DNNs) is highly dependent on their structures, and designing a good structure (aka architecture) tends to require considerable effort on the part of human experts. The idea of a machine learning algorithm for structures that can achieve performance comparable to the best human-designed structures is therefore increasingly attractive to machine learning researchers.
In the newspaper Learning structures for deep neural networks, a team from OneFlow and Microsoft is exploring unsupervised structure learning, leveraging the principle of efficient coding, information theory, and computational neuroscience to design a method of structure learning that does not require of labeled information and empirically demonstrates that higher entropy outputs in a deep neural network lead to better performance.
The researchers assume that the optimal structure of neural networks can be derived from the input characteristics without labels. Their study seeks to determine whether it is possible to learn good DNN network structures from scratch in a fully automatic manner, and what would be a principle-based method to achieve this goal.
The team refers to a principle borrowed from the field of the biological nervous system – the efficient coding principle – which postulates that good brain structure “forms an efficient internal representation of external environments.” They apply the principle of efficient coding to the DNN architecture, proposing that the structure of a well-designed network matches the statistical structure of its input signals.
The principle of efficient coding suggests that mutual information between inputs and outputs of a model should be maximized, and the team presents a strong theoretical basis of optimal Bayesian classification to support this. More precisely, they show that the upper layer of any neural network (softmax linear classifier) and the independence between the nodes of the upper hidden layer is a sufficient condition for the softmax linear classifier to act as an optimal Bayesian classifier. This theoretical basis not only supports the principle of efficient coding, but also provides a way to determine the depth of a DNN.
The team then investigates how to take advantage of the principle of efficient coding in the design of a structure learning algorithm and shows that sparse coding can implement the principle by assuming prior zero peak and heavy tail distributions. . This suggests that an efficient structure learning algorithm can be designed based on global sparse group coding.
The proposed structure learning algorithm with hollow coding learns a structure layer by layer in an ascending manner. The raw features are at layer one, and given the predefined number of nodes in layer two, the algorithm will learn the connection between these two layers, and so on.
The researchers also describe how this proposed algorithm can learn inter-layer connections, handle invariance, and determine DNN depth. Finally, they conduct intensive experiments on the popular CIFAR-10 dataset to assess the classification accuracies of their proposed structure learning method, the role of interlayer connections, and the role of structure and depth masks. network.
The results show that a learned structure single-layer grating achieves an accuracy of 63.0%, surpassing the single-layer baseline by 60.4%. In an inter-layer connection density evaluation experiment, structures generated by the sparse coding approach outperform random structures and, at the same density level, still outperform the baseline of restriction-sparsifying Boltzmann machines. (RBM). In evaluating the role of the team structure mask, the a priori structure provided by sparse coding is considered to improve performance. The network depth experience empirically justifies the approach proposed to determine the depth of the DNN via the efficiency of the coding.
Overall, research proves the effectiveness of the efficient coding principle for unsupervised structure learning, and that the proposed sparse coding-based structure learning algorithms can achieve performance comparable to the best structures designed by the man.
The paper Learning structures for deep neural networks is on arXiv.
Author: Hecate Il | Editor: Michael Sarazen, Zhang Channel
We know you don’t want to miss any news or research breakthroughs. Subscribe to our popular newsletter Weekly Synchronized Global AI to get weekly AI updates.