Evolving Neural Architecture Using One Shot Model

Authors: Nilotpal Sinha, Kuan-Wen Chen
ACM SIGEVO Genetic and Evolutionary Computation Conference (GECCO) 2021

Abstract: Previous evolution based architecture search require high computational resources resulting in large search time. In this work, we propose a novel way of applying a simple genetic algorithm to the neural architecture search problem called EvNAS (Evolving Neural Architecture using One Shot Model) which reduces the search time significantly while still achieving better result than previous evolution based methods. The architectures are represented by architecture parameter of one shot model which results in the weight sharing among the given population of architectures and also weight inheritance from one generation to the next generation of architectures. We use the accuracy of partially trained architecture on validation data as a prediction of its fitness to reduce the search time. We also propose a decoding technique for the architecture parameter which is used to divert majority of the gradient information towards the given architecture and is also used for improving the fitness prediction of the given architecture from the one shot model during the search process. EvNAS searches for architecture on CIFAR-10 for 3.83 GPU day on a single GPU with top-1 test error 2.47%, which is then transferred to CIFAR-100 and ImageNet achieving top-1 error 16.37% and top-5 error 7.4% respectively. [Full Paper] [CODE] [Video Link]

Neural Architecture Search using Covariance Matrix Adaptation Evolution Strategy

Authors: Nilotpal Sinha, Kuan-Wen Chen
submitted to Evolutionary Computation (MIT Press)

Abstract: Evolution-based neural architecture search requires high computational resources, resulting in long search time. In this work, we propose a framework of applying the Covariance Matrix Adaptation Evolution Strategy (CMA-ES) to the neural architecture search problem called CMANAS, which achieves better results than previous evolution-based methods while reducing the search time significantly. The architectures are modelled using a normal distribution, which is updated using CMA-ES based on the fitness of the sampled population. We used the accuracy of a trained one shot model (OSM) on the validation data as a prediction of the fitness of an individual architecture to reduce the search time. We also used an architecture-fitness table (AF table) for keeping record of the already evaluated architecture, thus further reducing the search time. CMANAS finished the architecture search on CIFAR-10 with the top-1 test accuracy of 97.44% in 0.45 GPU day and on CIFAR-100 with the top-1 test accuracy of 83.24% for 0.6 GPU day on a single GPU. The top architectures from the searches on CIFAR-10 and CIFAR-100 were then transferred to ImageNet, achieving the top-5 accuracy of 92.6% and 92.1%, respectively. [Evolutionary Computation (MIT Press)] [arXiv] [CODE]

Following is the visualization of the architecture search on CIFAR-10 dataset by the weight sharing based CMANAS. Here we visualize the mean of the normal distribution and its progression during the architecture search.

Similarly, following is the visualization of the architecture search on CIFAR-10 dataset by the non-weight sharing based CMANAS.

Neural Architecture Search using Progressive Evolution

Authors: Nilotpal Sinha, Kuan-Wen Chen
ACM SIGEVO Genetic and Evolutionary Computation Conference (GECCO) 2022

Abstract: Vanilla neural architecture search using evolutionary algorithms (EA) involves evaluating each architecture by training it from scratch, which is extremely time-consuming. This can be reduced by using a supernet to estimate the fitness of every architecture in the search space due to its weight sharing nature. However, the estimated fitness is very noisy due to the co-adaptation of the operations in the supernet. In this work, we propose a method called pEvNAS wherein the whole neural architecture search space is progressively reduced to smaller search space regions with good architectures. This is achieved by using a trained supernet for architecture evaluation during the architecture search using genetic algorithm to find search space regions with good architectures. Upon reaching the final reduced search space, the supernet is then used to search for the best architecture in that search space using evolution. The search is also enhanced by using weight inheritance wherein the supernet for the smaller search space inherits its weights from previous trained supernet for the bigger search space. Exerimentally, pEvNAS gives better results on CIFAR-10 and CIFAR-100 while using significantly less computational resources as compared to previous EA-based methods. [arXiv] [Paper] [CODE]

Novelty Driven Evolutionary Neural Architecture Search

Authors: Nilotpal Sinha, Kuan-Wen Chen
ACM SIGEVO Genetic and Evolutionary Computation Conference (GECCO) 2022

Abstract: Evolutionary algorithms (EA) based neural architecture search (NAS) involves evaluating each architecture by training it from scratch, which is extremely time-consuming. This can be reduced by using a supernet for estimating the fitness of an architecture due to weight sharing among all architectures in the search space. However, the estimated fitness is very noisy due to the co-adaptation of the operations in the supernet which results in NAS methods getting trapped in local optimum. In this paper, we propose a method called NEvoNAS wherein the NAS problem is posed as a multi-objective problem with 2 objectives: (i) maximize architecture novelty, (ii) maximize architecture fitness/accuracy. The novelty search is used for maintaining a diverse set of solutions at each generation which helps avoiding local optimum traps while the architecture fitness is calculated using supernet. NSGA-II is used for finding the pareto optimal front for the NAS problem and the best architecture in the pareto front is returned as the searched architecture. Exerimentally, NEvoNAS gives better results on 2 different search spaces while using significantly less computational resources as compared to previous EA-based methods. [arXiv] [Poster] [CODE]