This repo provides an up-to-date list of progress made in Neural architecture search, which includes but not limited to papers, datasets, codebases, frameworks and etc. Please feel free to open an issue to add new progress.
Note: The papers are grouped by published year. In each group, the papers are sorted by their citations. In addition, the paper with underline means a milestone in the field. The third-party code prefers PyTorch. If you are interested in manually-deisgned architectures, please refer to my another repo awesome-vision-architecture.
- 
FairNAS: Rethinking Evaluation Fairness of Weight Sharing Neural Architecture Search Cited by 167ICCV2021Xiaomi AI LabFairNASSupernet TrainingPDFOfficial Code (Stars 297)TL;DR: Based on inherent unfairness in the supernet training, the authors propose two levels of constraints: expectation fairness and strict fairness. Particularly, strict fairness ensures equal optimization opportunities for all choice blocks throughout the training, which neither overestimates nor underestimates their capacity.
- 
Zero-Cost Proxies for Lightweight NAS Cited by 51ICLR2021Samsung AI Center, CambridgeZero-Cost NASPDFOfficial Code (Stars 80)TL;DR: In this paper, the authors evaluate conventional reduced-training proxies and quantify how well they preserve ranking between neural network models during search when compared with the rankings produced by final trained accuracy.
- 
AutoFormer: Searching Transformers for Visual Recognition Cited by 32ICCV2021Stony Brook UniversityMicrosoft Research AsiaAutoFormerPDFOfficial Code (Stars 554)TL;DR: The authors propose a new one-shot architecture search framework, namely AutoFormer, dedicated to vision transformer search. AutoFormer entangles the weights of different blocks in the same layers during supernet training. The performance of these subnets with weights inherited from the supernet is comparable to those retrained from scratch.
- 
Vision Transformer Architecture Search Cited by 10arXiv2021The University of SydneySenseTime ResearchVision TransformerSuperformerPDFOfficial Code (Stars 40)TL;DR: This paper present a new cyclic weight-sharing mechanism for token embeddings of the Vision Transformers, which enables each channel could more evenly contribute to all candidate architectures.
- 
Searching the Search Space of Vision Transformer Cited by 0NeurIPS2021Institute of Automation, CASMicrosoft ResearchAutoFormerV2S3PDFOfficial Code (Stars 554)TL;DR: The authors propose to use neural architecture search to automate this process, by searching not only the architecture but also the search space. The central idea is to gradually evolve different search dimensions guided by their E-T Error computed using a weight-sharing supernet.
- 
Once-for-All: Train One Network and Specialize it for Efficient Deployment Cited by 508ICLR2020Massachusetts Institute of TechnologyOnce-for-AllOFAPDFOfficial Code (Stars 1.5k)TL;DR: Conventional NAS approaches find a specialized neural network and need to train it from scratch for each case.The authors propose to train a once-for-all (OFA) network that supports diverse architectural settings by decoupling training and search, to reduce the cost, which quickly gets a specialized sub-network by selecting from the OFA network without additional training.
- 
Designing Network Design Spaces Cited by 476CVPR2020FAIRRegNetPDFThird-party Code (Stars 162)TL;DR: Instead of focusing on designing individual network instances, the authors design network design spaces that parametrize populations of networks. The overall process is analogous to classic manual design of networks, but elevated to the design space level.
- 
Single Path One-Shot Neural Architecture Search with Uniform Sampling Cited by 436ECCV2020MEGVII TechnologySPOSSupernet TrainingPDFThird-party Code (Stars 208)TL;DR: The authors seek to construct a simplified supernet, where all architectures are single paths so that weight co-adaption problem is alleviated. Training is performed by uniform path sampling. All architectures (and their weights) are trained fully and equally.
- 
FBNetV2: Differentiable Neural Architecture Search for Spatial and Channel Dimensions Cited by 151CVPR2020UC BerkeleyFacebook Inc.FBNetV2PDFOfficial Code (Stars 724)TL;DR: The authors propose a memory and computationally efficient DNAS variant: DMaskingNAS. This algorithm expands the search space by up to 10^14x over conventional DNAS, supporting searches over spatial and channel dimensions that are otherwise prohibitively expensive: input resolution and number of filters.
- 
EcoNAS: Finding Proxies for Economical Neural Architecture Search Cited by 56CVPR2020The University of SydneySenseTime Computer Vision Research GroupEcoNASPDFTL;DR: The authors observe that most existing proxies exhibit different behaviors in maintaining the rank consistency among network candidates. In particular, some proxies can be more reliable. Inspired by these observations, the authors present a reliable proxy and further formulate a hierarchical proxy strategy that spends more computations on candidate networks that are potentially more accurate.
- 
FBNetV3: Joint Architecture-Recipe Search using Neural Acquisition Function Cited by 41arXiv2020Facebook Inc.UC BerkeleyUNC Chapel HillFBNetV3PDFTL;DR: Previous NAS methods search for architectures under one set of training hyper-parameters (i.e., a training recipe), overlooking superior architecture-recipe combinations. To address this, this paper presents Neural Architecture-Recipe Search (NARS) to search both architectures and their corresponding training recipes, simultaneously.
- 
Semi-Supervised Neural Architecture Search Cited by 27NeurIPS2020University of Science and Technology of ChinaMicrosoft Research AsiaSemi-Supervised NASPDFTL;DR: Neural architecture search (NAS) relies on a good controller to generate promising architectures. However, training the controller requires both abundant and high-quality pairs of architectures and their accuracy, which is costly. In this paper, the authors propose SemiNAS, a semi-supervised NAS approach that leverages numerous unlabeled architectures (without evaluation and thus nearly no cost).
- 
DARTS: Differentiable Architecture Search Cited by 2.5kICLR2019Carnegie Mellon UniversityGoogle DeepMindDARTSPDFOfficial Code (Stars 3.6k)TL;DR: This paper addresses the scalability challenge of architecture search by formulating the task in a differentiable manner. Unlike conventional approaches of applying evolution or reinforcement learning over a discrete and non-differentiable search space, the proposed method is based on the continuous relaxation of the architecture representation, allowing efficient search of the architecture using gradient descent.
- 
Regularized Evolution for Image Classifier Architecture Search Cited by 2.0kAAAI2019Google BrainEvolutionAmoebaNetPDFOfficial Code (Stars 23.4k)TL;DR: The authors evolve an image classifier---AmoebaNet-A---that surpasses hand-designs for the first time. To do this, they modify the tournament selection evolutionary algorithm by introducing an age property to favor the younger genotypes.
- 
MnasNet: Platform-Aware Neural Architecture Search for Mobile Cited by 1.8kCVPR2019Google BrainGoogle Inc.MNASNetPDFOfficial Code (Stars 4.8k)TL;DR: The authors propose an automated mobile neural architecture search (MNAS) approach, which explicitly incorporate model latency into the main objective, where latency is directly measures as real-world inference latency by executing the model on mobile phones.
- 
ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware Cited by 1.2kICLR2019Massachusetts Institute of TechnologyProxylessNASPDFOfficial Code (Stars 1.3k)TL;DR: This paper presents ProxylessNAS that can directly learn the architectures for large-scale target tasks and target hardware platforms. The proposed method address the high memory consumption issue of differentiable NAS and reduce the computational cost (GPU hours and GPU memory) to the same level of regular training while still allowing a large candidate set.
- 
FBNet: Hardware-Aware Efficient ConvNet Design via Differentiable Neural Architecture Search Cited by 811CVPR2019UC BerkeleyPrinceton UniversityFacebook Inc.FBNetLatency TablePDFOfficial Code (Stars 724)TL;DR: The authors propose a differentiable neural architecture search (DNAS) framework that uses gradient-based methods to optimize ConvNet architectures, by directly considering latency on target devices.
- Efficient Neural Architecture Search via Parameters Sharing Cited by 1.9kICML2018Google BrainCarnegie Mellon UniversityENASReinforcement LearningPDFThird-party Code (Stars 2.6k)TL;DR: The proposed method (ENAS) constructs a large computational graph (suprenet), where each subgraph represents a neural network architecture, hence forcing all architectures to share their parameters. Evaluating candidate architectures with these subgraphs and their corresponding parameters would lead to much lower GPU hours (1000x less expensive than existing methods).
- 
Neural Architecture Search with Reinforcement Learning Cited by 4.0kICLR2017Google BrainReinforcement LearningPDFThird-party Code (Stars 395)TL;DR: This is a pioneering work exploits the paradigms of reinforcement learning (RL) to solve NAS problem. To be specific, the authors use a recurrent network to generate the model descriptions of neural networks and train this RNN with reinforcement learning to maximize the expected accuracy of the generated architectures on a validation set.
- 
Designing Neural Network Architectures using Reinforcement Learning Cited by 1.2kICLR2017Massachusetts Institute of TechnologyQ-learningReinforcement LearningPDFOfficial Code (Stars 127)TL;DR: The authors introduce MetaQNN, a meta-modeling algorithm based on reinforcement learning to automatically generate high-performing CNN architectures for a given learning task. The learning agent is trained to sequentially choose CNN layers using -learning with an e-greedy exploration strategy and experience replay
- Neural Architecture Search: A Survey Cited by 1.5kJMLR2019University of FreiburgSurveyPDFTL;DR: The authors provide an overview of existing work in this field of research and categorize them according to three dimensions: search space, search strategy, and performance estimation strategy.
- NAS-Bench-101 Download LinkTL;DR: This dataset contains 423,624 unique neural networks exhaustively generated and evaluated from a fixed graph-based search space. Each network is trained and evaluated multiple times on CIFAR-10 at various training budgets and we present the metrics in a queriable API. The current release contains over 5 million trained and evaluated models.How to cite:NAS-Bench-101: Towards Reproducible Neural Architecture SearchCited by 340ICML2019Google BrainNAS-Bench-101PDF
- D-X-Y/AutoDL-Projects Stars 1.4kAutoDLTL;DR: Automated Deep Learning Projects (AutoDL-Projects) is an open source, lightweight, but useful project for everyone. This project implemented several neural architecture search (NAS) and hyper-parameter optimization (HPO) algorithms.
- CVPR 2021 WorkShop - 1st Lightweight NAS Challenge and Moving Beyond NAS ChallengeTL;DR: The goals of this workshop are to 1) bring together emerging research in the areas of NAS and etc. to discuss open challenges and opportunities ahead; 2) benchmark lightweight NAS in a systematic and realistic approach. This workshop provides winner solutions of lightweight NAS competitions, and the corresponding submitted papers.