2019 is coming to an end, and this year we’ve seen a snow-filled (increasingly dense) paper that sees researchers around the world explore new issues. Take the 2019 papers list from TopBoots, Heartbeat, New World AI, and others to summarize a representative, academically influential, and informative AI papers published in 2019. Some of these papers have improved existing technical ideas, some have deepened our understanding of the whole thing about machine learning/deep learning, and some have experimented with new hypotheses and opened up new ways of exploration.
Of course, many papers have significant academic value this year, and the following is just the tip of the iceberg. If you think there are any papers that are equally worth reviewing, please leave a message in the comments section and discuss them with us.
In addition, we have prepared a “Top 10 Novel Papers of 2019”, which summarize the year’s particularly interesting and even out-of-the-way papers that have attracted criticism.
Top10 for 2019’s best academic papers (in initials)
A Style-Based Generator Architecture for The Live Networks (CVPR 2019)
A style-based GAN generator architecture
Authors: NVIDIA Labs Tero Karras, Samuli Laine, Timo Aila
Reason for recommendation: StyleGAN is undoubtedly the hottest GAN network model of 2019. Prior to StyleGAN, GAN’s research had encountered difficulties such as conditional generation difficulties, limited benefits of simply increasing model size, inability to generate realistic high-resolution images, and so on, And StyleGAN broke through this bottleneck, combining different attributes in the generation of control, High-resolution, high-definition (and consistent) aspects have led to significant improvements. For this reason, StyleGAN received the CVPR 2019 Award for Best Paper Honor.
StyleGAN has generated a lot of discussion on the web, and its amazing face-building effect not only overwhelms the people who eat melons, but also attracts many people to write their own implementations and open up for everyone to try, including the creation of face (thispersonnotnotonly.com), Models of the cat (thiscatsonisnotnotnotnotnotnotnotnotnot.com), the generation of secondary meta-sister (thiswaifuisnotnotnotnotnotnotnotnot.net), and the generation of room photos (thisairbnbisnotnotnot.com).
Just recently, NVIDIA lab researchers, including the original authors of the paper, published StyleGAN2 (Analyzing and Improving The Image Quality of StyleGAN, arxiv.org/abs/1912.04958), which specifically corrected problems with defects in StyleGAN-generated images and improved the consistency of elements in the images, bringing image generation quality to a new peak.
Paper Address: StyleGAN arxiv.org/abs/1812.04948,StyleGAN2 arxiv.org/abs/1912.04958
Code Open Source: https://github.com/NVlabs/stylegan2
Bridging the Gap training and Inference for Machine Translation (ACL 2019)
Make up the gap between neuromachine translation model training and reasoning
Author: Key Laboratory of Intelligent Information Processing, Chinese Academy of Sciences Computing Institute, University of Chinese Academy of Sciences, WeChat AI Pattern Recognition Center, Worcester Institute of Technology, Huawei Noah’s Ark Laboratory
Recommended Reason: The nerve machine translation model is trained in a given context, predicting some masked words, but the reasoning process (the real translation process) is the need to generate the entire sentence from zero. This deviation problem is common in sequence-to-sequence conversion tasks for a long time. This paper studies this deviation and explores how to make up for it.
The solution proposed by the authors is to switch between “words based on reference text” and “preselected words in the decoder’s own output”, and the paper’s experiments are well done and the results are convincing. According to the ACL 2019 Papers Award Selection Committee, this approach applies to the current pure learning training paradigm and can also lead to improvements in planning sampling, and it may affect not only future research and applicationof of machine translation tasks that would otherwise be targeted, but also for generally improving other sequence-to-sequence conversion models. This paper was also selected as the best paper for ACL 2019.
Grandmaster Level in StarCraft II Using Multi-agent Ing (Nature)
Reaching the “Grandmaster” segment in Interstellar 2 with Multi-Intelligence Intensive Learning
By DeepMind Oriol Vinyals, Demis Hassabis, Chris Apps, David Silver, etc.
Reason for recommendation: DeepMind’s Star 2 AI “AlphaStar” debuted in January 2019, beating the human pro. While the rules of the game were clearly in the AI’s favour, we have felt that AI is not won by speed, but primarily by good strategy. Later, in the fair-rule, star-based massive man-machine 1v1 race, AlphaStar continued to perform well, winning the “Grandmaster” segment, about 0.15% for all active players. This is also the last experiment needed for the AlphaStar paper published in the October 2019 issue of Nature.
It’s certainly not the first time that AI has outperformed humans in games, but DeepMind’s development of AlphaStar doesn’t just use a lot of computing power (like other games AI) and they use designs such as group-enhanced learning (group evolution, retention of many different strategies) that also improve the problems that typically reinforce learning practices improves the performance of the body in complex environments. The solution to the problem of long sequence modeling of incomplete information and high-dimensional continuous action space is becoming more and more mature.
Address: https://www.nature.com/articles/s41586-019-1724-z (open reading version https://storage.googleapis.com/deepmind-media/research/alphastar/AlphaStar_unformatted.pdf)
Learning the Depths of Moving People by Watching Frozen People (CVPR 2019)
Learn to predict the depth of a moving person by observing a still person
Authors: Zhengqi Li, Tali Dekel, Forrester Cole, Richard Tucker, Noah Snavely, Ce Liu, William T. Freeman
Recommended reason: The task to be addressed in this paper “estimating the depth of moving objects from a single camera” at first glance may not seem possible. The paper uses a very clever method, on the one hand, the authors take youTube users’ own uploads of “time-still” videos as a data set, they provide a huge, natural, character-like three-dimensional spatial playback, after the traditional method of restoration can be used as label data, eliminating the pain of collection. This actually reminds us that, in addition to specifically collecting data sets using traditional crowdsourcing methods, there are many publicly available data on the network that can become valuable training datasets after processing.
On the other hand, while using depth models to learn spatial common sense and learn to predict depth, the authors add additional structures that allow the network to extract information about changes between adjacent frames, improving the network’s ability to handle moving objects. The end result is that the model can output a stable, high-accuracy 3D depth prediction with only input from a single camera angle of view, which is also very effective for moving objects. The paper also received the CVPR 2019 Award for Best Paper Honor.
Code Open Source: https://github.com/google/mannequinchallenge
The Lottery Lottery Show Hypothesis: Finding Sparse, Trainable Network Neurals (ICLR 2019)
Lottery Hypothesis: Finding sparse, trained neural networks
By Jonathan Frankle, Michael Carbin, MIT Computer Science and Artificial Intelligence Laboratory
Recommended reason: As a technical route to reduce the volume of the network and reduce the demand for computing resources, network sparseness and knowledge distillation have received more and more attention. At present, the most commonly used method of sparseness is to train a large network and then prune, sparse network can also get the same performance as dense network.
Since sparse networks can have similar performance to dense networks, the authors of this paper make a bold assumption that the desired sparse network is already in dense networks, and we just need to find it out — more specifically, if randomly identified networks do n The next iteration can get a well-trained dense network, and a similar number of iterations from randomly initialized networks can find sparse networks that perform similarly. It’s just that finding that sparse network relies heavily on good initial values, and trying to produce a good initial value at random is like drawing a lottery ticket. This is the “lottery hypothesis” at the heart of the paper.
The authors designed an algorithm to confirm “whether a good number was drawn” and used a series of experiments to verify the hypothesis and demonstrate the importance of good initial values. Even a sparse network from a good initial value can be better performed than a dense network. This paper won the ICLR 2019 Award for Best Paper.
The bold “lottery hypothesis” immediately sparked a heated debate. The authors did follow-up research published stabilizing the Lottery Tickets Hypothesis (arxiv.org/abs/1903.01611); Uber AI Labs published a paper, Deconstructing Lottery Tickets: Zeros, Sign, and The Supermask (arxiv.org/abs/1905.01067) to describe their in-depth exploration of the phenomenon, revealing the “right ness of the lottery”; Networks from Scratch: Faster Training without Losing Losing Performance (arxiv.org/abs/1907.04840) is also followed by the introduction of sparse network generation methods such as “lottery hypothesis” that are too expensive to calculate, and their new approach can start directly with sparse network structures, require less computing resources, train faster, and achieve similar performance to dense networks; The ticketization sacross datasets and optimizers (arxiv.org/abs/1906.02773) was received by NeurIPS 2019.
Code Open Source: https://github.com/google-research/lottery-ticket-hypothesis
On the Sings of The Adaptive Learning Rate and Beyond
About changes in adaptive learning rates and more
Author: UIUC Liyuan Liu, Han Jiaxuan, Microsoft Research Gao Jianfeng, etc.
Reason for recommendation: This paper from Han Jiaxuan’s team studies the management of variations in deep learning. In the training of neural networks, Adam, RMSProp, etc. to add adaptive momentum optimizers to improve the effect require a warm-up phase, otherwise it is easy to fall into bad, potentially problematic local optimization at the beginning of training, and the RAdam presented in this paper Provides a good initial value for the optimizer. With a dynamic rectifier, RAdam can adjust the adaptive momentum in the Adam optimizer to vary size and provide an efficient automatic warm-up process that can run against the current dataset, providing a solid starting point for deep neural network training.
There was another paper in the same period that looked ahead to improve the optimization process, “Look Optimizer: k Forward Steps, 1 step back( arxiAheadv.org/abs/1907.08610), whose core idea was to maintain two sets of weights and interpolate between them, which, arguably, allowed the faster set of weights to “look forward” (i.e., explore), while the slower set of weights could stay behind for better long-term stability. The effect of this approach is to reduce variation during training, “reducing the amount of work overparametered” and “having faster convergence and minimal computational overhead in many different deep learning tasks” (according to the authors themselves).
These two papers not only make effective improvements to the optimization process of neural networks, but also use them together. These results both enhance our understanding of neural network loss space and are very effective tools.
Code open source: https://github.com/LiyuanLucasLiu/RAdam, https://github.com/lonePatient/lookahead_pytorch/blob/master/(LookAhead)
Details: RAdam and LookAhead can be combined into one https://www.leiphone.com/news/201908/SAFF4ESD8CCXaCxM.html
Reasoning-RCNN: Unifying Global Global Reasoning Into-Scale Object Detection (CVPR 2019)
Reasoning-RCNN: Applying Unified Adaptive Global Reasoning in Large-Scale Target Detection
Author: Huawei Noah’s Ark Lab, Zhongshan University
Recommended reasons: With the increasing scale of target recognition and the finer grain, the problems such as category imbalance, occlusion, classification ambiguity, object scale difference and so on become more and more obvious. It is easy to think that an important part of human visual recognition is “common sense-based reasoning”, such as the recognition of both objects more accurately after identifying a A object that is obscured by a B object. This paper incorporates this idea into the RCNN model, where the authors design explicit common sense knowledge for the model and represent the semantic knowledge of objects in the image using a category-based knowledge graph.
On the one hand, adding common sense to the perception model and the basic reasoning ability is the tendency to build “visual intelligence”, on the other hand, other researchers have proposed in an earlier study that “the goal recognition from the image generates a diagram”, but what role does the diagram do after it is generated, and this paper shows that You can use the graph to further improve the performance of the target identification task itself.
In addition, the authors have made many improvements to make the model more suitable for large-scale object recognition, enhance the connection between stages, and optimize the recognition effect. Ultimately, the mAP of the model is significantly increased on multiple datasets. The authors’ approach is lightweight and can be used to identify backbone networks with a variety of targets, or to integrate a variety of different sources of knowledge.
Code Open Source: https://github.com/chanyn/reasoning-RCNN
Social Influence as Intrinsic Motivation for Multi-Agent Deep Learning (ICML 2019 )
Social influence as an intrinsic motivation in multi-intelligence intensive learning
Author: MIT, DeepMind, Princeton University
Recommended Reason: With more and more multi-intelligence intensive learning research, designing/making the smart body society action coordination and information exchange become an important subject. The authors’ focus in this paper is to let the intelligence learn the social motivations inherent in other smarts in a multi-intelligent environment. Their approach is to reward an intelligent body if it affects other smart sons and makes them perform better in terms of collaboration and communication. More specifically, the authors show edify that if a smart body makes a big difference in the behavior of other smarts, it is more likely to encourage more common information exchange between different intelligencees. Such a mechanism would lead to inductive bias in the bodies, and a greater willingness to learn to move in concert, even if they were trained independently. And the reward of influence is to use a distributed way to calculate, can effectively solve the problem of sudden communication. This paper was nominated for the ICML 2019 Best Paper Honor.
Another paper from the Facebook AI Research Institute, Learning ActiveSocialS Via ObservationSEd Self-Play (arxiv.org/abs/1806.10071), has been designed from another perspective: before joining a group, the new intelligence learns the group’s current patterns of behavior (human-looking “customs” through observation and replay mechanisms) to integrate itself. Avoiding a group does not have a strategy to be rewarded (even in a non-cooperative competitive environment). But perhaps the previous article learned that the inherent social motivation was a little more clear? By contrast, it clearly promotes more coordinated and active communication among the intelligent bodies (laughs).
Weight Agnostic Neural Networks
Weight-independent neural networks
By Google AI Adam Gaier and David Ha
Recommended Reason: Modern neural network research has a fixed pattern, fixed network architecture, through optimization to find good connection weight (training). This practice has also led to some discussion, “If we see network structure as a priori, connectivity weights as learned”, how far can we integrate knowledge into the model in the form of a structure (a priori)? And is it good or bad to do so?
This paper is a direct exploration, the network training process is not to find weight, but in the relatively fixed and random weight to find a better network structure. For the integrated a priori network structure, even if all the weights in the network are uniform and random can also have a good performance; The prior knowledge found in this way will also be directly reflected in the form of network structure, with better interpretation.
If “fixed network structure, looking for weight” and “fixed weight, looking for network structure” are like “gas” and “sword”, then now both sides are finally on the scene, we can expect more good drama in the future.
Code Open Source: https://weightagnostic.github.io/
XLNet: Generalized Autoregresive Pretraining for Language Understanding
XLNet: Universal Self-Regression Pre-Training for Language Understanding
Author: CMU, Google AI
Reason for recommendation: There are many improvement models based on BERT, and XLNet is one of the most successful. XLNet’s improvement spree focuses on, 1, replacing BERT’s mask-two-way prediction with a new mask-based mask replacement (a mechanism designed to make BERT token more like a text noise-cancelling model while performing poorly in build tasks),2, using content and content and The position-separated double flow from the attention mechanism, 3, adopts and improves 2 matching new mask ingglet slotted. These designs give XLNet both sequence generation (similar to traditional language models) and contextual information reference capabilities.
Coupled with improvements such as using larger training data sets, a longer-sequenced Transformer-XL as backbone network, higher utilization of masks in training, and allowing partial predictive training, it can be said that XLNet’s technological improvements relative to BERT are from start to finish. It is also reasonable to achieve better performance than BERT in all the tasks the authors tested (although some of the tasks were not significantly promoted).
The emergence of models such as XLNet represents the more mature nuptifa nuptifa nuptis, more and more downstream tasks to adapt to, and the possibility of a unified model architecture solving a variety of Different NLP tasks.
Code open source: https://github.com/zihangdai/xlnet
In addition, the following 10 papers are also on our candidate list, each of which stands out, and we list them below:
AI surpasses humans at six-player poker (Science Magazine)
Outperforming Human Poker AI in the 6-Person Texas Poker Game (which is also the 10th scientific breakthrough in 2019 summarized by Science Magazine)
ALBERT: A Lite BERT for Self-supervised Learning of Language Representations
Simplified version of BERT, but not simply narrowed things down, they get better performance with fewer parameters
A Theory of Fermat Paths for Non-Line-of-Sight Shape Reconstruction
“Rebuilding the shape of objects in non-sight”, or “how to see something behind a corner”, is the subject of this paper. Although the task is a bit of a snub, this paper shows that computer vision technology has the potential to make more seemingly impossible things possible. Best Paper for CVPR 2019
Transferable Multi-Domain State Generator for Task-Target Dialogue Systems (ACL 2019)
Task-oriented multi-wheeled dialog systems often design predefined templates for different tasks, but data sharing and data migration between different templates are a major challenge. This paper puts forward an effective method of knowledge tracking, sharing and migration.
Depth Prediction Without the Sensors: The Arging Structure for Unsupervised Learning From Monocular Videos
Based on single-view video, the practice of solving three-dimensional spatial structure based on the movement of moving objects has a lot of research in traditional computer vision, and this paper combines it with deep learning has brought better results, and the authors’ increased online learning ability also makes this method more adaptable to different data sets and different scenarios.
EfficientNet: Rethinking Model Scaling for The Neural Networks
Studying the scaling and scalability of the CNN model, obtaining higher accuracy with smaller models, and providing a range of optimized models for computing resources of different sizes. ICML 2019 Spotlight Papers
Emerg Tool Use From Multi-Agent Autocurricula
Through implicit course learning, in an environment with interactive and competitive mechanisms, new tasks can be continuously identified between different intelligence esclosers, and they can continue to learn new strategies
RoBERTa: A-Optimized Uberd Pretraining Approach
Special research on the pre-training process of BERT and a new way of improving ideas are proposed, and the new pre-training objectives are used to do more adequate training. That said, it’s easy to design a big model, but try to figure out if you’ve trained enough.
SinGAN: Learning a Generative Model from a Single Natural Image
This paper attempts to learn GAN from a single image, a pyramid structure of a variety of different scales of GAN to learn the image of different sizes of small pieces, the whole model of learning effect can take into account the overall structure and detail texture in the image. ICCV 2019 Best Papers
AsAs Artificial General Architecture Intelligence with With Tianjic Chip
The team from Tsinghua University designed the sky-machine chip with fusion architecture to support both numerically based artificial neural networks based on nonlinear transformations, as well as pulsed neural networks from neuroscience based on signal response. The paper was published in nature magazine.