Hi! New AI Weekly is here! Enjoy your weekend reading AI news and don’t forget to share it with your friends 😉
Neural scene representation and rendering – in this work authors introduce the Generative Query Network (GQN), a framework within which machines learn to perceive their surroundings by training only on data obtained by themselves as they move around scenes. Much like infants and animals, the GQN learns by trying to make sense of its observations of the world around it. In doing so, the GQN learns about plausible scenes and their geometrical properties, without any human labelling of the contents of scenes.
Jens Ludwig: “Machine Learning in the Criminal Justice System” | Talks at Google
Speech_Recognition_with_Tensorflow – implementation of a seq2seq model for speech recognition. Architecture similar to Listen, Attend and Spell.
Apple CreateML vs Kaggle – During recent WWDC Apple presented their newest tool called CreateML. As a ML enthusiast i was really impressed on what i’ve seen on dedicated session, so i thought it would be worth to investigate how powerful it really is.
Smoothgrad in Tensorflow.js – Visualise the saliency of the predictions made by a tensorflow.js model using SmoothGrad.
Improving Language Understanding with Unsupervised Learning OpenAI obtained state-of-the-art results on a suite of diverse language tasks with a scalable, task-agnostic system. Their approach is a combination of two existing ideas: transformers and unsupervised pre-training. These results provide a convincing example that pairing supervised learning methods with unsupervised pre-training works very well.
A Connectome Based Hexagonal Lattice Convolutional Network Model of the Drosophila Visual System – What can we learn from a connectome? Authors constructed a simplified model of the first two stages of the fly visual system, the lamina and medulla. The resulting hexagonal lattice convolutional network was trained using backpropagation through time to perform object tracking in natural scene videos. Networks initialized with weights from connectome reconstructions automatically discovered well-known orientation and direction selectivity properties in T4 neurons and their inputs, while networks initialized at random did not. This work is the first demonstration, that knowledge of the connectome can enable in silico predictions of the functional properties of individual neurons in a circuit, leading to an understanding of circuit function from structure alone.
Massively Parallel Video Networks – authors introduce a class of causal video understanding models that aims to improve efficiency of video processing by maximising throughput, minimising latency, and reducing the number of clock cycles. Leveraging operation pipelining and multi-rate clocks, these models perform a minimal amount of computation (e.g. as few as four convolutional layers) for each frame per timestep to produce an output. The models are still very deep, with dozens of such operations being performed but in a pipelined fashion that enables depth-parallel computation.
Through-Wall Human Pose Estimation Using Radio Signals – RF-Pose provides accurate human pose estimation through walls and occlusions. It leverages the fact that wireless signals in the WiFi frequencies traverse walls and reflect off the human body. It uses a deep neural network approach that parses such radio signals to estimate 2D poses. RF-Pose is trained using state-of-the-art vision model to provide cross-modal supervision. Once trained, RF-Pose uses only the wireless signal for pose estimation. Experimental results show that, when tested on visible scenes, the radio-based system is almost as accurate as the vision-based system used to train it. Yet, unlike vision-based pose estimation, the radio-based system can estimate 2D poses through walls despite never trained on such scenarios.