Cognitive Edge Computing

Confirmed Speakers

Time	Title	Speaker
8:45 AM - 8:55 AM	Opening Remarks	Organizers
9:00 AM - 9:40 AM	Living Intelligently on the Edge	Mondira Pant, Academic Research Director and Principal Engineer, Intel
9:40 AM - 10:20 AM	“OK edge”: Efficient Inference at the Edge	Matt Mattina, Senior Director of Machine Learning Research, ARM Research
10:20 AM - 10:40 AM	Morning Break
10:40 AM - 11:20 AM	Edge only Cognition	Cormac Brick, Director of Embedded Machine Intelligence, Movidius/Intel
11:20 AM - 12:00 PM	The Magic Inside HoloLens – Team and Technology	Rob Shearer, Sr. Director of Silicon and System Architecture, Microsoft
12:00 PM - 13:00 PM	Lunch Break
13:00 PM - 13:40 PM	Deep Learning for Real Time Human Activity Recognition on Mobile Phones	Cait Crawford, Distinguished Engineer, IBM Research
13:40 PM - 14:20 PM	Connectivity and Computing	Tom Rondeau, Program Manager, DARPA
14:20 PM - 15:00 PM	Cognitive Computing for Sensor Data Analytics	Bing C. Li, Lead for ATR and Image Processing Group, Lockheed Martin
15:00 PM - 15:30 PM	Coffee Break
15:30 PM - 16:10 PM	Bandwidth-Efficient Deep Learning on Edge Devices	Song Han, Research Scientist, Google/MIT
16:10 PM - 16:50 PM	Cognitive Edge Computing White Paper Discussion	Organizers, Speakers, Audience

Talk Details

Edge only Cognition

Today most people have experienced cloud based AI in some way. In the last few months we are starting to see more and more completely edge based cognitive devices. This talk will assess where we are at today, looking at deep learning silicon, toolchains and algorithms. We will also look at how rapid improvements in each of these areas will drive new products and highlight a few areas where more development is needed to take things to the next level.

Living Intelligently on the Edge

In this talk, Mandy will discuss research tasks and opportunities that will need to be pursued to realize performance and power levels of the human brain. This not only entails a full-system approach, including information processing, programming paradigms, algorithms, architectures, circuits, device technologies, and materials development, but also developing a full network of cognitive systems that connect the billions of edge devices to solve society scale applications.

“OK edge”: Efficient Inference at the Edge

As machine learning applications migrate from the cloud to the edge of the network, the platform on which these intelligent applications execute will be resource constrained. Understanding the tradeoff between accuracy and resource requirements for edge applications is essential. In this talk, we’ll compared several neural networks designed for keyword spotting (KWS) on edge platforms. In particular, we’ll examine the tradeoff between accuracy and compute/memory requirements for DNNs, CNNs, and LSTMs for the KWS use case.

The Magic Inside HoloLens – Team and Technology

Microsoft invented a new world of mixed reality via tightly coupled hardware/software co-design yielding hardware that enabled the first fully untethered, holographic computer with see through lenses. We will explore the vision that drove us; the unique architecture and design methodology; and then delve into the magic of the technology that made the HoloLens a reality.

Cognitive Computing for Sensor Data Analytics

As many new sensors are invented or created and existing sensor performances are improved, the quantity of data becomes significantly “big” which forms big data that have attracted many researchers recently. In this presentation, we discuss the characteristics of big data generated from physical sensors and then suggest possible approaches for these sensor data analytics. First we discuss data generation process and data characteristics that show that multiple and complex sensors and high dense environments in practical applications generate very high quantity of data. Then we discuss data characteristics and describe that the “big” of big data not only come from “more” data, but also come from data’s high dimension, and high complexity. Sensor data is big, however they are sparse and sparse hierarchically. We recommend hierarchical structures to describe big data and propose three layer models for sensor data processing. The first layer is the front end processing that process significant amount of data to create low level features. The low level features are taken as the sparse representation of raw data. In the second layer, the low level features are processed to create middle level features (objects) that are denser than the low level features, and can be treated as the sparse representation of low level features. In the third layer, the middle level features are combined and processed to create output on intelligent information. Finally, we demonstrate 3D LIDAR sensor data analytics for power company disaster management.

Connectivity and Computing

Over 130 years since Hertz's initial wireless experiments, we have come a long way in manipulating and using the electromagnetic spectrum. The range of uses and applications we have yet to invent is nearly endless. In particular, DARPA is interested in the area of large-scale software defined arrays to manage multi-beam, multi-band, and multi-function systems. All of these applications require improvements in both RF electronics as well as new computer processors that can compute more for less power. This talk will review some of the technologies being developed by DARPA's Microsystems Technology Office to address these technology issues. The talk will start by covering one result from the RF-FPGA program, the use of this technology to build a next-generation embedded software defined radio system called Hedgehog, and a future architecture for embedded processing in the Domain-Specific System on Chip (DSSoC) program.

Deep Learning for Real Time Human Activity Recognition on Mobile Phones

In this talk we present a deep learning based technique for human activity classification that runs in real time on mobile devices. Our technique minimizes the size of the model and computational overhead in order to run on the embedded processor and preserve battery life.

Bandwidth-Efficient Deep Learning on Edge Devices

Deep learning has spawned a wide range of AI applications that are changing our lives. However, deep neural networks are both computationally and memory intensive. Thus they are power hungry when deployed on embedded systems with a limited power budget. To address this problem, I will present an algorithm and hardware co-design methodology for improving the efficiency of deep learning.

I will first introduce "Deep Compression", which can compress deep neural network models by 18-49× without loss of prediction accuracy for a broad range of CNN, RNN, and LSTMs. The compression reduces memory bandwidth. I’ll then introduce “Deep Gradient Compression” that can reduce the communication bandwidth by 500x, which alleviates the network pressure for distributed training. Next, by changing the hardware architecture and efficiently implementing Deep Compression, I will introduce EIE, the Efficient Inference Engine, which can perform decompression and inference simultaneously, saving a significant amount of memory bandwidth. By taking advantage of the compressed model and being able to deal with an irregular computation pattern efficiently, EIE achieves 13× speedup and 3000× better energy efficiency over GPU. Finally, I will revisit the inefficiencies in current learning algorithms, present DSD training, and discuss the challenges and future work in bandwidth-efficient deep learning.