Cognitive computing is transforming the way we perceive the world. Today, applications of cognitive computing rely on machine learning models that are trained in the cloud and inference requests are made in real-time using end-user edge devices. This, thus far, has largely been the way that cognitive services are deployed. However, concerns over security, privacy, network connectivity etc. are sparking a paradigm shift in cognitive computing. Interest is rapidly growing in deploying training and inference engines at the edge, which in and of itself is evolving at an astounding pace.
Bringing cognitive computing to the edge devices opens up several new opportunities and challenges for researchers. Edge devices are short on resources, such as power, networking, storage, compute etc. Therefore, cognitive edge computing requires us to understand the issues and discover novel solutions.
The workshop is meant to foster an interactive discussion about emerging application domains and the role of cognitive computing in these new domains. In addition, the workshop is meant to lead to new discussions and insights on cognitive algorithms, architectures and system-level design.
|8:45 AM - 8:55 AM||Opening Remarks||Organizers|
|9:00 AM - 9:40 AM||Living Intelligently on the Edge||Mondira Pant, Academic Research Director and Principal Engineer, Intel|
|9:40 AM - 10:20 AM||“OK edge”: Efficient Inference at the Edge||Matt Mattina, Senior Director of Machine Learning Research, ARM Research|
|10:20 AM - 10:40 AM||Morning Break|
|10:40 AM - 11:20 AM||Edge only Cognition||Cormac Brick, Director of Embedded Machine Intelligence, Movidius/Intel|
|11:20 AM - 12:00 PM||The Magic Inside HoloLens – Team and Technology||Rob Shearer, Sr. Director of Silicon and System Architecture, Microsoft|
|12:00 PM - 13:00 PM||Lunch Break|
|13:00 PM - 13:40 PM||Deep Learning for Real Time Human Activity Recognition on Mobile Phones||Cait Crawford, Distinguished Engineer, IBM Research|
|13:40 PM - 14:20 PM||Connectivity and Computing||Tom Rondeau, Program Manager, DARPA|
|14:20 PM - 15:00 PM||Cognitive Computing for Sensor Data Analytics||Bing C. Li, Lead for ATR and Image Processing Group, Lockheed Martin|
|15:00 PM - 15:30 PM||Coffee Break|
|15:30 PM - 16:10 PM||Bandwidth-Efficient Deep Learning on Edge Devices<||Song Han, Research Scientist, Google/MIT|
|16:10 PM - 16:50 PM||Cognitive Edge Computing White Paper Discussion||Organizers, Speakers, Audience|
Edge only Cognition
Today most people have experienced cloud based AI in some way. In the last few months we are starting to see more and more completely edge based cognitive devices. This talk will assess where we are at today, looking at deep learning silicon, toolchains and algorithms. We will also look at how rapid improvements in each of these areas will drive new products and highlight a few areas where more development is needed to take things to the next level.
Living Intelligently on the Edge
In this talk, Mandy will discuss research tasks and opportunities that will need to be pursued to realize performance and power levels of the human brain. This not only entails a full-system approach, including information processing, programming paradigms, algorithms, architectures, circuits, device technologies, and materials development, but also developing a full network of cognitive systems that connect the billions of edge devices to solve society scale applications.
“OK edge”: Efficient Inference at the Edge
As machine learning applications migrate from the cloud to the edge of the network, the platform on which these intelligent applications execute will be resource constrained. Understanding the tradeoff between accuracy and resource requirements for edge applications is essential. In this talk, we’ll compared several neural networks designed for keyword spotting (KWS) on edge platforms. In particular, we’ll examine the tradeoff between accuracy and compute/memory requirements for DNNs, CNNs, and LSTMs for the KWS use case.
The Magic Inside HoloLens – Team and Technology
Microsoft invented a new world of mixed reality via tightly coupled hardware/software co-design yielding hardware that enabled the first fully untethered, holographic computer with see through lenses. We will explore the vision that drove us; the unique architecture and design methodology; and then delve into the magic of the technology that made the HoloLens a reality.
Cognitive Computing for Sensor Data Analytics
As many new sensors are invented or created and existing sensor performances are improved, the quantity of data becomes significantly “big” which forms big data that have attracted many researchers recently. In this presentation, we discuss the characteristics of big data generated from physical sensors and then suggest possible approaches for these sensor data analytics. First we discuss data generation process and data characteristics that show that multiple and complex sensors and high dense environments in practical applications generate very high quantity of data. Then we discuss data characteristics and describe that the “big” of big data not only come from “more” data, but also come from data’s high dimension, and high complexity. Sensor data is big, however they are sparse and sparse hierarchically. We recommend hierarchical structures to describe big data and propose three layer models for sensor data processing. The first layer is the front end processing that process significant amount of data to create low level features. The low level features are taken as the sparse representation of raw data. In the second layer, the low level features are processed to create middle level features (objects) that are denser than the low level features, and can be treated as the sparse representation of low level features. In the third layer, the middle level features are combined and processed to create output on intelligent information. Finally, we demonstrate 3D LIDAR sensor data analytics for power company disaster management.
Connectivity and Computing
Over 130 years since Hertz's initial wireless experiments, we have come a long way in manipulating and using the electromagnetic spectrum. The range of uses and applications we have yet to invent is nearly endless. In particular, DARPA is interested in the area of large-scale software defined arrays to manage multi-beam, multi-band, and multi-function systems. All of these applications require improvements in both RF electronics as well as new computer processors that can compute more for less power. This talk will review some of the technologies being developed by DARPA's Microsystems Technology Office to address these technology issues. The talk will start by covering one result from the RF-FPGA program, the use of this technology to build a next-generation embedded software defined radio system called Hedgehog, and a future architecture for embedded processing in the Domain-Specific System on Chip (DSSoC) program.
Deep Learning for Real Time Human Activity Recognition on Mobile Phones
In this talk we present a deep learning based technique for human activity classification that runs in real time on mobile devices. Our technique minimizes the size of the model and computational overhead in order to run on the embedded processor and preserve battery life.
Bandwidth-Efficient Deep Learning on Edge Devices
Deep learning has spawned a wide range of AI applications that are changing our lives. However, deep neural networks are both computationally and memory intensive. Thus they are power hungry when deployed on embedded systems with a limited power budget. To address this problem, I will present an algorithm and hardware co-design methodology for improving the efficiency of deep learning.
I will first introduce "Deep Compression", which can compress deep neural network models by 18-49× without loss of prediction accuracy for a broad range of CNN, RNN, and LSTMs. The compression reduces memory bandwidth. I’ll then introduce “Deep Gradient Compression” that can reduce the communication bandwidth by 500x, which alleviates the network pressure for distributed training. Next, by changing the hardware architecture and efficiently implementing Deep Compression, I will introduce EIE, the Efficient Inference Engine, which can perform decompression and inference simultaneously, saving a significant amount of memory bandwidth. By taking advantage of the compressed model and being able to deal with an irregular computation pattern efficiently, EIE achieves 13× speedup and 3000× better energy efficiency over GPU. Finally, I will revisit the inefficiencies in current learning algorithms, present DSD training, and discuss the challenges and future work in bandwidth-efficient deep learning.
Vijay Janapa Reddi is a Research Scientist at Google in the Mobile SoC architecture team, on leave from his position as an associate professor in the Department of Electrical and Computer Engineering at the University of Texas at Austin. His research interests include architecture and software design to enhance processor performance, user experience, energy efficiency and reliability for consumer devices and autonomous systems.
Ravi Iyer is an Intel Fellow and Director for Datacenter Technologies in Intel’s Datacenter Group. He has led technology innovation and incubation efforts and has made significant contributions from low-power system-on-chip wearable/IOT devices to high performance multi-core server architectures including novel cores, innovative cache/memory hierarchies, QoS, accelerators, algorithms/workloads and performance/power analysis. He has published 150+ papers and has 40+ patents. Ravi received his Ph.D. from Texas A&M University and is also an IEEE Fellow.
Nilesh Jain Nilesh Jain is a Principal Engineer in Intel’s Data Center Group. His research interest includes machine learning applications, system architecture, acceleration technologies that enhance the power, performance and overall experience. His recent focus has been deep learning, visual computing and immersive experiences. Prior to this he has developed technologies for ultra-low power edge analytics.
Yuhao Zhu is an Assistant Professor in the Computer Science Department at University of Rochester and a visiting researcher in the Machine Learning Group at ARM Research. His research interests are computer architecture and software design to enable next-generation mobile systems that are energy-efficient, intelligent, and deliver desirable end-user experience. His recent work has been focused on domain-specific systems for visual computing and Web technologies