Deep Learning: The Task
𝒪(N³) computational complexity
Need to carry matrices between memory and processor
Analog Deep Learning
Local (in-memory) processing
𝒪(N) computational complexity (Fully-parallel operation)
Unit Devices
Programmable
Resistive Elements
Si-Incompatible Slow or Uncontrollable
Architectures
Analog Core & Digital Periphery
Algorithms
Gradient Descent
Type Optimizer
Redundant Circuitry or Serial Operations
Highly Sensitive to Nonidealities
Previous Roadblocks Before
Analog Training Processors
All key components
are finally here.
3 Major Breakthroughs
Si-compatible technology
P-SiO2 solid-state proton electrolyte
Ultrafast ideal devices
Nanosecond-femtojoule protonics
Novel training algorithm
High accuracy deep learning
Nanosecond Protonic Programmable
Resistors for Analog Deep Learning
MIT Best PhD Dissertation Award