Complementary metal-oxide-semiconductor (CMOS) image sensors, such as those found in smartphone cameras, have become indispensable tools of the digital age to capture visual information. These sensors generally comprise a front-end, silicon photodiode array (pixels) that converts incoming light into analog, electrical currents, which are then digitized. However, the sensors themselves cannot perform processing on captured images, so data must be shuttled to a separate, back-end processor for subsequent operations. For more involved tasks, such as feature recognition, the separation of the sensor and the processor can introduce overhead in both throughput and power. Accordingly, processing data within the sensor itself can be beneficial when energy expenditure, latency, bandwidth, and memory usage are critical, such as in Internet of Things (IoT) edge devices. This paradigm, known as in-sensor processing, has taken the research community by storm, with new sensor architectures being actively explored to perform operations at the front-end.
Recent works have pioneered in-sensor, optoelectronic computing using electrostatically doped photodiodes constructed from two-dimensional transition metal dichalcogenide layers, where applied voltages are used to control the sensitivity of individual pixels to incoming light. By connecting multiple pixels together to sum their photocurrents, these devices can be configured to perform an analog “multiply-accumulate” (MAC) operation common to many image processing pipelines, thereby allowing some of the visual information to be refined as it is captured. In contrast, the chemically doped photodiodes at the heart of CMOS image sensors have a fixed photoresponsivity, and thus cannot readily perform in-sensor computing. Therefore, modifying the silicon photodiode array to enable pixel-level programmability could harness the massive scale of the mainstream CMOS industry to bring in-sensor computing to a wide variety of real-world applications.
To overcome this obstacle, we have developed a CMOS-compatible, electrostatically doped silicon photodiode array, which can be fabricated en-masse at the wafer scale. Instead of abutting chemically doped p and n regions to form diode junctions, we form metallic contacts on an intrinsic silicon wafer. By applying bias voltages to dedicated gate contacts, we can form electrically programmable p-i-n (or n-i-p) regions. These dynamically reversible photodiodes can perform analog, optoelectronic MAC operations in the same manner as described above, thereby concurrently filtering images as they are captured. In sum, this allows for the first stage of vision processing to be moved to the front-end sensor. We first evaluate the operational uniformity of thousands of measured photodiodes on a single wafer and confirm their re-programmability and stability; multiphysics simulation further confirms the operating principle. Then, in our highlight demonstration, we show how interconnected 3 × 3 arrays these devices can apply different image filters to incoming optical images, and also detect moving features.
Looking ahead, we foresee the use of CMOS-compatible, in-sensor processing as especially useful in not only machine vision and edge-computing applications, but also in bio-inspired applications, wherein early information processing allows for the co-location of memory and compute units, like in the brain. Increasing device density and integration with CMOS electronics is the next step to realize this. By replacing the standard pixels in CMOS image sensors with programmable ones that can intelligently trim out unneeded data, imaging devices could be made more efficient in both energy and bandwidth to address the demands of the next generation of sensory applications.
If you are interested in reading more about our work, please refer to the full article in Nature Electronics: In-sensor optoelectronic computing using electrostatically doped silicon