Monday 17 January 2022
IS&T Welcome & PLENARY: Quanta Image Sensors: Counting Photons Is the New Game in Town
07:00 – 08:10
The Quanta Image Sensor (QIS) was conceived as a different image sensor—one that counts photoelectrons one at a time using millions or billions of specialized pixels read out at high frame rate with computation imaging used to create gray scale images. QIS devices have been implemented in a CMOS image sensor (CIS) baseline room-temperature technology without using avalanche multiplication, and also with SPAD arrays. This plenary details the QIS concept, how it has been implemented in CIS and in SPADs, and what the major differences are. Applications that can be disrupted or enabled by this technology are also discussed, including smartphone, where CIS-QIS technology could even be employed in just a few years.
Eric R. Fossum, Dartmouth College (United States)
Eric R. Fossum is best known for the invention of the CMOS image sensor “camera-on-a-chip” used in billions of cameras. He is a solid-state image sensor device physicist and engineer, and his career has included academic and government research, and entrepreneurial leadership. At Dartmouth he is a professor of engineering and vice provost for entrepreneurship and technology transfer. Fossum received the 2017 Queen Elizabeth Prize from HRH Prince Charles, considered by many as the Nobel Prize of Engineering “for the creation of digital imaging sensors,” along with three others. He was inducted into the National Inventors Hall of Fame, and elected to the National Academy of Engineering among other honors including a recent Emmy Award. He has published more than 300 technical papers and holds more than 175 US patents. He co-founded several startups and co-founded the International Image Sensor Society (IISS), serving as its first president. He is a Fellow of IEEE and OSA.
08:10 – 08:40 EI 2022 Welcome Reception
KEYNOTE: Vision-based Navigation
Session Chairs: Patrick Denny, Valeo Vision Systems (Ireland) and Peter van Beek, Intel Corporation (United States)
08:40 – 09:45
Green Room
08:40
Conference Introduction
08:45AVM-100
KEYNOTE: Deep drone navigation and advances in vision-based navigation [PRESENTATION-ONLY], Matthias Müller, Embodied AI Lab at Intel (Germany)
This talk will be divided into two parts. In the first part, I will present our recent line of work on deep drone navigation in collaboration with the University of Zurich. We have developed vision-based navigation algorithms that can be trained entirely in simulation via privileged learning and then transferred to a real drone that performs acrobatic maneuvers or flies through complex indoor and outdoor environments at high speeds. This is achieved by using appropriate abstractions of the visual input and relying on an end-to-end pipeline instead of a modular system. Our approach works with only onboard sensing and computation. In the second part, I will present some interesting advances in graphics, computer vision and robotics from our lab with an outlook of their application to vision-based navigation.
Matthias Müller holds a BSc in electrical engineering and math minor from Texas A&M University. Early in his career, he worked at P+Z Engineering as an electrical engineer developing mild-hybrid electric machines for BMW. Later, he obtained a MSc and PhD in electrical engineering from KAUST with focus on persistent aerial tracking and sim-to-real transfer for autonomous navigation. Müller has contributed to more than 15 publications published in top tier conferences and journals such as CVPR, ECCV, ICCV, ICML, PAMI, Science Robotics, RSS, CoRL, ICRA and IROS. Müller has extensive experience in object tracking and autonomous navigation of embodied agents such as cars and UAVs. He was recognized as an outstanding reviewer for CVPR’18 and won the best paper award at the ECCV’18 workshop UAVision.
09:25AVM-101
Spatial precision and recall indices to assess the performance of instance segmentation algorithms, Mattis Brummel, Patrick Müller, and Alexander Braun, Düsseldorf University of Applied Sciences (Germany) [view abstract]
Since it is essential for Computer Vision systems to reliably perform in safety-critical applications such as autonomous vehicles, there is a need to evaluate their robustness to image perturbations. Optical aberrations of the imaging system are introduced both by design and due to production tolerances. Aberrations of a camera system are always spatially variable over the Field of View, and may thus influence the performance of Computer Vision systems in dependence of the degree of local aberrations. Therefore, the goal is to evaluate the performance of Computer Vision systems under optical aberration effects (e.g. defocus) by taking into account the spatial domain. In this work, large-scale Autonomous Driving datasets are degraded by a parameterized optical model to simulate driving scenes under physically realistic effects of defocus. Comparing standard evaluation metrics with the Spatial Recall Index (SRI) and a novel Spatial Precision Index (SPI), the performance of Computer Visions systems on these degraded datasets are further compared with the optical performance of the applied optical model. A correlation could be observed between the spatially varying optical performance and the spatial performance of Instance Segmentation systems.
Quality Metrics for Automated Vehicles
Session Chairs:
Patrick Denny, Valeo Vision Systems (Ireland) and Robin Jenkin, NVIDIA Corporation (United States)
10:10 – 11:30
Green Room
10:10AVM-107
IEEE P2020 Automotive Image Quality Working Group [PRESENTATION-ONLY], Sara Sargent, Independent (United States) [view abstract]
The P2020 working group is tasked with forming standards for the image quality of devices used in automotive applications. New vehicles have several onboard cameras currently used in both passive and active applications from back up cameras for human users, to L2 lane keeping. In the future, companies intend to rely on cameras for L4 driving applications which require no human user to be present. While other image quality standards exist, none have comprehensive coverage of these safety critical applications. The roadway environment presents combinations of unique challenges like illumination, temperature ranges, harsh weather, and debris. The industry has complex hardware to choose from like multicamera systems, fisheye lenses, and high dynamic range sensors. The result of failure for any of these cases or any of these diverse hardware systems, can be death on the roadway. P2020 aims to publish an international standard, defining the relevant set of metrics for assessing automotive image quality, by breaking the investigation into seven Image Quality Factors: Contrast Detection Probability, Resolution, LED Flicker, Geometric Calibration Verification, Noise, Dynamic Range, and Flare. P2020 is targeting 2022 to be published and we are looking for those interested in being involved as an individual contributor. Contact: [email protected]
10:30AVM-108
A review of IEEE P2020 flicker metrics, Brian Deegan, Valeo Vision Systems (Ireland) [view abstract]
In this paper, we review the LED flicker metrics as defined by the IEEE P2020 working group. The goal of these metrics is to quantify the flicker behaviour of a camera system, to enable engineers to quantify flicker mitigation, and to identify and explore challenging flicker use cases and system limitations. In brief, Flicker Modulation Index quantifies the modulation of a flickering light source, and is particularly useful for quantifying banding effects in rolling shutter cameras. Flicker Detection Index quantifies the ability of a camera system to distinguish a flickering light source from the background signal level. Modulation Mitigation Probably quantifies the ability of a camera system to mitigate modulation of a flickering light source. This paper explores various use cases of flicker, how the IEEE P2020 metrics can be used to quantify camera system performance in these use cases, and discusses measurement and reporting considerations for lab based flicker assessment.
10:50AVM-109
A review of IEEE P2020 noise metrics, Orit Skorka1 and Paul Romanczyk2; 1ON Semiconductor Corporation and 2Imatest LLC (United States) [view abstract]
The IEEE P2020 standard addresses fundamental image quality attributes that are specifically relevant to cameras in automotive imaging systems. The Noise standard in IEEE P2020 is mostly based on existing standards on noise in digital cameras. However, it adjusts test conditions and procedures to make them more suitable for cameras for automotive applications, such as use of fisheye lenses, 16-32 bit data format in operation in high dynamic range (HDR) mode, HDR scenes, extended temperature range, and near-infrared imaging. The work presents methodology, procedures and experimental results that demonstrate extraction of camera characteristics from videos of HDR and other test charts that are recorded in raw format, including dark and photo signals, temporal noise, fixed-pattern noise, signal-to-noise ratio curves, photon transfer curve, transaction factor and effective full well capacity. The work also presents methodology and experimental results for characterization of camera noise in the dark array and signal falloff.
11:10AVM-110
Paving the way for certified performance: Quality assessment and rating of simulation solutions for ADAS and autonomous driving, Marius Dupuis, M. Dupuis Engineering Services (Germany) [view abstract]
Simulation plays a key role in the development of Advanced Driver Assist Systems (ADAS) and Autonomous Driving (AD) stacks. A growing market provides offerings at unprecedented scale and with a large variety of features. Transparency is often hard to come by, and sorting marketing claims from product performance facts is a challenge. This market attracts new players – from users’ and vendors’ side – which will lead to further diversification. The evolvement of standards, regulatory requirements, validation schemes etc. will add to the list of criteria that might be relevant for identifying the best-fit solution for a given task. With the method presented in this paper, we aim to lay the foundation for a structured and broadly applicable approach to assess the quality and fitness of the market’s offerings with respect to commonly agreed definitions of uses cases and performance requirements. We will introduce the first step, a product rating system, and we will sketch the future steps to achieve a unified certification scheme which may be applied across the ADAS/AD simulation market.
Autonomous Driving and Robotics Systems
Session Chairs:
Robin Jenkin, NVIDIA Corporation (United States) and Peter van Beek, Intel Corporation (United States)
15:00 – 16:00
Green Room
15:00AVM-116
Efficient in-cabin monitoring solution using TI TDA2PxSOCs, Mayank Mangla1, Mihir Mody2, Kedar Chitnis2, Piyali Goswami2, Tarkesh Pande1, Shashank Dabral1, Shyam Jagannathan2, Stefan Haas3, Gang Hua1, Hrushikesh Garud2, Kumar Desappan2, Prithvi Shankar2, and Niraj Nandan1; 1Texas instruments (United States), 2Texas Instruments India Ltd. (India), and 3Texas Instruments GmbH (Germany) [view abstract]
In-Cabin Monitoring Systems (ICMS) are functional safety systems designed to monitor driver and/or passengers inside an automobile and increasing use with advent of automated driving. DMS (Driver Monitoring System) and OMS (Occupant Monitoring System) are two variants of ICMS. DMS focusses solely on the driver to monitor fatigue, attention and health. OMS monitor all the occupants, including unattended children. Besides safety ICMS can also augment comfort and security through driver identification and personalization. In-Cabin monitoring brings a unique set of challenges from an imaging perspective in terms of higher analytics needs, smaller form factor, low light vision and color accuracy. This paper discusses these challenges and provides an efficient implementation on Texas Instrument's TDA2Px automotive processor. The paper also provides details on novel implementation of RGB+IR sensor format processing that commonly used in these system enabling premium ICMS system using TDA2Px processor.
15:20AVM-117
Sensor-aware frontier exploration and mapping with application to thermal mapping of building interiors, Zixian Zang, Haotian Shen, Lizhi Yang, and Avideh Zakhor, University of California, Berkeley (United States) [view abstract]
The combination of simultaneous localization and mapping (SLAM) and frontier exploration enables a robot to traverse and map an unknown area autonomously. Most prior autonomous SLAM solutions utilize information only from depth sensing devices. However, in situations where the main goal is to collect data from auxiliary sensors such as thermal camera, existing approaches require two passes: one pass to create a map of the environment and another to collect the auxiliary data, which is time consuming and energy inefficient. We propose a sensor-aware frontier exploration algorithm that enables the robot to perform map construction and auxiliary data collection in one pass. Specifically, our method uses a real-time ray tracing technique to construct a map that encodes unvisited locations from the perspective of auxiliary sensors rather than depth sensors; this encourages the robot to fully explore those areas to complete the data collection task and map making in one pass. Our proposed exploration framework is deployed on a LoCoBot with the task to collect thermal images from building envelopes. We validate with experiments in both multi-room commercial buildings and cluttered residential buildings. Using a metric that evaluates the coverage of sensor data, our method significantly outperforms the baseline method with a naive SLAM algorithm.
15:40AVM-118
Open source deep learning inference libraries for autonomous driving systems, Kumar Desappan1, Anand Pathak1, Pramod Swami1, Mihir Mody1, Yuan Zhao1, Paula Carrillo1, Praveen Eppa1, and Jianzhong Xu2; 1Texas Instruments India Ltd. (India) and 2Texas Instruments China (China) [view abstract]
Deep learning (DL)-based algorithms are used in many integral modules of ADAS and Automated Driving Systems. Camera based perception, Driver Monitoring, Driving Policy, Radar and Lidar perception are few of the examples built using DL algorithms in such systems. Traditionally custom software provided by silicon vendors ae are used to deploy these DL algorithms on devices. This custom software is very optimal for supported features (limited), But these not flexible enough for trying various deep learning model architecture type quickly. In this paper we propose to use various open source deep learning inference frameworks to quickly deploy any model architecture without any performance/latency impact. We have implemented this proposed solution with three open source inference frameworks (Tensorflow Lite, TVM/Neo-AI-DLR and ONNX Runtime) on Linux running in ARM.
3D and Depth Perception
Session Chairs:
Robin Jenkin, NVIDIA Corporation (United States) and Peter van Beek, Intel Corporation (United States)
16:15 – 17:15
Green Room
16:15AVM-125
Point cloud processing technologies and standards (Invited) [PRESENTATION-ONLY], Dong Tian, InterDigital (United States) [view abstract]
Point clouds based on recent sensing technologies become prevailing in many applications, e.g., VR/AR and autonomous driving, because they could provide accurate and full 3D representation for geometric information of our surroundings. However, because point samples are typically sparse and they may appear at irregular positions, it is often challenging for point cloud processing tasks including low-level processing and high-level understanding. In this talk, we provide an overview of point cloud processing technologies, with a focus on AI-based point cloud technologies and emerging point cloud compression standards.
16:55AVM-126
Efficient high-dynamic-range depth map processing with reduced precision neural net accelerator, Peter van Beek, Chyuan-tyng Wu, and Avi Kalderon, Intel Corporation (United States) [view abstract]
Depth sensing technology has become important in a number of consumer, robotics, and automated driving applications. However, the depth maps generated by such technologies today still suffer from limited resolution, sparse measurements, and noise, and require significant post-processing. Depth map data often has higher dynamic range than common 8-bit image data and may be represented as 16-bit values. Deep convolutional neural nets can be used to perform denoising, interpolation and completion of depth maps; however, in practical applications there is a need to enable efficient low-power inference with 8-bit precision. In this paper, we explore methods to process high-dynamic-range depth data using neural net inference engines with 8-bit precision. We propose a simple technique that attempts to retain signal-to-noise ratio in the post-processed data as much as possible and can be applied in combination with most convolutional network models. Our initial results using depth data from a consumer camera device show promise, achieving inference results with 8-bit precision that have similar quality to floating-point processing.
Tuesday 18 January 2022
KEYNOTE: Deep Learning
Session Chairs: Patrick Denny, Valeo Vision Systems (Ireland) and Peter van Beek, Intel Corporation (United States)
07:00 – 08:00
Green Room
AVM-134
KEYNOTE: Deep learning for image and video restoration/super-resolution [PRESENTATION-ONLY], Ahmet Murat Tekalp, Koç University (Turkey)
Recent advances in neural architectures and training methods led to significant improvements in the performance of learned image/video restoration and SR. We can consider learned image restoration and SR as learning either a mapping from the space of degraded images to ideal images based on the universal approximation theorem, or a generative model that captures the probability distribution of ideal images. An important benefit of data-driven deep learning approach is that neural models can be optimized for any differentiable loss function, including visual perceptual loss functions, leading to perceptual video restoration and SR, which cannot be easily handled by traditional model-based approaches. I will discuss loss functions and evaluation criteria for image/video restoration and SR, including fidelity and perceptual criteria, and the relation between them, where we briefly review the perception vs. fidelity (distortion) trade-off. We then discuss practical problems in applying supervised training to real-life restoration and SR, including overfitting image priors and overfitting the degradation model and some possible ways to deal with these problems.
Ahmet Murat Tekalp received BS degrees in in electrical engineering and mathematics from Bogazici University (1980) with high honors, and his MS and PhD in electrical, computer, and systems engineering from Rensselaer Polytechnic Institute (RPI), Troy, New York (1982 and 1984, respectively). He was with Eastman Kodak Company, Rochester, New York, from December 1984 to June 1987, and with the University of Rochester, Rochester, New York, from July 1987 to June 2005, where he was promoted to Distinguished University Professor. Since June 2001, he is a Professor at Koc University, Istanbul, Turkey. He has been the Dean of Engineering at Koç University between 2010-2013. His research interests are in the area of digital image and video processing, including video compression and streaming, motion-compensated filtering, super-resolution, video segmentation, object tracking, content-based video analysis and summarization, 3D video processing, deep learning for image and video pocessing, video streaming and real-time video communications services, and software-defined networking. Prof. Tekalp is a Fellow of IEEE and a member of Turkish Academy of Sciences and Academia Europaea. He was named as Distinguished Lecturer by IEEE Signal Processing Society in 1998, and awarded a Fulbright Senior Scholarship in 1999. He received the TUBITAK Science Award (highest scientific award in Turkey) in 2004. The new edition of his Prentice Hall book Digital Video Processing (1995) is published in June 2015. Dr. Tekalp holds eight US patents. His group contributed technology to the ISO/IEC MPEG-4 and MPEG-7 standards. He participates in several European Framework projects, and is also a project evaluator for the European Commission and panel member for European Research Council.
Deep Learning
Session Chairs:
Patrick Denny, Valeo Vision Systems (Ireland) and Peter van Beek, Intel Corporation (United States)
08:30 – 09:30
Green Room
08:30AVM-146
Adversarial attacks on multi-task visual perception for autonomous driving (JIST-first), Varun Ravi Kumar1, Senthil Yogamani2, Ibrahim Sobh3, and Ahmed Hamed3; 1Valeo DAR Germany (Germany), 2Valeo Ireland (Ireland), and 3Valeo R&D Eygpt (Egypt) [view abstract]
In recent years, deep neural networks (DNNs) have accomplished impressive success in various applications, including autonomous driving perception tasks. On the other hand, current deep neural networks are easily fooled by adversarial attacks. This vulnerability raises significant concerns, particularly in safety-critical applications. As a result, research into attacking and defending DNNs has gained much coverage. In this work, detailed adversarial attacks are applied on a diverse multi-task visual perception deep network across distance estimation, semantic segmentation, motion detection, and object detection. The experiments consider both white and black box attacks for targeted and un-targeted cases while attacking a task and inspecting the effect on all the others, in addition to inspecting the effect of applying a simple defense method. We conclude this paper by comparing and discussing the experimental results, proposing insights and future work. The visualizations of the attacks are available at https://drive.google.com/file/d/1NKhCL2uC_SKam3H05SqjKNDE_zgvwQS-/view?usp=sharing
08:50AVM-147
FisheyePixPro: Self-supervised pretraining using Fisheye images for semantic segmentation, Ramchandra Cheke1, Ganesh Sistu2, and Senthil Yogamani2; 1University of Limerick and 2Valeo Vision Systems (Ireland) [view abstract]
Self-supervised learning has been an active area of research for the past few years. It has gained a significant performance improvement in the computer vision task. However, there has been limited work done in the field of fisheye images. In this paper, we explored the PixPro model on a woodscape dataset. The main aim of this work was to improve the segmentation performance by leveraging contrastive learning applied to pixel level for images having geometrical distortion. We compared the results with a supervised pre-trained Imagenet model and found that our training strategy achieved similar performance to that supervised Imagenet pre-trained model. Fish-eye PixPro model has achieved 65.52 mIoU score on the woodscape dataset and supervised imagenet pretrained model has achieved 66.2 mIoU on woodscape dataset. Our result shows that self-supervised learning eliminates the need for an expensive large-scale supervised dataset and instead a large un-annotated dataset in the same domain can be used.
09:10AVM-148
Multi-lane modelling using convolutional neural networks and conditional random fields, Ganesh Babu1, Ganesh Sistu2, and Senthil Yogamani2; 1University College Dublin and 2Valeo Vision Systems (Ireland) [view abstract]
Over the years autonomous driving has evolved leaps and bounds and a major part of that was owed to the involvement of deep learning in computer vision. Even in modern autonomous driving, multi-lane detection and projection has been a challenge which needs to be solved further. Several approaches have been proposed earlier involving conventional threshold techniques along with graphical models or with RANSAC and polynomial fitting. In recent times direct regression using deep learning models is also explored. In this paper, we propose a blend which uses a deep learning model for initial lane detection at pixel level and conditional random fields for modeling of lanes. The method provides a 15% improvement in lane detection and projection over conventional models.
KEYNOTE: Sensing for Autonomous Driving
Session Chairs: Patrick Denny, Valeo Vision Systems (Ireland) and Hari Tagat, Casix (United States)
10:00 – 11:00
Green Room
This session is hosted jointly by the Autonomous Vehicles and Machines 2022 and Imaging Sensors and Systems 2022 conferences.
10:00ISS-160
KEYNOTE: Recent developments in GatedVision imaging - Seeing the unseen [PRESENTATION-ONLY], Ofer David, BrightWay Vision (Israel)
Imaging is the basic building block for automotive autonomous driving. Any computer vision system will require a good image as an input at all driving conditions. GatedVision provides an extra layer on top of the regular RGB/RCCB sensor to augment these sensors at nighttime and harsh weather conditions. GatedVision images in darkness and different weather conditions will be shared. Imagine that you could detect a small target laying on the road having the same reflectivity as the back ground meaning no contrast, GatedVision can manipulate the way an image is captured so that contrast can be extracted. Additional imaging capabilities of GatedVision will be presented.
Ofer David has been BrightWay Vision CEO since 2010. David has more than 20 years’ experience in the area of active imaging systems and laser detection, and has produced various publications and patents. Other solutions in which David is involved with, include fog penetrating day/night imaging systems and visibility measurement systems. David received his BSc and MSc from the Technion – Israel Institute of Technology and his PhD in electro-optics from Ben-Gurion University.
10:40AVM-161
Potentials of combined visible light and near infrared imaging for driving automation, Korbinian Weikl1,2, Damien Schroeder1, and Walter Stechele2; 1Bayerische Motoren Werke AG and 2Technical University of Munich (Germany) [view abstract]
Automated driving functions of the highest levels of automation require camera and computer vision (CV) systems which enable them to operate at a safety level that exceeds human driving. To date, the information content of the cameras’ image data does not suffice to reach those performance levels. One degree of freedom to increase the image information content is to extend the spectral range of the cameras. Near infrared (NIR) imaging on CMOS imagers is a promising candidate technology in this research direction. To assess the potentials of combined visible light (VIS) and NIR imaging for the driving automation application, we extend our camera simulation and optimization framework for camera models that include a VIS-NIR CMOS imager. We also adapt our image processing and CV models to process the additional NIR image information. We evaluate the vision system performance for our VIS-NIR camera models, in reference to an automotive VIS-only camera model. We use a data set of synthetic automotive scenes, and a neural network-based object detection system as benchmark CV. Our results give an indication for the performance increase potentials of combined VIS-NIR imaging in driving automation and highlight critical scenarios that can be perceived correctly using VIS-NIR imaging.
[view abstract]
Automated driving functions of the highest levels of automation require camera and computer vision (CV) systems which enable them to operate at a safety level that exceeds human driving. To date, the information content of the cameras’ image data does not suffice to reach those performance levels. One degree of freedom to increase the image information content is to extend the spectral range of the cameras. Near infrared (NIR) imaging on CMOS imagers is a promising candidate technology in this research direction. To assess the potentials of combined visible light (VIS) and NIR imaging for the driving automation application, we extend our camera simulation and optimization framework for camera models that include a VIS-NIR CMOS imager. We also adapt our image processing and CV models to process the additional NIR image information. We evaluate the vision system performance for our VIS-NIR camera models, in reference to an automotive VIS-only camera model. We use a data set of synthetic automotive scenes, and a neural network-based object detection system as benchmark CV. Our results give an indication for the performance increase potentials of combined VIS-NIR imaging in driving automation and highlight critical scenarios that can be perceived correctly using VIS-NIR imaging.
LIDAR and Sensing
Session Chairs:
Robin Jenkin, NVIDIA Corporation (United States) and Min-Woong Seo, Samsung Electronics (Republic of Korea)
15:00 – 16:00
Red Room
This session is hosted jointly by the Autonomous Vehicles and Machines 2022 and Imaging Sensors and Systems 2022 conferences.
15:00AVM-172
Real-time LIDAR imaging by solid-state single chip beam scanner, Jisan Lee, Kyunghyun Son, Changbum Lee, Inoh Hwang, Bongyong Jang, Eunkyung Lee, Dongshik Shim, Hyunil Byun, Changgyun Shin, Dongjae Shin, Otsuka Tatsuhiro, Yongchul Cho, Kyoungho Ha, and Hyuck Choo, Samsung Electronics Co., Ltd. (Republic of Korea) [view abstract]
We present a real-time light detection and ranging (LIDAR) imaging by developing a single-chip solid-state beam scanner. The beam scanner is integrated with a fully functional 32-channel optical phased array, 36 optical amplifiers, and a tunable laser at central wavelength ~1310 nm, all on a 7.5 x 3 mm^2 single chip fabricated with III-V on silicon processes. The phased array is calibrated with self-evolving genetic algorithm to enable beam forming and steering in two dimensions. Distance measurement is performed with a digital signal processing that measures the time of flight (TOF) of pulsed light with a system consisting of an avalanche photodiode (APD), trans-impedance amplifier (TIA), analog-digital converter (ADC), and a processor. The LIDAR module utilizing this system can acquire point cloud images with 120 x 20 resolution with a speed of 20 frames per seconds at a distance up to 20 meters. This work presents the first demonstration of a chip-scale LIDAR solution without any moving part or bulk external light source or amplifier, making an ultra-low cost and compact LIDAR technology a reality.
[view abstract]
We present a real-time light detection and ranging (LIDAR) imaging by developing a single-chip solid-state beam scanner. The beam scanner is integrated with a fully functional 32-channel optical phased array, 36 optical amplifiers, and a tunable laser at central wavelength ~1310 nm, all on a 7.5 x 3 mm^2 single chip fabricated with III-V on silicon processes. The phased array is calibrated with self-evolving genetic algorithm to enable beam forming and steering in two dimensions. Distance measurement is performed with a digital signal processing that measures the time of flight (TOF) of pulsed light with a system consisting of an avalanche photodiode (APD), trans-impedance amplifier (TIA), analog-digital converter (ADC), and a processor. The LIDAR module utilizing this system can acquire point cloud images with 120 x 20 resolution with a speed of 20 frames per seconds at a distance up to 20 meters. This work presents the first demonstration of a chip-scale LIDAR solution without any moving part or bulk external light source or amplifier, making an ultra-low cost and compact LIDAR technology a reality.
15:20ISS-173
A back-illuminated SOI-based 4-tap lock-in pixel with high NIR sensitivity for TOF range image sensors [PRESENTATION-ONLY], Naoki Takada1, Keita Yasutomi1, Hodaka Kawanishi1, Kazuki Tada1, Tatsuya Kobayashi1, Atsushi Yabata2, Hiroki Kasai2, Noriyuki Miura2, Masao Okihara2, and Shoji Kawahito1; 1Shizuoka University and 2LAPIS Semiconductor Co., Ltd. (Japan) [view abstract]
In this study, we present a backside-illuminated (BSI) 4-tap SOI-based lock-in pixel having a high near-infrared sensitivity. Owing to the customized process for lock-in pixels, the size of floating diffusion is greatly reduced using a self-aligned process. The lock-in pixel including 4-tap readout circuits is successfully integrated into the single modulator size of 18 × 18um. The prototype chip is implemented in 0.2um SOI technology. The chip demonstrates a high QE of 65% at 950-nm wavelength and 40ns gate/light pulse modulation. The distance up to 30 m was successfully measured using the prototype chip.
[view abstract]
In this study, we present a backside-illuminated (BSI) 4-tap SOI-based lock-in pixel having a high near-infrared sensitivity. Owing to the customized process for lock-in pixels, the size of floating diffusion is greatly reduced using a self-aligned process. The lock-in pixel including 4-tap readout circuits is successfully integrated into the single modulator size of 18 × 18um. The prototype chip is implemented in 0.2um SOI technology. The chip demonstrates a high QE of 65% at 950-nm wavelength and 40ns gate/light pulse modulation. The distance up to 30 m was successfully measured using the prototype chip.
15:40ISS-174
An 8-tap image sensor using tapped PN-junction diode demodulation pixels for short-pulse time-of-flight measurements [PRESENTATION-ONLY], Ryosuke Miyazawa1, Yuya Shirakawa1, Kamel Mars1, Keita Yasutomi1, Keiichiro Kagawa1, Satoshi Aoyama2, and Shoji Kawahito1; 1Shizuoka University and 2Brookman Technology, Inc. (Japan) [view abstract]
A novel 8-tap short pulse (SP)-based indirect TOF (iTOF) image sensor is presented. This SP-based iTOF image sensor using 8-tap pixels with a drain is suitable for outdoor long-range (>10m) applications. The designed and implemented image sensor uses a novel multi-tap demodulation pixel based on Tapped PN-junction Diode (TPD) structure which uses divided p+ hole-pinning areas of the pinned photodiode (p+/n/p- structure) as multiple electrodes of the multi-tap demodulator for the dynamic control of the buried channel potential. The TPD demodulator can directly modulate the channel potential of the photo-receiving area using multiple p+ electrodes, and hence a large photo receiving area or high fill-factor of the pixel and high-speed photo-carrier transfer in the channel can be realized, leading to simultaneously meeting high sensitivity and high demodulation speed of the multi-tap TOF pixel. Using the eight consecutive time-gating windows, each of which has a width of 10 ns, prepared by the 8-tap SP-based pixel, 10m-range high ambient light TOF measurements have been carried out. The measurement results show that the maximum non-linearity error of 1.32 %FS for the range of 1.0–11.5 m and the depth resolution of maximally 16.4 cm have been attained under sunlight level background light.
[view abstract]
A novel 8-tap short pulse (SP)-based indirect TOF (iTOF) image sensor is presented. This SP-based iTOF image sensor using 8-tap pixels with a drain is suitable for outdoor long-range (>10m) applications. The designed and implemented image sensor uses a novel multi-tap demodulation pixel based on Tapped PN-junction Diode (TPD) structure which uses divided p+ hole-pinning areas of the pinned photodiode (p+/n/p- structure) as multiple electrodes of the multi-tap demodulator for the dynamic control of the buried channel potential. The TPD demodulator can directly modulate the channel potential of the photo-receiving area using multiple p+ electrodes, and hence a large photo receiving area or high fill-factor of the pixel and high-speed photo-carrier transfer in the channel can be realized, leading to simultaneously meeting high sensitivity and high demodulation speed of the multi-tap TOF pixel. Using the eight consecutive time-gating windows, each of which has a width of 10 ns, prepared by the 8-tap SP-based pixel, 10m-range high ambient light TOF measurements have been carried out. The measurement results show that the maximum non-linearity error of 1.32 %FS for the range of 1.0–11.5 m and the depth resolution of maximally 16.4 cm have been attained under sunlight level background light.
Wednesday 19 January 2022
IS&T Awards & PLENARY: In situ Mobility for Planetary Exploration: Progress and Challenges
07:00 – 08:15
This year saw exciting milestones in planetary exploration with the successful landing of the Perseverance Mars rover, followed by its operation and the successful technology demonstration of the Ingenuity helicopter, the first heavier-than-air aircraft ever to fly on another planetary body. This plenary highlights new technologies used in this mission, including precision landing for Perseverance, a vision coprocessor, new algorithms for faster rover traverse, and the ingredients of the helicopter. It concludes with a survey of challenges for future planetary mobility systems, particularly for Mars, Earth’s moon, and Saturn’s moon, Titan.
Larry Matthies, Jet Propulsion Laboratory (United States)
Larry Matthies received his PhD in computer science from Carnegie Mellon University (1989), before joining JPL, where he has supervised the Computer Vision Group for 21 years, the past two coordinating internal technology investments in the Mars office. His research interests include 3-D perception, state estimation, terrain classification, and dynamic scene analysis for autonomous navigation of unmanned vehicles on Earth and in space. He has been a principal investigator in many programs involving robot vision and has initiated new technology developments that impacted every US Mars surface mission since 1997, including visual navigation algorithms for rovers, map matching algorithms for precision landers, and autonomous navigation hardware and software architectures for rotorcraft. He is a Fellow of the IEEE and was a joint winner in 2008 of the IEEE’s Robotics and Automation Award for his contributions to robotic space exploration.
EI 2022 Interactive Poster Session
08:20 – 09:20
EI Symposium
Poster interactive session for all conferences authors and attendees.
Camera Modeling and Performance
Session Chairs:
Patrick Denny, Valeo Vision Systems (Ireland) and Peter van Beek, Intel Corporation (United States)
09:30 – 10:30
Green Room
09:30AVM-214
Original image noise reconstruction for spatially-varying filtered driving scenes, Luis Constantin Wohlers, Patrick Müller, and Alexander Braun, Hochschule Düsseldorf, University of Applied Sciences Düsseldorf (Germany) [view abstract]
Test drives for the development of camera-based automotive algorithms like object detection or instance segmentation are very expensive and time-consuming. Therefore, the re-use of existing databases like COCO or Berkeley Deep Drive by intentionally varying the image quality in a post-processing step promises to save time and money, while giving access to novel image quality properties. One possible variation we investigate is the sharpness of the camera system, by applying spatially varying optical blur models as low-pass filters on the image data. Any such operation significantly changes the amount and distribution of noise, a central property of image quality, which in this context is an undesired side-effect. In this article, a novel method is presented to reconstruct the original camera sensor noise for the filtered image. This is different from denoising. The method estimates the original camera sensor noise using the combination of principal component analysis (PCA) and a variance-stabilizing transformation. The noise is then reconstructed for the filtered image with the PCA applied locally on small image sections, and an inverse variance-stabilizing transformation. Although the resulting noise distribution can slightly deviate from the original, this novel method does not introduce any image artifacts as denoising would do. We present the method as applied to synthetic and real driving scenes at different noise levels and discuss the accuracy of the reconstruction visually and with statistical parameters.
09:50AVM-215
Non-RGB color filter options and traffic signal detection capabilities, Eiichi Funatsu, Steve Wang, Jken Vui Kok, Lou Lu, Fred Cheng, and Mario Heid, OmniVision Technologies, Inc. (United States) [view abstract]
Recently, non-RGB image sensors gain a traction in the automotive applications for high sensitivity camera system. Some color filter combinations have been proposed, such as RCCB, RYYCy, RCCG, etc. However, we found a difficulty to differentiate Yellow and Red traffic lights for some cases. Also, it was not very clear how effective those color filter options are for the low light SNR improvement. In this work, we propose the solution of Yellow / Red traffic light differentiation by shifting the Red color filter edge. The differentiation performance was verified by the segmentation in the color space using the traffic light spectrum database we built up. For the SNR comparison between those color filter options, we propose SNR10-based scheme for an apple-to-apple comparison and discuss on the overall pros/cons. This result was also checked with image data by using a hyperspectral camera simulation.
10:10AVM-216
Toward metrological trustworthiness for automated and connected mobility, Paola Iacomussi and Alessandro Schiavi, INRIM (Italy) [view abstract]
The mobility of people and goods is moving into a new era of more automated services based on sensors networks and Artificial Intelligence. At present, Automated and Connected (A-C) Mobility, in the broadest meaning of the terms includes Advanced Driver Assistance System (ADAS) and Autonomous Vehicle (AV), beyond being attractive for many practical advantages, ranging from safety to traffic flow management, still presents several concerns on the trustworthiness of the sensor networks integrated into vehicles, especially regarding data uncertainty and data fusion approaches. Currently, the trustworthiness of ADAS and A-C functions is assessed with virtual and physical simulation of functions relying on synthetic sensor models, simulated and measured sensor data and equivalent environmental conditions. All these inputs are considered nominal: no evaluation of the reliability of sensor and environment models, nor traceability and uncertainty of measured and simulated data is provided. Therefore, sensor output but also simulation reliability can be questioned, because currently no specific calibration facilities, procedures and uncertainty evaluation, or traceable input data-sets are available starting at National Metrology Institute (NMI) level. This paper presents an approach to lay the foundation of a metrology of Trustworthiness for Automated and Connected complex sensors systems.