Monday 17 January 2022
IS&T Welcome & PLENARY: Quanta Image Sensors: Counting Photons Is the New Game in Town
07:00 – 08:10
The Quanta Image Sensor (QIS) was conceived as a different image sensor—one that counts photoelectrons one at a time using millions or billions of specialized pixels read out at high frame rate with computation imaging used to create gray scale images. QIS devices have been implemented in a CMOS image sensor (CIS) baseline room-temperature technology without using avalanche multiplication, and also with SPAD arrays. This plenary details the QIS concept, how it has been implemented in CIS and in SPADs, and what the major differences are. Applications that can be disrupted or enabled by this technology are also discussed, including smartphone, where CIS-QIS technology could even be employed in just a few years.
Eric R. Fossum, Dartmouth College (United States)
Eric R. Fossum is best known for the invention of the CMOS image sensor “camera-on-a-chip” used in billions of cameras. He is a solid-state image sensor device physicist and engineer, and his career has included academic and government research, and entrepreneurial leadership. At Dartmouth he is a professor of engineering and vice provost for entrepreneurship and technology transfer. Fossum received the 2017 Queen Elizabeth Prize from HRH Prince Charles, considered by many as the Nobel Prize of Engineering “for the creation of digital imaging sensors,” along with three others. He was inducted into the National Inventors Hall of Fame, and elected to the National Academy of Engineering among other honors including a recent Emmy Award. He has published more than 300 technical papers and holds more than 175 US patents. He co-founded several startups and co-founded the International Image Sensor Society (IISS), serving as its first president. He is a Fellow of IEEE and OSA.
08:10 – 08:40 EI 2022 Welcome Reception
Wednesday 19 January 2022
IS&T Awards & PLENARY: In situ Mobility for Planetary Exploration: Progress and Challenges
07:00 – 08:15
This year saw exciting milestones in planetary exploration with the successful landing of the Perseverance Mars rover, followed by its operation and the successful technology demonstration of the Ingenuity helicopter, the first heavier-than-air aircraft ever to fly on another planetary body. This plenary highlights new technologies used in this mission, including precision landing for Perseverance, a vision coprocessor, new algorithms for faster rover traverse, and the ingredients of the helicopter. It concludes with a survey of challenges for future planetary mobility systems, particularly for Mars, Earth’s moon, and Saturn’s moon, Titan.
Larry Matthies, Jet Propulsion Laboratory (United States)
Larry Matthies received his PhD in computer science from Carnegie Mellon University (1989), before joining JPL, where he has supervised the Computer Vision Group for 21 years, the past two coordinating internal technology investments in the Mars office. His research interests include 3-D perception, state estimation, terrain classification, and dynamic scene analysis for autonomous navigation of unmanned vehicles on Earth and in space. He has been a principal investigator in many programs involving robot vision and has initiated new technology developments that impacted every US Mars surface mission since 1997, including visual navigation algorithms for rovers, map matching algorithms for precision landers, and autonomous navigation hardware and software architectures for rotorcraft. He is a Fellow of the IEEE and was a joint winner in 2008 of the IEEE’s Robotics and Automation Award for his contributions to robotic space exploration.
EI 2022 Interactive Poster Session
08:20 – 09:20
Poster interactive session for all conferences authors and attendees.
Acquisition and Processing
Tyler Bell, University of Iowa (United States)
09:30 – 10:35
Investigation of demosaicing effect on digital image correlation method: A case study on paintings with natural texture [PRESENTATION-ONLY], Athanasia Papanikolaou, Malgorzata Kujawinska, and Piotr Garbat, Politechnika Warszawska (Poland) [view abstract]
3D Digital Image Correlation (3D DIC) is a full-field optical method that enables the measurement of an object’s displacements with high resolution while at the same time being non-contact and non-invasive. Therefore, it has many potential applications in the field of Cultural Heritage (CH), as it enables to monitor the influence of environmental changes on CH objects such as paintings, parchments, or sculptures and therefore helps conservators to prevent damage. However, DIC implementation requires an object with a surface characterized by a good contrast texture, a criterion that CH objects do not always meet. In this paper, we investigate the possibility of using color cameras for 3D DIC along with multiple demosaicing algorithms with the aim to evaluate their performance in terms of minimization of displacement errors created due to preprocessing of captured images. We aim to the determination of the demosaicing algorithm which provides the most accurate displacement maps while 3D DIC is applied to investigations of mock-up canvas as well as oil paintings with natural texture. The selection of the appropriate demosaicing algorithm and optimization of the DIC analysis parameters can lead to the successful and accurate detection of displacements.
Pose estimation of teeth in pathological dental models, Maxime Chapuis1,2, Mathieu Lafourcade1, William Puech1, Noura Faraj1, and Gérard Guillerm2; 1Université de Montpellier and 2Groupe Orqual (France) [view abstract]
In this work, we present a method to estimate the pose of teeth in pathological dental models. For each tooth of a pathological model, we aim at computing its orientation and position with respect to a healthy dentition. The proposed method is based on the registration of a reference model on a patient-specific 3D segmented mesh. Dental features, such as the arch forms and their types, are derived. These features, combined with registration information, allow our system to propose a plausible target dental arrangement and thus estimate the pose of each tooth. The key contributions of this work are (a) the use of a registered reference model to derive dental features and quantify orthodontic disorders and (b) the automatic estimation of a target dental arrangement.
Segmentation in application to deformation analysis of cultural heritage surfaces, Sunita Saha and Robert Sitnik, Warsaw University of Technology (Poland) [view abstract]
The geometry comparison is promising for the deformation assessment of cultural heritage (CH) surfaces over the decade. In this work, the potential reliability of the developed changed-based segmentation method was explored in quantifying the deformation on an object. The proposed method was tested using two parts of a single model, considering one ideal in shape and the other part is deformed over time. This study explains how deformation using changed-based segmentation can be identified successfully with millimeter level accuracy in the measurement and inform future conservation treatment. The method is insensitive to the noise of surfaces and the cross-time alignment of two models with a known reference of no change. The technique allows identifying the deterioration based on deformation detection with clear colormap visualization and its significance as a preventive measure without using a physical marker as a reference. The threshold values were fed to the method based on the known respect of no change and quantify as minor and major based on the object's size. The presented result in this paper suggests that the method can be effectively used to enhance 3D documentation of monitoring both indoor and outdoor environments and perform preventive interventions irrespective of the object's size.
Analysis and Compression
William Puech, Laboratory d’Informatique de Robotique et de Microelectronique de Montpellier (France)
10:50 – 11:50
Scale-adaptive local intentional surface feature detection, Yujian Xu1, Matthew Gaubatz2, Stephen Pollard2, Robert Ulichney1, and Jan P. Allebach1; 1Purdue University and 2HP Labs (United States) [view abstract]
In this paper, we propose a mesh-based feature detection scheme that focuses on surface features. A class of features of key interest is intentional structures that act as fiducials and that, for instance, can assist in shape retrieval and distortion measurement. We introduce a tunable two-scale depth measurement scheme to quantify the displacement of a vertex from the local surface, which can be a strong indicator of features. We print and scan 3D models with fiducial features appearing across the surface to demonstrate the high fidelity and accuracy of the proposed feature detection scheme. The method outperforms existing 3D feature detection schemes on CAD models and 3D scans alike. We also discuss applications of data embedding enabled by the achievable detection performance.
Feature-driven 3D range geometry compression via spatially-aware depth encoding, Broderick S. Schwartz, Matthew G. Finley, and Tyler Bell, University of Iowa (United States) [view abstract]
The ever-growing variety of applications and capture methods for 3D range geometry continually increase the need for effective storage and transmission methods. Compression techniques offer reduced file sizes while keeping the precision needed for a particular application. Several such compression methods use phase-shifting principles to encode the 3D data into a 2D RGB image. In some applications, such as telepresence, high precision may only be required in a particular region within a scan. The proposed method provides a way to encode regions of interest at higher precision while encoding the remaining data at lower precision to reduce file sizes. The proposed feature-driven compression method supports both lossless and lossy compression, enabling even greater file size savings. In the case of a depth scan of a bust, an extracted bounding box of the face was used to create an encoding distribution such that only the facial region was encoded at higher precision. When using JPEG 80, the global RMS accuracy of the proposed encoding was 99.72%; however, in the region of interest, the accuracy was 99.88%. This targeted encoding achieved a 26% reduction in compressed file size compared to a fixed-precision encoding.
Quality analysis of point cloud coding solutions, Joao Prazeres1,2, Manuela Pereira1,2, and Antonio Pinheiro1,2; 1Universidade da Beira Interior (U.B.I.) and 2Instituto de Telecomunicacoes (Portugal) [view abstract]
In this paper, a subjective quality based comparison between four point clouds codecs is presented. For that a set of six point clouds was chosen, and they were coded with the four different point cloud encoding solution, notably the MPEG V-PCC and G-PCC, a deep learning coding solution RS-DLPCC and also Draco, with different bit rates. A subjective test where the distorted and reference point clouds were rotated in a video sequence side by side followed by the pair quality evaluation, was performed. The subjective quality evaluation results were compared with the a set of four point cloud objective quality metrics that usually are reported as providing a good representation. Was concluded that V-PCC is the best codec of the studied ones. The deep learning based solution still performs worst than the two MPEG codecs, although has space to improve its compression efficiency. The studied metrics tend to provide a good representation for V-PCC and G-PCC, an acceptable representation of RS-DLPCC, and a bad representation of Draco.
Processing and Applications
Robert Sitnik, Warsaw University of Technology (Poland)
15:00 – 16:00
Hand authentication from RGB-D video based on deep neural network, Ryogo Miyazaki1, Kazuya Sasaki2, Norimichi Tsumura1, and Keita Hirai1; 1Chiba University and 2MagikEye (Japan) [view abstract]
In recent years, behavioral biometrics authentication, which uses the habit of behavioral characteristics for personal authentication, has attracted attention as an authentication method with higher security since behavioral biometrics cannot mimic as fingerprint and face authentications. As the behavioral biometrics, many researches were performed on voiceprints. However, there are few authentication technologies that utilize the habits of hand and finger movements during hand gestures. Only either color images or depth images are used for hand gesture authentication in the conventional methods. In the research, therefore, we propose to find individual habits from RGB-D images of finger movements and create a personal authentication system. 3D CNN, which is a deep learning-based network, is used to extract individual habits. An F-measure of 0.97 is achieved when rock-paper-scissors are used as the authentication operation. An F-measure of 0.97 is achieved when the disinfection operation is used. These results show the effectiveness of using RGB-D images for personal authentication.
A 3D subtractive brush system for an immersive, multilayered archaeological map, Mike Yeates, Maxime Cordeil, and Tom Chandler, Monash University (Australia) [view abstract]
Digital archaeology is a rapidly evolving field, adapting new technologies to interpret diverse data sources. This paper details the superimposition of 2D maps and 3D data in an interactive 3D space, and their selective subtraction by a 3D brush system. The subject of study is the archaeological landscape of the medieval city of Angkor in Cambodia, an area of approximately 3500 square kilometers. By cutting through the superimposed layers of LIDAR point clouds, 2D mapping of the archaeological features, and the 3D reconstructions of the living city of Angkor, the brush system reveals both correspondences and discontinuities through interactive examination.
Design of ghost-free aerial display by using prism and dihedral corner reflector array, Yuto Osada and Yue Bao, Tokyo City University (Japan) [view abstract]
Dihedral corner reflector array (DCRA) has been proposed as one of the aerial displays that the observer can directly touch the aerial images. This optical element has a problem that the visibility of the aerial image is reduced by stray light called ghost. Although the formation of ghost can be suppressed by providing louver, the viewing angle and brightness of the aerial image is also reduced. This paper proposes the design of ghost free aerial display by using prism and DCRA. From the experimental results, the brightness of ghost with proposed method was about 14% of DCRA only at front. In addition, the aerial image with proposed method was brighter than DCRA only and conventional method in the range of 0 to 20 degrees.