Monday 17 January 2022
IS&T Welcome & PLENARY: Quanta Image Sensors: Counting Photons Is the New Game in Town
07:00 – 08:10
The Quanta Image Sensor (QIS) was conceived as a different image sensor—one that counts photoelectrons one at a time using millions or billions of specialized pixels read out at high frame rate with computation imaging used to create gray scale images. QIS devices have been implemented in a CMOS image sensor (CIS) baseline room-temperature technology without using avalanche multiplication, and also with SPAD arrays. This plenary details the QIS concept, how it has been implemented in CIS and in SPADs, and what the major differences are. Applications that can be disrupted or enabled by this technology are also discussed, including smartphone, where CIS-QIS technology could even be employed in just a few years.
Eric R. Fossum, Dartmouth College (United States)
Eric R. Fossum is best known for the invention of the CMOS image sensor “camera-on-a-chip” used in billions of cameras. He is a solid-state image sensor device physicist and engineer, and his career has included academic and government research, and entrepreneurial leadership. At Dartmouth he is a professor of engineering and vice provost for entrepreneurship and technology transfer. Fossum received the 2017 Queen Elizabeth Prize from HRH Prince Charles, considered by many as the Nobel Prize of Engineering “for the creation of digital imaging sensors,” along with three others. He was inducted into the National Inventors Hall of Fame, and elected to the National Academy of Engineering among other honors including a recent Emmy Award. He has published more than 300 technical papers and holds more than 175 US patents. He co-founded several startups and co-founded the International Image Sensor Society (IISS), serving as its first president. He is a Fellow of IEEE and OSA.
08:10 – 08:40 EI 2022 Welcome Reception
Wednesday 19 January 2022
IS&T Awards & PLENARY: In situ Mobility for Planetary Exploration: Progress and Challenges
07:00 – 08:15
This year saw exciting milestones in planetary exploration with the successful landing of the Perseverance Mars rover, followed by its operation and the successful technology demonstration of the Ingenuity helicopter, the first heavier-than-air aircraft ever to fly on another planetary body. This plenary highlights new technologies used in this mission, including precision landing for Perseverance, a vision coprocessor, new algorithms for faster rover traverse, and the ingredients of the helicopter. It concludes with a survey of challenges for future planetary mobility systems, particularly for Mars, Earth’s moon, and Saturn’s moon, Titan.
Larry Matthies, Jet Propulsion Laboratory (United States)
Larry Matthies received his PhD in computer science from Carnegie Mellon University (1989), before joining JPL, where he has supervised the Computer Vision Group for 21 years, the past two coordinating internal technology investments in the Mars office. His research interests include 3-D perception, state estimation, terrain classification, and dynamic scene analysis for autonomous navigation of unmanned vehicles on Earth and in space. He has been a principal investigator in many programs involving robot vision and has initiated new technology developments that impacted every US Mars surface mission since 1997, including visual navigation algorithms for rovers, map matching algorithms for precision landers, and autonomous navigation hardware and software architectures for rotorcraft. He is a Fellow of the IEEE and was a joint winner in 2008 of the IEEE’s Robotics and Automation Award for his contributions to robotic space exploration.
EI 2022 Interactive Poster Session
08:20 – 09:20
Poster interactive session for all conferences authors and attendees.
Thursday 20 January 2022
Juha Röning, University of Oulu (Finland)
07:00 – 08:05
Deep learning based wheat ears count in robot images for wheat phenotyping, Ehsan Ullah, Mohib Ullah, Muhammad Sajjad, and Faouzi Alaya Cheikh, Norwegian University of Science and Technology (Norway) [view abstract]
The number of spikes, spikelets per spike, number of spikes per square meter are some of the important metrics for plant breeders and researchers in predicting wheat crop yield. Evaluating the crop yield based on wheat ears counting is still done manually which is a labor-intensive, tedious and costly task. Thus, there is a significant need of developing a real-time wheat spikes/ears counting system for plant breeders for effective and efficient crop yield predictions. In this paper, we adopted and modified two deep learning-based methods namely Faster R-CNN and EfficientDet for accurate and computationally efficient localization and counting of wheat spikes/ears in digital images taken using some high-throughput phenotyping techniques under natural field conditions. We used Faster R-CNN with Resnet50 as backbone architecture which produced an overall accuracy of 88.7% on the test images. We also used recent state-of-the-art models EfficientDet-D5 and EfficientDet-D7 having backbone architectures EfficientNet-B5 and EfficientNet-B7, respectively. A comprehensive quantitative analysis is performed on the standard performance metrics. In the analysis, the EfficientDet-D5 model produces an accuracy of 92.7% on the test images and EfficientDet-D7 produces an accuracy of 93.6%.
Incremental two-network approach to develop a purity analyzer system for canola seeds, Kuldeep Singh, Fernando Saccon, and Dileepan Joseph, University of Alberta (Canada) [view abstract]
Given a suitable dataset, transfer learning using deep convolutional neural networks is an effective method to develop a system to detect and classify objects. Despite having models pretrained on large general-purpose datasets, the requirement to manually label an application-specific dataset remains a limiting factor in system development. We consider this wider problem in the context of the purity analysis of canola seeds, where end users wish to distinguish species of interest from contaminants in images taken with optical microscopes. We use a Detector network, trained only to detect seeds, to help label the dataset used to train an Analyzer network, capable of both seed detection and classification. We present results, over three experiments that involve 25 contaminant species, including Primary and Secondary Noxious Weed Seeds (as per the Canadian Weed Seeds Order), to validate our incremental approach. We also compare the proposed system to competing ones in a literature review.
Instance segmentation for characterization of satellites on additive manufacturing feedstock powders [PRESENTATION-ONLY], Ryan Cohn and Elizabeth Holm, Carnegie Mellon University (United States) [view abstract]
Satellites are formed when fine particles fuse to the surface of larger particles during powder production. This influences the flow and spreading of powders during powder bed fusion additive manufacturing, affecting the quality and consistency of 3d-printed parts. Despite this, current experimental methods of powder characterization are unable to detect satelliting in powder samples. We propose instance segmentation for directly measuring satelliting in powder samples. In this study Mask R-CNN was trained to segment individual particles and satellites in scanning electron microscope images of powder samples. Transfer learning was applied to train each model on a very small dataset of labeled images. Overlaying the predicted particle and satellite masks yielded the first quantitative, repeatable measurements of powder satellites. The results demonstrate the potential for computer vision to supplement current methods of powder characterization, leading to process improvements in both powder production and additive manufacturing. The dataset and code for this project are available on GitHub at https://github.com/rccohn/AMPIS.
Active Learning -- Multi-target Tracking -- Model Learning Capabilities
Kurt Niel, University of Applied Sciences Upper Austria (Austria)
08:30 – 09:30
Quantitative analysis of deep learning based multi-target tracking algorithms, Sanam Nisar Mangi, Mohib Ullah, and Faouzi Alaya Cheikh, Norwegian University of Science and Technology (Norway) [view abstract]
Multi-object tracking is an active computer vision problem that has gained sustainable interest due to its wide range of applications in many areas like surveillance, autonomous driving, entertainment, and, gaming to name a few. In the age of deep learning, many computer vision tasks have benefited from the convolutions neural network and have been optimized with rapid development, whereas multi-target tracking remains a challenging task. Different kind of approaches has been proposed to tackle this problem. There are also a variety of models that have been benefited from the representational power of deep to tackle this issue. In this paper, we inspect three CNN-based models that have achieved state-of-the-art performance in addressing this problem. All three models follow a different paradigm and provide a key inside of the development of the field. We examined the models and conduct experiments on the three models using the benchmark dataset. The quantitative results from the state-of-the-art models are listed in the standard metrics and key provides the basis of future research in the field.
Leveraging gradient weighted class activation mapping to improve classification effectiveness: Case study in transportation infrastructure characterization, Thomas P. Karnowski, Deniz Aykac, Regina K. Ferrell, Christy Gambrell, Zach Langford, and Lauren Torkelson, Oak Ridge National Laboratory (United States) [view abstract]
Roadway “corners” are common for pedestrian use, whether designated with markings or not. Different types of markings have been deployed, ranging from simple parallel lines to more complex designs. Understanding the impact of different types of crosswalks is important for public safety. In this work we explore methods to improve the logging of marked crosswalk types. We used the Roadway Information Database from the Second Strategic Highway Research Project and used active learning methods with transfer learning to identify the crosswalk types (marked or unmarked). Upon completion we found our classifiers were unable to perform above roughly 90% correct classifications. To improve their efficacy, we separated the crosswalks into their “fine grained” types and used Gradient-Weighted Class Activation Mapping to isolate and study the features that classified the crosswalks. We compared this with sampled manually marked crosswalks and present findings. We believe this use case can represent a process to improve the active learning method for some visual machine learning applications.
Deep learning-based multiple animal pose estimation, Brage Arnkværn, Sigurd Schoeler, Mohib Ullah, and Faouzi Alaya Cheikh, Norwegian University of Science and Technology (Norway) [view abstract]
We proposed a deep learning-based approach for pig key-point detection. In a nutshell, we explored transfer learning to adapt a human pose estimation model for the pigs. In total, we tested 3 different models and eventually trained open-pose on the pig data. For training, the data is annotated in COCO format. Additionally, for highlighting the model learning capabilities, we visualized the pixel level response of the network named PAF (part infinity field) on the test frames. The trained model shows promising results and opens new a door for further research.
Henry Ngan, ENPS Hong Kong (Hong Kong)
16:15 – 16:55
Efficient landslide detection by UAV-based multi-temporal visual analysis, Yosuke Yamaguchi1, Kai Matsui1, Jun Ohya1, Katsuya Hasegawa2, and Hiroshi Nagahashi3; 1Waseda University, 2Japan Aerospace Exploration Agency, Institute of Space and Astronautical Science, and 3Tokyo Institute of Technology (Japan) [view abstract]
Unmanned aerial vehicles (UAVs) equipped with optical cameras are powerful tools for automatic landslide detection in large remote and undeveloped areas as they enable high-resolution, low-cost, and flexible visual analysis. Structure from Motion (SfM) - Multi View Stereo (MVS) is commonly used for processing UAV imagery to accurately measure the ground surface. Recent studies based on SfM-MVS conducted landslide detection by monitoring the elevation change over time. However, the heavy processing load of SfM-MVS has been an obstacle for quick analysis in large areas. This paper proposes an efficient landslide detection method by Visual Simultaneous Localization and Mapping (Visual-SLAM) and convolutional neural network (CNN). Visual-SLAM enables the real time measurement of ground surface. Although the acquired data is less accurate than that of SfM-MVS, the CNN-based detection model performs to compensate for this side effect. In the preliminary experiment, it turns out that our method runs 5.5 times as fast as the SfM-MVS, achieving the F1 score of 0.83. The results show the efficiency of the SLAM-CNN-based method for landslide detection in large areas.
Detecting falling rocks by estimating excavation points using single color markers, Rei Kobayashi1, Yoshihiro Sato1, Masaya Miura2, Yuto Osada1, and Yue Bao1; 1Tokyo City University and 2Tokyu Construction Co., Ltd. (Japan) [view abstract]
Tunnels have been constructed in various places for transportation and lifelines. During tunnel construction, the industrial accidents have occurred due to falling rocks from the tunnel face. A large amount of falling rocks is confirmed in the precursor of tunnel collapse. Therefore, it is necessary to detect falling rocks to prevent industrial accidents and to grasp of the situation on tunnel face. As conventional methods, the inter-frame difference method and the laser measurement method were proposed. However, those methods were difficult to monitor the entire tunnel face and detected moving objects other than falling rocks. In this paper, we propose a falling rocks detection method that combines the moving object detection method on the tunnel face and the estimation method of excavation points using single color markers. It was confirmed that only falling rocks were detected during excavation experiment.