Monday 17 January 2022
IS&T Welcome & PLENARY: Quanta Image Sensors: Counting Photons Is the New Game in Town
07:00 – 08:10
The Quanta Image Sensor (QIS) was conceived as a different image sensor—one that counts photoelectrons one at a time using millions or billions of specialized pixels read out at high frame rate with computation imaging used to create gray scale images. QIS devices have been implemented in a CMOS image sensor (CIS) baseline room-temperature technology without using avalanche multiplication, and also with SPAD arrays. This plenary details the QIS concept, how it has been implemented in CIS and in SPADs, and what the major differences are. Applications that can be disrupted or enabled by this technology are also discussed, including smartphone, where CIS-QIS technology could even be employed in just a few years.
Eric R. Fossum, Dartmouth College (United States)
Eric R. Fossum is best known for the invention of the CMOS image sensor “camera-on-a-chip” used in billions of cameras. He is a solid-state image sensor device physicist and engineer, and his career has included academic and government research, and entrepreneurial leadership. At Dartmouth he is a professor of engineering and vice provost for entrepreneurship and technology transfer. Fossum received the 2017 Queen Elizabeth Prize from HRH Prince Charles, considered by many as the Nobel Prize of Engineering “for the creation of digital imaging sensors,” along with three others. He was inducted into the National Inventors Hall of Fame, and elected to the National Academy of Engineering among other honors including a recent Emmy Award. He has published more than 300 technical papers and holds more than 175 US patents. He co-founded several startups and co-founded the International Image Sensor Society (IISS), serving as its first president. He is a Fellow of IEEE and OSA.
08:10 – 08:40 EI 2022 Welcome Reception
Wednesday 19 January 2022
IS&T Awards & PLENARY: In situ Mobility for Planetary Exploration: Progress and Challenges
07:00 – 08:15
This year saw exciting milestones in planetary exploration with the successful landing of the Perseverance Mars rover, followed by its operation and the successful technology demonstration of the Ingenuity helicopter, the first heavier-than-air aircraft ever to fly on another planetary body. This plenary highlights new technologies used in this mission, including precision landing for Perseverance, a vision coprocessor, new algorithms for faster rover traverse, and the ingredients of the helicopter. It concludes with a survey of challenges for future planetary mobility systems, particularly for Mars, Earth’s moon, and Saturn’s moon, Titan.
Larry Matthies, Jet Propulsion Laboratory (United States)
Larry Matthies received his PhD in computer science from Carnegie Mellon University (1989), before joining JPL, where he has supervised the Computer Vision Group for 21 years, the past two coordinating internal technology investments in the Mars office. His research interests include 3-D perception, state estimation, terrain classification, and dynamic scene analysis for autonomous navigation of unmanned vehicles on Earth and in space. He has been a principal investigator in many programs involving robot vision and has initiated new technology developments that impacted every US Mars surface mission since 1997, including visual navigation algorithms for rovers, map matching algorithms for precision landers, and autonomous navigation hardware and software architectures for rotorcraft. He is a Fellow of the IEEE and was a joint winner in 2008 of the IEEE’s Robotics and Automation Award for his contributions to robotic space exploration.
EI 2022 Interactive Poster Session
08:20 – 09:20
EI Symposium
Poster interactive session for all conferences authors and attendees.
Virtual Reality
Session Chairs:
Jan Allebach, Purdue University (United States); Raja Bala, Amazon (United States); and Qian Lin, HP Labs, HP Inc. (United States)
16:15 – 17:15
Red Room
16:15IMAGE-254
Cognitive load inference within a multitasking paradigm in virtual reality (Invited), Jishang Wei, HP Labs (United States) [view abstract]
Virtual reality (VR) has become an increasingly popular way for learning and training. The assessment of the amount of mental effort, or cognitive load required to perform a task, is essential to create adaptive VR training experiences. In this work, we conducted a large-scale study (N=738) to collect behavioral and physiological measures under different cognitive load conditions in a VR environment, and developed a novel machine learning solution to predict cognitive load in real time. Our model predicts cognitive load as a continuous value in the range from 0 to 1, where 0 and 1 correspond to the lowest and highest reported cognitive loads across all participants. On top of the point estimation, our model quantifies prediction uncertainty using a prediction interval. On this regression problem, we achieved a mean absolute error of 0.11. The result indicates that, with a combination of behavioral and physiological indicators, we can reliably predict cognitive load in real-time, without calibration. To augment this paper, we are releasing our test dataset from 100 participants for use by researchers and developers interested in machine learning, virtual reality, learning & memory, cognition, or psychophysiology. This dataset includes recordings from multiple sensors (including pupillometry, eye-tracking, and pulse plethysmography), self-reported cognitive effort, behavioral task performance, and demographic information on the sample.
16:55IMAGE-255
VR facial expression tracking via action unit intensity regression model, Xiaoyu Ji1, Jishang Wei2, Yvonne Huang3, Qian Lin3, Jan P. Allebach1, and Fengqing Zhu1; 1Purdue University, 2HP Labs, and 3HP Inc. (United States) [view abstract]
Prior to the conference, the sponsor of this work may want to submit a patent application protecting IP associated with it. For that reason, we cannot reveal further details about the work at this time.
Thursday 20 January 2022
KEYNOTE: Photography
Session Chairs: Jan Allebach, Purdue University (United States); Raja Bala, Amazon (United States); and Qian Lin, HP Labs, HP Inc. (United States)
07:00 – 08:05
Red Room
07:00IMAGE-262
KEYNOTE: Analogue – Digital – Mobile – Social: How photography has changed in the last 25 years, Reiner Fageth, CEWE Stiftung & Co. KGAA (Germany)
This paper will describe how digital photography has evolved from being a niche product for digital experts and IT freaks, to total domination of the mass-market and the disruption of analogue photography. This rapid, industry-altering process will be put into relationship to conferences and presentations given at Electronic Imaging in the last three decades. Developments and influence on quality were driven by imaging sensors and printing technologies. A review of the battle for attainting the high quality of silver-halide prints from negatives will be presented. The development of image enhancement technologies and printing technologies will be analyzed. The classical one-hour in-store photo order process in North America and Asia based on mini labs, and the logistics systems in Europe (collecting the film from the points of sale in the evening and returning the prints, processed by huge photofinishing plants, the next day) were substituted by digital kiosk systems in stores, and software applications in the browser or downloadable software. The development of the technologies involved will also be presented, as well as the efforts in supporting the selection process of the most suitable images. New digital printers based on liquid ink and toner offered new products, personalized photobooks and calendars allowed for story-telling and emotionalized gifting via tangible photo products and raised the value of every printed image. A review will be provided of the improvements there, as well as classical silver-halide based printing systems. The introduction of smartphones disrupted the new digital imaging ecosystems once more. The camera was now a constant companion in nearly everybody’s pocket. The resulting increase in the number of images taken complicated the selection process (convenience photos, images “only” for social communication, ...) and the image quality discussion was once more raised and addressed in these conference. All of these challenges are addressed by actual imaging eco-systems. They include ordering possibilities over all devices (classical digital cameras and smartphones) and retail locations, as well as providing home delivery options. Selling these products became more of a marketing than a technological challenge. These systems utilize AI based solutions (on device and utilizing edge computing) combined with experience/heuristics gathered in the last 25 years. Some very good approaches will be presented at the end.
Reiner Fageth occupies the position of Head-Technology, Research & Development at CEWE Stiftung & Co. KGaA and Chairman-Supervisory Board of CEWE Color as (a subsidiary of CEWE Stiftung & Co. KGaA). Dr. Fageth is also on the board of CeWe Color, Inc. and Member-Management Board at Neumüller CEWE COLOR Stiftung. Reiner Fageth studied electronic engineering at the Fachhochschule Heilbronn, Germany. He received a PhD from the University of Northumbria at Newcastle, UK in split research with Telefunken Microelectronic and the Steinbeis Transferzentrum Image Processing in 1994. The major research topics there and also for the following years were industrial image processing systems based on classification using fuzzy logic and neural networks. In 1998 he joined CeWe Color with the charge to drive the analogue photo business into digital. First he was responsible for R&D and the production of consumers digital files on silver halide paper. CeWe Color is Europe largest wholesale photofinisher producing more than 3 billion prints a year. He is a member of the German DIN Normenausschuss Bild und Film NA 049-00-04 AA and has published over 30 technical papers.
07:40IMAGE-263
Efficient real-time portrait video segmentation with temporal guidance, Weichen Xu1, Yezhi Shen1, Qian Lin2, Jan P. Allebach1, and Fengqing Zhu1; 1Purdue University (United States) and 2HP Inc. (United States) [view abstract]
Prior to the conference, the sponsor of this work may want to submit a patent application protecting IP associated with it. For that reason, we cannot reveal further details about the work at this time.
Machine / Deep Learning
Session Chairs:
Jan Allebach, Purdue University (United States); Raja Bala, Amazon (United States); and Qian Lin, HP Labs, HP Inc. (United States)
08:30 – 09:30
Red Room
08:30IMAGE-272
Generating high-resolution atmospheric gas concentration imagery with multiple remote sensing data using a geography-informed machine learning approach (Invited), Kalai Ramea, Palo Alto Research Center (United States) [view abstract]
As more policies for air quality and climate change are stipulated, there is an increasing need to monitor atmospheric gases. Traditional monitoring methods of these gases include ground sensors and on-demand in-situ measurements such as cars, drones, or aircraft equipped with gas sensors. While these measurements are reliable to quantify surface-level concentrations, this necessitates dense sensor networks to develop the fine-scale spatial distribution of gas concentration, which is hard to scale in large regions or with on-demand measurements. Recently, satellite remote sensing has been emerging as a promising alternative for atmospheric measurements. Although the satellites can capture larger areas in one swath, the best available spatial resolutions for atmospheric gases are still coarse. To understand if the implemented policies impact the environment or regulate bad actors, we need a method to rapidly generate near real-time, high spatial resolution imagery of these gases. This talk will provide a comprehensive overview of remote sensing data and the related applications for various types of atmospheric gases; and a novel geography-informed machine learning algorithm that fuses data from multiple remote sensing sources. While we show these results for atmospheric gases, this approach could generically apply to other multi-modal observations defined by spatial autocorrelation properties.
09:10IMAGE-273
Cultural assets identification using transfer learning, Huajian Liu, Simon Bugert, Waldemar Berchtold, and Martin Steinebach, Fraunhofer Institute for Secure Information Technology (Germany) [view abstract]
Identifying culture assets is a challenging task which requires specific expertise. In this paper, a deep learning based solution to identify archaeological objects is proposed. Unlike general object recognition, identifying archaeological objects poses new challenges. To meet the special requirements in classifying antiques, a hybrid network architecture is used to learn the characteristics of objects using transfer learning, which includes a classification network and a regression network. With the help of the regression network the age of objects can be predicted, which improves the overall performance in comparison to manually classifying the age of objects. The proposed scheme is evaluated using a public database of culture assets and the experimental results demonstrate its significant performance in identifying antique objects.
Image / Video Analysis
Session Chairs:
Jan Allebach, Purdue University (United States); Raja Bala, Amazon (United States); and Qian Lin, HP Labs, HP Inc. (United States)
10:00 – 11:00
Red Room
10:00IMAGE-287
Correspondences for image and video reconstruction (Invited), Xiaoyu Xiang, Facebook Inc. (United States) [view abstract]
In this talk, we will discuss the function of correspondences in image and video reconstruction. Specifically, I will cover several methods to find the correspondences, including optical flow, deformable convolution, etc., and their applications to image and video restoration tasks such as super-resolution, denoising, and view synthesis.
10:40IMAGE-301
Towards the creation of a nutrition and food group based image database, Zeman Shao, Jiangpeng He, Ya-Yuan Yu, Luotao Lin, Alexandra E. Cowan, Heather A. Eicher-Miller, and Fengqing Zhu, Purdue University (United States) [view abstract]
Prior to the conference, the sponsor of this work may want to submit a patent application protecting IP associated with it. For that reason, we cannot reveal further details about the work at this time.
Applications
Session Chairs:
Jan Allebach, Purdue University (United States); Raja Bala, Amazon (United States); and Qian Lin, HP Labs, HP Inc. (United States)
15:00 – 16:00
Red Room
15:00IMAGE-300
Automatic facial skin feature detection for everyone, Qian Zheng1, Ankur Purwar2, Heng Zhao1, Guang Liang Lim1, Ling Li1, Debasish Behera2, Qian Wang1, Min Tan1, Rizhao Cai1, Jennifer Werner2, Dennis Sng1, Maurice van Steensel1, Weisi Lin1, and Alex C. Kot1; 1Nanyang Technological University and 2Procter & Gamble (Singapore) [view abstract]
The automatic assessment and understanding of facial skin condition have several research and consumer applications including the early detection of underlying health problems, suggested lifestyle and dietary changes such as less sun exposure or more hydration, as well as identification of recommended skin-care products that can improve the overall facial skin health. We present an automatic skin quality assessment method that works across a variety of skin tones and age groups for selfies in-the-wild. Selfies in-the-wild are an excellent data source to democratise skin quality assessment but suffers with several image capture condition challenges e.g. lighting environment, different poses etc. We leverage the technique of deep learning to tackle these challenges. That is, we annotate the locations of acne, pigmentation, and wrinkle for selfie images with different skin tone colors, levels of severity, and conditions of lighting. The annotation is conducted in a two-phase scheme with the help of a dermatologist. The architecture of our neural network is Unet++. This project shows that the two-phase annotation scheme can robustly produce accurate locations of acne, pigmentation, and wrinkle for selfie images with different ethnicities, skin tone colors, levels of severity, age groups, and conditions of lighting.
15:20IMAGE-288
Mix-loss trained bias-removed blind image denoising network, Yi Yang1, Chih-Hsien Chou2, and Jan P. Allebach1; 1Purdue University and 2Futurewei Technologies, Inc (United States) [view abstract]
We studied the modern deep convolutional neural networks used for image denoising, where RGB input images are transformed into RGB output images via feed-forward convolutional neural networks that use a loss defined in the RGB color space. Considering the difference between human visual perception and objective evaluation metrics such as PSNR or SSIM, we propose a data augmentation technique and demonstrate that it is equivalent to defining a perceptual loss function. We trained a network based on this and obtained visually pleasing denoised results. We also combine an unsupervised design and the bias-free network to deal with the overfitting due to the absence of clean images, and improve performance when the noise level exceeds the training range.
15:40IMAGE-302
Billion-scale pretrained unified visual embedding and its applications in Pinterest, Josh Beal1, Rex Wu1, Seth Park1,2, Kofi Boakye1, Bin Shen1, Andrew Zhai1, and Chuck Rosenberg1; 1Pinterest and 2University of California, Berkeley (United States) [view abstract]
The talk will cover how Pinterest trains unified embedding from billions of images and how Pinterest uses the embedding for applications.