Image Quality and System Performance XX
Monday 16 January 2023
20th Anniversary: A Tour of Quality Assessment and System Performance (M1)
Session Chair:
Mohamed Chaker Larabi, Université de Poitiers (France)
8:45 – 10:20 AM
Cyril Magnin III
8:45
Conference Welcome, Chaker Larabi
8:50
Twenty years in twenty minutes, Peter Burns
9:10IQSP-450
Subjective image quality: Beauty and the Beast in human vision (Invited), Göte S. Nyman, University of Helsinki (Finland) [view abstract]
For the "20th Anniversary: A Tour of Quality Assessment and System Performance" session. The classic problem of objective vs. subjective image quality is getting a new boost when AI meets high-quality human vision and visual experience. With the ever better displays, cameras, image processing tools, generators, and algorithms, the criteria for good and excellent image quality are pushed further from traditional quality metrics. We face the question how to measure and model high subjective image quality and visual experience? So far, there is no standard approach to this in the field of imaging. Subjective image quality metrics are expected to provide relevant data for r&d and computational purposes, and especially for the evolving, automated and ML-based assessment. Ranking, rating, and preference data are not enough. In the talk, historical, methodological and somewhat philosophical background of qualitative methods in image quality assessment and profiling are described and their future possibilities considered. This is based on our twenty-year experience in applying qualitative methods in different imaging contexts, from print and publishing to mobile phone camera development.
9:45IQSP-451
Displays and lighting: What do they have in common? (Invited), Ingrid Heynderickx, Eindhoven University of Technology (the Netherlands) [view abstract]
For the "20th Anniversary: A Tour of Quality Assessment and System Performance" session. Experiments have illustrated that users mainly focus on artefacts when assessing quality of images or light created with still developing technologies. Once the technologies have matured, quality is considered a broader concept more related to the overall viewing experience. Not only this trend is similar between the two largely separated fields of display quality and lighting quality; several of the underlying visual principles have the same origin in the visual system, but are investigated separately for displays and lighting. Examples that will be discussed are color break-up, brightness perception and light adaptation. What can the display community learn from the lighting community and vice versa?
10:20 – 10:40 AM Coffee Break
20th Anniversary: A Tour of Quality Assessment and System Performance (M2)
Session Chairs: Mohamed Chaker Larabi, Université de Poitiers (France) and Jonathan Phillips, Imatest, LLC (United States)
10:40 AM – 12:30 PM
Cyril Magnin III
10:40IQSP-452
The revolutionary advancement of camera phone image quality (Invited), Jonathan Phillips, Imatest, LLC (United States) [view abstract]
For the "20th Anniversary: A Tour of Quality Assessment and System Performance" session. The global impact of camera phones is multi-faceted, influencing technological advances, user interface design, cloud storage, and image sharing methodologies. The sheer volume of camera phone ownership has dwarfed the existing number of digital still cameras as the camera phone market segment grew from tens of millions in early acceptance years in Japan to annual global sales volumes of over 1 billion for nearly 10 years and counting. This has enabled and pushed forward revolutionary image quality advancement of the incorporated cameras in the multifunctional devices, progressing from 0.11MP image sensors with 2-inch displays in 1999 to current maximums of 200 MP sensors and 8-inch foldable displays. This overview will provide example images and image quality metrics showing the progression over the past twenty years. Content will also highlight significant technological advancements impacting image quality attributes such as resolution, low light performance, dynamic range, zoom, and bokeh.
11:10IQSP-453
23 years of ISO 12233 resolution measurement (Invited), Dietmar Wueller, Image Engineering GmbH & Co. KG (Germany) [view abstract]
For the "20th Anniversary: A Tour of Quality Assessment and System Performance" session. Thirty years ago, ISO/TC42 WG18, a newly created ISO working group on digital photography, began developing a standard to measure the spatial resolution of digital cameras. After years of proposals, testing, and analysis, consensus was reached on a test chart with tilted edge features for measuring spatial frequency response (SFR) and hyperbolic wedges for measuring visual and limiting resolution. The group ensured that the test chart and analysis software would be available internationally. First published in 2000, ISO 12233 is now used to measure cameras in a wide range of applications. It was revised in 2014 to define three new charts, a sine wave modulated target in polar format, a low contrast e-SFR target, and the CIPA chart with software which computes a “human equivalent visual resolution” value. Edition 4 of ISO 12233 will soon be published. It expands the e-SFR measurement with a polynomial fit function in case of high distortion levels, adds the analysis of sagittal and tangential edge orientation, specifies a way to determine acutance from the measured SFRs and a compensation for non-uniform illumination.
11:20IQSP-456
Limits of MTF in practice (Invited), Alexander Braun, Düsseldorf University of Applied Sciences (Germany) [view abstract]
For the "20th Anniversary: A Tour of Quality Assessment and System Performance" session. The modulation transfer function (MTF) is one of the most established metrics to gauge ’sharpness’ of imaging systems. Solidly based on linear system theory it has been standardized decades ago, and the main standard ISO12233 is constantly being evolved and improved — as demonstrated by the many talks and discussions at this EI2023. In automotive mass production, though, the MTF is difficult to implement in a stable and reproducible way. Also, it is currently not traceable to fundamental SI units or, more practically speaking, to the norming institutes like NIST (Boulder, CO) or the PTB (Germany). This is important, as the MTF is used End-of-Line in mass production to validate the correct operation of the produced camera system. Finally, our research results indicate that for some circumstances the MTF as a metric does not correlate well to the performance of ML/AI-based algorithms, which are a pillar of modern Computer Vision.
11:30IQSP-454
Measuring camera information capacity with slanted-edges (Invited), Norman Koren, Imatest LLC (United States) [view abstract]
For the "20th Anniversary: A Tour of Quality Assessment and System Performance" session. We describe a new calculation of camera information capacity, C, derived from standard 4:1 contrast ratio slanted edges, that takes advantage of an overlooked capability of the slanted edge that allows the variance and hence the noise of the edge to be calculated in addition to the mean. The average signal and noise power derived from the edge can be entered into the Shannon-Hartley equation to calculate the information capacity of the 4:1 edge signal, C[4]. Since C[4] is highly sensitive to exposure, we have developed a more consistent metric, C[max], derived from the maximum allowed signal in the file, making it an excellent approximation of the camera’s maximum information capacity. Information capacities C[4] and C[max] are excellent figures of merit for system performance because they combine the effects of MTF and noise. They have great potential for predicting the performance of Machine Vision and Artificial Intelligence systems. They are easy to calculate, requiring no extra effort beyond the standard slanted-edge MTF calculation.
11:40IQSP-455
From BxU to integrated information capacity, a brief history of MTF based KPIs at DXOMARK (Invited), Laurent Chanas, DxOMark Image Labs (France) [view abstract]
For the "20th Anniversary: A Tour of Quality Assessment and System Performance" session. The MTF curve needs an expert to understand its meaning. DXOMARK has tried different ways to convert the MTF into a scalar that has a meaning for less expert peoples. We have proposed the new metric named BxU, then used the classical acutance. Finally, we have proposed a way to compute the Perceptual-Mpix, which is linked to the information capacity of a camera. This new concept encompasses both the sensor noise level and the MTF to provide the maximum information than can be captured in the image provided by the sensor. We have recently proposed a new computation of the information capacity of a camera, by using all the lens and sensor characterization, and a radial model of the lens defects.
11:50
Panel Discussion
12:30 – 2:00 PM Lunch
Monday 16 January PLENARY: Neural Operators for Solving PDEs
Session Chair: Robin Jenkin, NVIDIA Corporation (United States)
2:00 PM – 3:00 PM
Cyril Magnin I/II/III
Deep learning surrogate models have shown promise in modeling complex physical phenomena such as fluid flows, molecular dynamics, and material properties. However, standard neural networks assume finite-dimensional inputs and outputs, and hence, cannot withstand a change in resolution or discretization between training and testing. We introduce Fourier neural operators that can learn operators, which are mappings between infinite dimensional spaces. They are independent of the resolution or grid of training data and allow for zero-shot generalization to higher resolution evaluations. When applied to weather forecasting, neural operators capture fine-scale phenomena and have similar skill as gold-standard numerical weather models for predictions up to a week or longer, while being 4-5 orders of magnitude faster.
Anima Anandkumar, Bren professor, California Institute of Technology, and senior director of AI Research, NVIDIA Corporation (United States)
Anima Anandkumar is a Bren Professor at Caltech and Senior Director of AI Research at NVIDIA. She is passionate about designing principled AI algorithms and applying them to interdisciplinary domains. She has received several honors such as the IEEE fellowship, Alfred. P. Sloan Fellowship, NSF Career Award, and Faculty Fellowships from Microsoft, Google, Facebook, and Adobe. She is part of the World Economic Forum's Expert Network. Anandkumar received her BTech from Indian Institute of Technology Madras, her PhD from Cornell University, and did her postdoctoral research at MIT and assistant professorship at University of California Irvine.
3:00 – 3:30 PM Coffee Break
Subjective Quality Assessment (M3)
Session Chair:
Sophie Triantaphillidou, University of Westminster (United Kingdom)
3:30 – 4:50 PM
Cyril Magnin III
3:30IQSP-301
Image demosaicing: Subjective analysis and evaluation of image quality metrics, Tawsin Uddin Ahmed, Seyed Ali Amirshahi, and Marius Pedersen, Norwegian University of Science and Technology (Norway) [view abstract]
Most cameras use a single-sensor arrangement with Color Filter Array (CFA). Color interpolation techniques performed during image demosaicing are normally the reason behind visual artifacts generated in a captured image. While the severity of the artifacts depends on the demosaicing methods used, the artifacts themselves are mainly zipper artifacts (block artifacts across the edges) and false-color distortions. In this study and to evaluate the performance of demosaicing methods, a subjective pair-comparison method with 15 observers was performed on six different methods (namely Nearest Neighbours, Bilinear interpolation, Laplacian, Adaptive Laplacian, Smooth hue transition, and Gradient-Based image interpolation) and nine different scenes. The subjective scores and scene images are then collected as a dataset and used to evaluate a set of no-reference image quality metrics. Assessment of the performance of these image quality metrics in terms of correlation with the subjective scores show that many of the evaluated no-reference metrics cannot predict perceived image quality.
3:50IQSP-302
Age-specific perceptual image quality assessment, Yinan Wang1, Andrei Chubarau1, Tara Akhavan2, Hyunjin Yoo2, and James Clark1; 1McGill University and 2Forvia (Canada) [view abstract]
With the development of image-based applications, assessing the quality of images has become increasingly important. Although our perception of image quality changes as we age, most existing image quality assessment (IQA) metrics make simplifying assumptions about the age of observers, thus limiting their use for age-specific applications. In this work, we propose a personalized IQA metric to assess the perceived image quality of observers from different age groups. Firstly, we apply an age simulation algorithm to compute how an observer with a particular age would perceive a given image. More specifically, we process the input image according to an age-specific contrast sensitivity function (CSF), which predicts the reduction of contrast visibility associated with the aging eye. We combine age simulation with existing IQA metrics to calculate the age-specific perceived image quality score. To validate the effectiveness of our combined model, we conducted a psychophysical experiment in a controlled laboratory environment with young (18-31 y.o.), middle-aged (32-52 y.o.), and older (53+ y.o.) adults, measuring their image quality preferences for 84 test images. Our analysis shows that the predictions by our age-specific IQA metric are well correlated with the collected subjective IQA results from our psychophysical experiment.
4:10IQSP-303
A method for evaluating camera auto-focusing performance using a transparent display device, Seungwan Jeon, Kichul Park, Sung-Su Kim, and Yitae Kim, Samsung Electronics (Republic of Korea) [view abstract]
With the development of various autofocusing (AF) technologies, sensor manufacturers are demanded to evaluate their performance accurately. The basic method of evaluating AF performance is to measure the time to get the refocused image and the sharpness of the image while repeatedly inducing the refocusing process. Traditionally, this process was conducted manually by covering and uncovering an object or sensor repeatedly, which can lead to unreliable results due to the human error and light blocking method. To deal with this problem, we propose a new device and solutions using a transparent display. Our method can provide more reliable results than the existing method by modulating the opacity, pattern, and repetition cycle of the target on the transparent display.
EI 2023 Highlights Session
Session Chair: Robin Jenkin, NVIDIA Corporation (United States)
3:30 – 5:00 PM
Cyril Magnin II
Join us for a session that celebrates the breadth of what EI has to offer with short papers selected from EI conferences.
NOTE: The EI-wide "EI 2023 Highlights" session is concurrent with Monday afternoon COIMG, COLOR, IMAGE, and IQSP conference sessions.
IQSP-309
Evaluation of image quality metrics designed for DRI tasks with automotive cameras, Valentine Klein, Yiqi LI, Claudio Greco, Laurent Chanas, and Frédéric Guichard, DXOMARK (France) [view abstract]
Driving assistance is increasingly used in new car models. Most driving assistance systems are based on automotive cameras and computer vision. Computer Vision, regardless of the underlying algorithms and technology, requires the images to have good image quality, defined according to the task. This notion of good image quality is still to be defined in the case of computer vision as it has very different criteria than human vision: humans have a better contrast detection ability than image chains. The aim of this article is to compare three different metrics designed for detection of objects with computer vision: the Contrast Detection Probability (CDP) [1, 2, 3, 4], the Contrast Signal to Noise Ratio (CSNR) [5] and the Frequency of Correct Resolution (FCR) [6]. For this purpose, the computer vision task of reading the characters on a license plate will be used as a benchmark. The objective is to check the correlation between the objective metric and the ability of a neural network to perform this task. Thus, a protocol to test these metrics and compare them to the output of the neural network has been designed and the pros and cons of each of these three metrics have been noted.
SD&A-224
Human performance using stereo 3D in a helmet mounted display and association with individual stereo acuity, Bonnie Posselt, RAF Centre of Aviation Medicine (United Kingdom) [view abstract]
Binocular Helmet Mounted Displays (HMDs) are a critical part of the aircraft system, allowing information to be presented to the aviator with stereoscopic 3D (S3D) depth, potentially enhancing situational awareness and improving performance. The utility of S3D in an HMD may be linked to an individual’s ability to perceive changes in binocular disparity (stereo acuity). Though minimum stereo acuity standards exist for most military aviators, current test methods may be unable to characterise this relationship. This presentation will investigate the effect of S3D on performance when used in a warning alert displayed in an HMD. Furthermore, any effect on performance, ocular symptoms, and cognitive workload shall be evaluated in regard to individual stereo acuity measured with a variety of paper-based and digital stereo tests.
IMAGE-281
Smartphone-enabled point-of-care blood hemoglobin testing with color accuracy-assisted spectral learning, Sang Mok Park1, Yuhyun Ji1, Semin Kwon1, Andrew R. O’Brien2, Ying Wang2, and Young L. Kim1; 1Purdue University and 2Indiana University School of Medicine (United States) [view abstract]
We develop an mHealth technology for noninvasively measuring blood Hgb levels in patients with sickle cell anemia, using the photos of peripheral tissue acquired by the built-in camera of a smartphone. As an easily accessible sensing site, the inner eyelid (i.e., palpebral conjunctiva) is used because of the relatively uniform microvasculature and the absence of skin pigments. Color correction (color reproduction) and spectral learning (spectral super-resolution spectroscopy) algorithms are integrated for accurate and precise mHealth blood Hgb testing. First, color correction using a color reference chart with multiple color patches extracts absolute color information of the inner eyelid, compensating for smartphone models, ambient light conditions, and data formats during photo acquisition. Second, spectral learning virtually transforms the smartphone camera into a hyperspectral imaging system, mathematically reconstructing high-resolution spectra from color-corrected eyelid images. Third, color correction and spectral learning algorithms are combined with a spectroscopic model for blood Hgb quantification among sickle cell patients. Importantly, single-shot photo acquisition of the inner eyelid using the color reference chart allows straightforward, real-time, and instantaneous reading of blood Hgb levels. Overall, our mHealth blood Hgb tests could potentially be scalable, robust, and sustainable in resource-limited and homecare settings.
AVM-118
Designing scenes to quantify the performance of automotive perception systems, Zhenyi Liu1, Devesh Shah2, Alireza Rahimpour2, Joyce Farrell1, and Brian Wandell1; 1Stanford University and 2Ford Motor Company (United States) [view abstract]
We implemented an end-to-end simulation for perception systems, based on cameras, that are used in automotive applications. The open-source software creates complex driving scenes and simulates cameras that acquire images of these scenes. The camera images are then used by a neural network in the perception system to identify the locations of scene objects, providing the results as input to the decision system. In this paper, we design collections of test scenes that can be used to quantify the perception system’s performance under a range of (a) environmental conditions (object distance, occlusion ratio, lighting levels), and (b) camera parameters (pixel size, lens type, color filter array). We are designing scene collections to analyze performance for detecting vehicles, traffic signs and vulnerable road users in a range of environmental conditions and for a range of camera parameters. With experience, such scene collections may serve a role similar to that of standardized test targets that are used to quantify camera image quality (e.g., acuity, color).
VDA-403
Visualizing and monitoring the process of injection molding, Christian A. Steinparz1, Thomas Mitterlehner2, Bernhard Praher2, Klaus Straka1,2, Holger Stitz1,3, and Marc Streit1,3; 1Johannes Kepler University, 2Moldsonics GmbH, and 3datavisyn GmbH (Austria) [view abstract]
In injection molding machines the molds are rarely equipped with sensor systems. The availability of non-invasive ultrasound-based in-mold sensors provides better means for guiding operators of injection molding machines throughout the production process. However, existing visualizations are mostly limited to plots of temperature and pressure over time. In this work, we present the result of a design study created in collaboration with domain experts. The resulting prototypical application uses real-world data taken from live ultrasound sensor measurements for injection molding cavities captured over multiple cycles during the injection process. Our contribution includes a definition of tasks for setting up and monitoring the machines during the process, and the corresponding web-based visual analysis tool addressing these tasks. The interface consists of a multi-view display with various levels of data aggregation that is updated live for newly streamed data of ongoing injection cycles.
COIMG-155
Commissioning the James Webb Space Telescope, Joseph M. Howard, NASA Goddard Space Flight Center (United States) [view abstract]
Astronomy is arguably in a golden age, where current and future NASA space telescopes are expected to contribute to this rapid growth in understanding of our universe. The most recent addition to our space-based telescopes dedicated to astronomy and astrophysics is the James Webb Space Telescope (JWST), which launched on 25 December 2021. This talk will discuss the first six months in space for JWST, which were spent commissioning the observatory with many deployments, alignments, and system and instrumentation checks. These engineering activities help verify the proper working of the telescope prior to commencing full science operations. For the session: Computational Imaging using Fourier Ptychography and Phase Retrieval.
HVEI-223
Critical flicker frequency (CFF) at high luminance levels, Alexandre Chapiro1, Nathan Matsuda1, Maliha Ashraf2, and Rafal Mantiuk3; 1Meta (United States), 2University of Liverpool (United Kingdom), and 3University of Cambridge (United Kingdom) [view abstract]
The critical flicker fusion (CFF) is the frequency of changes at which a temporally periodic light will begin to appear completely steady to an observer. This value is affected by several visual factors, such as the luminance of the stimulus or its location on the retina. With new high dynamic range (HDR) displays, operating at higher luminance levels, and virtual reality (VR) displays, presenting at wide fields-of-view, the effective CFF may change significantly from values expected for traditional presentation. In this work we use a prototype HDR VR display capable of luminances up to 20,000 cd/m^2 to gather a novel set of CFF measurements for never before examined levels of luminance, eccentricity, and size. Our data is useful to study the temporal behavior of the visual system at high luminance levels, as well as setting useful thresholds for display engineering.
HPCI-228
Physics guided machine learning for image-based material decomposition of tissues from simulated breast models with calcifications, Muralikrishnan Gopalakrishnan Meena1, Amir K. Ziabari1, Singanallur Venkatakrishnan1, Isaac R. Lyngaas1, Matthew R. Norman1, Balint Joo1, Thomas L. Beck1, Charles A. Bouman2, Anuj Kapadia1, and Xiao Wang1; 1Oak Ridge National Laboratory and 2Purdue University (United States) [view abstract]
Material decomposition of Computed Tomography (CT) scans using projection-based approaches, while highly accurate, poses a challenge for medical imaging researchers and clinicians due to limited or no access to projection data. We introduce a deep learning image-based material decomposition method guided by physics and requiring no access to projection data. The method is demonstrated to decompose tissues from simulated dual-energy X-ray CT scans of virtual human phantoms containing four materials - adipose, fibroglandular, calcification, and air. The method uses a hybrid unsupervised and supervised learning technique to tackle the material decomposition problem. We take advantage of the unique X-ray absorption rate of calcium compared to body tissues to perform a preliminary segmentation of calcification from the images using unsupervised learning. We then perform supervised material decomposition using a deep learned UNET model which is trained using GPUs in the high-performant systems at the Oak Ridge Leadership Computing Facility. The method is demonstrated on simulated breast models to decompose calcification, adipose, fibroglandular, and air.
3DIA-104
Layered view synthesis for general images, Loïc Dehan, Wiebe Van Ranst, and Patrick Vandewalle, Katholieke University Leuven (Belgium) [view abstract]
We describe a novel method for monocular view synthesis. The goal of our work is to create a visually pleasing set of horizontally spaced views based on a single image. This can be applied in view synthesis for virtual reality and glasses-free 3D displays. Previous methods produce realistic results on images that show a clear distinction between a foreground object and the background. We aim to create novel views in more general, crowded scenes in which there is no clear distinction. Our main contributions are a computationally efficient method for realistic occlusion inpainting and blending, especially in complex scenes. Our method can be effectively applied to any image, which is shown both qualitatively and quantitatively on a large dataset of stereo images. Our method performs natural disocclusion inpainting and maintains the shape and edge quality of foreground objects.
ISS-329
A self-powered asynchronous image sensor with independent in-pixel harvesting and sensing operations, Ruben Gomez-Merchan, Juan Antonio Leñero-Bardallo, and Ángel Rodríguez-Vázquez, University of Seville (Spain) [view abstract]
A new self-powered asynchronous sensor with a novel pixel architecture is presented. Pixels are autonomous and can harvest or sense energy independently. During the image acquisition, pixels toggle to a harvesting operation mode once they have sensed their local illumination level. With the proposed pixel architecture, most illuminated pixels provide an early contribution to power the sensor, while low illuminated ones spend more time sensing their local illumination. Thus, the equivalent frame rate is higher than the offered by conventional self-powered sensors that harvest and sense illumination in independient phases. The proposed sensor uses a Time-to-First-Spike readout that allows trading between image quality and data and bandwidth consumption. The sensor has HDR operation with a dynamic range of 80 dB. Pixel power consumption is only 70 pW. In the article, we describe the sensor’s and pixel’s architectures in detail. Experimental results are provided and discussed. Sensor specifications are benchmarked against the art.
COLOR-184
Color blindness and modern board games, Alessandro Rizzi1 and Matteo Sassi2; 1Università degli Studi di Milano and 2consultant (Italy) [view abstract]
Board game industry is experiencing a strong renewed interest. In the last few years, about 4000 new board games have been designed and distributed each year. Board game players gender balance is reaching the equality, but nowadays the male component is a slight majority. This means that (at least) around 10% of board game players are color blind. How does the board game industry deal with this ? Recently, a raising of awareness in the board game design has started but so far there is a big gap compared with (e.g.) the computer game industry. This paper presents some data about the actual situation, discussing exemplary cases of successful board games.
5:00 – 6:15 PM EI 2023 All-Conference Welcome Reception (in the Cyril Magnin Foyer)
Tuesday 17 January 2023
KEYNOTE: Perceptual Video Quality 1 (T1)
Session Chairs: Lukáš Krasula, Netflix, Inc. (United States) and Mohamed Chaker Larabi, Université de Poitiers (France)
9:05 – 10:10 AM
Cyril Magnin III
This session is jointly sponsored by: Human Vision and Electronic Imaging 2023, and Image Quality and System Performance XX.
Joint Conference Welcome
HVEI-258
KEYNOTE: Bringing joy to Netflix members through perceptual encoding optimization, Anne Aaron, Netflix, Inc. (United States) [view abstract]
As Director of Encoding Technologies, Anne Aaron leads the team responsible for media processing and encoding at Netflix. Her team works on video, audio, images and timed-text, from analysis to processing, encoding, packaging and DRM. On the streaming side, they strive to deliver a compelling viewing experience for millions of Netflix members worldwide, no matter where, how and what they watch. For the Netflix studio, they build media technologies that can improve content production. In her previous role at Netflix, Aaron led the Video Algorithms team. As a team, they researched and deployed innovation in the video encoding space (per-title encoding, video quality assessment and perceptual metrics, shot-based encoding, HDR, next-generation codecs) that benefited Netflix members as well as impacted the rest of the industry. Recent recognitions include: Some recent recognitions: SMPTE 2019 Workflow Systems Medal, Forbes' 2018 America's top women in Tech, Business Insider's 2017 Most powerful female engineers in US tech in 2017.
Audio and video compression are immensely important to Netflix, as well as internet service providers (ISPs). It has been estimated that our codec optimization efforts, together with the Open Connect program, saved ISPs over 1 billion dollars in 2021 alone. The keynote will talk about the importance of perceptual models and optimization for delivering the hits such as Stranger Things, Squid Game, or Red Notice in the highest quality while being mindful of the internet traffic. It will cover the recent advances in audio and video encoding, innovations in the subjective and objective assessment of quality, as well as immediate and future challenges in this area.
10:00 AM – 7:30 PM Industry Exhibition - Tuesday (in the Cyril Magnin Foyer)
10:20 – 10:50 AM Coffee Break
Perceptual Video Quality 2 (T2)
Session Chairs:
Lukáš Krasula, Netflix, Inc. (United States) and Mohamed Chaker Larabi, Université de Poitiers (France)
10:50 AM – 12:30 PM
Cyril Magnin III
This session is jointly sponsored by: Human Vision and Electronic Imaging 2023, and Image Quality and System Performance XX.
10:50HVEI-259
Video quality of video professionals for Video Assisted Referee (VAR), Kjell Brunnström1,2, Anders Djupsjöbacka1, Johsan Billingham3, Katharina Wistel3, Börje Andrén1, Oskars Ozolins1,4, and Nicolas Evans3; 1RISE Research Institutes of Sweden AB (Sweden), 2Mid Sweden University (Sweden), 3Fédération Internationale de Football Association (FIFA) (Switzerland), and 4KTH (Royal Institute of Technology) (Sweden) [view abstract]
Changes in the footballing world’s approach to technology and innovation contributed to the decision by the International Football Association Board (IFAB) to introduce Video Assistant Referees (VAR). The change meant that under strict protocols referees could use video replays to review decisions in the event of a “clear and obvious error” or a “serious missed incident”. This led to the need by Fédération Internationale de Football Association (FIFA) to develop methods for quality control of the VAR-systems, which was done in collaboration with RISE Research Institutes of Sweden AB. One of the important aspects is the video quality. The novelty of this study is that it has performed a user study specifically targeting video experts i.e., to measure the perceived quality of video professionals working with video production as their main occupation. An experiment was performed involving 25 video experts. In addition, six video quality models have been benchmarked against the user data and evaluated to show which of the models could provide the best predictions of perceived quality for this application. Video Quality Metric for variable frame delay (VQM_VFD) had the best performance for both formats, followed by Video Multimethod Assessment Fusion (VMAF) and VQM General model.
11:10HVEI-260
User perception for dynamic video resolution change using VVC, Sachin G. Deshpande and Philip Cowan, Sharp (United States) [view abstract]
We define experiments that measure user perception when video resolution changes dynamically. Versatile Video Coding (VVC) standard was recently finalized and it includes a reference picture resampling (RPR) tool. VVC RPR supports changing spatial resolution in a coded video sequence on a per picture basis. VVC RPR defines the downsampling and upsampling filters to be used when changing resolution. This paper provides results from subjective evaluation when VVC RPR is used for part of the video sequence to dynamically change resolution. The experiments use different QP values (or bitrates), different RPR scale factors and different highest original spatial resolutions. The results compare how users perceive video coded using VVC RPR for some pictures compared to an anchor which does not use RPR. In addition to the subjective results, we also describe performance of various metrics including PSNR, VMAF and MS-SSIM. Our results can help choose the highest RPR scale factor that can be used to achieve/ maintain certain perceived quality when using RPR (for example for bitrate reduction). The study also confirms that MS-SSIM and VMAF match subjective test results more closely compared to PSNR.
11:30IQSP-261
Proposing more ecologically-valid experiment protocol using YouTube platform, Gabriela Wielgus, Lucjan Janowski, Kamil Koniuch, Mikolaj Leszczuk, and Rafal Figlus, AGH University of Science and Technology (Poland) [view abstract]
Video streaming is becoming increasingly popular, and with platforms like YouTube, users do not watch the video passively but seek, pause, and read the comments. The popularity of video services is possible due to the development of compression and quality prediction algorithms. However, those algorithms are developed based on classic experiments, which are non-ecologically valid. Therefore, classic experiments do not mimic real user interaction. Further development of the quality and compression algorithms depends on the results coming from ecologically-valid experiments. Therefore, we aim to propose such experiments. Nevertheless, proposing a new experimental protocol is difficult, especially when there is no limitation on content selection and control of the video. The freedom makes data analysis more challenging. In this paper, we present an ecologically-valid experimental protocol in which the subject assessed the quality while freely using YouTube. To achieve this goal, we developed a Chrome extension that collects objective data and allows network manipulation. Our deep data analysis shows a correlation between MOS and objectively measured results such as resolution, which proves that the ecologically-valid test works. Moreover, we have shown significant differences between subjects, allowing for a more detailed understanding, of how the quality influences the interaction with the service.
11:50IQSP-262
Evaluation of motion blur image quality in video frame interpolation, Hai Dinh, Fangwen Tu, Qinyi Wang, Brett Frymire, and Bo Mu, Omnivision Technology (United States) [view abstract]
While slow motion has become a standard feature in mainstream cell phones, a fast approach without relying on specific training datasets to assess slow motion video quality is not available. Conventionally, researchers evaluate their algorithms with peak signal-to-noise ratio (PSNR) or structural similarity index measure (SSIM) between ground-truth and reconstructed frames. But they are both global evaluation index and more sensitive to noise or distortion brought by the interpolation. For video interpolation, especially for fast moving objects, motion blur as well as ghost problem are more essential to the audience subjective judgment. How to achieve a proper evaluation for Video Frame Interpolation (VFI) task is still a problem that is not well addressed.
12:10IQSP-263
Subjective video quality for 4K HDR-WCG content using a browser-based approach for “at-home” testing, Lukáš Krasula1, Anustup Choudhury2, Scott Daly2, Zhi Li1, Robin Atkins2, Ludovic Malfait2, and Aditya Mavlankar1; 1Netflix, Inc. and 2Dolby Laboratories, Inc. (United States) [view abstract]
A subjective quality study of 4K HDR-WCG (3840 x 2160, High Dynamic Range, Wide Color Gamut) video content was performed in an at-home scenario. There are no available datasets on such content, yet they are crucial for objective quality metrics development and testing. While at-home testing generally implies lack of calibration, we sought to maximize calibration by limiting the displays to a specific model of TV that we have calibrated in our lab and have found that unit to unit deviations are small. Moreover, we performed the experiment in the Dolby Vision mode (where the various enhancements of the TV are turned OFF by default). In addition, we asked subjects to go through procedures to ensure a standard viewing distance of 1.6 picture heights, and to eliminate ambient lighting effects on display contrast by viewing in dark or dim conditions. A browser approach was used which took control of the TV, and ensure the content was viewed at the native resolution of the TV (e.g., dot-on-dot mode). Particular care was given to content selection to probe specific challenge cases of the display behavior as well as human vision (e.g., complex motion effects on eye tracking). Further, several clips were selected that represent the highest quality possible with 2021 technology. We have found the subject response variability was like lab-based experiments, suggesting the noise in the results due to display variability and lack of unit-to-unit calibration, was less than the within-subject variability due to personal physiology or preferences. Several statistical models and subject-rejection strategies will be compared and the usefulness of the data for objective metrics will be presented.
12:30 – 2:00 PM Lunch
Tuesday 17 January PLENARY: Embedded Gain Maps for Adaptive Display of High Dynamic Range Images
Session Chair: Robin Jenkin, NVIDIA Corporation (United States)
2:00 PM – 3:00 PM
Cyril Magnin I/II/III
Images optimized for High Dynamic Range (HDR) displays have brighter highlights and more detailed shadows, resulting in an increased sense of realism and greater impact. However, a major issue with HDR content is the lack of consistency in appearance across different devices and viewing environments. There are several reasons, including varying capabilities of HDR displays and the different tone mapping methods implemented across software and platforms. Consequently, HDR content authors can neither control nor predict how their images will appear in other apps.
We present a flexible system that provides consistent and adaptive display of HDR images. Conceptually, the method combines both SDR and HDR renditions within a single image and interpolates between the two dynamically at display time. We compute a Gain Map that represents the difference between the two renditions. In the file, we store a Base rendition (either SDR or HDR), the Gain Map, and some associated metadata. At display time, we combine the Base image with a scaled version of the Gain Map, where the scale factor depends on the image metadata, the HDR capacity of the display, and the viewing environment.
Eric Chan, Fellow, Adobe Inc. (United States)
Eric Chan is a Fellow at Adobe, where he develops software for editing photographs. Current projects include Photoshop, Lightroom, Camera Raw, and Digital Negative (DNG). When not writing software, Chan enjoys spending time at his other keyboard, the piano. He is an enthusiastic nature photographer and often combines his photo activities with travel and hiking.
Paul M. Hubel, director of Image Quality in Software Engineering, Apple Inc. (United States)
Paul M. Hubel is director of Image Quality in Software Engineering at Apple. He has worked on computational photography and image quality of photographic systems for many years on all aspects of the imaging chain, particularly for iPhone. He trained in optical engineering at University of Rochester, Oxford University, and MIT, and has more than 50 patents on color imaging and camera technology. Hubel is active on the ISO-TC42 committee Digital Photography, where this work is under discussion, and is currently a VP on the IS&T Board. Outside work he enjoys photography, travel, cycling, coffee roasting, and plays trumpet in several bay area ensembles.
3:00 – 3:30 PM Coffee Break
Objective Quality Assessment (T3)
Session Chair:
Peter Burns, Rochester Institure of Tech. (United States)
3:30 – 5:30 PM
Cyril Magnin III
3:30IQSP-305
Another look at SSIM image quality metric, Yuriy Reznik, Brightcove, Inc. (United States) [view abstract]
We review the design of the SSIM quality metric and offer an alternative model of SSIM computation, utilizing subband decomposition and identical distance measures in each subband. We show that this model performs very close to the original and offers many advantages from a methodological standpoint. It immediately brings several possible explanations of why SSIM is effective. It also suggests a simple strategy for band noise allocation optimizing SSIM scores. This strategy may aid the design of encoders or pre-processing filters for video coding. Finally, this model leads to more straightforward mathematical connections between SSIM, MSE, and SNR metrics, improving previously known results.
3:50IQSP-306
What are we looking at? An investigation on the use of deep learning models for image quality assessment, Ha Thu Nguyen and Seyed Ali Amirshahi, Norwegian University of Science and Technology (Norway) [view abstract]
In recent years several different Image Quality Metrics (IQMs) have been introduced which are focused on comparing the feature maps extracted from different pre-trained deep learning models[1-3]. While such objective IQMs have shown a high correlation with the subjective scores little attention has been paid on how they could be used to better understand the Human Visual System (HVS) and how observers evaluate the quality of images. In this study, by using different pre-trained Convolutional Neural Networks (CNN) models we identify the most relevant features in Image Quality Assessment (IQA). By visualizing these feature maps we try to have a better understanding about which features play a dominant role when evaluating the quality of images. Experimental results on four benchmark datasets show that the most important feature maps represent repeated textures such as stripes or checkers, and feature maps linked to colors blue, or orange also play a crucial role. Additionally, when it comes to calculating the quality of an image based on a comparison of different feature maps, a higher accuracy can be reached when only the most relevant feature maps are used in calculating the image quality instead of using all the extracted feature maps from a CNN model. [1] Amirshahi, Seyed Ali, Marius Pedersen, and Stella X. Yu. "Image quality assessment by comparing CNN features between images." Journal of Imaging Science and Technology 60.6 (2016): 60410-1. [2] Amirshahi, Seyed Ali, Marius Pedersen, and Azeddine Beghdadi. "Reviving traditional image quality metrics using CNNs." Color and imaging conference. Vol. 2018. No. 1. Society for Imaging Science and Technology, 2018. [3] Gao, Fei, et al. "Deepsim: Deep similarity for image quality assessment." Neurocomputing 257 (2017): 104-114.
4:10IQSP-307
A framework for the metrification of input image quality in deep networks, Alexandra Psarrou and Sophie Triantaphillidou, University of Westminster (United Kingdom) [view abstract]
Deep Neural Networks (DNNs) are critical for real-time imaging applications including autonomous vehicles. DNNs are often trained and validated with images that originate from a limited number of cameras, each of which has its own hardware and image signal processing (ISP) characteristics. However, in most real-time embedded systems, the input images come from a variety of cameras with different ISP pipelines, and often include perturbations due to a variety of scene conditions. Data augmentation methods are commonly exploited to enhance the robustness of such systems. Alternatively, methods are employed to detect input images that are unfamiliar to the trained networks, including out of distribution detection. Despite these efforts DNNs remain widely systems with operational boundaries that cannot be easily defined. One reason is that, while training and benchmark image datasets include samples with a variety of perturbations, there is a lack of research in the areas of metrification of input image quality suitable to DNNs and a universal method to relate quality to DNN performance using meaningful quality metrics. This paper addresses this lack of metrification specific to DNNs systems and introduces a framework that uses systematic modification of image quality attributes and relate input image quality to DNN performance.
4:30IQSP-308
Investigating pretrained self-supervised vision transformers for reference-based quality assessment., Kanjar De, Lulea University of Technology (Sweden) [view abstract]
Reference-based image quality assessment techniques use information from an undistorted reference image of the same scene to estimate the quality of a distorted target image. The main challenge in designing algorithms for quality assessment is to incorporate the behavior of the human visual system into the algorithms. The advent of deep learning (DL) techniques has garnered sufficient interest among researchers in the field of image quality assessment. The common limitation of applying deep learning for image quality assessment is its dependence on a large amount of subjective training data. Recent advances in the field of patch-based self-supervised vision transformers have achieved remarkable results for tasks like object segmentation, copy detection, etc. and other downstream computer vision tasks. In this paper, we study how the distance between the pretrained self-supervised vision transformer features applied on pristine and distorted images is related to the human visual system. Experiments carried out on three publicly available image quality databases (namely KADID-10K, TID2013, and MDID2016) have yielded promising results that can be further exploited to design perceptual reference-based image quality assessment methods.
4:50IQSP-309
Evaluation of image quality metrics designed for DRI tasks with automotive cameras, Valentine Klein, Theophanis Eleftheriou, Yiqi LI, Emilie Baudin, Claudio Greco, Laurent Chanas, and Frédéric Guichard, DXOMARK (France) [view abstract]
Driving assistance is increasingly used in new car models. Most driving assistance systems are based on automotive cameras and computer vision. Computer Vision, regardless of the underlying algorithms and technology, requires the images to have good image quality, defined according to the task. This notion of good image quality is still to be defined in the case of computer vision as it has very different criteria than human vision: humans have a better contrast detection ability than image chains. The aim of this article is to compare three different metrics designed for detection of objects with computer vision: the Contrast Detection Probability (CDP) [1, 2, 3, 4], the Contrast Signal to Noise Ratio (CSNR) [5] and the Frequency of Correct Resolution (FCR) [6]. For this purpose, the computer vision task of reading the characters on a license plate will be used as a benchmark. The objective is to check the correlation between the objective metric and the ability of a neural network to perform this task. Thus, a protocol to test these metrics and compare them to the output of the neural network has been designed and the pros and cons of each of these three metrics have been noted.
5:10IQSP-310
Towards image-computable visual text quality metric with deep neural network, Ling-Qi Zhang1,2, Minjung Kim1, James Hillis1, and Trisha Lian1; 1Meta Reality Labs and 2University of Pennsylvania (United States) [view abstract]
Image quality metrics have become invaluable tools for image processing and display system development. These metrics are typically developed for and tested on images and videos of natural content. Text, on the other hand, has unique features and supports a distinct visual function: reading. It is therefore not clear if these image quality metrics are efficient or optimal as measures of text quality. Here, we developed a domain-specific image quality metric for text and compared its performance against quality metrics developed for natural images. To develop our metric, we first trained a deep neural network to perform text classification on a data set of distorted letter images. We then compute the responses of internal layers of the network to uncorrupted and corrupted images of text, respectively. We used the cosine dissimilarity between the responses as a measure of text quality. Preliminary results indicate that both our model and more established quality metrics (e.g., SSIM) are able to predict general trends in participants’ text quality ratings. In some cases, our model is able to outperform SSIM. We further developed our model to predict response data in a two-alternative forced choice experiment, on which only our model achieved very high accuracy.
5:30 – 7:00 PM EI 2023 Symposium Demonstration Session (in the Cyril Magnin Foyer)
Wednesday 18 January 2023
System Performance (W1)
Session Chair:
Jonathan Phillips, Imatest, LLC (United States)
8:50 – 10:10 AM
Cyril Magnin III
8:50IQSP-311
A tool for deriving camera spatial frequency response from natural scenes (NS-SFR), Oliver van Zwanenberg1, Sophie Triantaphillidou1, and Robin B. Jenkin1,2; 1University of Westminster (United Kingdom) and 2NVIDIA Corporation (United States) [view abstract]
Recent research on digital camera performance evaluation introduced the Natural Scene Spatial Frequency Response (NS-SFR) framework, shown to provide a comparable measure to the ISO12233 edge SFR (e-SFR) but derived outside laboratory conditions. The framework extracts step-edges captured from pictorial natural scenes to evaluate the camera SFR. It is in 2-parts. The first utilizes the ISO12233 slanted-edge algorithm to produce an ‘envelope’ of NS-SFRs. The second estimates the system e-SFR from this NS-SFR data. One drawback of this proposed methodology has been the computation time. The process was not optimized, as it first derived NS-SFRs from all suitable step-edges and then further validated and statistically treated the results to estimate the e-SFR. This paper presents changes to the framework processes, aiming to optimize the computation time so that it is practical for real-world implementation. The developments include an improved framework structure, a pixel-stretching filter alternative, and the capability to utilize Graphics Processing Unit (GPU) acceleration. In addition, the methodology was updated to utilize the latest e-SFR algorithm implementation. The resulting code has been incorporated into a self-executable user interface prototype, available in GitHub. Future goals include making it an open-access, cloud-based solution to be used by scientists, camera evaluation labs and the general public.
9:10IQSP-312
Influence of the light source on the image sensor characterization according to EMVA 1288, Ganesh D. Kubina, Max Gäde, and Uwe Artmann, Image Engineering GmbH & Co KG (Germany) [view abstract]
Due to the increasing demand of machine vision applications in a variety of scenarios, it is necessary to know the capability of the hardware before implementing it. The 1288 Standard by the European Machine Vision Association aims to provide a basis to compare the performance of cameras based on a characterization of the image sensor, using a monochrome light source. This paper aims to investigate the influence the light source has on the measurement results. Which parameters are dependent on it, and which are not? Are there any benefits to using a broadband light source? To answer this question, a series of measurement runs using six different illuminants were performed with the same camera. The illuminants included monochromatic blue, green and red light as well as three different white spectra (CIE E, CIE D65 and white LED). The results show that the influence of the light source on the metrics is limited to the measured quantum efficiency of the camera and related parameters. As a consequence, using a non-monochromatic light source for the measurements might be an option, as it can provide better insight into use-case specific performance and improve comparability.
9:30IQSP-313
Managing deviant data in spatial frequency response (SFR) measurement by outlier rejection, Peter Burns1 and Don Williams2; 1Burns Digital Imaging and 2Image Science Associates (United States) [view abstract]
The edge-based Spatial Frequency Response (e-SFR) method was first developed for evaluating camera image resolution and image sharpness. The method was described in the first version of the ISO 12233 standard. Since then, the method has been applied in a wide range of applications, including medical, security, archiving, and document processing. However, with this broad application, several of the assumptions of the method are no longer closely followed. This has led to several improvements aimed at broadening its application, for example for lenses with spatial distortion. We can think of the evaluation of image quality parameters as an estimation problem, based on the gathered data, often from digital images. In this paper, we address the mitigation of measurement error that is introduced when the analysis is applied to low-exposure (and therefore, noisy) applications and those with small analysis regions. We consider the origins of both bias and variation in the resulting SFR measurement and present practical ways to reduce them. We describe the screening of outlier edge-location values as a method for improved edge detection. This, in turn, is related to a reduction in negative bias in the resulting SFR.
9:50IQSP-314
Optimization of ISP parameters for low light conditions using a non-linear reference based approach, Shubham Ravindra Alai1, Radhesh Bhat1, and Ajay Basarur2; 1PathPartner Technology - Member of KPIT Group (India) and 2presenter only (United States) [view abstract]
An image signal processor (ISP) transforms a sensor's raw image into a RGB image for use in computer or human vision applications. ISP is composed of various functional blocks and each block contributes uniquely to make the image best suitable for the target application. Whereas, each block consists of several hyperparameters and each hyperparameter needs to be tuned (usually done manually by experts in an iterative manner) to achieve the target image quality. The tuning becomes challenging and increasingly iterative especially in low to very low light conditions where the amount of details preserved by the sensor is limited and ISP parameters have to be tuned to balance the amount of details recovered, noise, sharpness, contrast etc. To extract maximum information out of the image, usually it is required to increase the ISO gain which eventually impacts the noise and color accuracy. Also, the number of ISP parameters that need to be tuned are huge and it becomes impractical to consider all of them in such low light conditions to arrive at the best possible settings. To tackle challenges in manual tuning, especially for low light conditions we have implemented an automatic hyperparameter optimization model that can tune the low lux images so that they are perceptually equivalent to high-lux images. The experiments for IQ validation are carried out under challenging low light conditions and scenarios using Qualcomm’s Spectra ISP simulator with a 13MP OV sensor, and the performance of automatic tuned IQ is compared with manual tuned IQ for human vision use-cases. With experimental results, we have proved that with the help of evolutionary algorithms and local optimization it is possible to optimize the ISP parameters such that without using any of the KPI metrics still low-lux image/ image captured with different ISP (test image) can perceptually be improved that are equivalent to high-lux or well-tuned (reference) image.
10:00 AM – 3:30 PM Industry Exhibition - Wednesday (in the Cyril Magnin Foyer)
10:20 – 10:50 AM Coffee Break
Mobile and Camera Quality Assessment (W2)
Session Chair:
Elaine Jin, Rivian Automotive, Inc. (United States)
10:50 AM – 12:30 PM
Cyril Magnin III
10:50IQSP-315
Image quality performance of CMOS image sensor equipped with Nano Prism, Sungho Cha, Samsung Electronics (Republic of Korea) [view abstract]
Smartphones with 100 million pixel sensor are on the market. After that, it is expected to mount a higher resolution mobile camera module of 200 million pixels or more. In order to develop high resolution sensor products by mounting more pixels in a limited space, it is necessary to reduce the size of the pixels. There are currently sensors on the market with 0.64um pixels. It is expected that sensors with smaller pixels will be developed in the future. In terms of image quality, the smaller the pixel size, the smaller the amount of light received. Therefore, the image quality deteriorates in terms of noise and crosstalk. To overcome this limitation, various high sensitivity sensors are being developed, and it is advantageous to mount Nano Prism in the development of high sensitivity sensor. In this paper, we introduce the image quality performance of CMOS image sensor equipped with Nano Prism.
11:10IQSP-316
Noise quality estimation on portraits in realistic controlled scenarios, Nicolas Chahine1,2, Samuel S. Santos3, Sofiene Lahouar1, Ana-Stefania Calarasanu1, Sira Ferradans1, Benoit Pochon1, and Frédéric Guichard1; 1DXOMARK Image Labs, 2INRIA, and 3Parrot (France) [view abstract]
The wide use of cameras by the public has raised the interest of image quality evaluation and ranking. Current cameras embed complex processing pipelines that adapt strongly to the scene content by implementing, for instance, advanced noise reduction or local adjustment on faces. However, current methods of Image Quality assessment are based on static geometric charts which are not representative of the common camera usage that targets mostly portraits. Moreover, on non-synthetic content most relevant features such as detail preservation or noisiness are often un-tractable. To overcome this situation, we propose to mix classical measurements and Machine learning based methods: we reproduce realistic content triggering this complex processing pipelines in controlled conditions in the lab which allows for rigorous quality assessment. Then, ML based methods can reproduce perceptual quality annotated previously. In this paper, we focus on noise quality evaluation and test on two different set ups: closeup and distant portraits. These setups provide scene capture conditions flexibility, but most of all, allow the evaluation of all quality camera ranges from high quality DSLR to poor quality video conference. Our numerical results show the relevance of our solution compared to geometric charts and the importance of adapting to realistic content.
11:30IQSP-317
VCX – Version 2023 – The latest transparent and objective mobile phone test scheme, Uwe Artmann1 and Anthony L. Orchard2; 1Image Engineering GmbH & Co KG (Germany) and 2Intel Corporation (United States) [view abstract]
VCX or Valued Camera eXperience is a nonprofit organization dedicated to the objective and transparent evaluation of mobile phone cameras. The members continuously work on the development of a test scheme that can provide an objective score for the camera performance. Every device is tested for a variety of image quality factors while these typically based on existing standards. This paper presents that latest development with the newly released version 2023 and the process behind it. New metric included are extended tests on video dynamics, AE and AWB, dedicated tests on ultra wide modules and adjustments to the metric system based on a large scale subjective study.
11:50IQSP-318
VCX – A transparent and objective test scheme for webcams, Uwe Artmann1 and Anthony L. Orchard2; 1Image Engineering GmbH & Co KG (Germany) and 2Intel Corporation (United States) [view abstract]
VCX or Valued Camera eXperience is a nonprofit organization dedicated to the objective and transparent evaluation of consumer camera devices like mobile phones and webcams. The members continuously work on the development of a test scheme that can provide an objective score for the camera performance. We present the new developed test scheme for webcams used in video conference systems. The test scheme covers many different aspects of camera performance including global image quality factors like AE, AWB and color and local image quality factors like resolution, texture, and sharpening. The used test procedure considers state of the art algorithms and covers more challenging situations and scenes compared to existing test schemes.
12:10IQSP-319
Improvement of the flare evaluation and applications in NIR, Elodie Souksava, Emilie Baudin, Claudio Greco, Hoang-Phi Nguyen, Laurent Chanas, and Frédéric Guichard, DxOMark Image Labs (France) [view abstract]
Near-infrared (NIR) light sources have become increasingly present in our daily lives, which led to the growth of the number of cameras designed for viewing in the NIR spectrum (sometimes in addition to the visible) in the automotive, mobile, and surveillance sectors. However, camera evaluation metrics are still mainly focused on sensors in visible lights. The goal of this article is to extend our existing flare setup and objective flare metric to quantify NIR flare for different cameras and to evaluate the performance of several NIR filters. We also compare the results in both visible and NIR lighting for different types of devices. Moreover, we propose a new method to measure the ISO speed rating in visible light spectrum (originally defined in the ISO standard 12232) and an equivalent ISO for NIR spectrum with our flare setup.
12:30 – 2:00 PM Lunch
Wednesday 18 January PLENARY: Bringing Vision Science to Electronic Imaging: The Pyramid of Visibility
Session Chair: Andreas Savakis, Rochester Institute of Technology (United States)
2:00 PM – 3:00 PM
Cyril Magnin I/II/III
Electronic imaging depends fundamentally on the capabilities and limitations of human vision. The challenge for the vision scientist is to describe these limitations to the engineer in a comprehensive, computable, and elegant formulation. Primary among these limitations are visibility of variations in light intensity over space and time, of variations in color over space and time, and of all of these patterns with position in the visual field. Lastly, we must describe how all these sensitivities vary with adapting light level. We have recently developed a structural description of human visual sensitivity that we call the Pyramid of Visibility, that accomplishes this synthesis. This talk shows how this structure accommodates all the dimensions described above, and how it can be used to solve a wide variety of problems in display engineering.
Andrew B. Watson, chief vision scientist, Apple Inc. (United States)
Andrew Watson is Chief Vision Scientist at Apple, where he leads the application of vision science to technologies, applications, and displays. His research focuses on computational models of early vision. He is the author of more than 100 scientific papers and 8 patents. He has 21,180 citations and an h-index of 63. Watson founded the Journal of Vision, and served as editor-in-chief 2001-2013 and 2018-2022. Watson has received numerous awards including the Presidential Rank Award from the President of the United States.
3:00 – 3:30 PM Coffee Break
5:30 – 7:00 PM EI 2023 Symposium Interactive (Poster) Paper Session (in the Cyril Magnin Foyer)
5:30 – 7:00 PM EI 2023 Meet the Future: A Showcase of Student and Young Professionals Research (in the Cyril Magnin Foyer)