Mobile Devices and Multimedia: Enabling Technologies, Algorithms, and Applications 2023
Monday 16 January 2023
10:20 – 10:50 AM Coffee Break
12:30 – 2:00 PM Lunch
Monday 16 January PLENARY: Neural Operators for Solving PDEs
Session Chair: Robin Jenkin, NVIDIA Corporation (United States)
2:00 PM – 3:00 PM
Cyril Magnin I/II/III
Deep learning surrogate models have shown promise in modeling complex physical phenomena such as fluid flows, molecular dynamics, and material properties. However, standard neural networks assume finite-dimensional inputs and outputs, and hence, cannot withstand a change in resolution or discretization between training and testing. We introduce Fourier neural operators that can learn operators, which are mappings between infinite dimensional spaces. They are independent of the resolution or grid of training data and allow for zero-shot generalization to higher resolution evaluations. When applied to weather forecasting, neural operators capture fine-scale phenomena and have similar skill as gold-standard numerical weather models for predictions up to a week or longer, while being 4-5 orders of magnitude faster.
Anima Anandkumar, Bren professor, California Institute of Technology, and senior director of AI Research, NVIDIA Corporation (United States)
Anima Anandkumar is a Bren Professor at Caltech and Senior Director of AI Research at NVIDIA. She is passionate about designing principled AI algorithms and applying them to interdisciplinary domains. She has received several honors such as the IEEE fellowship, Alfred. P. Sloan Fellowship, NSF Career Award, and Faculty Fellowships from Microsoft, Google, Facebook, and Adobe. She is part of the World Economic Forum's Expert Network. Anandkumar received her BTech from Indian Institute of Technology Madras, her PhD from Cornell University, and did her postdoctoral research at MIT and assistant professorship at University of California Irvine.
3:00 – 3:30 PM Coffee Break
EI 2023 Highlights Session
Session Chair: Robin Jenkin, NVIDIA Corporation (United States)
3:30 – 5:00 PM
Cyril Magnin II
Join us for a session that celebrates the breadth of what EI has to offer with short papers selected from EI conferences.
NOTE: The EI-wide "EI 2023 Highlights" session is concurrent with Monday afternoon COIMG, COLOR, IMAGE, and IQSP conference sessions.
IQSP-309
Evaluation of image quality metrics designed for DRI tasks with automotive cameras, Valentine Klein, Yiqi LI, Claudio Greco, Laurent Chanas, and Frédéric Guichard, DXOMARK (France) [view abstract]
Driving assistance is increasingly used in new car models. Most driving assistance systems are based on automotive cameras and computer vision. Computer Vision, regardless of the underlying algorithms and technology, requires the images to have good image quality, defined according to the task. This notion of good image quality is still to be defined in the case of computer vision as it has very different criteria than human vision: humans have a better contrast detection ability than image chains. The aim of this article is to compare three different metrics designed for detection of objects with computer vision: the Contrast Detection Probability (CDP) [1, 2, 3, 4], the Contrast Signal to Noise Ratio (CSNR) [5] and the Frequency of Correct Resolution (FCR) [6]. For this purpose, the computer vision task of reading the characters on a license plate will be used as a benchmark. The objective is to check the correlation between the objective metric and the ability of a neural network to perform this task. Thus, a protocol to test these metrics and compare them to the output of the neural network has been designed and the pros and cons of each of these three metrics have been noted.
SD&A-224
Human performance using stereo 3D in a helmet mounted display and association with individual stereo acuity, Bonnie Posselt, RAF Centre of Aviation Medicine (United Kingdom) [view abstract]
Binocular Helmet Mounted Displays (HMDs) are a critical part of the aircraft system, allowing information to be presented to the aviator with stereoscopic 3D (S3D) depth, potentially enhancing situational awareness and improving performance. The utility of S3D in an HMD may be linked to an individual’s ability to perceive changes in binocular disparity (stereo acuity). Though minimum stereo acuity standards exist for most military aviators, current test methods may be unable to characterise this relationship. This presentation will investigate the effect of S3D on performance when used in a warning alert displayed in an HMD. Furthermore, any effect on performance, ocular symptoms, and cognitive workload shall be evaluated in regard to individual stereo acuity measured with a variety of paper-based and digital stereo tests.
IMAGE-281
Smartphone-enabled point-of-care blood hemoglobin testing with color accuracy-assisted spectral learning, Sang Mok Park1, Yuhyun Ji1, Semin Kwon1, Andrew R. O’Brien2, Ying Wang2, and Young L. Kim1; 1Purdue University and 2Indiana University School of Medicine (United States) [view abstract]
We develop an mHealth technology for noninvasively measuring blood Hgb levels in patients with sickle cell anemia, using the photos of peripheral tissue acquired by the built-in camera of a smartphone. As an easily accessible sensing site, the inner eyelid (i.e., palpebral conjunctiva) is used because of the relatively uniform microvasculature and the absence of skin pigments. Color correction (color reproduction) and spectral learning (spectral super-resolution spectroscopy) algorithms are integrated for accurate and precise mHealth blood Hgb testing. First, color correction using a color reference chart with multiple color patches extracts absolute color information of the inner eyelid, compensating for smartphone models, ambient light conditions, and data formats during photo acquisition. Second, spectral learning virtually transforms the smartphone camera into a hyperspectral imaging system, mathematically reconstructing high-resolution spectra from color-corrected eyelid images. Third, color correction and spectral learning algorithms are combined with a spectroscopic model for blood Hgb quantification among sickle cell patients. Importantly, single-shot photo acquisition of the inner eyelid using the color reference chart allows straightforward, real-time, and instantaneous reading of blood Hgb levels. Overall, our mHealth blood Hgb tests could potentially be scalable, robust, and sustainable in resource-limited and homecare settings.
AVM-118
Designing scenes to quantify the performance of automotive perception systems, Zhenyi Liu1, Devesh Shah2, Alireza Rahimpour2, Joyce Farrell1, and Brian Wandell1; 1Stanford University and 2Ford Motor Company (United States) [view abstract]
We implemented an end-to-end simulation for perception systems, based on cameras, that are used in automotive applications. The open-source software creates complex driving scenes and simulates cameras that acquire images of these scenes. The camera images are then used by a neural network in the perception system to identify the locations of scene objects, providing the results as input to the decision system. In this paper, we design collections of test scenes that can be used to quantify the perception system’s performance under a range of (a) environmental conditions (object distance, occlusion ratio, lighting levels), and (b) camera parameters (pixel size, lens type, color filter array). We are designing scene collections to analyze performance for detecting vehicles, traffic signs and vulnerable road users in a range of environmental conditions and for a range of camera parameters. With experience, such scene collections may serve a role similar to that of standardized test targets that are used to quantify camera image quality (e.g., acuity, color).
VDA-403
Visualizing and monitoring the process of injection molding, Christian A. Steinparz1, Thomas Mitterlehner2, Bernhard Praher2, Klaus Straka1,2, Holger Stitz1,3, and Marc Streit1,3; 1Johannes Kepler University, 2Moldsonics GmbH, and 3datavisyn GmbH (Austria) [view abstract]
In injection molding machines the molds are rarely equipped with sensor systems. The availability of non-invasive ultrasound-based in-mold sensors provides better means for guiding operators of injection molding machines throughout the production process. However, existing visualizations are mostly limited to plots of temperature and pressure over time. In this work, we present the result of a design study created in collaboration with domain experts. The resulting prototypical application uses real-world data taken from live ultrasound sensor measurements for injection molding cavities captured over multiple cycles during the injection process. Our contribution includes a definition of tasks for setting up and monitoring the machines during the process, and the corresponding web-based visual analysis tool addressing these tasks. The interface consists of a multi-view display with various levels of data aggregation that is updated live for newly streamed data of ongoing injection cycles.
COIMG-155
Commissioning the James Webb Space Telescope, Joseph M. Howard, NASA Goddard Space Flight Center (United States) [view abstract]
Astronomy is arguably in a golden age, where current and future NASA space telescopes are expected to contribute to this rapid growth in understanding of our universe. The most recent addition to our space-based telescopes dedicated to astronomy and astrophysics is the James Webb Space Telescope (JWST), which launched on 25 December 2021. This talk will discuss the first six months in space for JWST, which were spent commissioning the observatory with many deployments, alignments, and system and instrumentation checks. These engineering activities help verify the proper working of the telescope prior to commencing full science operations. For the session: Computational Imaging using Fourier Ptychography and Phase Retrieval.
HVEI-223
Critical flicker frequency (CFF) at high luminance levels, Alexandre Chapiro1, Nathan Matsuda1, Maliha Ashraf2, and Rafal Mantiuk3; 1Meta (United States), 2University of Liverpool (United Kingdom), and 3University of Cambridge (United Kingdom) [view abstract]
The critical flicker fusion (CFF) is the frequency of changes at which a temporally periodic light will begin to appear completely steady to an observer. This value is affected by several visual factors, such as the luminance of the stimulus or its location on the retina. With new high dynamic range (HDR) displays, operating at higher luminance levels, and virtual reality (VR) displays, presenting at wide fields-of-view, the effective CFF may change significantly from values expected for traditional presentation. In this work we use a prototype HDR VR display capable of luminances up to 20,000 cd/m^2 to gather a novel set of CFF measurements for never before examined levels of luminance, eccentricity, and size. Our data is useful to study the temporal behavior of the visual system at high luminance levels, as well as setting useful thresholds for display engineering.
HPCI-228
Physics guided machine learning for image-based material decomposition of tissues from simulated breast models with calcifications, Muralikrishnan Gopalakrishnan Meena1, Amir K. Ziabari1, Singanallur Venkatakrishnan1, Isaac R. Lyngaas1, Matthew R. Norman1, Balint Joo1, Thomas L. Beck1, Charles A. Bouman2, Anuj Kapadia1, and Xiao Wang1; 1Oak Ridge National Laboratory and 2Purdue University (United States) [view abstract]
Material decomposition of Computed Tomography (CT) scans using projection-based approaches, while highly accurate, poses a challenge for medical imaging researchers and clinicians due to limited or no access to projection data. We introduce a deep learning image-based material decomposition method guided by physics and requiring no access to projection data. The method is demonstrated to decompose tissues from simulated dual-energy X-ray CT scans of virtual human phantoms containing four materials - adipose, fibroglandular, calcification, and air. The method uses a hybrid unsupervised and supervised learning technique to tackle the material decomposition problem. We take advantage of the unique X-ray absorption rate of calcium compared to body tissues to perform a preliminary segmentation of calcification from the images using unsupervised learning. We then perform supervised material decomposition using a deep learned UNET model which is trained using GPUs in the high-performant systems at the Oak Ridge Leadership Computing Facility. The method is demonstrated on simulated breast models to decompose calcification, adipose, fibroglandular, and air.
3DIA-104
Layered view synthesis for general images, Loïc Dehan, Wiebe Van Ranst, and Patrick Vandewalle, Katholieke University Leuven (Belgium) [view abstract]
We describe a novel method for monocular view synthesis. The goal of our work is to create a visually pleasing set of horizontally spaced views based on a single image. This can be applied in view synthesis for virtual reality and glasses-free 3D displays. Previous methods produce realistic results on images that show a clear distinction between a foreground object and the background. We aim to create novel views in more general, crowded scenes in which there is no clear distinction. Our main contributions are a computationally efficient method for realistic occlusion inpainting and blending, especially in complex scenes. Our method can be effectively applied to any image, which is shown both qualitatively and quantitatively on a large dataset of stereo images. Our method performs natural disocclusion inpainting and maintains the shape and edge quality of foreground objects.
ISS-329
A self-powered asynchronous image sensor with independent in-pixel harvesting and sensing operations, Ruben Gomez-Merchan, Juan Antonio Leñero-Bardallo, and Ángel Rodríguez-Vázquez, University of Seville (Spain) [view abstract]
A new self-powered asynchronous sensor with a novel pixel architecture is presented. Pixels are autonomous and can harvest or sense energy independently. During the image acquisition, pixels toggle to a harvesting operation mode once they have sensed their local illumination level. With the proposed pixel architecture, most illuminated pixels provide an early contribution to power the sensor, while low illuminated ones spend more time sensing their local illumination. Thus, the equivalent frame rate is higher than the offered by conventional self-powered sensors that harvest and sense illumination in independient phases. The proposed sensor uses a Time-to-First-Spike readout that allows trading between image quality and data and bandwidth consumption. The sensor has HDR operation with a dynamic range of 80 dB. Pixel power consumption is only 70 pW. In the article, we describe the sensor’s and pixel’s architectures in detail. Experimental results are provided and discussed. Sensor specifications are benchmarked against the art.
COLOR-184
Color blindness and modern board games, Alessandro Rizzi1 and Matteo Sassi2; 1Università degli Studi di Milano and 2consultant (Italy) [view abstract]
Board game industry is experiencing a strong renewed interest. In the last few years, about 4000 new board games have been designed and distributed each year. Board game players gender balance is reaching the equality, but nowadays the male component is a slight majority. This means that (at least) around 10% of board game players are color blind. How does the board game industry deal with this ? Recently, a raising of awareness in the board game design has started but so far there is a big gap compared with (e.g.) the computer game industry. This paper presents some data about the actual situation, discussing exemplary cases of successful board games.
5:00 – 6:15 PM EI 2023 All-Conference Welcome Reception (in the Cyril Magnin Foyer)
Tuesday 17 January 2023
10:00 AM – 7:30 PM Industry Exhibition - Tuesday (in the Cyril Magnin Foyer)
10:20 – 10:50 AM Coffee Break
12:30 – 2:00 PM Lunch
Tuesday 17 January PLENARY: Embedded Gain Maps for Adaptive Display of High Dynamic Range Images
Session Chair: Robin Jenkin, NVIDIA Corporation (United States)
2:00 PM – 3:00 PM
Cyril Magnin I/II/III
Images optimized for High Dynamic Range (HDR) displays have brighter highlights and more detailed shadows, resulting in an increased sense of realism and greater impact. However, a major issue with HDR content is the lack of consistency in appearance across different devices and viewing environments. There are several reasons, including varying capabilities of HDR displays and the different tone mapping methods implemented across software and platforms. Consequently, HDR content authors can neither control nor predict how their images will appear in other apps.
We present a flexible system that provides consistent and adaptive display of HDR images. Conceptually, the method combines both SDR and HDR renditions within a single image and interpolates between the two dynamically at display time. We compute a Gain Map that represents the difference between the two renditions. In the file, we store a Base rendition (either SDR or HDR), the Gain Map, and some associated metadata. At display time, we combine the Base image with a scaled version of the Gain Map, where the scale factor depends on the image metadata, the HDR capacity of the display, and the viewing environment.
Eric Chan, Fellow, Adobe Inc. (United States)
Eric Chan is a Fellow at Adobe, where he develops software for editing photographs. Current projects include Photoshop, Lightroom, Camera Raw, and Digital Negative (DNG). When not writing software, Chan enjoys spending time at his other keyboard, the piano. He is an enthusiastic nature photographer and often combines his photo activities with travel and hiking.
Paul M. Hubel, director of Image Quality in Software Engineering, Apple Inc. (United States)
Paul M. Hubel is director of Image Quality in Software Engineering at Apple. He has worked on computational photography and image quality of photographic systems for many years on all aspects of the imaging chain, particularly for iPhone. He trained in optical engineering at University of Rochester, Oxford University, and MIT, and has more than 50 patents on color imaging and camera technology. Hubel is active on the ISO-TC42 committee Digital Photography, where this work is under discussion, and is currently a VP on the IS&T Board. Outside work he enjoys photography, travel, cycling, coffee roasting, and plays trumpet in several bay area ensembles.
3:00 – 3:30 PM Coffee Break
5:30 – 7:00 PM EI 2023 Symposium Demonstration Session (in the Cyril Magnin Foyer)
Wednesday 18 January 2023
Imaging, Detection, Systems (W1)
Session Chair:
Reiner Creutzburg, Technische Hochschule Brandenburg (Germany)
8:45 – 10:10 AM
Balboa
8:45
Conference Welcome
8:50MOBMU-349
Comparative study of various object detection sensors for an autonomous valet parking system with line tracking, Harshi Ghai1, Rahul Nethilath Vinod1, Saurabh Kothale1, Klaus Schwarz1, Michael Hartmann1, and Reiner Creutzburg1,2; 1SRH Berlin University of Applied Sciences and 2Technische Hochschule Brandenburg (Germany) [view abstract]
The urban population is increasing every day. As a result, the density of private cars in urban hotspots is increasing. The biggest challenge here is finding a parking space. The introduction of multi-level parking garages has been a slight relief to this massive problem. However, the time an individual spends to find a free parking space is considerably high and will continue to increase as the population grows. The introduction of autonomous driving paves a new path in developing a solution to this problem. This paper aims to find an ideal solution to this problem by guiding the car to the nearest available parking space using line-following technology. This process combines different sensory methods (LiDAR, Radar, and Ultrasonic) for collision avoidance to develop an automatic parking system.
9:10MOBMU-350
iPhone12 imagery in scene-referred computer graphics pipelines, Eberhard Hasche1, Oliver Karaschewski1, and Reiner Creutzburg1,2; 1Technische Hochschule Brandenburg and 2SRH Berlin University of Applied Sciences (Germany) [view abstract]
With the release of the Apple iPhone 12 pro in 2020, various features were integrated that make it attractive as a recording device for scene-related computer graphics pipelines. The captured Apple RAW images have a much higher dynamic range than the standard 8-bit images. Since a scene-based workflow naturally has an extended dynamic range (HDR), the Apple RAW recordings can be well integrated. Another feature is the Dolby Vision HDR recordings, which are primarily adapted to the respective display of the source device. However, these recordings can also be used in the CG workflow since at least the basic HLG transfer function is integrated. The iPhone12pro's two Laser scanners can produce complex 3D models and textures for the CG pipeline. On the one hand, there is a scanner on the back that is primarily intended for capturing the surroundings for AR purposes. On the other hand, there is another scanner on the front for facial recognition. In addition, external software can read out the scanning data for integration in 3D applications. To correctly integrate the iPhone12pro Apple RAW data into a scene-related workflow, two command-line-based software solutions can be used, among others: dcraw and rawtoaces. Dcraw offers the possibility to export RAW images directly to ACES2065-1. Unfortunately, the modifiers for the four RAW color channels to address the different white points are unavailable. Experimental test series are performed under controlled studio conditions to retrieve these modifier values. Subsequently, these RAW-derived images are imported into computer graphics pipelines of various CG software applications (SideFx Houdini, The Foundry Nuke, Autodesk Maya) with the help of OpenColorIO (OCIO) and ACES. Finally, it will be determined if they can improve the overall color quality. Dolby Vision content can be captured using the native Camera app on an iPhone 12. It captures HDR video using Dolby Vision Profile 8.4, which contains a cross-compatible HLG Rec.2020 base layer and Dolby Vision dynamic metadata. Only the HLG base layer is passed on when exporting the Dolby Vision iPhone video without the corresponding metadata. It is investigated whether the iPhone12 videos transferred this way can increase the quality of the computer graphics pipeline. The 3D Scanner App software controls the two integrated Laser Scanners. In addition, the software provides a large number of export formats. Therefore, integrating the OBJ-3D data into industry-standard software like Maya and Houdini is unproblematic. Unfortunately, the models and the corresponding UV map are more or less machine-readable. So, manually improving the 3D geometry (filling holes, refining the geometry, setting up new topology) is cumbersome and time-consuming. It is investigated if standard techniques like using the ZRemesher in ZBrush, applying Texture- and UV-Projection in Maya, and VEX-snippets in Houdini can assemble these models and textures for manual editing.
9:30MOBMU-351
Improving the performance of web-streaming by super-resolution upscaling techniques, Yuriy Reznik1 and Nabajeet Barman2,3; 1Brightcove, Inc. (United States), 2Brightcove UK Ltd (United Kingdom), and 3Kingston University (United Kingdom) [view abstract]
In recent years, we have seen significant progress in advanced image upscaling techniques, sometimes called super-resolution, ML-based, or AI-based upscaling. Such algorithms are now available not only in form of specialized software but also in drivers and SDKs supplied with modern graphics cards. Upscaling functions in NVIDIA Maxine SDK is one of the recent examples. However, to take advantage of this functionality in video streaming applications, one needs to (a) quantify the impacts of super-resolution techniques on the perceived visual quality, (b) implement video rendering incorporating super-resolution upscaling techniques, and (c) implement new bitrate+resolution adaptation algorithms in streaming players, enabling such players to deliver better quality of experience or better efficiency (e.g. reduce bandwidth usage) or both. Towards this end, in this paper, we propose several techniques that may be helpful to the implementation community. First, we offer a model quantifying the impacts of super resolution upscaling on the perceived quality. Our model is based on the Westerink-Roufs model connecting the true resolution of images/videos to perceived quality, with several additional parameters added, allowing its tuning to specific implementations of super-resolution techniques. We verify this model by using several recent datasets including MOS scores measured for several conventional up-scaling and super-resolution algorithms. Then, we propose an improved adaptation logic for video streaming players, considering video bitrates, encoded video resolutions, player size, and the upscaling method. This improved logic relies on our modified Westerink-Roufs model to predict perceived quality and suggests choices of renditions that would deliver the best quality for given display and upscaling method characteristics. Finally, we study the impacts of the proposed techniques and show that they can deliver practically appreciable results in terms of the expected QoE improvements and bandwidth savings.
9:50MOBMU-352
Integrity and authenticity verification of printed documents by smartphones, Simon Bugert, Julian Heeger, and Waldemar Berchtold, Fraunhofer Institute for Secure Information Technology (Germany) [view abstract]
To this day, most important documents are still issued on paper. The security is based on the fact that the cost of creating a counterfeit must be unattractive for counterfeiters in relation to the expected profit. This results typically in using expensive printing equipment and substrate. This work introduces an approach which evaluates paper documents using any internet enabled device with a camera and a web browser like smartphones and tablets. Optical character recognition (OCR) is used to make text machine readable after the document is recognized and rectified. Digital signatures are then used to verify the authenticity and integrity of the data. Beyond that, the requirements of privacy, robustness and usability are satisfied. By using JAB Code, a high-capacity matrix code, the data to be verified can be stored directly on the document without having to use a database. This brings key advantages compared to database-bound systems in terms of security and privacy. The use of OCR achieves high usability.
10:00 AM – 3:30 PM Industry Exhibition - Wednesday (in the Cyril Magnin Foyer)
10:20 – 10:50 AM Coffee Break
Open Source Intelligence: Social Media (W2)
Session Chair:
Mohammad Nadim, The University of Texas at San Antonio (United States)
10:50 AM – 12:30 PM
Balboa
10:50MOBMU-353
Evaluation and test of various tools for OSINT-based email investigation, Samrudha Mhatre1, Franziska Schwarz2, Klaus Schwarz1, and Reiner Creutzburg1,2; 1SRH Berlin University of Applied Sciences and 2Technische Hochschule Brandenburg (Germany) [view abstract]
Open-source technologies (OSINT) and Social Media Intelligence (SOCMINT) are becoming increasingly popular with investigative and government agencies, intelligence services, media companies, and corporations - but also for cybercriminals in the field of email phishing, to name just one example. The amount of public and private data available is rising rapidly. OSINT and SOCMINT technologies use sophisticated techniques and special tools to efficiently analyze the continually growing sources of information. This work aims to find descriptive information by using the OSINT tools available on the internet. The target will be achieved with the help of dummy accounts that would help in understanding the tools and then evaluate further different tools. Also, find out what tools are commonly used and what improvements can be made to make them more descriptive for analysts.
11:10MOBMU-354
Importance of OSINT/SOCMINT for modern disaster management evaluation - Australia, Haiti, Japan, Nazneen Mansoor1, Klaus Schwarz1, Daniel Arias Aranda2, and Reiner Creutzburg1,3; 1SRH Berlin University of Applied Sciences (Germany), 2University of Granada (Spain), and 3Technische Hochschule Brandenburg (Germany) [view abstract]
Open-source technologies (OSINT) and Social Media Intelligence (SOCMINT) are becoming increasingly popular with investigative and government agencies, intelligence services, media companies, and corporations. These OSINT and SOCMINT technologies use sophisticated techniques and special tools to efficiently analyze the continually growing sources of information. There is a great need for training and further education in the OSINT field worldwide. This report describes the importance of open source or social media intelligence for evaluating disaster management. It also gives an overview of the government work in Australia, Haiti, and Japan for disaster management using various OSINT tools and platforms. Thus, decision support for using OSINT and SOCMINT tools is given, and the necessary training needs for investigators can be better estimated.
11:30MOBMU-355
Practical OSINT investigation in Twitter utilizing AI-based aggressiveness analysis, Artem Sklyar1, Klaus Schwarz1, Daniel Arias Aranda2, and Reiner Creutzburg1,3; 1SRH Berlin University of Applied Sciences (Germany), 2University of Granada (Spain), and 3Technische Hochschule Brandenburg (Germany) [view abstract]
Open-source intelligence is gaining popularity due to the rapid development of social networks. There is more and more information in the public domain. One of the most popular social networks is Twitter. It was chosen to analyze the dependence of changes in the number of likes, reposts, quotes and retweets on the aggressiveness of the post text for a separate profile, as this information can be important not only for the owner of the channel in the social network, but also for other studies that in some way influence user accounts and their behavior in the social network. Furthermore, this work includes a detailed analysis and evaluation of the Tweety library capabilities and situations in which it can be effectively applied. Lastly, this work includes the creation and description of a compiled neural network whose purpose is to predict changes in the number of likes, reposts, quotes, and retweets from the aggressiveness of the post text for a separate profile.
11:50MOBMU-356
Practical OSINT investigation - Similarity calculation using Reddit user profile data, Valeria Vishnevskaya1, Klaus Schwarz1, Daniel Arias Aranda2, and Reiner Creutzburg1,3; 1SRH Berlin University of Applied Sciences (Germany), 2University of Granada (Spain), and 3Technische Hochschule Brandenburg (Germany) [view abstract]
This paper presents a practical Open Source Intelligence (OSINT) use case for user similarity measurements with the use of open profile data from the Reddit social network. This PoC work combines the open data from Reddit and the part of the state-of-the-art BERT model. Using the PRAW Python library, the project fetches comments and posts of users. Then these texts are converted into a feature vector - representation of all user posts and comments. The main idea here is to create a comparable user's pair similarity score based on their comments and posts. For example, if we fix one user and calculate scores of all mutual pairs with other users, we will produce a total order on the set of all mutual pairs with that user. This total order can be described as a degree of written similarity with this chosen user. A set of "similar" users for one particular user can be used to recommend to the user interesting for him people. The similarity score also has a "transitive property": if $user_1$ is "similar" to $user_2$ and $user_2$ is similar to $user_3$ then inner properties of our model guarantees that $user_1$ and $user_3$ are pretty "similar" too. In this way, this score can be used to cluster a set of users into sets of "similar" users. It could be used in some recommendation algorithms or tune already existing algorithms to consider a cluster's peculiarities. Also, we can extend our model and calculate feature vectors for subreddits. In that way, we can find similar to the user's subreddits and recommend them to him.
12:10MOBMU-357
Open-source Intelligence (OSINT) investigation in Facebook, Pranesh Kumar Narasimhan1, Klaus Schwarz1, Hasan Dag2, and Reiner Creutzburg1,3; 1SRH Berlin University of Applied Sciences (Germany), 2Kadir Das University (Turkey), and 3Technische Hochschule Brandenburg (Germany) [view abstract]
Open Source Intelligence (OSINT) has come a long way, and it is still developing ideas, and lots of investigations are yet to happen in the near future. The main essential requirement for all the OSINT investigations is the information that is valuable data from a good source. This paper discusses various tools and methodologies related to Facebook data collection and analyzes part of the collected data. At the end of the paper, the reader will get a deep and clear insight into the available techniques, tools, and descriptions about tools that are present to scrape the data out of the Facebook platform and the types of investigations and analyses that the gathered data can do.
12:30 – 2:00 PM Lunch
Wednesday 18 January PLENARY: Bringing Vision Science to Electronic Imaging: The Pyramid of Visibility
Session Chair: Andreas Savakis, Rochester Institute of Technology (United States)
2:00 PM – 3:00 PM
Cyril Magnin I/II/III
Electronic imaging depends fundamentally on the capabilities and limitations of human vision. The challenge for the vision scientist is to describe these limitations to the engineer in a comprehensive, computable, and elegant formulation. Primary among these limitations are visibility of variations in light intensity over space and time, of variations in color over space and time, and of all of these patterns with position in the visual field. Lastly, we must describe how all these sensitivities vary with adapting light level. We have recently developed a structural description of human visual sensitivity that we call the Pyramid of Visibility, that accomplishes this synthesis. This talk shows how this structure accommodates all the dimensions described above, and how it can be used to solve a wide variety of problems in display engineering.
Andrew B. Watson, chief vision scientist, Apple Inc. (United States)
Andrew Watson is Chief Vision Scientist at Apple, where he leads the application of vision science to technologies, applications, and displays. His research focuses on computational models of early vision. He is the author of more than 100 scientific papers and 8 patents. He has 21,180 citations and an h-index of 63. Watson founded the Journal of Vision, and served as editor-in-chief 2001-2013 and 2018-2022. Watson has received numerous awards including the Presidential Rank Award from the President of the United States.
3:00 – 3:30 PM Coffee Break
Mobile Applications (W3)
Session Chair:
Reiner Creutzburg, Technische Hochschule Brandenburg (Germany)
3:30 – 4:10 PM
Balboa
3:30MOBMU-358
Mobile incident commanding dashboard (MIC-D), Yang Cai, CMU (United States) [view abstract]
Incident Command Dashboard (ICD) plays an essential role in Emergency Support Functions (ESF). They are centralized with a massive amount of live data. In this project, we explore a decentralized mobile incident commanding dashboard (MIC-D) with an improved mobile augmented reality (AR) user interface (UI) that can access and display multimodal live IoT data streams in phones, tablets, and inexpensive HUDs on the first responder’s helmets. The new platform is designed to work in the field and to share live data streams among team members. It also enables users to view the 3D LiDAR scan data on the location, live thermal video data, and vital sign data on the 3D map. We have built a virtual medical helicopter communication center and tested the launchpad on fire and remote fire extinguishing scenarios. We have also tested the wildfire prevention scenario “Cold Trailing” in the outdoor environment.
3:50MOBMU-359
Performance evaluation of keyword detection for the chatbot model, Ganesh Reddy Gunnam, Devasena Inupakutika, Rahul Mundlamuri, Sahak Kaghyan, and David Akopian, The University of Texas at San Antonio (United States) [view abstract]
The chatbot is designed to respond to users with automated responses with respect to the content provided by the user. But the question arises when the user provides a free text which is a synonym of required content or part of the content for the chatbot to understand. In this case most of the Chatbot models using huge libraries which has a large number of samples and require more computational time and storage. Keyword detection methods with a huge amount of data are suitable for most applications but chatbots were designed for specific tasks, for example, ordering food, customer support for the specific application, etc., so these types of chatbots don’t need huge training data. In this paper, we conducted a performance evaluation of different sets and sizes of samples based on certain keywords specifically used for the closed domain chatbot. In this research, we used Movielens 20M dataset which provides tag assignments between movies and unique tags. We used Deep Learning methods in this keyword extraction model.
Mobile Devices and Multimedia: Enabling Technologies, Algorithms, and Applications 2023 Interactive (Poster) Paper Session
5:30 – 7:00 PM
Cyril Magnin Foyer
The following works will be presented at the EI 2023 Symposium Interactive (Poster) Paper Session.
MOBMU-360
Evaluation and test of various tools for OSINT Reddit investigation - Scenarios and use cases, Shubham Pandya1, Klaus Schwarz1, and Reiner Creutzburg1,2; 1SRH Berlin University of Applied Sciences and 2Technische Hochschule Brandenburg (Germany) [view abstract]
This paper aims to understand and evaluate the tools available online, which can help carry out a Reddit-based open-source investigation. As part of this study, the various tools were evaluated based on their features, and some comparison analyses and use cases were also conducted. The objective here was to provide an in-depth understanding of all the available tools and assess which tools are the best and can help in an open-source-based investigation on Reddit. According to the findings and information gathered, the paper could suggest a set of good Reddit-based open-source investigation tools that can analyze content on these Reddit scenarios effectively.
MOBMU-361
Comparison of OSINT-based marketing tools for Pinterest, Amith Rajolkar1, Klaus Schwarz1, and Reiner Creutzburg1,2; 1SRH Berlin University of Applied Sciences and 2Technische Hochschule Brandenburg (Germany) [view abstract]
This work aims to provide an overview of the numerous OSINT tools on the topic of Pinterest. The different tools were evaluated in this paper based on their features, comparative analysis, and use cases. In the process, an attempt is made to find practical marketing tools that can help conduct social media-based research. Several suitable Pinterest-based open-source intelligence tools could be provided, which can effectively and analytically investigate the material on Pinterest accounts based on the insights and information obtained. Finally, this work discusses what the field of social media analytics could look like in the future and what actions could be taken to improve the analytical capabilities of these tools and create a better all-around tool.
MOBMU-362
Evaluation and test of various tools for OSINT-based Snapchat investigation, Shashank Markapuram Ramesh1, Klaus Schwarz1, and Reiner Creutzburg1,2; 1SRh Berlin University of Applied Sciences and 2Technische Hochschule Brandenburg (Germany) [view abstract]
The paper aims to investigate the Snapchat application as the open-source tool available to gather the information or data of the users. The application provides a platform for the users to enjoy entertainment while posting their media, including pictures and videos. OSINT tools that are considered open-source have information for the users so that they can get access to the data of other users and can be used for different purposes. The paper will include the analysis of Snapchat tools to investigate whether the Snapchat application is safe or secure for the users or not. In this regard, the paper will analyze the different tools such as Snapmap and Snap Lion to hack the user’s data. Hence, the paper showed that the Snapchat application is not authentic and secure for users for illegal and legal purposes. Therefore, it needs better management and application developers to ensure the safe use of the application.
MOBMU-363
Improvement of vehicle accident detection using object tracking with U-Net, Kirsnaragavan Arudpiragasam1, Kannuri Taraka Rama Krishna Kanth1, Klaus Schwarz1, Michael Hartmann1, and Reiner Creutzburg1,2; 1SRH Berlin University of Applied Sciences and 2Technische Hochschule Brandenburg (Germany) [view abstract]
Over the past decade, researchers have suggested many methods to find anomalies. However, none of the studies has applied frame reconstruction with Object Tracking (OT) to detect anomalies. Therefore, this study focuses on road accident detection using a combination of OT and U-Net associated with variants such as skip, skip residual and attention connections. The U-Net algorithm is developed for reconstructing the images using the UFC-Crime dataset. Furthermore, YOLOV4 and DeepSort are used for object detection and tracking within the frames. Finally, the Mahalanobis distance and the reconstruction error (RCE) are determined using a Kalman filter and the U-Net model.
MOBMU-364
Generative adversarial network (GAN) and object tracking for vehicle accident detection, Kirsnaragavan Arudpiragasam1, Kannuri Taraka Rama Krishna Kanth1, Klaus Schwarz1, Michael Hartmann1, and Reiner Creutzburg1,2; 1SRH Berlin University of Applied Sciences and 2Technische Hochschule Brandenburg (Germany) [view abstract]
Accident detection is one of the biggest challenges as there are various anomalies, occlusions, and objects in the image at different times. Therefore, this paper focuses on detecting traffic accidents through a combination of Object Tracking (OT) and image generation using GAN with variants such as skip connection, residual, and attention connection. The background removal techniques will be applied to reduce the background variation in the frame. Later, YOLO-R is used to detect objects, followed by DeepSort tracking of objects in the frame. Finally, the distance error metric and the adversarial error are determined using the Kalman filter and the GAN approach and help to decide accidents in videos.
MOBMU-365
Multimodal approach for classifying road accident severity, Kirsnaragavan Arudpiragasam1, Sanskruti Sawant1, Kannuri Taraka Rama Krishna Kanth1, Klaus Schwarz1, Michael Hartmann1, and Reiner Creutzburg1,2; 1SRH Berlin University of Applied Sciences and 2Technische Hochschule Brandenburg (Germany) [view abstract]
The severity of road accidents is a significant issue worldwide and results in severe cases of human fatalities. This paper aims to classify the severity of road accidents using a combination of machine learning and deep learning associated with different clustering approaches, XGBoosting, and LightGBM. Moreover, various statistical methods are applied to deal with imbalances and non-values in the dataset. Finally, the model accuracy and precision values of all variants will be used as metrics to select the best model among them.
MOBMU-366
An RF modulation recognition method using machine learning, Rahul Mundlamuri, Devasena Inupakutika, Ganesh Reddy Gunnam, Thinh Ngo, and David Akopian, The University of Texas at San Antonio (United States) [view abstract]
In recent few years, deep learning has been successfully applied in many fields to optimize decision making including self-driving cars, health care, machine translation, image recognition, etc. In wireless communication, deep learning has been used in channel estimation, signal classification, massive MIMOs, heterogeneous networks, energy harvesting, device-to-device (D2D) communications, and so on. In this paper, we applied machine learning (ML) and deep learning (DL) neural networks to RF signal recognition. Specifically, we built, trained, and tested two ML models, SVM and XGBoost, and two DL models, ConvNet and ResNet. We utilized the online dataset at radioml.com. Our goals are to learn how to scientifically apply ML/DL in terms of dataset processing, deep neural network constructing, training, testing, fine-tuning, and results analyzing and reporting.
MOBMU-367
Flood prediction with deep learning, Ganesh Reddy Gunnam, Devasena Inupakutika, Rahul Mundlamuri, Sahak Kaghyan, and David Akopian, The University of Texas at San Antonio (United States) [view abstract]
Time-series prediction problems have been effectively solved by deep neural networks lately given their ability to understand temporal characteristics found in time series. In this study, a deep learning-based flood occurrence prediction method is presented for the successful interpretation of weather events and meteorological data with higher accuracy. The proposed model is evaluated on the United States National Climate Data Center (NCDC) dataset, NCDC storm events. Correlation analysis was performed on the meteorological and weather phenomenal parameters for choosing the appropriate parameters. The experimental results show that the model achieves 87.8% accuracy while predicting floods in the United States from the year 2013 to 2019.
MOBMU-368
A qualitative study of LiDAR technologies and their application areas, Daniel Jaster1, Eberhard Hasche1, and Reiner Creutzburg1,2; 1Technische Hochschule Brandenburg and 2SRH Berlin University of Applied Sciences (Germany) [view abstract]
This work investigated the most relevant 3D LiDAR technologies in 2022 and their applications. For this purpose, applications of LiDAR systems were classified into the typical application areas "3D modeling", "smart city," "robotics," "intelligent motor vehicles," and "consumer goods." The investigation has shown that neither "mechanical" LiDAR technologies, so-called solid-state LiDAR technologies, nor "hybrid" LiDAR technologies can be evaluated as optimal for the typical application areas so far. None of the application areas could meet all of the elaborated requirements. The "hybrid" LiDAR technologies, such as sequential MEMS LiDAR technology and sequential Flash LiDAR technology, proved to be among the best for most typical best suited to most typical applications. However, other technologies have also proven significantly more suitable for typical individual applications. Finally, it was found that, at present, several of the LiDAR technologies investigated are equally suitable for some specific application areas. To evaluate suitability, concrete LiDAR systems - of different technologies and characteristics - were compared with the specific requirements of exemplary applications in one application area. The investigation results provide an orientation as to which LiDAR technology is promising for which area of application. Furthermore, it is established that there can be one optimal LiDAR technology in principle. The so-called flash LiDAR technology is based on VCSEL emitters and CMOS detectors if their performance characteristics are further optimized.
MOBMU-369
Survey into predictive maintenance analysis of photovoltaic systems, Reiner Creutzburg1,2 and Saiful Islam2; 1Technische Hochschule Brandenburg and 2SRH Berlin University of Applied Sciences (Germany) [view abstract]
Renewable Energy (RE) sources are being used nowadays to overcome the grid instability situation. Artificial Intelligence is using to predict future energy production, fault detection, excess energy where due to fluctuations in irradiance and temperature, PV (Photovoltaic) data is highly stochastic. The main problem of renewable energy sources is uncertainty. The predictive model will have consisted of renewable sources along with conventional energy sources such as photovoltaic, utility grid, and inverter system. In the research, the tool such as Artificial Intelligence can be implemented by sustainable management. The arrangement information is prepared to extricate geological data and based on resources. The renewable sources data are variant according to their location and it has an impact in terms of energy production. Data acquisition and analysis of it could be a help to the current technologies such as smart grid, microgrid, and their control systems. Electrical service providers utilizing clever frameworks and data innovations. In a modern sustainable framework, the utilization of large data control is required for administration, miniature networks, and environmentally friendly power supplies. The target of this exploration is to survey different predictive foundation for the management of enormous volumes of data through large Information instruments (monitoring system of renewable system) to help the management of environmentally friendly power. The main distinction between the conventional power system and the renewable system is the variability of sources. Where conventional sources such as utility grid, diesel generators and renewable sources are photovoltaic (PV), wind, etc. The paper also studies various surveys regarding predictive analytic for renewable application to identify different best practices in this field.
MOBMU-370
Evaluation and test of various tools for OSINT investigation in social media networks: Facebook, Twitter, Instagram, and Telegram, Klaus Schwarz1 and Reiner Creutzburg1,2; 1SRH Berlin University of Applied Sciences and 2Technische Hochschule Brandenburg (Germany) [view abstract]
This paper aims to understand and evaluate the tools available online, which can help carry out OSINT-based social media investigations. As part of this study, the various tools were evaluated based on their features, and some comparison analyses and use cases were also conducted. The objective here was to provide an in-depth understanding of all the available tools and assess which tools are the best and can help in an open-source-based investigation on Facebook, Twitter, Instagram and Telegram. According to the findings and information gathered, the paper could suggest a set of good open-source investigation tools that can analyze content on these social media networks effectively.
MOBMU-371
Performance of keyword extraction tools, Mohammad Nadim, Adolfo Matamoros, and David Akopian, The University of Texas at San Antonio (United States) [view abstract]
Abstract—Finding research professionals and collaborators to address community problems continues to be a significant barrier for many local government agencies. Research collaboration between researchers from universities, industries, and local government agencies can be tremendously useful to all organizations. San Antonio Research Partnership Portal is a collaborative initiative to bring researchers and local government agencies in one place to solve community concerns. In this paper, we have investigated the performance of popular keyword extraction tools by measuring the effectiveness of identifying the keywords from research opportunities. The extracted keywords are used in an automated process for San Antonio Research Partnership Portal to match academic researchers with corresponding research opportunities.
5:30 – 7:00 PM EI 2023 Symposium Interactive (Poster) Paper Session (in the Cyril Magnin Foyer)
5:30 – 7:00 PM EI 2023 Meet the Future: A Showcase of Student and Young Professionals Research (in the Cyril Magnin Foyer)