Multimedia analysis in CALLISTO
During the last decades we have witnessed an accelerating research interest in multimedia content analysis, with a vast number of applications in all aspects of our daily lives. The objective of multimedia analysis has been to develop automatic methods for high-level description and annotation, using low-level multimedia elements, such as image, text, and video.
The CALLISTO project aims, among others, to create value on Earth Observation (EO) data beyond the space domain, in the public and the private sector, such as policymaking, water management, security and journalism. To this end, complementary data sources need to be linked to EO data and effective information is generated through the analysis and the fusion of heterogeneous data. The project involves satellite images and associated metadata, video recordings on Unmanned Aerial Vehicles (UAVs) and crowdsourced information from social media platforms.
Satellite data concerns information about planets, gathered by man-made satellites, which are commonly used in Earth Observation. EO involves delivering information about surface and weather changes of the Earth by using remote sensing technologies. Within CALLISTO, satellite data from Copernicus EU programme are used. Copernicus provides accurate, continuous, timely, easily accessible and free-of-charge information to its users. To facilitate the access to data, five cloud-based platforms called DIAS were created and CALLISTO uses ONDA DIAS, which provides full availability of the data and allows users to build their applications on it. In particular, satellite data are used in the project for water quality analysis and land border detection.
Regarding the water quality estimation, remote sensing is used to monitor the changes in chemical, biological and physical parameters. Optical sensors from satellites measure changes in the light back-scattered from the water surface. Specialized methods are then used to relate these changes to variations in water constituents (inorganic, organic, particulate, dissolved materials) and their concentration within the water column. Specifically, in CALLISTO, optical data retrieved from Copernicus Sentinel-2 L1C are considered, which are transformed into L2 water quality products by applying several techniques, including atmospheric correction. The processed satellite data along with other in-situ data are considered as input to a Deep Neural Network and prediction of algae bloom at pixel level is realized. The outcome is a 2D mask which is accompanied by a metadata file. This module is able to assist water operators / water utilities to monitor harmful algal blooms, which affect the water quality and by consequence the consumer’s health.
Water quality in CALLISTO
As far as land border detection is concerned, the respective module aims at identifying changes in the infrastructure, such as new constructions, new buildings, and new road networks, which are observed close to countries borders. Although the module focuses on the land border area, this doesn’t affect the methodology applied. The identification of the changes is realized by using a Temporal Convolutional Network, which is a Neural Network architecture not “oblivious” in the time dimension that extracts useful information out of a time series regarding the temporal order of events under consideration. The module receives as input a sequence of optical Sentinel-2 images, while the output is a 2D mask capturing the changes identified and a metadata file that accompanies the mask and captures information about the meaning of the mask. This particular module can aid in the virtual presence of EU policymakers in the border area, as the surveillance of the EU’s external borders is vital for the internal security and protection of its citizens.
Land border change detection in CALLISTO. Source: Golovanov, 2018
Information retrieval is the field of study concerned with searching, browsing and retrieving single (image) or multimodal (image, text) data from a database. Searches can be based on full-text (e.g. sentence) or other content-based (e.g. concepts) indexing. Due to the growth of the Web, personal multimedia collections, and massive social media streams, we need Information Retrieval to design and develop customized search engines. The search engines assist the users in finding, organizing, and filtering the information they require. For example, Information Retrieval can be utilized when a user enters a text query in a search engine or filters his search according to some criterion in a recommendation system (e.g. IMDB). Similarly, the user is able to have an image as a query (e.g. the picture of a building) and retrieve images or video shots that depict a building.
More specifically, in the CALLISTO project the user submits as input a video shot with (or without) spatiotemporal information and they receive the most similar video shots captured by UAVs, ranked by a retrieval methodology. The need of more accurate and faster search engines is major due to the availability of massive digital heterogeneous data.
The Information Retrieval flow in CALLISTO
With respect to crowdsourced data sources, CALLISTO involves the collection of citizen observations from social media platforms and other online content. Monitoring of popular platforms, such as Twitter, is possible with real-time crawling of posts through specific queries, composed of keywords and user accounts that are related to the examined use cases. The ultimate objective here is to link these collected posts to relevant location-annotated data, like satellite images to make their joint exploitation possible, assisting in tasks like land border change detection, monitoring of rivers that feed water treatments, but also in environmental journalism.
To this end, we focus on named entities (or simply, “proper names”) found in social media posts that refer to locations, retrieve them, annotate them accordingly, and also append them with geo-referencing information (latitude and longitude coordinates). The process of recognizing the correct entities and assigning them a proper label (organization, location, person) is called Named Entity recognition and is an active machine-learning-based Natural Language Processing task since the late ‘80s. The recent advances in Deep Learning have reignited interest in the field, with state-of-the-art approaches being able to achieve human-level performance.
The EOPEN Social Media Platform as a basis in CALLISTO
Finally, non-EO data (i.e. social media posts) and EO data (i.e. satellite images) can be combined towards the detection of events, which are defined as happenings in the real world that are expressed by unusual behavior in particular data streams. Apart from discovering events on each data stream separately, the aim is also to investigate the correlation between these different data types and enhance the overall performance of the detection. But how can we effectively combine the large amount of Copernicus data and the massive streams of citizen in-situ observations? CALLISTO offers a multimodal fusion mechanism to monitor environmental issues in urban areas, such as the quality of drinking water and its natural resources.
The Multimodal Data Fusion and Analytics Group (M4D) of CERTH-ITI’s MKLab, which participates in CALLISTO, has significant experience in artificial intelligence and in fields, such as multimedia analysis, multimodal retrieval, discovery and mining of heterogeneous multimedia Web resources, as well as in the processing and analysis of the multimodal data extracted from them. In recent years, the team has coordinated and participated in more than 150 European and National research projects in the areas of multimedia processing, information extraction, and social media monitoring and analysis.
- DateMay 31, 2021
- WriterScientific & Technical management team, CERTH