Project Description
Earth Observation data
The incredible amount of data produced by Earth Observation (EO) satellites, is a unique source of information. The variety of sensors employed, with different spatial, temporal and spectral characteristics, offer a wealth of knowledge about our planet, contributing to smarter and more targeted decision-making.
In addition, the policy of free & open access data, coupled with progresses in algorithms and data processing, have contributed to enable the widespread use of this information, also beyond the specialized scientific community.
The ONDA DIAS
Indeed, the DIAS – the European Commission Data and Information Access Services funded by the European Commission – were conceived with the aim to facilitate and standardize access to the free Copernicus data, increase the exploitation of these data, and boost business, research and innovation.
With this purpose, the ONDA DIAS, led by Serco Italia, provides centralized access to the full set of Copernicus data and information, processing tools, and dedicated APIs for data indexing, which allow further extraction of value from Copernicus data and information as well as from Third-Party missions.
Data fusion
It is CALLISTO’s ambition to extend the concept of the current DIAS infrastructures by conceiving a new model of exploitation, that is employing the DIAS to fuse the already indexed Copernicus data with other heterogeneous, unstructured data sources, utilizing High Performance Computing infrastructures for enhanced scalability.
Data fusion – the synergetic exploitation of data from different sources – combines measurements from several sensors in order to provide enriched information of greater quality.
In CALLISTO, data fusion is provided on-demand, based on the end user requirements, and aims to demonstrate the increased amount and accuracy of information.
Data collection and indexing
With this goal in mind, in the context of CALLISTO, Serco Italia is in charge of performing the data indexing, documentation and cataloguing on the ONDA DIAS infrastructure, which is one of the technological objectives of the project.
ONDA is therefore being extended to provide an advanced data model that combines EO and non-EO data with dedicated indexing functions. More specifically, EO data like the Copernicus Sentinel-2 optical satellite images are fused with other distributed sources as GNSS data through the Galileo-enabled mobile app, some non-EO data retrieved from social media platforms, as well as video recordings retrieved from UAVs.
Until now, a number of datasets have already been ingested on ONDA, selected according to the CALLISTO pilot use cases as for the areas of interest taken into consideration:
- Copernicus Sentinel-2 for water quality monitoring
Figure 1: Sentinel-2 data for water quality available in the extended model of the ONDA catalogue. These datasets can be searched, listed and downloaded, as any other data available in the catalogue.
- HYPSTAR, an in-situ hyperspectral radiometer system that has the purpose of validating the satellite products used for water quality (read more on hyperspectral sensors here)
Figure 2: HYPSTAR water quality data available in the extended model of the ONDA catalogue. These datasets can be searched, listed and downloaded, as any other data available in the catalogue.
- Twitter posts – these crowdsourced, user-generated data are incorporated on the basis of a set of keywords, locations, and user accounts defined by end users. In particular, the selected keywords refer to the areas of interest for water quality, air quality and migration. Based on these keywords, a Twitter crawler was built and has started collecting new matching tweets in real time, already resulting in a collection of circa 500 thousand tweets. These datasets are geo-located using semantic and Deep learning technologies.
Figure 3: Twitter data available in the extended model of the ONDA catalogue. These datasets can be searched, listed and downloaded, as any other data available in the catalogue.
At this stage of the project, the next steps involve data collection and processing from UAVs, for example for border surveillance purposes. The main UAV body frame and sensors have already been designed and configured, with a high-resolution camera and a GNSS sensor to collect video recordings, which are annotated with precise Galileo geographical coordinates (read more on UAVs here).
Also, further unstructured social media data from more social media platforms like Instagram will be collected and ingested on the ONDA catalogue.
Each of the distributed data sources in CALLISTO will serve as input to machine learning models, 3D representation models, HPC infrastructures, when needed, and semantic data fusion modules, in order to provide the CALLISTO interoperable Big Data platform, a complete and integrated solution with Mixed Reality tools.