Earth-prints repository, logo   DSpace

About DSpace Software
|earth-prints home page | roma library | bologna library | catania library | milano library | napoli library | palermo library
Please use this identifier to cite or link to this item: http://hdl.handle.net/2122/10352

Authors: Monna, S.*
Marcucci, N.M.*
Marinaro, G.*
Rossi, M.*
Fiore, S.*
Antonacci, M.*
Beranzoli, L.*
Favali, P.*
Title: INDIGO-DATA CLOUD EC project: A study case applied to one of the EMSO Research Infrastructure Deep sea Observatories
Issue Date: 28-Sep-2016
Keywords: data
indigo
observatory
analysis
case study
Abstract: Our case study is a pilot experience used to describe some of the activities performed by INGV in the frame of the European Research Infrastructure EMSO (European Multidisciplinary Seafloor and water column Observatory). EMSO is composed of several deep-seafloor and water column observatories, deployed at key sites in the European waters, thus forming a widely distributed pan-European infrastructure. We consider data collected by the NEMO-SN1 observatory, one of the EMSO nodes used for geohazard monitoring, located in the Western Ionian Sea in proximity of Etna volcano. In this poster we will focus on the Researcher and Data Manager user-types. The INGV EMSO community uses MOIST (Multidisciplinary Oceanic Information System) for storing and visualizing data and metadata produced by NEMO-SN1 Observatory. Data quality control and analysis often requires several steps that include the use of different scripts and software developed in-house, commercials tools (Matlab, R-Studio....), and proprietary tools available from sensor manufacturers. In this chain of events, some operations might require a relevant computing power. Data are retrieved from MOIST through remote mount (via samba or sshfs). Analysis might also be performed on datasets that are produced by other partners and remote access and sharing of these data is needed. At present, in the majority of cases, software is run on individual researchers’ Pcs. The first test for the implementation of our use-case within INDIGO-DataCloud included running an R script on a cloud environment and exploiting data sharing capabilities. The input to this script is data coming from the analysis of Short Duration seismic Events (SDE) automatically detected on the seismometer continuous time series. The script calculates a cumulate energy in the measurement period (8 months) and compares this cumulate curve to N random cumulates calculated by mixing the energy values at the fixed observed times where SDE are detected. Within the INDIGO project (WP2) we defined the users requirements and we identified some useful INDIGO solutions. In particular we are testing Ophidia, a software stack for big data analytics (Fiore et al., 2013), and the execution of R jobs in docker containers described by TOSCA templates through INDIGO Orchestrator and Apache Mesos. This poster illustrates our case study, the users’ requirements and the INDIGO solutions that we have been testing so far and would like to test in the near future.
Appears in Collections:05.01.01. Data processing
Conference materials

Files in This Item:

File SizeFormatVisibility
Poster-INGV-case-study-indigodatacloud-for-DI4R-Krakow.pdf3.31 MBAdobe PDFView/Open

This item is licensed under a Creative Commons License
Creative Commons


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.


Share this record
Del.icio.us

Citeulike

Connotea

Facebook

Stumble it!

reddit


 

Valid XHTML 1.0! ICT Support, development & maintenance are provided by CINECA. Powered on DSpace Software. CINECA