Jorge Carretero - CosmoHub on Hadoop: a web portal to analyze and distribute massive cosmological data

Seminar (IFAE)



Jelena Aleksic (IFAE), Joern Lange (IFAE Barcelona), John E Ward (IFAE)
We present CosmoHub, a web platform to perform interactive analysis of massive cosmological data without any SQL knowledge. CosmoHub is built on top of Apache Hive, which is an Apache Hadoop ecosystem component, which facilitates reading, writing, and managing large datasets. CosmoHub is hosted at the Port de Informació Científica (PIC) and currently provides support to several international cosmology projects such as the Euclid space ESA mission, the Dark Energy Survey (DES), the Physics of the Accelerated Universe (PAU) and the Marenostrum Institut de Ciències de l'Espai Simulations (MICE). More than two billion objects, from public and private data, as well as observed and simulated data, are available among all projects. In the last three an a half years more than 400 users have produced about 1500 custom catalogs occupying 2TB in compressed format. CosmoHub allows users to access value-added data, to load and explore pre-built datasets and to create their own custom datasets through a guided process. All those datasets can be interactively explored using an integrated visualization tool which includes 1D histogram and 2D heatmap plots. In our current implementation, online analysis of datasets of a billion objects can be done in less than 25 seconds. Finally, all those datasets can be downloaded in three different formats: CSV.BZ2, FITS and ASDF.
  • Alícia Labián
  • Antonio Pineda
  • Bruno Bourguille
  • Chihaya Anzai
  • Daniel Guberman
  • Daniel Moreno
  • Delgado Jordi
  • Dirk Hornung
  • Elena Planas
  • Enrique Fernandez
  • Federico Sanchez
  • Francesc Torradeflot
  • gloria de la rosa
  • Isaac Esparbé
  • Jelena Aleksić
  • Joaquim Palacio Navarro
  • Jordi Casals
  • José R. Espinosa
  • Juli Mundet
  • Laia Cardiel
  • Leyre Nogués
  • Lluïsa-Maria Mir
  • Machiel Kolstein
  • Manel Martinez
  • Manuel Delfino
  • Mari Carmen Porto
  • Mario Martinez
  • Matthias Jamin
  • Nadia Tonello
  • Oscar Blanch Bigas
  • Oscar Martinez
  • Otger Ballester
  • Paolo Cumani
  • Pilar Casado
  • Rafel Escribano
  • Raimon Casanova
  • Ramon Miquel
  • Sara Strauch
  • Vanessa Acín
  • Xabier Llobregat