Tags:
create new tag
, view all tags
Screen_Shot_2013-10-15_at_12.11.11_AM.png

A project for scientific data preservation in France

PREDON_WGs_LOGO.png

PREDON as a MADICS action

Screen_Shot_2016-04-06_at_08.28.40.png

The group is member of the GdR "MADICS" (Masses de Données, Informations et Connaissances en Sciences, Big Data - Data Science).

PREDONx Workshops

Extended workshops on Data Preservatoin (PREDONx) are organised ona yearly basis. Stay tuned for more infomation and contact Cristinel Diaconu diaconu( at)cppm.in2p3.fr if interested. Former editions here:

PREDON at AMU Days October 2015

Journéee Big Data PREDON/AMU: Accès, Préservation, Reproductibilité

IndexMed Workshop, October 14, 2014, Aix-Marseille University, Campus Saint-Charles, "Methods and tools for the mining of multi-sources and heterogeneous data in ecology".

Big Data PR2I AMU Day, October 15, 2015:

Predon Poster

PREDON at the International Workshop on High Performance Data Intensive Computing (HPDIC'2015)

More details about HPDIC2015 here.

"This year, we open new directions related to the preservation of data in cooperation with the PREDON group. The preservation of scientific data remains nevertheless a challenge due to the complexity of the data structure, the fragility of the custom-made software environments as well as the lack of rigorous approaches in workflows and algorithms."

Workshop PREDON APC Paris 4/5 Novembre 2014

PREDONx 2014 workshop took place in Paris at APC. The web site and the agenda can be found here:

https://indico.cern.ch/event/338461/

Registrations are open, please send you proposal for contributions (there is no participation fee, limited support for travel in France is available).

Scientific Data Preservation 2014

Couverture rapport

January 21st, 2014: PREDON document "Scientific Data Preservation", a facts finding white paper produced following 2012/2013 workshops is available here.

"Data observatories, based on open access policies and coupled with multi-disciplinary techniques for indexing and mining may lead to truly new paradigms in science. It is therefore of outmost importance to pursue a coherent and vigorous approach to preserve the scientific data at long term. The preservation remains nevertheless a challenge due to the complexity of the data structure, the fragility of the custom-made software environments as well as the lack of rigorous approaches in workflows and algorithms. [...]"

"The present document includes contributions form the participants to the PREDON Study Group, as well as invited papers, related to the scientific case, methodology and technology. This document should be read as a “facts finding” resource pointing to a concrete and significant scientific interest for long term research data preservation, as well as to cutting edge methods and technologies to achieve this goal. A sustained and coherent and long term action in the area of scientific data preservation would be highly beneficial."

Challenges

Scientific data collected with modern sensors or dedicated detectors exceed very often the perimeter of the initial scientific design. These data are obtained more and more frequently with large material and human efforts. A large class of scientific experiments are in fact unique because of their large scale, with very small chances to be repeated or superseded by new experiments in the same domain: for instance high energy physics and astrophysics experiments involve multi-annual and even multi-decades developments, unlikely repeatable. Other scientific experiments are in fact unique by nature: earth science, medical sciences etc. since the collected data is “time-stamped” and thereby non-reproducible by new experiments or observations. This new knowledge obtained using these data (“data observatories”) should be preserved long term such that the access and the re-use are made possible and lead to an enhancement of the initial investment. It is therefore of outmost importance to pursue a coherent and vigorous approach to preserve the scientific data at long term. The preservation remains nevertheless a challlenge due to the complexity of the data structure, the fragility of the custom-made software environments as well as the lack of rigorous approaches in workflows and algorithms.

Mission

One of the main missions of this project is to enforce the efforts to preserve the scientific data in France. The proposed research program as well as the associated events and workshops should be seen as ingredients towards the creation, at national level and in strong connection with the international organisation, of scientific data infrastructures for long term preservation and access.

To address the challenges listed before, the PREDON consortium proposes a research program aimed at solving the most urgent problems which presently lead to a large number of orphaned (and therefore lost) scientific data sets:
• Technological and methodological specific issues for long term data preservation
• Data mining algorithms and worksflows for large/big scientific data sets
• Experimental interfaces, formats for scientific data collection, analysis and exploitation at long term.

The approach proposed by the present project is based on a number of principles which we believe essential for a widely accepted, robust and sustainable scientific data preservation at long term:

• Multi-disciplinarity and unification
• Open access
• International connection

Structure and contact

The members of this project are scientists from IN2P3, INSU, INS2I, IRD, CINES (more details soon).

contact: Cristinel Diaconu diaconu( at)cppm.in2p3.fr

The PREDON project is supported by the Inter-disciplinary Mission of CNRS.

PREDON is structured in 4 working groups:

Working Package

Objectives

Participants

(*coordinator)

WP1 Technologies and Methodologies

Explore methodologies and technologies suitable for a coherent and robust scientific data preservation in a multi-disciplinary context and on a multi-platform computing centre

CINES*

APC

WP2 Algorithms and Workflows

Investigate generic and mathematically robust workflows and algorithms for data mining suited for data and workflow preservation; data- and process-based workflows and mining techniques to be used in a multi-disciplinary environment towards long term data preservation

LAM

LIRMM

LIPADE*

LIPN

WP3 Data formats and interfaces

A parallel approach for data collection, storage, processing, analysis and preservation with the aim to achieve common standards for scientific data treatment

APC

CPPM

LAM*

LPSC

WP4

General coordination

Program coordination, dissemination, communication and cooperation

CPPM*

Workshops and future events

  • 2012: The first PREDON workshop was held in December 2012.
  • 2013: An extended Workshop PREDONx will take place in 14/15 November 2013 in Marseille. Web page , agenda and registration here.
  • 2014: "DPHEP Full Costs of Curation" workshop: https://indico.cern.ch/conferenceDisplay.py?confId=276820
  • 2014 : LOPS@ICDE Workshop on LOng term Preservation for big Scientific data. LOPS will be held in conjunction with the 30th IEEE International Conference on Data Engineering. Chicago, IL, USA. March 31-April 4, 2014.
  • iPRES 2014: The call for contributions for iPRES 2014, to be held in Melbourne in October, is now open: http://ipres2014.org/call-contributions The iPRES 2014 Coordinating Committee invites contributions of papers, posters, demonstrations, tutorials and workshops related to the increasingly broad world of digital preservation.
  • EGI Community Forum: Call for participation: http://cf2014.egi.eu/programme/cfp.html]] Submission of abstracts at: http://go.egi.eu/CF2014-CfP
    • FYI: there is a track on data and knowledge preservation:
      Data and knowledge preservation and curation (Track Leaders: J. Shiers, A. Fresa)
      This track focuses on applications in data and knowledge preservation and curation and discusses best practices, lessons learnt, shared solutions and common challenges, covering all fields of research. The track will also address the technical and non-technical aspects of using e-infrastructures for data preservation and curation. The convenors are looking for submissions concerning, for example, workflow management, skills improvement, global services, solutions with multidisciplinary applications, business cases, amongst others. Contributors are encouraged to present their experiences, also in terms of concrete stories to be shared with other participants. Demonstrations are particularly welcome.
  • The H2020 calls, more details on the topic that explicitly mentions "preservation" can be found at: http://ec.europa.eu/research/participants/portal/desktop/en/opportunities/h2020/topics/2137-einfra-1-2014.html The deadline for submission is September 2014.

  • RDA Plenary March 2014: https://rd-alliance.org/rda-third-plenary-meeting.html
  • IDCC14: "Commodity, catalyst or change-agent? Data-driven transformations in research, education, business & society” 24-27 February 2014, Omni San Francisco Hotel, California Street, San Francisco, USA http://www.dcc.ac.uk/events/idcc14/]]

Recent talks

  • Talk at MASTODONS colloque , January 23/24, 2014, Institut de Physique du Globe, Paris (see reference below).
  • CHEP2013 October 13-18, 2013, International Conference on "Computing in High Energy Physics" :
    • C. Diaconu "PREDON: a project for scientific data preservation in France" (presentated in parallel session: agenda)
  • FreDocs : October 7-10, 2013, Le réseau Renatis, réseau national des professionnels de l’information scientifique du CNRS, tiendra ses prochaines rencontres FRéDoc 2013 du 7 au 10 octobre à Aussois sur le thème de "Gestion et valorisation des données de la recherche". Ces journées réuniront des professionnels de l’IST, administrateurs de systèmes d’information, responsables qualité, chercheurs et autres acteurs du monde scientifique du CNRS et d’autres établissements de recherche et d’enseignement supérieur et des intervenants (...)

Reference documents

  • MADICS Logo:
    Screen_Shot_2016-04-06_at_08.28.40.png
Topic attachments
I Attachment Action Size Date Who Comment
Unknown file formatpptx DIACONU_ICDE2014.pptx manage 15216.8 K 2014-04-09 - 14:58 CristinelDiaconu PREDON TALK AT ICDE2014
PDFpdf PREDON-VECTO-BD.pdf manage 77067.2 K 2014-02-11 - 09:22 CristinelDiaconu Scientic Data Preservation 2014 - PREDON Document
PDFpdf PREDON_MASTODONS_24JAN2014-V04.pdf manage 7762.2 K 2014-02-04 - 11:04 CristinelDiaconu Talk at MASTODONS Colloque 23/24 January 2014, IPG Paris
PNGpng PREDON_WGs_LOGO.png manage 17.1 K 2013-07-01 - 10:34 CristinelDiaconu logo
PDFpdf Predon_Poster.pdf manage 16267.8 K 2015-10-13 - 23:27 CristinelDiaconu PREDON Poster
Unknown file formatpptx Predon_Poster.pptx manage 12114.1 K 2015-10-13 - 23:19 CristinelDiaconu PREDON Poster
PNGpng Screen_Shot_2013-10-15_at_12.11.11_AM.png manage 32.4 K 2013-10-15 - 00:14 CristinelDiaconu PREDON LOGO
PNGpng Screen_Shot_2014-01-21_at_3.53.17_PM.png manage 137.9 K 2014-01-21 - 15:55 CristinelDiaconu  
PNGpng Screen_Shot_2014-01-22_at_12.43.50_PM.png manage 109.5 K 2014-01-22 - 12:44 CristinelDiaconu  
PNGpng Screen_Shot_2016-04-06_at_08.28.40.png manage 99.2 K 2016-04-06 - 08:31 CristinelDiaconu MADICS Logo
PDFpdf predon-WEB_BD.pdf manage 2806.0 K 2014-03-03 - 15:36 CristinelDiaconu Scientic Data Preservation 2014 - PREDON Document
Topic revision: r38 - 2016-04-06 - CristinelDiaconu
 
This site is powered by the TWiki collaboration platformCopyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback