Best Practice Reference Architecture for Data Curation

  • Published on
    25-Jul-2015

  • View
    204

  • Download
    3

Transcript

1. Best practice reference architecture for data standardization and curation Dr. Michael Engels, OSTHUS BioIT World, Boston April 21st 2015 2. Slide 2 Agenda OSTHUS Who we are Painpoints Reference architecture Use cases Benefits 3. Slide 3 Who we are 4. Slide 4 Cutting edge in R&D Global partner Independent Digital Lab Informatics Innovation Active network Open collaboration Customer orientation Trust Who we are 5. Slide 5 Who we are Focus on value Concepts and methodology Approach & committment 6. Slide 6 Agenda OSTHUS Who we are Painpoints Reference architecture Use cases Benefits 7. Slide 7 Life science data Scientific data are Valuable assets to NGO, academic and industries Domain/context specific Only interpreted by experts Scientific data are subject of continuous change: Growth Formats, standards, and technology Concept extensions Context changes 8. Slide 8 Change of concepts Phenomenological based concept Gene-based concept Pharmacology example: Ion channels taxonomy 9. Slide 9 Painpoints Data standardization, data curation, master data management, data migration, . Are complex endeavor's Are labor, and alignment-intensive Need expert input (technical and scientific) Are highly iterative Are difficult to frame in time-lines or costs How to address this challenge? 10. Slide 10 Agenda OSTHUS Who we are Painpoints Reference architecture Use cases Benefits 11. Slide 11 Reference architecture Data migration Manage Curation runs Manage Results Analysis I II III IV ... Manage Dictionary Data Source Sources Copy Copy of targetWorking area Transformation Glossary and VocabularyProperty Mapping Extraction & Loading Data Concept Target Data SourceGlossary Vocabulary Annotation Rules Mapping Rules Transformation Rules Run Configuration Data partitioning Data Processing Filtering Monitoring & Audit Logs & Observ. Exceptions Comments Dashboard Calculate Properties Data Comparison Visual Analytics Tag Data List Management CDC SQL to Load Audit Trails 12. Slide 12 Agenda OSTHUS Who we are Painpoints Reference architecture Use cases Benefits 13. Slide 13 Use case 1 Chemical cartridge/structure migration Accord Mol2000 #1: racemic #1 Big Bang 14. Slide 14 Use case 2 Data integration DWH Continuous Growth 15. Slide 15 Agenda OSTHUS Who we are Painpoints Reference architecture Use cases Benefits 16. Slide 16 Benefits Benefits are Modular set up All functions available within one integrated framework Separate components for technical and scientific experts alike Data curation part of a process not of individual data editing Easy-to-use Configurable toolbox tailored to any program Integrated visual / comparative analysis between source and target data Reduction of technical issues Error propagation contained, roll backs possible Focus on data, not on technology 17. Slide 17 Questions? For more information: Visit us at Booth # 451 or at Poster # 47

Recommended

View more >