11 October 2017: Scheduled talks for 'Scaling Cloud to Big Data' session
Dirk Duellmann, CERN: Leads the analytics and design section of CERN IT's storage group.
Yannick Legré, EGI: Managing director.
Title: Building Scale-Out Storage Infrastructures with RADOS and Ceph
Ceph and its core RADOS object store present a compelling building block for architecting very large storage clusters out of commodity hardware. However, several early design decisions can have lasting implications for a Ceph-based infrastructure. For example, which of Ceph's APIs should I build upon - RADOS, S3, block, POSIX? Should we build a single multi-tenant cluster, or are targetted smaller deployments somehow better? Erasure-coding looks inexpensive on paper, but are there performance implications? When and where does it pay off to add flash? This talk will attempt to answer these and other questions about Ceph, based on our 4 years' experience operating multi-petabyte Ceph clusers for OpenStack, scientific data, and HPC.
Daniel van der Ster from CERN IT storage group.
Dan is a Storage Engineer at CERN and sits on the Ceph Advisory Board as its academic chair. In 2008, he earned a PhD in Computer Engineering from the University of Victoria in Canada.
Title: Data, Software, and Knowledge Preservation of Scientific Results
Simply preserving the data from a scientific experiment is rarely sufficient to enable the re-use or re-analysis of the data. Instead, a more complete set of knowledge describing how the results were obtained, including analysis software and workflows, computation environments, and other documentation may be required. This talk explores the challenges in preserving the various knowledge products and how emerging technologies such as linux containers may provide solutions that simultaneously enable preservation and computational portability. In particular, the work of the Data and Software Preservation for Open Science (DASPOS) project and the building of the CERN Analysis Preservation Portal will be highlighted.
Mike Hildreth, Associate Dean for Research at University of Notre Dame, Professor of Physics
Title: Cloud Native Databases
The cloud native movement has changed the way of application development and management in recent years. The main drivers are the adoption of public cloud technologies, containers and microservices. But storage options have been left behind in comparison: While conventional SQL databases have improved clustering options, they still lack features required to be considered “cloud native”. With the increased adoption of technologies like Kubernetes, many companies are confronted with the question on how to manage state in a cloud native environment.
The talk will give an overview about cloud native means, why should I care and how databases need to be designed to be considered cloud native. We will then show some available options for cloud native databases and how to run them on Kubernetes.
Michael Mueller is CTO at Container Solutions Switzerland. Before joining CS, Michael was heading the IT and Cloud Innovation team at Swisscom Innovation. His team worked with container infrastructure and orchestration systems such as Kubernetes, Mesos, Docker Swarm and adopted DevOps principles. Prior to that, he worked in roles ranging from Operations Management to Network Security in companies in the tourism, ISP, broadcasting and defense sector. Michael is a Cloud Native Computing Foundation ambassador specialized in Kubernetes.