Data science & engineering workflows - from Notebooks to Production
Part 1 - From Notebooks to Production - our experiences
Scaling up data science can be difficult. Data set management, cluster management, distributed workloads and model development suddenly all need to fit into the routines and conventions of classical software development. The current ML-ops landscape offers several potential solutions but the lack of a mature standard indicates an underlying issue - none of them are perfect. In this talk, we discuss the challenges of production grade data science workflows and our learnings from dealing with them on behalf of our customers.
Part 2 - From Notebooks to Production - our proposal
Taking into account everything we learned in the first talk, what would we want the process to look like? We propose a set of tools to quickly and easily get you started with cluster computing, batch job workflows, scalable on-demand notebooks and distributed model training in a cloud agnostic, testable and version controlled manner. We guide you through our thought process in designing a solution that makes the development experience simple, fun and ergonomic. We invite you to collaborate with us in the creation of an open source toolkit for data science & engineering.
- Co-founder, Backtick Technologies, ML-engineer, M.Sc. Computer Science & Engineering, Lund University
Michal has a background in big data validation at an American tech-giant. 2018 he pivoted towards data science & ML-engineering and focuses mainly on helping companies design/implement/improve their data science pipelines and ML infrastructure.
- CTO, Backtick Technologies. M.Sc. Computer Science & Engineering, Lund University
Johan has been writing software for the greater part of his life, specialising in services and distributed software. For the past few years he has been applying these skills to solve problems in the data engineering space.
17:30 – 17:45 – Meet & Greet
17:45 – 18:30 – Presentation - Part 1
18:30 – 19:00 – Meet & eat
19:00 – 19:45 – Presentation - Part 2
19:45 – 20:00 – Q&A