Kubernetes for Big Data: Building a Big Data stack using Kubernetes

Title: Building a Big Data Stack on Kubernetes
Description: Sean Suchter of Pepperdata shares some considerations and enhancements for deploying Apache Spark within a Kubernetes environment.
What: February 2018 BayLISA Meetup
When: Recorded on 2018-02-15
Where: PayPal Inc. corporate offices “Town Hall”, San Jose, CA

Abstract:

There is growing interest in running Apache Spark natively on Kubernetes (see https://github.com/apache-spark-on-k8s/spark). In this meetup, we will discuss how to build a big data stack on Kubernetes. Specifically, you will learn how the Apache Spark scheduler can still provide HDFS data locality on Kubernetes by discovering the mapping of Kubernetes containers to physical nodes to HDFS datanode daemons. You’ll also learn how you can provide Spark with the high availability of the critical HDFS namenode service when running HDFS in Kubernetes.

Presenter:

Pepperdata Founder and CTO, Sean Suchter

Sean is the co-founder and CTO of Pepperdata. Previously, Sean was the founding GM of Microsoft’s Silicon Valley Search Technology Center, where he led the integration of Facebook and Twitter content into Bing search. Prior to Microsoft, Sean managed the Yahoo Search Technology team, the first production user of Hadoop. Sean joined Yahoo through the acquisition of Inktomi, and holds a B.S. in Engineering and Applied Science from Caltech.

Interested to know more about future BayLISA meetups? Visit the BayLISA meetup page at https://www.meetup.com/BayLISA/

Enjoyed viewing this event? See more BayLISA events on more.opentechtv.com!