This is Day 3 of the VMware Experts Program Big Data, Scientific & Engineering Workloads at VMware Corporate Headquarters, Palo Alto, Ca.

Compute Accelerators for HPC, ML, and Big Data on vSphere


Ziv Kalmanovich VMware
The problem with a Paravirtual device is performance?You have another layer to go through

GPU Compute on vSphere with DirectPath IO

Ability to share a single GPU across Virtual Machines


Containers and Big Data

Ben Corrie

Did a research project many years ago to see if we could make VMs ?Look and Smell like containers

People are still figuring out how to get the benefits of containers for their existing?applications

Turning Policy into plumbing

Containers and VMs are complementary

When you put multiple containers in a VM they are in a failure domain

What is a container?

  • An Executable process
  • Designed to be one process
  • Resource constraints/Private Namespace
  • Binary dependencies: Application runtime, OS
  • A shared Linux kernel for running the executable
  • Ephemeral and persistent storage layer

Difference between a hypervisor and a container host, A hypervisor is running everything

Containers force you to think about state management in a way that is really helpful

A transactional container only consumes resources when it’s running

Pods: Ability to tie multiple containers into a single ??Unit of Scale?

DAWN: Infrastructure for Usable Machine Learning

Matei Zaharia, Stanford

It?s the golden age of data

Hidden Technical Debt in Machine Learning Systems from Google

Training Data is the Key to AI –

Image search, Speech, Games: Labeled Training Data is (Relatively) easy to obtain

How are we handling performance: End-to-end compilers: WELD, Delite

The main way developers are productive is by composing existing?libraries

For data-intensive apps, data movement costs dominates on modern hardware

Machine learning systems can do much more to support user applications end-to-end

Successful systems need to span whole software stack from infrastructure to data to algorithms


Reference Architectures for High-Performance Computing

Mohan Potheri

Since HPC are long-running jobs it?s important to leverage vSphere to mitigate impact of infrastructure failures

You can provide networking at a VM level with Software-defined networking.

For HPC environment you are able to combine all your resources

Secure multi-tenancy with NSX

Nearly every IT Component contributes to application performance

A lack of visibility can lead to alert storms that drain the productivity of HPC systems

Workload Management for vSphere

Jared Rosoff

Workloads are getting more complicated

A Workload is not just a VM

These workloads are not tied to a host of a cluster – adding a new global view

Tagging would be great if everything had the right tag on it.

Workloads access would be role based

Machine Learning Using Virtualized GPUs in vSphere? Uday Kurkure


Two solutions to access GPU in vSphere VMWare DirectPathIO or Nvidia Grid vCPU

Speaking on GPUs Overhead in respect to native is only 4% in both solutions

GPUs: More Silicon is devoted to increasing the numbers of ALUs

GPUs outperform CPUs on ML Workloads

GPUs is sitting on PCIe bus

One VM with one vGPU can be scaled to four VMs with one vGPU each

You really run multiple virtual machines on P40

Virtualized GPUs deliver near bare-metal performance for ML workloads in VMware vSphere
















