In this blog we want to present a quick tutorial on how to set up the latest version of Kubeflow, the open source package for managing Machine Learning on Kubernetes. Please follow these instructions on how to get Kubeflow 1.5 started:
- Launch a cluster on EKS (we recommend 4 worker nodes with at least t3.2xlarge to give training enough juice).
- We recommend using Spectro Cloud Palette for ease of management :) Here’s a quick link on how to launch an EKS cluster on our platform. But you may do this any way you like.
- Install “kustomize” if not done already (version 3.2.0) (download link).
- have “kubectl” handy (link to install instructions).
Setting it up using command lines:
Let’s download from the source.
- Clone the manifest from the official repo and switch to v1.5 branch.
2.Download kubeconfig from Palette and ensure the kubectl command knows where to find it e.g.
If you need a quick way to context switch KUBECONFIG we recommend a good trick “kctx” mentioned on our blog post.
3.Now it is time to apply the required resource files — there are quite a few! Please be aware that this may take up to 20 minutes because of the number of resources required.
When this is done let’s setup port forwarding:
And access Kubeflow dashboard as follows with default credentials:
> username : firstname.lastname@example.org
> password: 12341234
Voila! You now have a kubeflow cluster at your disposal for your ML projects
Let’s try an example:
Text Classification with Movie Reviews
Using the same specifications above for the notebook server, we can run an example of NLP notebook as illustrated below
This notebook classifies movie reviews as positive or negative using the text of the review. This is an example of binary classification, an important and widely applicable kind of machine learning problem.
We’ll use the IMDB dataset that contains the text of 50,000 movie reviews from the Internet Movie Database. These are split into 25,000 reviews for training and 25,000 reviews for testing.
We download the dataset and we build a model based on a model from TensorFlow Hub called google/nnlm-en-dim50/2
Next step is the model training, depending on your server specifications, this part might take some time.
Graph of accuracy and loss over time:
As always, if you have any questions or want to discuss anything described in this tutorial, ping us at email@example.com.