We are venturing into the domain of Unsupervised Learning here.

With unsupervised learning, the class labels are not known, and the data are plotted to see whether it clusters naturally. Cluster analysis divides the data into groups that are hopefully meaningful and useful to us later. These clusters may or may not be analogous with the human perception of similarity.

Clusters Representation.

In Agglomerative Hierarchical Clustering we will treat every data point as its own cluster, initially. Then subsequently we will keep merging nearest clusters together to form a new cluster. …

PCA, Principal Component Analysis, is one of the basic techniques for reducing data with large dimensions to a much smaller set. The idea is to reduce the dataset but preserve as much information as possible.

In this blog, we will explore PCA to compress an image just enough so as to still be able to recognize the person in the picture.

This blog is a tutorial. One can use this as a reference for their endeavor towards understanding PCA.

Importing Libraries and Loading the image:

Converting image to 2D Array:

The array has 904 rows and 603 columns of pixels…

This is a tutorial on deploying Kubeflow on a local Kubernetes cluster from scratch. While trying to deploy it myself using the documentation on the official websites, I came across a lot of errors, mostly compatibility issues so I am documenting the process here, for reference.

We will create an Ubuntu VM first. I have created an instance on GCP.

VM Requirements:

  1. VM OS — Ubuntu 20.04 LTS
  2. 16 GB RAM
  3. 4 vCPUs
  4. The official site recommends machine with 250 GB storage…

