Home Artificial Intelligence Feature Transformations: A Tutorial on PCA and LDA

Feature Transformations: A Tutorial on PCA and LDA

Feature Transformations: A Tutorial on PCA and LDA

Reducing the dimension of a dataset using methods similar to PCA

Towards Data Science
Photo by Nicole Cagnina on Unsplash


When coping with high-dimension data, it is not uncommon to make use of methods similar to Principal Component Evaluation (PCA) to scale back the dimension of the info. This converts the info to a special (lower dimension) set of features. This contrasts with feature subset selection which selects a subset of the unique features (see [1] for a turorial on feature selection).

PCA is a linear transformation of the info to a lower dimension space. In this text we start off by explaining what a linear transformation is. Then we show with Python examples how PCA works. The article concludes with an outline of Linear Discriminant Evaluation (LDA) a supervised linear transformation method. Python code for the methods presented in that paper is offered on GitHub.

Linear Transformations

Imagine that after a vacation Bill owes Mary £5 and $15 that should be paid in euro (€). The rates of exchange are; £1 = €1.15 and $1 = €0.93. So the debt in € is:

Here we’re converting a debt in two dimensions (£,$) to 1 dimension (€). Three examples of this are illustrated in Figure 1, the unique (£5, $15) debt and two other debts of (£15, $20) and (£20, $35). The green dots are the unique debts and the red dots are the debts projected right into a single dimension. The red line is that this recent dimension.

A depiction of example currency conversions (£,$ -> €).” class=”bg nu nv c” width=”700″ height=”248″ loading=”lazy”></picture></div>
</div><figcaption class=Figure 1. An illustration of how converting £,$ debts to € is a linear transformation. Image by writer.

On the left within the figure we will see how this will be represented as matrix multiplication. The unique dataset is a 3 by 2 matrix (3 samples, 2 features), the rates of exchange form a 1D matrix of two components and the output is a 1D matrix of three components. The exchange rate matrix is the transformation; if the exchange rates are modified then the transformation changes.

We are able to perform this matrix multiplication in Python using the code below. The matrices are represented as numpy arrays; the ultimate line calls the dot method on the cur matrix to perform matrix multiplication (dot product). This…


Please enter your comment!
Please enter your name here