chitkara logo


Vol. 4, Issue 11, April 2018

Introduction to Machine Learning Algorithms

In simplest of words Machine Learning could be defined as a set of algorithms or computer programs that infer knowledge from past information & experiences and use it to improve future decision making process. Emerged from the Artificial Intelligence, Machine Learning has evolved as a separate field of study in Computer Science & Engineering.

Algorithms studied in Machine Learning are defined by Tom M. Mitchell an American Computer Scientist as follows - "A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P if its performance at tasks in T, as measured by P, improves with experience E."

Image Source [1]

Machine Learning algorithms are proving to be extremely useful in wide range applications especially in critical applications like - Natural Language Processing, Medical Image processing, Finance Trading, Energy Production, Weather Prediction & Forecasting. Following block diagram is used to explain a Machine Learning process with help of a simple example. Let us say we wish to build a Machine Learning algorithm to identify male & female persons correctly from their images. In order to do so algorithm is trained on 1000s of images in training phase where output/label (known male or female classification for each image) is fed to the algorithm. In the prediction phase, trained algorithm classify the unknown image with correct label.

Figure 2 - Machine Learning Process (Source [2])

There are two types of Machine Learning algorithms that are briefly discussed in this article.

Supervised Machine Learning Algorithms
Supervised learning algorithms are applied on labelled dataset (a dataset in which each data point has some sort of tag or class). Dataset contains independent variables (or input variables) and a dependent variable (or output variable). Sequence of independent variables are used to predict the dependent variable using a mapping function in the algorithm. It is called supervised because algorithm learning takes place on training dataset where output is known, if the predicted output deviates from known output then correction in the algorithm is performed. The process is repeated until acceptable level of performance is reached. Supervised learning algorithms can be further categorized into Classification and Regression depending upon the type of output variable. If output variable is a category like True or False, Long or Short, Yes or No, then it called Classification. However if output variable is a real number like cost, sales, revenue, height, length etc. then it is called Regression. Naïve Bayes, NB Tree, Decision Tree, Random Forest Logistic Regression and Linear Regression are commonly used supervised algorithms.

Unsupervised machine learning algorithms:
Unsupervised learning is an attempt to find hidden patterns in datasets, which are not labelled. In this case, unlike supervised learning, for a sequence input values, we do not have any corresponding labelled output value. Algorithms are used to group data using some common attributes or to find rules which explains the data well. Unsupervised algorithms can be categorised into clustering and association. K-means clustering and Apriori algorithms are commonly used unsupervised algorithms.

Following are some of the software, packages & tools used for building machine learning algorithms - R : CARET, e1071 (packages in R) , Python: NumPy, SciPy and Matplotlib (packages in Pyhton), Weka, Matlab

By: Dr. Vanita Jaitly - Associate Professor (CSE), Chitkara University​, Himachal Pradesh

References

  1. https://www.autodesk.com/redshift/machine-learning/
  2. http://adilmoujahid.com/posts/2016/06/introduction-deep-learning-python-caffe/

CLICK HERE to Rate the Article


Disclaimer: The content of this newsletter is contributed by Chitkara University faculty & taken from resources that are believed to be reliable. The content is verified by editorial team to best of its accuracy but editorial team denies any ownership pertaining to validation of the source & accuracy of the content. The objective of the newsletter is only limited to spread awareness among faculty & students about technology and not to impose or influence decision of individuals.