Naive Bayes Algorithm is one of the most famous supervised machine learning algorithms for the multi-classification problems.

Some of its applications

  • Text Classifications to determine to which class a text document belongs to (i.e. politics, comics, etc).
  • Spam Mailing Prediction System
  • Sentiment Analysis like for daily rating of tweets, reviews and comments on social media to determine if they are violent, sarcastic, and so on.
  • Medical Diagnosis

Introduction to Probability

Probability Theorem

We have three main probability types:

  1. Marginal Probability
  2. Joint probability
  3. Conditional Probability.

Let’s check the following example to get the idea behind each type.


Suppose that we have the following table that…

First step of in computer vision using Sequential API


TensorFlow as being described in its official website is an end-to-end open source platform for machine learning. It has a comprehensive, flexible ecosystem of tools, libraries and community resources that lets researchers push the state-of-the-art in ML and developers easily build and deploy ML powered applications.

So, What is Keras?

Keras is a high-level API for Deep Learning that can implement (build, train and evaluate) any sort of Deep Neural Networks (DNN). It was developed as a research project by Francois Chollet. …

Common technique in Machine Learning systems used to handle a sequence of data processing components or if there are many transformations have to be applied on these data.

Pipe line in Machine Learning

In Fig.1 you can see that the data output from the price prediction model is being stored to be used as an input into another investment analysis model that handles it with another types of data to predict if it’s OK to invest in this product or not. So, pipeline here consists of two consecutive processing units. This represents the first case of using pipelines.

Fig.1 Pipeline of sequence data processing units.

The second…

What Does Deep-Learning Mean?

Deep learning is a subset of machine learning which in-turn is a subset of artificial intelligence as shown in Fig.1, . Deep learning (DL) differs from machine learning (ML) — also called shallow learning — that it offered better performance on many problems by utilizing a huge number of neural networks, huge datasets and have the ability to accelerate computing on GPUs. Most of nowadays AI advancements, such as detecting spam in emails, forecasting stock prices, recognizing images in a picture, diagnosing illnesses and self-driving cars are due to the great progress and power of deep learning.

Fig.1 Relation between AI, ML and DL

Is there another difference?

Yes, another major…


Tip: You should have good understanding of what is supervised machine learning, what is training data and testing data. (Also, I prefer to read the first part of this topic

  • As per supervised machine learning, we have inputs and actual output (labeled data)
  • We are using preliminary random weight values.
  • So we need the right values of weights to get the actual desired outputs or a nearest value to it with least error and this is the rule of back propagation.
Fig.1 FFNN and Back_propagation

Feedforward Neural Network

which is being discussed in previous tutorial

  • 1- Inputs are being received.
  • 2- Inputs are being modeled weights. The…

Feed-Forward Neural Networks (FFNN)


In its most general form, a neural network is a machine that is designed to model the way in which the brain performs a particular task or function of interest; the network is usually implemented by using electronic components or is simulated in software on a digital computer.

To achieve good performance, neural networks employ a massive interconnection of simple computing cells referred to as “neurons”, perceptrons or “processing units”.

We may thus offer the following definition of a neural network:

A neural network is a massively parallel distributed processor made up of simple processing units…

Random Forest & Gradient Boosting Algorithms

Tip It’s recommended that you should read the following articles regarding “Decision Tree: Regression Trees” and ”Decision Trees Classifier” before going deep in ensemble methods.


The basic idea behind ensemble methods is to build a group or ensemble of different predictive models -weak models- (each of these models is independent of the others) and then combine their outputs in some way to get a precise final predicted value.

As using different algorithms to train different models (independent models) and combining their results is very very difficult, ensemble methods uses only one algorithm to train all of these models in a…

Both of Regression Trees and Classification Trees are a part of CART (Classification And Regression Tree) Algorithm. As we mentioned in Regression Trees article, tree is composed of 3-major parts; root-node, decision-node and terminal/leaf-node.

The criteria used here in node splitting differs from that being used in Regression Trees. As before we will run our example and then learn how the model is being trained.

There are three commonly measures are used in the attribute selection Gini impurity measure, is the one used by CART classifier. For more information on these, see Wikipedia.

Data set being used is iris data set

import numpy as np
import pandas as pd

SVM non-linear models and kernel-tricks

In the first part of this tutorial regarding SVM-algorithm linear model which I strongly recommend to read first, it was mentioned that SVM is used for solving both regression and classification problems and mostly used for classification as it has a great ability to classify by using either linear or non-linear modeling.

Can we classify non-linearly separable data?

Suppose we have a system like the one shown below, and you would like to use SVMs to classify it. We have seen that it is not possible because the data is not linearly separable. However, this last assumption is not correct. …

Linear Model


In machine learning, support vector machines (SVMs; also, support vector networks) are supervised learning models with associated learning algorithms that analyze data and recognize patterns used for classification and regression analysis but mostly with classification analysis. It works by establishing a hyperplane between different classes of data. If the data can’t be linearly separated, SVM will try to project data into a higher dimension in away it can be separated with what is known as kernels trick.

SVM Algorithm (Linear Model)

Some of supervised machine learning algorithms requires transforming labels into numbers and SVM is one of them. This algorithm requires that the positive…

Ahmed Imam

Machine Learning Engineer & Python/Machine Learning Senior Instructor

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store