The normal distribution is the most important probability distribution in statistics because it fits many natural phenomena.

In this article we will cover some distributions that I have found useful while analysing data. I have split them based on whether they are for a continuous or a discrete random variable. First I give a small theoretical introduction about the distribution, its probability density function, and then how to use python to represent it graphically.

Continuous Distributions:

  • Normal Distribution, also known as Gaussian distribution
  • Standard Normal Distribution — case of normal distribution where loc or mean = 0 and…


Scrum methodology to implement Agile philosophy for Analytics teams.

Photo by Kelly Sikkema on Unsplash

Waterfall Methodology

Traditionally analytics teams have followed a waterfall methodology to project management. This involves dividing the project into sequential steps involving requirement gathering, development, testing, delivery, and maintenance. The benefits of this approach are that budgeting is easy, however on the flip side there is less to no contact with the customer after the requirement gathering stage up until delivery.

Agile Philosophy

Agile is an alternative philosophy that strives to make production process more efficient and manageable. Agile staggers the project into consecutive iterative sprints. There are many different project…


Photo by Bench Accounting on Unsplash

This is the third of a series of articles that I will write to give a gentle introduction to statistics. In this article we will cover how we can visualize data using various charts and how to read them. I will show how to create these charts using SAS and will include code snippets as well. For a full version of the code visit my GitHub repository.

SAS has an in-built procedure called sgplot that allows you to create several kinds of plots. Also available is proc univariate which allows you to create histograms and normal probability plots, also known…


Photo by Isaac Smith on Unsplash

This is the second of a series of articles that I will write to give a gentle introduction to statistics. In this article we will cover how we can visualize data using various charts and how to read them. I will show how to create these charts using Python and will include code snippets as well. For a full version of the code visit my GitHub repository.

Python has many libraries that allow creating visually appealing charts. In this article we will work with the in-built tips dataset and then plot using the following libraries:

import seaborn as sns
tips =…


A three part article series on version control using Git and GitHub.

Photo by Yancy Min on Unsplash

This is the first article in the series in which I will give a very brief introduction to Git. This will allow most readers to understand enough to utilize it for version control during development.

Lets start with some definitions. Git is a source control software allowing you to take snapshots and distribute your creations and modifications over time.

Before we learn about Git, lets quickly look at a few basic terminal commands, as we will be using Git on terminal.

  • Terminal (for Unix or Mac) or Command Prompt for Windows allows us to type Git commands and manage project repositories. …


Learn how to automate repetitive or complex tasks using the power of Excel VBA.

Photo by Alex Knight on Unsplash

Most modern programming languages have a set up similar building blocks, for example

  1. Ability to store values in variables (usually of different kinds such as integers, floating points or character)
  2. A string of characters where you can store names, addresses, or any other kind of text
  3. Some advance data types such as arrays which can store a series of regular variables (such as a series of integers)
  4. Ability to loop your code in the sense…

Co-author: Richard Vogg

Photo by Micheile Henderson on Unsplash

Acquiring a new customer in the financial services sector can be as much as five to 25 times more expensive than retaining an existing one. Therefore, prevention of customer churn is of paramount importance for the business. Advances in the area of Machine Learning, availability of large amount of customer data, and more sophisticated methods for predicting churn can help devise data backed strategy to prevent customers from churning.


Photo by Estée Janssens on Unsplash

What is markdown?

Markdown is a lightweight markup language used to create rich text using a plain text editor. It is often used for formatting readme files and for creating static webpages using Jekyll. Some of its popular cousins form the markup family are HTML and XML.

Following table provides a quick overview of frequently used Markdown syntax elements. It does not cover every case, so if you need more information about any of these elements, refer to the reference guides for basic syntax and extended syntax.


C++ is between 10 and 100 times faster than Python when doing any serious number crunching.

Photo by Dawid Zawiła on Unsplash

Most modern programming languages have a set up similar building blocks, for example

  1. Ability to store values in variables (usually of different kinds such as integers, floating points or character)
  2. A string of characters where you can store names, addresses, or any other kind of text
  3. Some advance data types such as arrays which can store a series of regular variables (such as a series of integers)
  4. Ability to loop your code in the sense that you want to receive 10 names from a user, you will write the…

Photo by Pritesh Sudra on Unsplash

This is the first of a series of articles that I will write to give a gentle introduction to statistics. In this article we will introduce some basic statistical concepts and learn how to use basic statistics to help you describe your data. We will cover the following topics in this article:

  • The difference between Descriptive and Inferential statistics
  • Different types of variables
  • Types of descriptive statistics
  • Normal or Gaussian distribution

The difference between a population and a sample:

Vivek Parashar

Experienced in synthesizing data to identify trends, deliver insights and recommendations. Focused on customer lifecycle, cross-sell, and employee performance.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store