dplyr package in R WHAT IS “dplyr”? “dplyr” package, widely used in R, is basically a grammar of data manipulation. It is written and maintained by Hadley Wickham. WHY “dplyr”? The package helps in transformation and summarization of data frames (i.e., data recorded in tabular form with rows and columns). …

Customer Lifetime Value Customer Lifetime Value (CLV), also known as Life-time Value (LTV), is the present value of the future cash flows from the customer during his or her entire relationship with the company. In other words, the dollar value of a customer relationship which is based on the present …

Analysis of variance technique was first introduced by R. A. Fisher. Though the name ANOVA suggests splitting of total variance into different components, actually it splits total sum of squares obtained from a dataset on a certain response variable into different sum of squares according to various sources of variations. …

In previous article of this series we learned how to calculate values of coefficients, test of slope coefficients and Hypothesis.

You must have heard about Regression models many times but you might have not heard about the techniques of solving or making a regression model step-wise.

Merging Dataframes pandas provides various facilities for easily combining together Series, DataFrame, and Panel objects with various kinds of set logic for the indexes and relational algebra functionality in the case of join / merge-type operations.

1 2 3 4 5 6 |
import pandas as pd df = pd.DataFrame([{'Name': 'Chris', 'Item Purchased': 'Sponge', 'Cost': 22.50}, {'Name': 'Kevyn', 'Item Purchased': 'Kitty Litter', 'Cost': 2.50}, {'Name': 'Filip', 'Item Purchased': 'Spoon', 'Cost': 5.00}], index=['Store 1', 'Store 1', 'Store 2']) df |

Cost Item Purchased Name Store 1 22.5 Sponge Chris Store 1 2.5 …

In the Last part of Statistical Cluster Analysis series we discussed about Hierarchical cluster analysis(HCA). Which is the first method of data exploratory analysis techniques.

Statistical cluster analysis is a Exploratory Data Analysis Technique which groups heterogeneous objects (M.D.) into

The Series Data Structure We will quickly start with, non-comprehensive overview of the fundamental data structures in pandas. The fundamental behavior about data types, indexing, and axis labeling / alignment apply across all of the objects. To get started, import numpy and load pandas into your namespace

1 2 |
import pandas as pd pd.Series? |

We’ll create …

The Python Programming Language: Functions

1 2 3 4 |
x = 1 y = 2 x + y |

add_numbers is a function that takes two numbers and adds them together.

1 2 3 4 |
def add_numbers(x, y): return x + y add_numbers(1, 2) |

add_numbers updated to take an optional 3rd parameter. Using print allows printing of multiple expressions within a single cell.

1 2 3 4 5 6 7 8 |
def add_numbers(x,y,z=None): if (z==None): return x+y else: return x+y+z print(add_numbers(1, 2)) print(add_numbers(1, 2, 3)) |

add_numbers updated to take an optional flag parameter.

1 2 3 4 5 6 7 8 9 |
def add_numbers(x, y, z=None, flag=False): if (flag): print('Flag is true!') if (z==None): return x + y else: return x + y + z print(add_numbers(1, 2, flag=True)) |

Assign function add_numbers to variable a

1 2 3 4 |
def add_numbers(x,y): return x+y a = add_numbers a(1,2) |

The Python Programming …