# Plot Lorenz Curve in Python

import numpy as np

from matplotlib import pyplot as plt
%matplotlib inline


Gini coefficient, along with Lorenz curve, is a great way to show inequality in a series of values. However, there are no straight forward wrapper function to use for the plot. I’ve been using these a lot lately and woud like to write down how I do these with Numpy and Matplotlib.

Let’s first come up with a small sample set. To make a skewed dataset, I append two Poisson random samples together.

X = np.append(np.random.poisson(lam=10, size=40),
np.random.poisson(lam=100, size=10))
X

array([ 11,  11,  13,  10,  17,  14,  14,   9,   8,  11,   4,  13,  12,
9,  14,  11,  10,  11,  13,  17,  14,   6,  10,   8,  13,  14,
13,   5,  22,   8,  13,  10,  11,  14,   6,   6,  11,  10,   6,
7, 103,  86,  99,  94,  99,  95,  91,  95,  88,  79])


#### Calculation of Gini

The formula comes from Gini’s Wikipedia page:

$$Gini = \frac{2\sum_i^n i\times y_i}{n\sum_i^n y_i} - \frac{n+1}{n}$$

Note that the values of $y$’s are ordered such that $y_i \leq y_{i+1}$

Now, we can write a simple wrapper function for this.

def gini(arr):
## first sort
sorted_arr = arr.copy()
sorted_arr.sort()
n = arr.size
coef_ = 2. / n
const_ = (n + 1.) / n
weighted_sum = sum([(i+1)*yi for i, yi in enumerate(sorted_arr)])
return coef_*weighted_sum/(sorted_arr.sum()) - const_

gini(X)

0.5305263157894735


#### Lorenz Curve

There are two elements we need: perfect equality that has a slope of $1$ and the Lorenz curve.

Quote from Lorenz curve’s wiki page

The curve is a graph showing the proportion of overall income or wealth assumed by the bottom $x$% of the people

In fact, the famous 80-20 rule is one good example: the bottom 80% holds 20% of the overall wealth. Under the situation of perfect equality, on the other hand, we can say the bottom $x$% holds $x$% of the overall wealth.

Similar to Gini coefficient, we will sort the data. Then we convert values to cumulative proportions. We may also add the origin $(0,0)$

X_lorenz = X.cumsum() / X.sum()
X_lorenz = np.insert(X_lorenz, 0, 0)
X_lorenz[0], X_lorenz[-1]

(0.0, 1.0)

fig, ax = plt.subplots(figsize=[6,6])
## scatter plot of Lorenz curve
ax.scatter(np.arange(X_lorenz.size)/(X_lorenz.size-1), X_lorenz,
marker='x', color='darkgreen', s=100)
## line plot of equality
ax.plot([0,1], [0,1], color='k')


This gives us a very simple Lorenz curve. Then it is up to our own needs to add any customization.

#### A wrapper function

def lorenz_curve(X):
X_lorenz = X.cumsum() / X.sum()
X_lorenz = np.insert(X_lorenz, 0, 0)
X_lorenz[0], X_lorenz[-1]
fig, ax = plt.subplots(figsize=[6,6])
## scatter plot of Lorenz curve
ax.scatter(np.arange(X_lorenz.size)/(X_lorenz.size-1), X_lorenz,
marker='x', color='darkgreen', s=100)
## line plot of equality
ax.plot([0,1], [0,1], color='k')

X = np.append(np.random.poisson(lam=10, size=40),
np.random.poisson(lam=100, size=10))

gini(X)

0.5620936639118459

lorenz_curve(X)