Introduction to Linear Programming with Python – Part 5

Using PuLP with pandas and binary constraints to solve a scheduling problem

In this example, we’ll be solving a scheduling problem. We have 2 offshore production plants in 2 locations and an estimated demand for our products.

We want to produce a schedule of production from both plants that meets our demand with the lowest cost.

A factory can be in 2 states:

Off – Producing zero units
On – Producing between its minimum and maximum production capacities

Both factories have fixed costs, that are incurred as long as the factory is on, and variable costs, a cost per unit of production. These vary month by month.

We also know that factory B is down for maintenance in month 5.

We’ll start by importing our data.

In [1]:

import pandas as pd
import pulp

In [2]:

factories = pd.DataFrame.from_csv('csv/factory_variables.csv', index_col=['Month', 'Factory'])
factories

Out[2]:

		Max_Capacity	Min_Capacity	Variable_Costs	Fixed_Costs
Month	Factory
1	A	100000	20000	10	500
1	B	50000	20000	5	600
2	A	110000	20000	11	500
2	B	55000	20000	4	600
3	A	120000	20000	12	500
3	B	60000	20000	3	600
4	A	145000	20000	9	500
4	B	100000	20000	5	600
5	A	160000	20000	8	500
5	B	0	0	0	0
6	A	140000	20000	8	500
6	B	70000	20000	6	600
7	A	155000	20000	5	500
7	B	60000	20000	4	600
8	A	200000	20000	7	500
8	B	100000	20000	6	600
9	A	210000	20000	9	500
9	B	100000	20000	8	600
10	A	197000	20000	10	500
10	B	100000	20000	11	600
11	A	80000	20000	8	500
11	B	120000	20000	10	600
12	A	150000	20000	8	500
12	B	150000	20000	12	600

We’ll also import our demand data

In [3]:

demand = pd.DataFrame.from_csv('csv/monthly_demand.csv', index_col=['Month'])

In [4]:

demand

Out[4]:

	Demand
Month
1	120000
2	100000
3	130000
4	130000
5	140000
6	130000
7	150000
8	170000
9	200000
10	190000
11	140000
12	100000

As we have fixed costs and variable costs, we’ll need to model both production and the status of the factory i.e. whether it is on or off.

Production is modelled as an integer variable.

We have a value for production for each month for each factory, this is given by the tuples of our multi-index pandas DataFrame index.

In [5]:

production = pulp.LpVariable.dicts("production",
                                     ((month, factory) for month, factory in factories.index),
                                     lowBound=0,
                                     cat='Integer')

Factory status is modelled as a binary variable. It will have a value of 1 if the factory is on and a value of 0 when the factory is off.

Binary variables are the same as integer variables but constrained to be >= 0 and <=1

Again this has a value for each month for each factory, again given by the index of our DataFrame

In [6]:

factory_status = pulp.LpVariable.dicts("factory_status",
                                     ((month, factory) for month, factory in factories.index),
                                     cat='Binary')

We instantiate our model and use LpMinimize as the aim is to minimise costs.

In [7]:

model = pulp.LpProblem("Cost minimising scheduling problem", pulp.LpMinimize)

In our objective function we include our 2 costs:

Our variable costs is the product of the variable costs per unit and production
Our fixed costs is the factory status – 1 (on) or 0 (off) – multiplied by the fixed cost of production

In [8]:

model += pulp.lpSum(
    [production[month, factory] * factories.loc[(month, factory), 'Variable_Costs'] for month, factory in factories.index]
    + [factory_status[month, factory] * factories.loc[(month, factory), 'Fixed_Costs'] for month, factory in factories.index]
)

We build up our constraints

In [9]:

# Production in any month must be equal to demand
months = demand.index
for month in months:
    model += production[(month, 'A')] + production[(month, 'B')] == demand.loc[month, 'Demand']

An issue we run into here is that in linear programming we can’t use conditional constraints.

For example we can’t add to our model that if the factory is off factory status must be 0, and if it is on factory status must be 1. Before we’ve solved our model though, we don’t know if the factory will be on or off in a given month.

In this case construct constraints that have minimum and maximum capacities that are constant variables, which we multiply by the factory status.

Now, either factory status is 0 and:

$ \text{min_production} \geq 0$
$ \text{max_production} \leq 0$

Or factory status is 1 and:

$ \text{min_production} \geq \text{min_capacity}$
$ \text{max_production} \leq \text{max_capacity}$

(In some cases we can use linear constraints to model conditional statements, we’ll explore this in part 6)

In [10]:

# Production in any month must be between minimum and maximum capacity, or zero.
for month, factory in factories.index:
    min_production = factories.loc[(month, factory), 'Min_Capacity']
    max_production = factories.loc[(month, factory), 'Max_Capacity']
    model += production[(month, factory)] >= min_production * factory_status[month, factory]
    model += production[(month, factory)] <= max_production * factory_status[month, factory]

In [11]:

# Factory B is off in May
model += factory_status[5, 'B'] == 0
model += production[5, 'B'] == 0

We then solve the model

In [12]:

model.solve()
pulp.LpStatus[model.status]

Out[12]:

'Optimal'

Let’s take a look at the optimal production schedule output for each month from each factory. For ease of viewing we’ll output the data to a pandas DataFrame.

In [13]:

output = []
for month, factory in production:
    var_output = {
        'Month': month,
        'Factory': factory,
        'Production': production[(month, factory)].varValue,
        'Factory Status': factory_status[(month, factory)].varValue
    }
    output.append(var_output)
output_df = pd.DataFrame.from_records(output).sort_values(['Month', 'Factory'])
output_df.set_index(['Month', 'Factory'], inplace=True)
output_df

Out[13]:

		Factory Status	Production
Month	Factory
1	A	1	70000
1	B	1	50000
2	A	1	45000
2	B	1	55000
3	A	1	70000
3	B	1	60000
4	A	1	30000
4	B	1	100000
5	A	1	140000
5	B	0	0
6	A	1	60000
6	B	1	70000
7	A	1	90000
7	B	1	60000
8	A	1	70000
8	B	1	100000
9	A	1	100000
9	B	1	100000
10	A	1	190000
10	B	0	0
11	A	1	80000
11	B	1	60000
12	A	1	100000
12	B	0	0

Notice above that the factory status is 0 when not producing and 1 when it is producing

In [14]:

# Print our objective function value (Total Costs)
print pulp.value(model.objective)

12906400.0

Introduction
Part 1 – Introduction to Linear Programming
Part 2 – Introduction to PuLP
Part 3 – Real world examples – Resourcing Problem
Part 4 – Real world examples – Blending Problem
Part 5 – Using PuLP with pandas and binary constraints to solve a scheduling problem
Part 6 – Mocking conditional statements using binary constraints