Introduction to Linear Programming with Python – Part 5

Using PuLP with pandas and binary constraints to solve a scheduling problem

In this example, we’ll be solving a scheduling problem. We have 2 offshore production plants in 2 locations and an estimated demand for our products.

We want to produce a schedule of production from both plants that meets our demand with the lowest cost.

A factory can be in 2 states:

  • Off – Producing zero units
  • On – Producing between its minimum and maximum production capacities

Both factories have fixed costs, that are incurred as long as the factory is on, and variable costs, a cost per unit of production. These vary month by month.

We also know that factory B is down for maintenance in month 5.

We’ll start by importing our data.

In [1]:
import pandas as pd
import pulp
In [2]:
factories = pd.DataFrame.from_csv('csv/factory_variables.csv', index_col=['Month', 'Factory'])
factories
Out[2]:
Max_Capacity Min_Capacity Variable_Costs Fixed_Costs
Month Factory
1 A 100000 20000 10 500
B 50000 20000 5 600
2 A 110000 20000 11 500
B 55000 20000 4 600
3 A 120000 20000 12 500
B 60000 20000 3 600
4 A 145000 20000 9 500
B 100000 20000 5 600
5 A 160000 20000 8 500
B 0 0 0 0
6 A 140000 20000 8 500
B 70000 20000 6 600
7 A 155000 20000 5 500
B 60000 20000 4 600
8 A 200000 20000 7 500
B 100000 20000 6 600
9 A 210000 20000 9 500
B 100000 20000 8 600
10 A 197000 20000 10 500
B 100000 20000 11 600
11 A 80000 20000 8 500
B 120000 20000 10 600
12 A 150000 20000 8 500
B 150000 20000 12 600

We’ll also import our demand data

In [3]:
demand = pd.DataFrame.from_csv('csv/monthly_demand.csv', index_col=['Month'])
In [4]:
demand
Out[4]:
Demand
Month
1 120000
2 100000
3 130000
4 130000
5 140000
6 130000
7 150000
8 170000
9 200000
10 190000
11 140000
12 100000

As we have fixed costs and variable costs, we’ll need to model both production and the status of the factory i.e. whether it is on or off.

Production is modelled as an integer variable.

We have a value for production for each month for each factory, this is given by the tuples of our multi-index pandas DataFrame index.

In [5]:
production = pulp.LpVariable.dicts("production",
                                     ((month, factory) for month, factory in factories.index),
                                     lowBound=0,
                                     cat='Integer')

Factory status is modelled as a binary variable. It will have a value of 1 if the factory is on and a value of 0 when the factory is off.

Binary variables are the same as integer variables but constrained to be >= 0 and <=1

Again this has a value for each month for each factory, again given by the index of our DataFrame

In [6]:
factory_status = pulp.LpVariable.dicts("factory_status",
                                     ((month, factory) for month, factory in factories.index),
                                     cat='Binary')

We instantiate our model and use LpMinimize as the aim is to minimise costs.

In [7]:
model = pulp.LpProblem("Cost minimising scheduling problem", pulp.LpMinimize)

In our objective function we include our 2 costs:

  • Our variable costs is the product of the variable costs per unit and production
  • Our fixed costs is the factory status – 1 (on) or 0 (off) – multiplied by the fixed cost of production
In [8]:
model += pulp.lpSum(
    [production[month, factory] * factories.loc[(month, factory), 'Variable_Costs'] for month, factory in factories.index]
    + [factory_status[month, factory] * factories.loc[(month, factory), 'Fixed_Costs'] for month, factory in factories.index]
)

We build up our constraints

In [9]:
# Production in any month must be equal to demand
months = demand.index
for month in months:
    model += production[(month, 'A')] + production[(month, 'B')] == demand.loc[month, 'Demand']

An issue we run into here is that in linear programming we can’t use conditional constraints.

For example we can’t add to our model that if the factory is off factory status must be 0, and if it is on factory status must be 1. Before we’ve solved our model though, we don’t know if the factory will be on or off in a given month.

In this case construct constraints that have minimum and maximum capacities that are constant variables, which we multiply by the factory status.

Now, either factory status is 0 and:

  • $ \text{min_production} \geq 0$
  • $ \text{max_production} \leq 0$

Or factory status is 1 and:

  • $ \text{min_production} \leq \text{min_capacity}$
  • $ \text{max_production} \leq \text{max_capacity}$

(In some cases we can use linear constraints to model conditional statements, we’ll explore this in part 6)

In [10]:
# Production in any month must be between minimum and maximum capacity, or zero.
for month, factory in factories.index:
    min_production = factories.loc[(month, factory), 'Min_Capacity']
    max_production = factories.loc[(month, factory), 'Max_Capacity']
    model += production[(month, factory)] >= min_production * factory_status[month, factory]
    model += production[(month, factory)] <= max_production * factory_status[month, factory]
In [11]:
# Factory B is off in May
model += factory_status[5, 'B'] == 0
model += production[5, 'B'] == 0

We then solve the model

In [12]:
model.solve()
pulp.LpStatus[model.status]
Out[12]:
'Optimal'

Let’s take a look at the optimal production schedule output for each month from each factory. For ease of viewing we’ll output the data to a pandas DataFrame.

In [13]:
output = []
for month, factory in production:
    var_output = {
        'Month': month,
        'Factory': factory,
        'Production': production[(month, factory)].varValue,
        'Factory Status': factory_status[(month, factory)].varValue
    }
    output.append(var_output)
output_df = pd.DataFrame.from_records(output).sort_values(['Month', 'Factory'])
output_df.set_index(['Month', 'Factory'], inplace=True)
output_df
Out[13]:
Factory Status Production
Month Factory
1 A 1 70000
B 1 50000
2 A 1 45000
B 1 55000
3 A 1 70000
B 1 60000
4 A 1 30000
B 1 100000
5 A 1 140000
B 0 0
6 A 1 60000
B 1 70000
7 A 1 90000
B 1 60000
8 A 1 70000
B 1 100000
9 A 1 100000
B 1 100000
10 A 1 190000
B 0 0
11 A 1 80000
B 1 60000
12 A 1 100000
B 0 0

Notice above that the factory status is 0 when not producing and 1 when it is producing

In [14]:
# Print our objective function value (Total Costs)
print pulp.value(model.objective)
12906400.0

Introduction
Part 1 – Introduction to Linear Programming
Part 2 – Introduction to PuLP
Part 3 – Real world examples – Resourcing Problem
Part 4 – Real world examples – Blending Problem
Part 5 – Using PuLP with pandas and binary constraints to solve a scheduling problem
Part 6 – Mocking conditional statements using binary constraints