Introduction to Linear Programming with Python – Part 5
Using PuLP with pandas and binary constraints to solve a scheduling problem
In this example, we’ll be solving a scheduling problem. We have 2 offshore production plants in 2 locations and an estimated demand for our products.
We want to produce a schedule of production from both plants that meets our demand with the lowest cost.
A factory can be in 2 states:
- Off – Producing zero units
- On – Producing between its minimum and maximum production capacities
Both factories have fixed costs, that are incurred as long as the factory is on, and variable costs, a cost per unit of production. These vary month by month.
We also know that factory B is down for maintenance in month 5.
We’ll start by importing our data.
import pandas as pd
import pulp
factories = pd.DataFrame.from_csv('csv/factory_variables.csv', index_col=['Month', 'Factory'])
factories
We’ll also import our demand data
demand = pd.DataFrame.from_csv('csv/monthly_demand.csv', index_col=['Month'])
demand
As we have fixed costs and variable costs, we’ll need to model both production and the status of the factory i.e. whether it is on or off.
Production is modelled as an integer variable.
We have a value for production for each month for each factory, this is given by the tuples of our multi-index pandas DataFrame index.
production = pulp.LpVariable.dicts("production",
((month, factory) for month, factory in factories.index),
lowBound=0,
cat='Integer')
Factory status is modelled as a binary variable. It will have a value of 1 if the factory is on and a value of 0 when the factory is off.
Binary variables are the same as integer variables but constrained to be >= 0 and <=1
Again this has a value for each month for each factory, again given by the index of our DataFrame
factory_status = pulp.LpVariable.dicts("factory_status",
((month, factory) for month, factory in factories.index),
cat='Binary')
We instantiate our model and use LpMinimize as the aim is to minimise costs.
model = pulp.LpProblem("Cost minimising scheduling problem", pulp.LpMinimize)
In our objective function we include our 2 costs:
- Our variable costs is the product of the variable costs per unit and production
- Our fixed costs is the factory status – 1 (on) or 0 (off) – multiplied by the fixed cost of production
model += pulp.lpSum(
[production[month, factory] * factories.loc[(month, factory), 'Variable_Costs'] for month, factory in factories.index]
+ [factory_status[month, factory] * factories.loc[(month, factory), 'Fixed_Costs'] for month, factory in factories.index]
)
We build up our constraints
# Production in any month must be equal to demand
months = demand.index
for month in months:
model += production[(month, 'A')] + production[(month, 'B')] == demand.loc[month, 'Demand']
An issue we run into here is that in linear programming we can’t use conditional constraints.
For example we can’t add to our model that if the factory is off factory status must be 0, and if it is on factory status must be 1. Before we’ve solved our model though, we don’t know if the factory will be on or off in a given month.
In this case construct constraints that have minimum and maximum capacities that are constant variables, which we multiply by the factory status.
Now, either factory status is 0 and:
- $ \text{min_production} \geq 0$
- $ \text{max_production} \leq 0$
Or factory status is 1 and:
- $ \text{min_production} \geq \text{min_capacity}$
- $ \text{max_production} \leq \text{max_capacity}$
(In some cases we can use linear constraints to model conditional statements, we’ll explore this in part 6)
# Production in any month must be between minimum and maximum capacity, or zero.
for month, factory in factories.index:
min_production = factories.loc[(month, factory), 'Min_Capacity']
max_production = factories.loc[(month, factory), 'Max_Capacity']
model += production[(month, factory)] >= min_production * factory_status[month, factory]
model += production[(month, factory)] <= max_production * factory_status[month, factory]
# Factory B is off in May
model += factory_status[5, 'B'] == 0
model += production[5, 'B'] == 0
We then solve the model
model.solve()
pulp.LpStatus[model.status]
Let’s take a look at the optimal production schedule output for each month from each factory. For ease of viewing we’ll output the data to a pandas DataFrame.
output = []
for month, factory in production:
var_output = {
'Month': month,
'Factory': factory,
'Production': production[(month, factory)].varValue,
'Factory Status': factory_status[(month, factory)].varValue
}
output.append(var_output)
output_df = pd.DataFrame.from_records(output).sort_values(['Month', 'Factory'])
output_df.set_index(['Month', 'Factory'], inplace=True)
output_df
Notice above that the factory status is 0 when not producing and 1 when it is producing
# Print our objective function value (Total Costs)
print pulp.value(model.objective)
Introduction
Part 1 – Introduction to Linear Programming
Part 2 – Introduction to PuLP
Part 3 – Real world examples – Resourcing Problem
Part 4 – Real world examples – Blending Problem
Part 5 – Using PuLP with pandas and binary constraints to solve a scheduling problem
Part 6 – Mocking conditional statements using binary constraints