Pricing Shared Rides

This archive is distributed in association with the journal Operations Research under the MIT License.

The software and data in this repository are a snapshot of the software and data that were used in the research reported in the paper Pricing Shared Rides by Chiwei Yan, Julia Yan, and Yifan Shen.

Cite

To cite the contents of this repository, please cite both the paper and this repo, using their respective DOIs.

https://doi.org/10.1287/opre.2023.0513

https://doi.org/10.1287/opre.2023.0513.cd

Below is the BibTex for citing this snapshot of the repository.

@misc{yan2024pricing,
  author =        {Yan, Chiwei and Yan, Julia and Shen, Yifan},
  publisher =     {Operations Research},
  title =         {{Pricing Shared Rides}},
  year =          {2025},
  doi =           {10.1287/opre.2023.0513.cd},
  note =          {Available for download at https://github.com/ORJournal/2023.0513},
}

Description

The goal of this repository is to replicate the computational experiments described in Section 4 of the paper Pricing Shared Rides by Chiwei Yan, Julia Yan, and Yifan Shen.

Dependencies

This project was developed and tested using Python 3.9.16. The following Python packages are required:

numpy==1.24.3
pandas==1.5.3
gurobipy==11.0.2
networkx==3.1
osmnx==1.3.1
matplotlib==3.7.1
graph-tool==2.55
tqdm
jupyter
ipython

Installation

It is recommended to use Conda to manage the Python environment and install the dependencies. Run the following commands in your terminal to create and activate the Conda environment:

conda env create -f environment.yml
conda activate shared_pricing_env

Important:

This project relies on the graph-tool package. For Linux and MacOS users, graph-tool can be installed via Conda with the command above and you don't need to install it separately. For Windows users, please refer to the graph-tool installation instructions for guidance on using Docker.
This project uses Gurobi as the optimization solver. You must activate a valid license (see Gurobi).

Hardware Requirements

The experiments were conducted on a Linux-based virtual machine with 8 CPU cores and 128 GB of RAM.
It is recommended to replicate the results on a machine with at least 4 CPU cores and at least 25 GB of RAM.
Running the script for a single representative parameter setting usually took 7-10 hours (to finish the complete training and evaluation process) on the above machine, with a peak memory usage around 20 GB RAM.

Replicating results

Scripts for running experiments and visualizing results are provided in the scripts folder. The source codes for the algorithms are located in the src folder, and the input data in the data folder. The plots (corresponding to Figures 9, 10, 11 in the paper) are saved in the results folder.

To interactively implement a quick example, use:
```
scripts/computational_experiments.ipynb
```
This notebook allows you to run a quick demo with a small instance with only 10 rider types (compared to 244 in the full instance), which runs in about 10 minutes and requires approximately 7 GB of RAM. To run a larger or full instance, you can adjust the parameter n_quick in the notebook.
To run all experiments under all parameter settings and generate full results:
```
python scripts/computational_experiments.py | ts
```
The results include data in Tables 2 (test set performance metrics) and EC2 (training set performance metrics) of the paper, and plots for Figures 9–11 of the paper (map visualizations for one representative setting of c=0.9 USD/mile and sojourn_time=300 seconds). You can change the parameters c_values and sojourn_time_values at the end of the script to generate results for a subset of parameter settings.

Codebase Structure

`scripts/`

This directory contains scripts for running computational experiments and generating results from the paper.

computational_experiments.py runs the full set of experiments across all parameter settings and saves results to the results folder.
computational_experiments.ipynb is an interactive notebook version for a specific parameter setting. You can run a quick demo with fewer rider types or run a full instance by changing the parameters in the notebook.

`src/`

This directory contains the source files of the implementation of the algorithms:

instance_data.py contains the class that stores input data and parameters for each instance.
network.py contains the class for modeling the network and computing distances.
policies.py contains the class to optimize the pricing and matching policies.
simulation.py contains the class to simulate shared ride operations under given policies.
evaluator.py contains the class to evaluate the performance metrics of the policies based on simulation results.
utils.py contains utility functions.

`data/`

This directory contains the input data used in the experiments.

Chicago_network.pickle contains the Chicago road network data processed based on OpenStreetMap.
Chicago_zone.pickle contains the zone data for Chicago, which includes:
- shapes_gdf is the data for the original 76 community areas (excluding O’Hare International Airport) in Chicago, from CARTO.
- clusters_gdf is the 42 zones aggregated via k-means clustering.
Chicago_rider_types.csv contains the aggregated rider types data for Chicago. Each rider type corresponds to a pick-up node, a drop-off node, and an arrival rate (# per second).
- The rider types are aggregated based on the real Chicago shared ride requests data over an eight-week horizon in October and November 2019 during Monday morning peak hours (7:30-8:30 a.m., from 2019-10-07 to 2019-11-25). The original ride-sharing data is available at the Chicago Data Portal.
- A two-step clustering method is used to construct the 42 rider types:
  - The 76 original community areas of Chicago are grouped into 42 aggregated zones using k-means clustering (as in Chicago_zone.pickle).
  - Within each pick-up zone, riders are further grouped by their drop-off locations using another round of k-means clustering.
- Each resulting rider type has a minimum of 5 trip records in the training dataset.
Chicago_demand.pickle contains the arrival data over training (weeks 1-7) and test sets (week 8), which includes:
- arrival_types: maps dataset (training or test) and date to a rider type list.
- arrival_times: maps dataset (training or test) and date to an arrival time (0 to 3600 seconds) list.

`results/`

This directory stores the results of the computational experiments, including:

tables/ contains the data for Tables 2 and EC2 in the paper. Each CSV file corresponds to a specific parameter setting (c and sojourn_time) under the training or test dataset.
figures/ contains the map visualizations for Figures 9, 10, and 11 in the paper. Note that Figure 11(a) only shows the demand density distribution and is not part of the computational experiment results, so it is not included.
log_c=0.7_sojourn_time=30.txt is a sample log file generated during the execution of computational_experiments.py under the parameter setting c = 0.7 USD/mile and sojourn_time = 30 seconds.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
data		data
results		results
scripts		scripts
src		src
.gitignore		.gitignore
AUTHORS		AUTHORS
LICENSE		LICENSE
README.md		README.md
environment.yml		environment.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Pricing Shared Rides

Cite

Description

Dependencies

Installation

Hardware Requirements

Replicating results

Codebase Structure

`scripts/`

`src/`

`data/`

`results/`

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Pricing Shared Rides

Cite

Description

Dependencies

Installation

Hardware Requirements

Replicating results

Codebase Structure

scripts/

src/

data/

results/

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`scripts/`

`src/`

`data/`

`results/`

Packages