Skip to content
This repository was archived by the owner on Sep 24, 2025. It is now read-only.

Implement gatherling.com scraper #15

@Badaro

Description

@Badaro

This has been suggested by @Aliquanto3 as a well to get Premodern data into our dataset, and bakert commented on Discord that we could also use this for Penny Dreadful.

There's two paths to go here.

Plan A: Database dump. According to bakert they generate a database dump (with some things redatacted) every 24h here:
https://pennydreadfulmagic.com/static/dev-db.sql.gz

This is a MariaDB data dump, so the process should be fairly simple:

  • Automate the creation of a docker container with MariaDB importing this script
  • Extract data from the DB dump

We could also explore using embedded MariaDB which would facilitate a few things, but it doesn't look like they provide Windows builds for that.

Plan B: Scraping. There's an eventinfo route that seems to contain all the info we need for an individual tournament.
https://gatherling.com/api.php?action=eventinfo&event=Pre-Modern%20Monthly%20League%2011.05

The only thing missing is a way to list older events. There's an event list page here, but it doesn't match the way the scraper works very well since there's no way to navigate by date.
https://gatherling.com/eventreport.php

There's no documentation for the API but the code is available on Github, so we can also explore if there's other routes that could help:
https://github.com/PennyDreadfulMTG/gatherling/blob/dev/gatherling/api.php

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions