Clean, structured datasets from Sri Lankan government sources
5 Years of Data | 4 Key Ministries | Multiple Departments
- Foreign Affairs & Relations
- Immigration & Emigration
- Foreign Employment
- Tourism Development
- 🏛️ Foreign Affairs: Diplomatic missions, communications, organizational data
- 🛂 Immigration: Asylum seekers, visas, passports, refugee statistics
- 💼 Employment: Worker complaints, remittances, registration data, legal performance
- 🏖️ Tourism: Arrivals, accommodations, occupancy rates, revenue statistics
Note
🚨 Action Required: View the Missing Datasets Report to see which datasets need to be populated.
| Dataset Name | Years Available | Collection Status | Verification Status |
|---|---|---|---|
| Media Releases from Ministry of Foreign Affairs | 2020, 2021, 2022, 2023 | ✅ Collected | ✅ Verified (2020, 2021, 2022, 2023) |
| Cadre Management of Ministry of Foreign Relations | 2020, 2022 | ✅ Collected | ✅ Verified (2020, 2022) |
| Dataset Name | Years Available | Collection Status | Verification Status |
|---|---|---|---|
| Accommodations by District | 2020, 2021, 2022, 2023, 2024 | ✅ Collected | ✅ Verified (2020, 2021, 2022, 2023, 2024) |
| Accommodations by Province | 2020, 2021, 2022, 2023, 2024 | ✅ Collected | ✅ Verified (2020, 2021, 2022, 2023, 2024) |
| Annual Tourism Receipts | 2020, 2021, 2022, 2023 | ✅ Collected | ✅ Verified (2020, 2021, 2022, 2023) |
| Arrivals by Age | 2020, 2021, 2023, 2024 | ✅ Collected | ✅ Verified (2020, 2021, 2023, 2024) |
| Arrivals by Carrier | 2020, 2021, 2022, 2023, 2024 | ✅ Collected | ✅ Verified (2020, 2021, 2022, 2023, 2024) |
| Arrivals by Country | 2020, 2021, 2022 | ✅ Collected | ✅ Verified (2020, 2021, 2022) |
| Arrivals by Month | 2020, 2021, 2022, 2023, 2024 | ✅ Collected | ✅ Verified (2020, 2021, 2022, 2023, 2024) |
| Arrivals by Port | 2020, 2021, 2022, 2023 | ✅ Collected | ✅ Verified (2020, 2021, 2022, 2023) |
| Arrivals by Purpose | 2020, 2021, 2022, 2023, 2024 | ✅ Collected | ✅ Verified (2020, 2021, 2022, 2023, 2024) |
| Arrivals by Sex | 2020, 2021, 2023, 2024 | ✅ Collected | ✅ Verified (2020, 2021, 2023, 2024) |
| Arrivals by Month vs Country | 2020, 2021, 2022, 2023, 2024 | ✅ Collected | ✅ Verified (2020, 2021, 2022, 2023, 2024) |
| Location vs Revenue vs Visitors Count | 2020, 2021, 2023, 2024 | ✅ Collected | ✅ Verified (2020, 2021, 2023, 2024) |
| Occupancy Rate by District | 2020, 2021 | ✅ Collected | ✅ Verified (2020, 2021) |
| Occupancy Rate by Month | 2020, 2021 | ✅ Collected | ✅ Verified (2020, 2021) |
| Top 10 source markets | 2020, 2021, 2022, 2023, 2024 | ✅ Collected | ✅ Verified (2020, 2021, 2022, 2023, 2024) |
| Dataset Name | Years Available | Collection Status | Verification Status |
|---|---|---|---|
| Number of complaints received | 2020, 2021, 2022, 2023 | ✅ Collected | ✅ Verified (2020, 2021, 2022, 2023) |
| Number of complaints resolved | 2020, 2021, 2022, 2023 | ✅ Collected | ✅ Verified (2020, 2021, 2022, 2023) |
| Legal division performance | 2020, 2021, 2022, 2023 | ✅ Collected | ✅ Verified (2020, 2021, 2022, 2023) |
| Local arrivals | 2020, 2021, 2022, 2023 | ✅ Collected | ✅ Verified (2020, 2021, 2022, 2023) |
| Local departures | 2020, 2021, 2022, 2023 | ✅ Collected | ✅ Verified (2020, 2021, 2022, 2023) |
| Monthly foreign exchange earnings | 2020, 2021, 2022, 2023 | ✅ Collected | ✅ Verified (2020, 2021, 2022, 2023) |
| Number of raids conducted | 2022 | ✅ Collected | ✅ Verified (2022) |
| Private Remittances (Region-wise) | 2020, 2021 | ✅ Collected | ✅ Verified (2020, 2021) |
| SLBFE Registration by Age & Manpower Level | 2020, 2021, 2022, 2023 | ✅ Collected | ✅ Verified (2020, 2021, 2022, 2023) |
| SLBFE Registration by Age | 2020, 2021, 2022, 2023 | ✅ Collected | ✅ Verified (2020, 2021, 2022, 2023) |
| SLBFE registration by country vs manpower level | 2020, 2021, 2022, 2023, 2024 | ✅ Collected | ✅ Verified (2020, 2021, 2022, 2023, 2024) |
| SLBFE registration by country | 2020, 2021, 2022 | ✅ Collected | ✅ Verified (2020, 2021, 2022) |
| SLBFE Registration by District, Manpower Level & Gender - 2020 | 2020 | ✅ Collected | ✅ Verified (2020) |
| SLBFE Registration by District, Manpower Level & Gender - 2023 | 2023 | ✅ Collected | ✅ Verified (2023) |
| SLBFE Registration by District, Manpower Level & Gender | 2021, 2022 | ✅ Collected | ✅ Verified (2021, 2022) |
| SLBFE Registration by district | 2024 | ✅ Collected | ✅ Verified (2024) |
| SLBFE registration by gender | 2020, 2021, 2022, 2024 | ✅ Collected | ✅ Verified (2020, 2021, 2022, 2024) |
| SLBFE Registration by Manpower Level & Gender | 2020, 2021, 2022, 2023, 2024 | ✅ Collected | ✅ Verified (2020, 2021, 2022, 2023, 2024) |
| SLBFE registration by manpower level | 2020, 2021, 2022, 2023, 2024 | ✅ Collected | ✅ Verified (2020, 2021, 2022, 2023, 2024) |
| SLBFE Registration through Private Sources by Country | 2020, 2021 | ✅ Collected | ✅ Verified (2020, 2021) |
| SLBFE Registration all Sources by Country | 2022 | ✅ Collected | ✅ Verified (2022) |
| Workers Remittances | 2020, 2021, 2022, 2023 | ✅ Collected | ✅ Verified (2020, 2021, 2022, 2023) |
State Minister of Internal Security / Minister of Investment Planning / Minister of Investment Promotion / Minister of Public Security and Parliamentary Affairs
| Dataset Name | Years Available | Collection Status | Verification Status |
|---|---|---|---|
| asylum_seekers_by_nationality | 2020, 2021, 2023, 2024 | ✅ Collected | ✅ Verified (2020, 2021, 2023) |
| deportations_by_nationality | 2020, 2021, 2022, 2023, 2024 | ✅ Collected | ✅ Verified (2020, 2021, 2022, 2023, 2024) |
| refugees_by_nationality | 2020, 2021, 2022, 2023, 2024 | ✅ Collected | ✅ Verified (2020, 2021, 2022, 2023, 2024) |
| refused_entry_by_nationality | 2020, 2021, 2022, 2023, 2024 | ✅ Collected | ✅ Verified (2020, 2021, 2022, 2023) |
| fake_passport_detection_by_nationality | 2023 | ✅ Collected | ✅ Verified (2023) |
| fraudulent_visa_detection_by_nationality | 2023 | ✅ Collected | ✅ Verified (2023) |
- 2019 ❗❗ This data has not been verified yet
- 2020-2021
- 2022-2023
- 2024
📖 Browse all data interactively →
🌐 View online at GitHub Pages →
All datasets are in clean JSON format with metadata .
This repository contains cleaned and organized datasets from various Sri Lankan government public sources, compiled by the Lanka Data Foundation. The data spans from 2019 to 2024 and covers multiple ministries and departments.
To run the data ingestion and utility scripts, you'll need to set up the Python environment. We recommend using Mamba (or Conda).
-
Create the environment:
mamba env create -f environment.yml
(If using Conda:
conda env create -f environment.yml) -
Activate the environment:
mamba activate datasets_env
-
Run the scripts:
# Run the optimized ingestion script python insert.py # Run the attribute writer (optional year filter) python write_attributes.py --year 2023
- Total Years: 6 (2019-2024)
- Total Datasets: 175+ JSON files
- Ministries Covered: 4 main categories
- Data Sources: Public government sources
datasets/
├── data/ # Main data directory
│ ├── 2019/ # Year-based organization
│ ├── 2020/
│ ├── 2021/
│ ├── 2022/
│ └── 2023/
├── generate_static_html.py # HTML generator script
├── index.html # Generated static HTML
├── styles.css # CSS stylesheet
└── README.md # This file
Data is organized hierarchically:
- Year → Government → President → Ministry → Department → Data Files
Each dataset contains:
data.json- The main datasetmetadata.json- Metadata about the dataset (optional)
- Create a new folder under
data/(e.g.,data/2024/) - Follow the existing folder structure:
data/2024/ └── Government of Sri Lanka(government)/ └── [President Name](citizen)/ └── [Ministry Name](minister)/ └── [Department Name](department)/ ├── [category]/ │ ├── data.json │ └── metadata.json (optional)
- Navigate to the appropriate year folder in
data/ - Follow the existing hierarchy to find the correct ministry/department
- Add your
data.jsonand optionalmetadata.jsonfiles
- data.json: Must contain valid JSON data
- metadata.json: Optional, should contain dataset metadata (description, source, etc.)
- Files must be placed in appropriately named folders with category indicators
The API documentation website is built with Jekyll on GitHub Pages. The data listing is auto-generated and injected into docs/index.md.
To update the data listing:
- Run the update script:
python3 update_dataset_index.py
- This will:
- Scan the
data/directory. - Generate ZIP files for each year.
- Inject the file listing into
docs/index.md.
- Scan the
- Commit and push changes to
mainbranch.
- Automatically created for each year folder
- Contains all JSON files from that year
- Named as
[YEAR]_Data.zip(e.g.,2019_Data.zip)
- Interactive collapsible sections
- Download buttons for yearly ZIP files
- In-browser JSON viewer with copy/download functionality
- Responsive design with CSS styling
- Use
(government),(citizen),(minister),(department)suffixes for proper categorization - Use
(AS_CATEGORY)for sub-categories - Underscores in folder names will be converted to spaces in display
Edit the get_emoji_for_type() function in generate_static_html.py:
emoji_map = {
'your_category': '🎯',
# ... existing mappings
}Edit styles.css to customize the appearance:
- Colors, fonts, spacing
- Responsive breakpoints
- Modal styling for JSON viewer
The script automatically counts datasets, but you can manually update the description in the main() function.
The generated index.html is ready for deployment on:
- GitHub Pages
- Any static hosting service
- Local web servers
For any enquiries please contact: [email protected]
Codebase at: https://github.com/LDFLK/datasets
See LICENSE file for details.