Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
89 changes: 89 additions & 0 deletions .github/ISSUE_TEMPLATE/add-tool.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,89 @@
name: Add New Tool
description: Suggest a new data engineering tool to be added to the list
title: "[ADD] "
labels: ["enhancement", "new-tool"]
body:
- type: markdown
attributes:
value: |
Thanks for suggesting a new tool! Please provide the following information to help us evaluate your suggestion.

- type: input
id: tool-name
attributes:
label: Tool Name
description: What is the name of the tool?
placeholder: ex. Apache Spark
validations:
required: true

- type: input
id: tool-url
attributes:
label: Tool URL
description: Official website or repository URL
placeholder: https://spark.apache.org/
validations:
required: true

- type: dropdown
id: category
attributes:
label: Category
description: Which category does this tool belong to?
options:
- Data Ingestion
- Data Storage
- Data Transformation
- Orchestration & Workflow
- Stream Processing
- Batch Processing
- Data Quality & Observability
- Data Discovery & Governance
- Reverse ETL
- Analytics & Visualization
- AI/ML & LLM Infrastructure
- Infrastructure & Deployment
- Other (specify in description)
validations:
required: true

- type: textarea
id: description
attributes:
label: Tool Description
description: Provide a one-sentence description (under 100 characters)
placeholder: "Unified analytics engine for large-scale data processing with SQL, streaming, ML, and graph capabilities."
validations:
required: true

- type: textarea
id: rationale
attributes:
label: Why should this tool be included?
description: |
Explain why this tool deserves to be on the list:
- Is it production-ready and actively maintained?
- Is it used by known companies at scale?
- Does it solve a problem not addressed by existing entries?
placeholder: |
- Used by 80% of Fortune 500 companies
- Active development with releases every 2 months
- Provides unique capabilities for X use case
validations:
required: true

- type: checkboxes
id: checklist
attributes:
label: Submission Checklist
description: Please confirm the following
options:
- label: The tool is production-ready and actively maintained
required: true
- label: I have checked that this tool is not already listed
required: true
- label: The tool has real-world production usage
required: true
- label: I have read the [contribution guidelines](../contributing.md)
required: true
53 changes: 53 additions & 0 deletions .github/ISSUE_TEMPLATE/broken-link.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
name: Report Broken Link
description: Report a broken or outdated link
title: "[BROKEN LINK] "
labels: ["broken-link", "bug"]
body:
- type: markdown
attributes:
value: |
Thanks for helping us maintain quality links!

- type: input
id: broken-url
attributes:
label: Broken URL
description: What is the broken link?
placeholder: https://example.com/tool
validations:
required: true

- type: input
id: tool-name
attributes:
label: Tool/Resource Name
description: What tool or resource does this link belong to?
placeholder: Apache Kafka
validations:
required: true

- type: dropdown
id: issue-type
attributes:
label: Issue Type
options:
- 404 - Page not found
- 500 - Server error
- Timeout - Site not responding
- Moved - URL has changed
- Other
validations:
required: true

- type: input
id: suggested-url
attributes:
label: Suggested Replacement URL (if known)
description: If you know the new URL, please provide it
placeholder: https://new-url.com

- type: textarea
id: additional-context
attributes:
label: Additional Context
description: Any other information that might help
57 changes: 57 additions & 0 deletions .github/ISSUE_TEMPLATE/update-tool.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
name: Update Existing Tool
description: Suggest updates to an existing tool entry
title: "[UPDATE] "
labels: ["update"]
body:
- type: input
id: tool-name
attributes:
label: Tool Name
description: Which tool needs updating?
placeholder: ex. dbt
validations:
required: true

- type: dropdown
id: update-type
attributes:
label: Update Type
description: What kind of update is needed?
options:
- Description update
- URL change
- Category change
- Add sub-tool/related project
- Remove deprecated tool
- Other
validations:
required: true

- type: textarea
id: current-entry
attributes:
label: Current Entry
description: Copy the current entry from README
placeholder: |
* [Tool](url) - Current description
validations:
required: true

- type: textarea
id: proposed-change
attributes:
label: Proposed Change
description: What should it be changed to?
placeholder: |
* [Tool](new-url) - Updated description
validations:
required: true

- type: textarea
id: rationale
attributes:
label: Rationale
description: Why is this update necessary?
placeholder: The tool was acquired, URL changed, new features added, etc.
validations:
required: true
58 changes: 58 additions & 0 deletions .github/PULL_REQUEST_TEMPLATE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
# Pull Request

## Description

<!-- Provide a clear description of what this PR does -->

## Type of Change

<!-- Check all that apply -->

- [ ] Add new tool(s)
- [ ] Update existing tool(s)
- [ ] Fix broken link(s)
- [ ] Update documentation
- [ ] Fix typo/formatting
- [ ] Remove deprecated tool(s)

## Checklist

<!-- Make sure you've completed all items before submitting -->

### General
- [ ] I have read the [contribution guidelines](../contributing.md)
- [ ] My PR follows the project's formatting standards
- [ ] I have checked that no similar PR exists

### For New Tools
- [ ] Tool is actively maintained (commits within last 6 months)
- [ ] Tool is production-ready and used at scale
- [ ] Tool solves a problem not addressed by existing entries
- [ ] Description is clear, concise, and actionable (< 100 characters)
- [ ] Link uses HTTPS where available
- [ ] Link points to official source (not third-party)
- [ ] Tool is placed in the correct category and subcategory
- [ ] Tool is added to the **bottom** of the appropriate subcategory

### For Updates
- [ ] Changes accurately reflect current state of the tool
- [ ] Links have been verified and are working
- [ ] Description improvements are factually correct

### For Link Fixes
- [ ] New URL has been verified and tested
- [ ] New URL points to official source

## Additional Context

<!-- Add any additional information, screenshots, or context here -->

## Related Issues

<!-- Reference any related issues using #issue_number -->

Closes #

---

**Thank you for contributing to Awesome Data Engineering!** 🚀
62 changes: 62 additions & 0 deletions .github/workflows/link-check.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
name: Check Links

on:
schedule:
# Run weekly on Mondays at 9am UTC
- cron: '0 9 * * 1'
workflow_dispatch:
pull_request:
paths:
- 'README.md'
- 'contributing.md'

jobs:
link-check:
runs-on: ubuntu-latest
steps:
- name: Checkout repository
uses: actions/checkout@v4

- name: Check links in README
uses: lycheeverse/lychee-action@v1
with:
files: README.md contributing.md
args: '--verbose --no-progress --exclude-mail --exclude twitter.com --exclude linkedin.com --max-retries 3 --timeout 20'
fail: true
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}

- name: Create issue if links are broken
if: failure()
uses: actions/github-script@v7
with:
script: |
const title = '🔗 Broken Links Detected';
const body = `Automated link checking has detected broken links in the repository.

Please review the [workflow run](${context.payload.repository.html_url}/actions/runs/${context.runId}) for details.

Common causes:
- Website temporarily down
- URL changed/deprecated
- Typo in URL

Action required: Update or remove broken links.`;

// Check if issue already exists
const issues = await github.rest.issues.listForRepo({
owner: context.repo.owner,
repo: context.repo.repo,
state: 'open',
labels: ['broken-links']
});

if (issues.data.length === 0) {
await github.rest.issues.create({
owner: context.repo.owner,
repo: context.repo.repo,
title: title,
body: body,
labels: ['broken-links', 'automated']
});
}
28 changes: 28 additions & 0 deletions .github/workflows/markdown-lint.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
name: Markdown Lint

on:
pull_request:
paths:
- '**.md'
push:
branches:
- main
- master
paths:
- '**.md'

jobs:
markdown-lint:
runs-on: ubuntu-latest
steps:
- name: Checkout repository
uses: actions/checkout@v4

- name: Lint markdown files
uses: DavidAnson/markdownlint-cli2-action@v15
with:
globs: '**/*.md'
config: '.markdownlint.json'

- name: Check awesome list
uses: max/awesome-lint@v2
11 changes: 11 additions & 0 deletions .markdownlint.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
{
"default": true,
"MD001": true,
"MD003": { "style": "atx" },
"MD004": { "style": "dash" },
"MD007": { "indent": 2 },
"MD013": false,
"MD024": { "siblings_only": true },
"MD033": false,
"MD041": false
}
Loading