Note
Deployment: https://simplequranse-zeynthedev.streamlit.app/
A simple, interactive web-based search engine for the Quran, built using Python and Streamlit. This project utilizes the Vector Space Model (VSM) in Information Retrieval to find the most relevant Ayahs based on user text queries.
Unlike exact-match search engines (Boolean Retrieval), this app uses Natural Language Processing (NLP) techniques to find semantic similarities:
- TF-IDF (Term Frequency-Inverse Document Frequency): Converts the entire English translation of the Quran into a mathematical matrix, giving higher weight to unique/important keywords and ignoring common stop words.
- Cosine Similarity: Calculates the angle between the user's query vector and the document vectors to determine the relevancy score. The closer the score is to 1.0, the more relevant the Ayah is.
- Top-N Results: Users can specify how many top results they want to see.
- Smart Ranking: Results are sorted dynamically from the highest similarity score to the lowest.
- Direct Integration: Includes clickable links to Quran.com for each retrieved Ayah for further reading and context.
- Language: Python 3
- Frontend/Framework: Streamlit
- Data Manipulation: Pandas
- Machine Learning/NLP: Scikit-learn
- Dataset: English-Bahasa Indonesia Translations of the Quran (Yusuf Ali/Kemenag RI) dynamically fetched from Tanzil.net.
If you want to run this application on your local machine (or WSL), follow these steps:
1. Clone the repository
git clone https://github.com/ZeynTheDev/SimpleQuranSE.git
cd SimpleQuranSE2. Create a virtual environment (Recommended)
python3 -m venv venv
source venv/bin/activate # On Windows use: venv\Scripts\activate3. Install dependencies
pip install -r requirements.txt4. Run the app
streamlit run app.py # Replace 'app.py' with your actual python file nameThis project is open-source and available under the MIT License.
Bahasa Indonesia support was released!
*Note: Sorry, today I have no mood to chitchat as usual. Something bad occured on my IRL so sorry.