Skip to content

geekydevu/boolean-retrieval-model

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 

Repository files navigation

A simple Information retrieval engine which works on boolean queries in python. A boolean query contains the operators AND, OR, NOT .

Requirements :
[1.] IPython notebook installed
[2.] Corpus for building the boolean IR system

Notes on how to take input -:
->  On running the ipython notebook, it asks the user to input the number of documents in th training corpora.   The documents should be named in the way -> "doc"+str(doc_id)+".txt". For eg. if there are 10 documents, then there names should be "doc1.txt","doc2.txt","doc3.txt","doc4.txt","doc5.txt","doc6.txt","doc7.txt","doc8.txt","doc9.txt","doc1.txt". 
->  The documents must be placed in a folder named 'local_corpus' . The folder 'local_corpus' and the ipython file must be in the same directory.
->  The queries supported in my basic model are of the form AND, OR, NEGATION 
    e.g. 1.) machine AND learning
         2.) politics AND football
         3.) world AND wide AND web AND internet
         4.) politics or football
         5.) information OR retrieval OR search OR engines
         6.) NEGATION(supervised)
         7.) NEGATION(regression)

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors