SiteCrawl is a Metasearch Engine that was developed as part of a software engineering project for the Master's Degree in Computer Science at University College, Dublin during the summer of 2013.


General Features

SiteCrawl is a search engine that takes your query and returns a list of documents intended to satisfy your information needs. It is of the metasearch class, which means that it uses three underlying search engines and creates a list of the best results from all three. The idea is that the results returned would be better than any single search engine on its own. SiteCrawl is intended to have the following general features:

  1. Be useful to the end user
  2. Be intuitive and easy to use
  3. Be fast and efficient

Search Features

To use SiteCrawl just enter your query and hit the search button.
For advanced users you have the option for viewing the results as:

  1. Aggregated
  2. Non-Aggregated
  3. Clustered

For a detailed description of these display features and other complex search features see the help section here.

Technical Features

SiteCrawl has a number of features of interest to software developers:

  1. Responsive Web Design for the User Interface
  2. Object-oriented code
  3. Languages: PHP, HTML5 & CSS3
  4. Clustering feature: Term frequency and a new experimental technique called Binaclustering