A Web crawler is a computer program that browses the World Wide Web in a methodical, automated manner or in an orderly fashion. Other terms for Web crawlers are ants, automatic indexers, bots, Web spiders, Web robots, or—especially in the FOAF community—Web scutters. [Wiki Reference]

Crawlers range in scale, from small ones that only fetch a couple hundred URLs, to web-scale beasts like the GoogleBot that indexes a sizable portion of the entire Internet.
Building crawlers is our passion and expertise. We have crawled and extracted data from websites ranging from the mundane to the insanely difficult - even through site with javascript navigation. Crawler can be python-based and work autonomously or php-based and work by user/manual trigger. We can also implement drupal-based crawlers that can work as a module that are triggered from cron. Other programming languages can also be used - in case there is an explicit requirement.
  • Outsourced model

In the outsourced model, we develop and run the crawls, then extract the data and present it to you as a deliverable in a predescussed format.

Pros: No infrastructure costs required on your end, and no crawl or programming expertise required. You need only MS Excel to read csv/tsv files or a MySQL database to read the data - local or remote. Lower costs as you only for the deliverable.
Cons: No access to source-code.


  •  Consulting model

In the consulting model, we function as a consultant to your company and you own the intellectual property we create in the form of the crawler. We develop the crawler and install it on your own servers.
Pros: You own the source code and have total control over the crawls.
Cons: More costly. Some technical and/or programming expertise may be required to run crawls.

