Crawl solutions

As an Apache Nutch Committer Openindex is an expert in web crawling. We can assist on setting up a Nutch web crawler service that fits your needs, or we do the crawling for you and provide you with the data you need. In any case we can setup a single machine or a cluster, using Hadoop, depending on the scale that needs to be crawled.

Feed your search engine

a Web crawler is often being used to deliver data to your search engine. We can help you set up a crawler that feeds your search engine and tackles all the difficulties that come along: content extraction, crawler traps, duplicates, etc.

Using Apache Solr/Lucene as a search engine, we are able to offer a crawler based custom open source search solution for your website, intranet or Document Management System.

Collect data

We can provide you with a crawler that collects data from the internet. For example we can provide you with a list of domains that use a certain CMS, that contain certain words or content or a certain widget. These data sets can be very useful for e.g. research or sales leads.

Scrape websites

We can provide you with a scraper that collects specific data from specific websites. This is a great solution if you would like to obtain, for example, all product descriptions from a certain (set of) webshop(s) on a regular base.

Data as a service

Openindex is happy to do the crawling or scraping for you. In this case, we provide you with the data you need, in the format you like, on a regular base or one-time.

Contact us
Please feel free to contact us now! Call 0(031) 50 85 36 620, send an e-mail or go to our contact page.