Entity Search

Entity Search is the Named Entity Recognition (NER) tool by Openindex. This tool allows us to extract all kinds of relevant information from any text, such as names, companies, organizations, locations, etc. Entity Search recognizes more and more entities in any arbitrary text by training with large available data sets. Input can be text, pdf, a website or any other format. For tests and small quantities the input can be entered via a web form, but we also offer an API service. This allows the technology to be incorporated into automated processes.

Openindex inside
Built with advanced search engine technology by openindex
Privacy guaranteed
Openindex is all about respecting privacy. All data collected is stored anonymously. No data is sold or provided to third parties.
Extract numerous entities from your text
Automatically finds people, brands, locations, time indications and much more.
Works on any website and text source
It doesn’t matter what technology is behind the websites or data sources, as long as the output is readable by our parser. The input can be any accessible website, but also text files, PDFs, etc.
Support for multiple languages
We currently support forty-eight languages for Entity Search, but additional language models are being built. It supports about forty different entity types and integrates seamlessly with our parser.
API available
A web API is available through which Entity Search can be queried. This allows the technology to be integrated into automated processes.

Try the online demo below

Enter a URL or free text in the text box below and see what information is extracted.


{{ content }}

Entity Search uses the following techniques

Apache OpenNLP
Apache OpenNLP is a machine learning based toolkit for the processing of natural language text.
Apache Solr
Solr is the popular, blazing-fast, open source enterprise search platform built on Apache Lucene™. Openindex has its own highly customized and optimized Solr instance that serves as the base of our platform.
A perceptron is a neural network in which the neurons are connected in different layers. A first layer consists of input neurons, where the input signals are applied.
Maximum Entropy
The principle of maximum entropy states that the probability distribution that best reflects the current state of knowledge is the one with the greatest entropy, in the context of accurately stated previous data.
PoS tagging
In corpus linguistics, part-of-speech tagging, also known as grammatical tagging, is the marking of a word in a text corresponding to a particular word portion, based on both the definition and the context.