Buying Intent Prediction Pipeline in More Detail

The NinaData buying intent platform is a real-time contextual data platform for driving in-moment online results for brands.
Using purpose-built AI for semantic analysis of text, video and images, NinaData builds sustainable data value and insights for brands and content owners.

Powerful Machine Learning techniques

Statistics of the co-occurrence of normalized forms of particular words and phrases are the basis of all NLP. But to capture the meaning in a way that matches the language-independent thinking of an expert we need to have something more than just these descriptions of surface form distribution.

To achieve this we regard a set of keywords or a piece of content as a guide for navigation in a semantic model for a particular domain. A query, a web page, or any other content, will be mapped to a specific neighbourhood in a semantic space representing expert knowledge for a particular domain. This mapping is based on Deep Learning and other Machine Learning techniques.

The platform consists of four internal stages of the pipeline:​

Crawling of the online content and the extraction of main content and metadata.

Generation of contextual keywords that represent the features of the content that enable maximally precise prediction in the next phase. In essence they represent a particular set of possible sub-contexts with weights, selected automatically to maximize the precision of the next phase.

Buying Intent Prediction. In this stage the system classifies over contextual variables, given the contextual keyword generation stage output.

Relevance Matching, where the analyzed URLs closest to the desired target (in the contextual sense), are fetched from a database and returned as the result.

The Architecture

Our approach is based on an architecture of scalable microservices, using modern frameworks, and hosted in the cloud. These are the technical choices we have made:​

Platforms: AWS, Github
Languages: Python, Javascript, HTML, CSS
Database: MongoDB, Redis, Hbase
Message Queues: SQS, Apache Kafka.