Graph AI is powerful for patent searching, but it lends itself also to other use cases. Classification of patents according to a custom technology taxonomy is one of them. This article summarizes the steps that are needed to get started with AI classification.
Note: AI classifier is a module of IPRally that is sold separately.
Contact your Account Manager or sales@iprally.com to get access.
Introduction: Tags and Classifiers
Technology tags
The purpose of technology classification is to give documents labels that match both with the internal class system used by the customer and the content of the document. The labels are called Technology tags or just Tags in IPRally. Tags can be managed via the Settings panel.
Tags can be given a descriptive name and color.
Tags can be created, modified and deleted in the Settings panel, or created on-the-fly in search result lists and collections, or through XLS imports.
Tags are company-wide, i.e. each user of the organization can see the same set of tags.
Classifiers
AI Classifiers are kind of robots, that are trained - using human-labeled data - to predict the technology area of a new publication - or a set of publications at once.
Trained and active Classifiers can be seen and managed in the Settings panel.
Each trained classifier box shows when it was created, last trained and how big the training data set was. It also shows which Tags the classifier is trained to predict.
The sensitivity of the classifier can also be adjusted in the classifier box. After creation, the sensitivity defaults to "AI Optimum", i.e., a value that matches best with the training set.
Getting started
Note: Using Tags and Classifiers requires that the AI Classifier
module is enabled for your organization and that respective rights
are granted to you by the Admin user of the organization.
Step 1: Create a training Collection for a classifier
1A. Create an empty Collection
A collection can be created on the main page of IPRally:
1B. Add tagged documents to the collection
If you have historical tagged patent documents available, the most convenient way to import them as an Excel spreadsheet through the "Add documents or combine searches" dialogue.
Under the File tab, you can add an Excel file. The spreadsheet should have a column with publication numbers and another column with the tags. Each publication may have multiple tags on multiple lines or on a single cell separated by comma (,) or semicolon (;).
In the next stage, you choose which column is which, and, after selecting the relevant column, will see the publication numbers and tags to be associated with them.
When you complete the import dialogue, you'll see the documents with the Tags added to the collection. In this case, there are three Tags, each assigned to 50 patent documents/families. Each document/family can have multiple Tags.
In case you do not have historical data available, you can also easily label the data inside IPRally via the "+ Add tag" pop-up menu:
...or the Classify menu:
Note: You need to have a representative enough training data set to
train a classifier that works well in practice. That means that for
each Tag, you need to have at least a few (preferably 10+) positive
samples (having the Tag attached) and a few (preferably 10+) negative
samples (not having the Tag attached). The amount of data needed
depends on the granularity of the class system used is best found out
by testing!
Step 2: Train a classifier
When you have a training data set prepared, you can easily train the classifier inside the training collection (or via the Settings -> AI classifiers panel). Just press the CREATE CLASSIFIER button the get started.
Just give the classifier a descriptive and unique name...
... and voilá, the data set is fed to the Graph AI system for training purposes, yielding a ready-to-use classifier.
If you want a more "generous" classifier that predicts more Tags with higher uncertainly, you can move the sensitivity slider to the right. In the opposite case, move it to the left. No re-training is needed after sensitivity adjustment.
Now, we are ready to use the classifer.
Note: you can create as many classifiers as you need for different
purposes. For example, one for continuous competitor monitoring in one
technology area and set of competitors, one for ganeral technology
monitoring on a brand range, and one for a one-off landscaping
project.
Step 3: Use the classifier
A trained classifier can be applied to any data set inside IPRally: a result list of a Search case, latest monitoring results, or a Collection generated.
3A. Predict the Tags for fresh data
The Classify menu contains a Classify with... submenu, where you can choose the classifier and apply to the whole data set or subset thereof.
As a result, you'll get the predicted labels, with a question mark (?), to each document that the AI thinks should contain the Tag.
Note: The predicted labels will be contained in the XLS and PDF
exports.
3B. Confirm or reject the predictions
Clicking the question mark opens a dialogue through which you can either Confirm or Reject the prediction. A confirmed Tag becomes an ordinary Tag and part of the general metadata stored inside IPRally for your organization.
Confirmed tags are marked with a green checkmark:
Batch confirmation and rejection is also available via the ACTIONS menu:
It might be a good idea to add the confirmed tags back to the training collection and re-train the collection, so that the AI learns again better.
Note: automatic feedback and re-training of confirmations and
rejections to the AI system is coming! Our team is working on numerous
other improvements and add-ons for Classifiers.
Step 4: Manage & Retrain Classifiers
Within Settings simply select AI Classifiers under the Classification section
4a. Re-Train:
Here you may manually 'Re-Train' the Classifier - or set them to 'Automatically Re-Train'
4b. Share; Delete; Create
You are also able to 'Share' or 'Delete' Classifiers:
Happy classification!