- 1.Set Watson Credentials
- 2.Set the minimum confidence level you require
- 3.Train the model
- 4.Classify unknown documents
What is Watson Visual Recognition?
The name pretty much says it all Watson visual recognition is an IBM Cloud service that inspects images and suggests appropriate classification tags to ensure you’re able to find it later when needed. There are out-of-the-box recognition capabilities, and more importantly, the ability to train the tool and classify your own images.
Industry Use Cases of Watson Visual Recognition
You can use your own images to train the Visual Recognition engine and create your own model. If you’ve used the IBM Content Classification Engine, the concept is similar. Using the Visual Recognition web tooling, or Datacap, you can create a new classifier, define classes (or categories) and train it with sample data.
For example, in the medical field, you may have four classes: X-Ray, EKG, MRI and a Stress Test.
In the insurance field, your classes might be vehicle damage, house damage, license plate, driver’s license.
I created an Animal Classifier with the three classes seen in the image. Here’s what the output looked like:
The resulting number is the confidence level. When calling the classifier using the APIs, or Datacap, you can configure what confidence to accept.
This is where the power with Visual Recognition comes from. You can define custom classifiers that are applicable to your industry. The sky is the limit, you simply must have the images with which to train.
Watson Visual Recognition helps to build custom Image classifiers that are applicable to your industry.
Using IBM Datacap
How to use Visual recognition ruleset in IBM Datacap Studio
Training a Visual Recognition Model
Basically, training involves creating the classes within your model, then feeding as many example documents as you can – This can be done in one of several ways.
- Use Watson Studio to manually create classes and manually add your example docs one at a time.
- Upload a zip file to each class that contains the documents for that class.
- Use the Watson API's to programmatically train your model.
The Visual Recognition Service requires the training documents to be either JPG or PDF format. It also expects these files to be organized into zip files. The VisualRecogTrain action expects that each set of test pages are in one document per class. All this can be accomplished in your application rulesets.
Using a Visual Recognition Model
Once you create a model from Watson Studio and it’s in Ready Status, you can use your model to classify unknown images. You can do this through the Watson web interface, or (of course) from Datacap. Think of Visual Recognition in the same way as any other classification. It runs during PageID, the page is sent to the service, and the service sends back the classification and a confidence level. You can set a minimum confidence level to accept. The classification ruleset would look like this: