Creating custom skills for Azure Cognitive Search using Azure ML
This blog post is accompanied by another post entitled Searching document text at scale using Azure Cognitive Search. This is the first of the two blog posts and details the deployment of a custom skill for use with Azure Cognitive Search using Azure Machine Learning.
Azure cognitive search is a Lucene-based search PaaS service available from Microsoft Azure.
Custom skills are required for custom enrichment our document index in Azure Cognitive Search with additional data.
Quick disclaimer: At the time of writing, I am currently a Microsoft Employee
Example Use Case
The use case that is explored in this and the accompanying post is for searching through articles from Nucleic Acids Research (NAR).
As such we’ll be creating an Azure Cognitive Search Custom Skill to extract genetic codes from the journal articles.
There are 4 nucleotides that comprise DNA – adenine (A), cytosine (C), guanine (G) and thymine (T). In RNA thymine is replaced with uracil (U).
We’ll be using regular expressions to extract out genetic codes.
Using the Azure CLI, create a resource group (if not already created), I’ve named mine
az group create --name azure-search-nar-demo --location westeurope
We’ll be using the following two external python packages in this post, so I’d recommend pip installing them into a python virtual environment:
pip install azureml-sdk
pip install requests
Creating Azure Custom Skill
Azure Custom Skills are just a REST API endpoint call, in which text or image data is passed to an API and insights from this text are extracted and returned from the API.
We’ll be deploying an API using Azure Machine Learning and, while not training any machine learning models in this post, this process will be directly transferable to another workflow using custom trained models.
Create Azure ML Workspace
We’ll need an Azure ML Workspace to deploy our API endpoint, run the cell below to create this.
If you want to use Azure CLI for authentication, you’ll need to install the associated package to your virtual environment with
pip install azure-cli-core.