Best way to share Taxonomy files?

Hi all,

I have several processes where I use IntelligentOCR, Classification and Extraction of data. I use the same 'Keywordlearning.json" and “taxonomy.json” for all projects.

What is the best way for all projects to share / use the same files?

Is there a way to store taxonomies in Orchestrator? It would be great with an activity like “Get Asset” - could be “Get Taxonomy” :slight_smile:

Thoughts / input?

Cheers,
Michael Fray

@MichaelFray One way maybe to use Serialization and Deserialization. You can Serialize the Taxonomy to a String Type and Store it in an Asset or maybe in a Queue. You can then retrieve it using a Get Transaction Item or Get Asset and then Deserialize the text to a Taxonomy Type. I have not tried Storing and retrieval but I just tried to Convert it by Serializing and it works. It might work for your case as well.

1 Like

That could work :slight_smile: Thanks for the suggestion!

1 Like

Any chance you can share your sample? Thanks :star_struck:

@MichaelFray Check this post :

The Conversion I think is From a Dictionary to a String. You can use the Taxonomy instead of the Dictionary. You can change the Type in Deserialize Json Activity to the Taxonomy Type.

Although there is no Set Asset or Add Queue Item used, you can add the Activity and then get the Item From the Orchestrator and then use Deserialize Json Activity on that Item.

Let me know if you face any issues.

hello @supermanPunch and @MichaelFray,

You are right, this is the way to do it :slight_smile:

To serialize, just use objTaxonomy.Serialize (or just grab the content of the taxonomy file), and you can use the class method DocumentTaxonomy.Deserialize(strTaxo) to obtain an objTaxonomy.

Same applies to DOM and extractionResults.

For Keyword Learning content: you can grab the content of the learning file, put that content as string wherever it suits you (database, queue, asset , bucket storage etc), then just retrieve the contents, and use that instead of the LearningFilePath, within the LearningData variable.
Even for training the keyword / intelligent keyword / classifiers, the LearningData is In/Out - so it gets the LearningData at the state before training, and when the training finishes, the same variable contains the new, modified LearningData.

Hope this helps!

Ioana

4 Likes

Thanks again!

1 Like

Many thanks @Ioana_Gligan - very useful!

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.