I have several processes where I use IntelligentOCR, Classification and Extraction of data. I use the same 'Keywordlearning.json" and “taxonomy.json” for all projects.
What is the best way for all projects to share / use the same files?
Is there a way to store taxonomies in Orchestrator? It would be great with an activity like “Get Asset” - could be “Get Taxonomy”
@MichaelFray One way maybe to use Serialization and Deserialization. You can Serialize the Taxonomy to a String Type and Store it in an Asset or maybe in a Queue. You can then retrieve it using a Get Transaction Item or Get Asset and then Deserialize the text to a Taxonomy Type. I have not tried Storing and retrieval but I just tried to Convert it by Serializing and it works. It might work for your case as well.
The Conversion I think is From a Dictionary to a String. You can use the Taxonomy instead of the Dictionary. You can change the Type in Deserialize Json Activity to the Taxonomy Type.
Although there is no Set Asset or Add Queue Item used, you can add the Activity and then get the Item From the Orchestrator and then use Deserialize Json Activity on that Item.
To serialize, just use objTaxonomy.Serialize (or just grab the content of the taxonomy file), and you can use the class method DocumentTaxonomy.Deserialize(strTaxo) to obtain an objTaxonomy.
Same applies to DOM and extractionResults.
For Keyword Learning content: you can grab the content of the learning file, put that content as string wherever it suits you (database, queue, asset , bucket storage etc), then just retrieve the contents, and use that instead of the LearningFilePath, within the LearningData variable.
Even for training the keyword / intelligent keyword / classifiers, the LearningData is In/Out - so it gets the LearningData at the state before training, and when the training finishes, the same variable contains the new, modified LearningData.