I am playing with Keyword Based Classifier, I have hundreds key-words provided into Keyword Based Classifier so far. I would like to easy extract all of them divided by document type. I know that the key-words are save in classifier json file, but there are other information and it is not easy to get that information from that file. I would like to easy extract them to compare which words I have per which document type to do some analyses and compare each category.
Do you have any idea how to easy extract them?
Thank you in advanced,
Adrian Rokicki
You’d have to read that JSON file into a variable as text, then parse the JSON and write it out to Excel or whatever you want for easier readability.
As you are in early stages of classifying using the keyword classifier, I would suggest you try to build separate classifier json files per document type and check the classification results. Its kind of trail and error method to Iterate through the files and try to build a table with the file extension, classification result and the keywords used per classifier.
It is not easy to extract the actual keywords from the classifier json file, As your keywords increases or incase you start to use a re-trainable classifier your effort on retrieving these keywords increases further.