Resources for Sample Data Set for AI model training

Hi All,

I am working on data extraction from various document types using AI Center, Document Understanding. I was just wondering if there are any open-source resources where I can find large number of sample documents on any sector, which I can use to train models and create overall end-to-end extraction processes.

For example:

  1. Insurance related resource sites.
  2. Real Estate related resource sites.
  3. Any finance sector related resource sites.

Also, I have already tried Invoice related processes. So, Invoice related resources would not prove much of use.

Thank You

Hi @Mona_Kumari thank you for your question. You can use our Use Cases Repository to get inspired and I’m wondering if @zell12 or @Nithinkrishna can assist here with some ideas. Thank you.

1 Like

thanks for tagging @loredana_ifrim
@Mona_Kumari you can use Dataset search in google just for this specific purpose.
Dataset Search (google.com)

Below is an example result searching for financial invoice documents
chainyo/rvl-cdip-invoice · Datasets at Hugging Face

1 Like

Hi @Mona_Kumari

Please check your DM.

Thanks
#nK