Skip to Main Content
Create custom models using OCI Document Understanding and Label Studio

About This Workshop

Youtube Video

About This Workshop
Label Studio is an open source software that provides an interface to annotate images for training custom models. Document Understanding (DU) is a service offered by Oracle Cloud Infrastructure (OCI) that uses AI-powered tools to help businesses extract insights, manage, and process documents at scale. DU provides pre-trained AI models for document classification, key value extraction, table extraction, and the extraction of other key elements for document processing. It also offers an option to create custom models to meet specific customer needs. These custom models can be trained to recognize unique document types, industry-specific terms, or specific data fields, ensuring that the solution adapts seamlessly to the company’s processes and requirements.

This workshop guides you through the process of configuring and using Label Studio to train your own key-value extraction model in OCI Document Understanding. We will focus with an example of using it to extract key values from invoices which the pre-trained DU model does not extract. The workshop will be comprehensive, covering everything from setting up the Label Studio environment to creating and training the custom model, and testing. Some synthetic examples will be offered to help understand the process end-to-end.

Workshop Info

2 hours
  • Lab0: Download and install Label Studio and dependencies
  • Lab1: Prepare files for a Label Studio Dataset
  • Lab 2 : Setup OCI OCR Integration for Pre-Annotation
  • Lab 3 : Setup Label Studio
  • Lab 4 : Label your dataset
  • Lab 5 : Export dataset from Label Studio to OCI Object Storage
  • Lab 6 : Use the dataset to train a custom Document Understanding model
  • Basic understanding of Python and terminal commands

Other Workshops you might like

Other Workshops you might like Cards