Lab 4: Create async jobs and document translation

Introduction

In this session, you will learn how to create async jobs for analysing text and document translation.

Estimated Lab Time: 45 minutes

Objectives

In this lab, you will:

  • Learn how to create an async jobs for analysing text and document translation.

Prerequisites

  • A Free tier or paid tenancy account in OCI (Oracle Cloud Infrastructure)
  • Completed Lab 2 to create a custom Named Entity Recognition model.
  • Familiar with OCI object storage to upload data.

Policy Setup

Follow these steps to configure required policies.

  • 1. Navigate to Dynamic Groups

    Log into OCI Cloud Console. Using the Burger Menu on the top left corner, navigate to Identity & Security and click it, and then select Dynamic Groups item under Identity.

    OCI Hamburger menu

  • 2. Create Dynamic Group

    Click Create Dynamic Group

    OCI Create policy

    all {resource.type='ailanguagejob'}
  • 3. Navigate to Policies

    Log into OCI Cloud Console. Using the Burger Menu on the top left corner, navigate to Identity & Security and click it, and then select Policies item under Identity. OCI Hamburger menu

  • 4. Create Policy

    Click Create Policy OCI Create policy

  • 5. Create a new policy with the following statements

    To allow dynamic group created above to access object storage in your tenancy, create a new policy with the below statement:

    Allow dynamic-group language-service-dynamic-group-for-async-jobs to manage objects in tenancy

    OCI Create policy screen

    For more details on policies required for Language Service, please refer Language Service documentation

Task 1: Create an async job for analysing text using pre-trained model

Follow below steps to create a job.

  1. Upload the training data to Object Storage:

    1. Download and extract the hotel dataset from this link.

    2. Upload training data to object storage:

      • Log into OCI Console. Using the Burger Menu on the top left corner, navigate to Storage and click it, and then select Buckets item under Object Storage and Archive Storage. OCI Hamburger menu
      • Create bucket and upload the extracted data. Upload Objects

      For more details on uploading data to Object Storage, refer Putting Data into Object Storage

  2. Log into OCI Cloud Console. Using the Burger Menu on the top left corner, navigate to Analytics and AI menu and click it, and then select Language Service item under AI services.

    OCI Language Screen

  3. Select jobs on the left hand side of the console.

    Job List

  4. The Create Job button navigates user to a form where they can specify the details to create an async job.

    Create Job Panel

  5. Specify job properties: Specify job name, compartment details and job description.

  6. Specify model type to run the job: Select Pretrained sentiment analysis as feature type and choose a source language and configuration.

    Pretrained language detection

  7. Specify job input data to run the job: Choose the data type and bucket name in which the hotel.csv uploaded in step 1.

    job-input-data

  8. Specify job output data to run the job: Select the option to store the job result and click Next.

    job-output-data

  9. Create job: Click "Create Job" and this will kick off the process. Wait until the job execution is successful and job is in SUCCEEDED state.

    job-output-data

  10. Access the job output : Output files created by the job can be accessed once it has been successfully completed by navigating to the Output file location. Click on Output file location link to navigate to output folder, then navigate to the folder named same as job ocid to access the output files.

    job-output-result

Task 2: Create an async job for analysing text using custom model

  1. Upload the training data to Object Storage:

    1. Download Custom NER offerletter dataset from this link.

      • Extract the zip file contents into a directory.
    2. Upload the training dataset files to object storage:

      • Log into OCI Cloud Console. Using the Burger Menu on the top left corner, navigate to Storage and click it, and then select Buckets item under Object Storage and Archive Storage. OCI Hamburger menu
      • Create bucket and upload Custom NER offerletter data extracted above. Upload Objects

      For more details on uploading data to Object Storage, refer Putting Data into Object Storage

  2. Log into OCI Cloud Console. Using the Burger Menu on the top left corner, navigate to Analytics and AI menu and click it, and then select Language Service item under AI services.

    OCI Language Screen

  3. Select jobs on the left hand side of the console.

    Job List

  4. The Create Job button navigates user to a form where they can specify the details to create an async job.

    Create Job Panel

  5. Specify job properties: Specify job name, compartment details and job description.

  6. Specify model type to run the job: Select the custom Named Entity Recognition model you created in Lab 2.

    async-job-custom-model

  7. Specify custom model details: Select the model created in Step 3. You can also specify a existing model endpoint optionally and click Next.

    async-job-custom-model-details

  8. Specify job input data to run the job: Select the data type and choose the bucket created in previous step from which the job will take input.

    job-input-data

  9. Specify job output data to run the job: Select the option to store job result and click Next.

    job-output-data

  10. Create job: Click on Create Job button to create an async job. Wait until the job execution is successful and job is in SUCCEEDED state.

    job-output-data

  11. Access the job output : Output files created by the job can be accessed once it has been successfully completed by navigating to the Output file location. Click on Output file location link to navigate to output folder, then navigate to the folder named same as job ocid to access the output files.

    job-output-result

Task 3: Create an async job for translating documents

  1. Upload the training data to Object Storage:

    1. Download the translation dataset from this link.

      • Extract the zip file contents into a directory.
    2. Upload the files to object storage:

      • Log into OCI Cloud Console. Using the Burger Menu on the top left corner, navigate to Storage and click it, and then select Buckets item under Object Storage and Archive Storage. OCI Hamburger menu
      • Create bucket and upload extracted files OCW AI Presentation.pptx and Oracle Cloud Infrastructure Overview.docx to object storage. Upload Objects

      For more details on uploading data to Object Storage, refer Putting Data into Object Storage

  2. Log into OCI Cloud Console. Using the Burger Menu on the top left corner, navigate to Analytics and AI menu and click it, and then select Language Service item under AI services.

    OCI Language Screen

  3. Select jobs on the left hand side of the console.

    Job List

  4. The Create Job button navigates user to a form where they can specify the details to create an async job.

    Create Job Panel

  5. Specify job properties: Specify job name, compartment details and job description.

  6. Specify model type to run the job: Select Pretrained language translation as the feature type and specify the source and target languages.

    async-job-document-translation

  7. Specify job input data to run the job: Select the bucket created in previous step and file that was uploaded.

    job-input-data

  8. Specify job output data to run the job: Select the option to store job result and click Next.

    job-output-data

  9. Create job: Click "Create Job" and this will kick off the process. Wait until the job execution is successful and job is in SUCCEEDED state.

    job-output-data

  10. Access the job output : Output files created by the job can be accessed once it has been successfully completed by navigating to the Output file location. Click on Output file location link to navigate to output folder, then navigate to the folder named same as job ocid to access the output files.

    job-output-result

Task 4: Using Python SDK to create async job and document translation

Download code file and save it your directory.

Download code file and save it your directory.

To know more about the Python SDK visit Python OCI-Language

Summary

Congratulations!
In this lab you have learnt how to create async jobs for analysing text and document translation using OCI Console and Python SDK.

Acknowledgements

Authors

  • Raja Pratap Kondamari - Product Manager, OCI Language Service
  • Nitish Kumar Rai - Oracle AI OCI Language Services

Last Updated By/Date

  • Nitish Kumar Rai - Oracle AI OCI Language Services, March 2024

How to Translate This Page

You must be on the livelabs.oracle.com domain to use translations.
They are not available on apexapps.oracle.com.

For the best translation experience, we recommend Google Chrome.

  1. Right-click anywhere on the page and choose “Translate to
    [Your Language]”
    .
  2. If that option doesn’t appear, click the ⋮ three-dot menu in the
    top-right corner of Chrome.
  3. Select “Translate” from the dropdown.
  4. Then, click the translate icon Translate icon in the address bar.
  5. If needed, click the ⋮ three-dot menu within the Google
    Translate popup and choose your preferred language.