Sitemap

Sunday, April 5, 2015

UCM: Migration of contents into UCM using SQL* Loader and UTL_FILE

SCENARIO: The client have provided a CSV file which contains the metadata fields of the files that needs to be migrated into UCM. For this, you need to create a batch loader file which will be used by Batch Loader of UCM.

Creation of batch loader file involves 2 steps:
1. Moving data from external  files into a temporary table(here PNC_DATA) using SQL* Loader
2. Creation of UCM batch load file using UTL_FILE package

SQL* Loader:
It accepts the 2 source files:
1. Data CSV file, which will be provided by the client
2. CTL control file. It describes the task that the SQL*Loader is to carry out. The SQL*Loader control file contains information that describes how the data will be loaded. It contains the table name, column data types, field delimiters, etc.

1. Use the below SQL script to create the PNC_DATA table.
CREATE TABLE PNC_DATA
(
  ID             NUMBER,
  DOCTITLE       VARCHAR2(100 BYTE),
  DOCAUTHOR      VARCHAR2(100 BYTE),
  INVOICENUMBER  VARCHAR2(100 BYTE),
  INVOICEDATE    DATE,
  INVOICEAMOUNT  NUMBER,
  VENDORNAME     VARCHAR2(100 BYTE),
  SHIPTONAME     VARCHAR2(100 BYTE),
  ADDRESS        VARCHAR2(100 BYTE),
  ZIP            VARCHAR2(100 BYTE),
  PRIMARYFILE    VARCHAR2(100 BYTE),
  STATUS         VARCHAR2(100 BYTE),
  ERROR_MESSAGE  VARCHAR2(4000 BYTE)
)


ALTER TABLE PNC_DATA ADD (
  CONSTRAINT PNC_DATA_PK PRIMARY KEY (ID));


CREATE SEQUENCE PNC_DATA_SEQ
  MINVALUE 1
  START WITH 1
  INCREMENT BY 1
  CACHE 20;
  
  
CREATE OR REPLACE TRIGGER PNC_DATA_BIR 
BEFORE INSERT ON PNC_DATA 
FOR EACH ROW

BEGIN
  SELECT PNC_DATA_SEQ.NEXTVAL
  INTO   :NEW.ID
  FROM   DUAL;
END;
/

2. Create a control (CTL) file for SQL* Loader.
OPTIONS (SKIP=1)
LOAD DATA 
INFILE 'C:\PNC\mydata.csv'
APPEND
INTO TABLE PNC_DATA
FIELDS TERMINATED BY "|" OPTIONALLY ENCLOSED BY '"'
TRAILING NULLCOLS
 (DOCTITLE, DOCAUTHOR, INVOICENUMBER, INVOICEDATE DATE "MM/DD/YYYY", INVOICEAMOUNT, VENDORNAME, SHIPTONAME, ADDRESS, ZIP, PRIMARYFILE)

3. Get the CSV file from the client. A sample CSV file is attached with this document.
DocTitle|DocAuthor|InvoiceNumber|InvoiceDate|InvoiceAmount|vendorName|ShipToName|Address|Zip|PrimaryFile,,
0000024F|sysadmin|196328182|12/26/2002|2038.66|DELL|JIM MACDONALD|PO 20026306, IRVINE, CA|92618|G:/Projects/PNC/Sample invoice/0000024F.TIF
0000025A|sysadmin|292487395|3/18/2003|11124.15|DELL||16245 Laguna cannyon RD, IRVINE, CA|92618|G:/Projects/PNC/Sample invoice/0000025A.TIF
0000025B|sysadmin|237596557|2/14/2003|1095.22|DELL|JIM MACDONALD|16245  Laguna cannyon RD, IRVINE, CA|92618|G:/Projects/PNC/Sample invoice/0000025B.TIF
0000025C|sysadmin|224533134|1/27/2003|1084|DELL|RECEIVING|16245  Laguna cannyon RD, IRVINE, CA|92618|G:/Projects/PNC/Sample invoice/0000025C.TIF
0000025D|sysadmin|226712637|1/28/2003|20644.92|DELL|RECEIVING|16246  Laguna cannyon RD, IRVINE, CA|92619|G:/Projects/PNC/Sample invoice/0000025D.TIF
0000025E|sysadmin|223872293|1/29/2003|1166.62|DELL|JIM MACDONALD|16247  Laguna cannyon RD, IRVINE, CA|92620|G:/Projects/PNC/Sample invoice/0000025E.TIF

4. Put the CSV data and the CTL file in one directory, and run the sqlldr command from the command prompt (in Windows), under the user which owns the table created above.
sqlldr hr/hr control=C:\import.ctl

Log files will be created, import.log and import.bad. Verify these to make sure any rows were skipped. Otherwise the data should be added into the PNC_DATA table.


UTL_FILE package
1. First we must define a directory object which points to an existing filesystem directory on the server. We must grant the necessary access privilege on the directory object to the user who will perform the extract.
CONNECT / AS SYSDBA
CREATE OR REPLACE DIRECTORY EXTRACT_DIR AS 'C:\Extract';
GRANT READ, WRITE ON DIRECTORY EXTRACT_DIR TO HR;
GRANT EXECUTE ON UTL_FILE TO HR;

2. Compile the PL/SQL specification and then the body of PNC_KA_POC_PKG package. The sample is attached with this document.
CREATE OR REPLACE PACKAGE PNC_KA_POC_PKG
AS
   PROCEDURE MIGRATION_TEXTFILE;
END;
/

CREATE OR REPLACE PACKAGE BODY PNC_KA_POC_PKG
AS
   PROCEDURE PNC_VALIDATE_DATA
   AS
   BEGIN
      UPDATE PNC_DATA
         SET STATUS = 'E',
             ERROR_MESSAGE = ERROR_MESSAGE || ' - SHIPTONAME cannot be NULL'
       WHERE SHIPTONAME IS NULL;

      UPDATE PNC_DATA X
         SET STATUS = 'E',
             ERROR_MESSAGE =
                ERROR_MESSAGE || ' - INVOICENUMBER should be Unique'
       WHERE EXISTS
                (  SELECT COUNT (INVOICENUMBER)
                     FROM PNC_DATA
                 GROUP BY INVOICENUMBER
                   HAVING     INVOICENUMBER LIKE X.INVOICENUMBER
                          AND COUNT (INVOICENUMBER) > 1);

      UPDATE PNC_DATA
         SET STATUS = 'E',
             ERROR_MESSAGE =
                ERROR_MESSAGE || ' - SHIPTONAME format is not correct'
       WHERE INSTR (INVOICENUMBER, ' ') != 0;

      UPDATE PNC_DATA
         SET STATUS = 'E',
             ERROR_MESSAGE = ERROR_MESSAGE || ' - ZIP should be NUMBER'
       WHERE LENGTH (TRANSLATE (ZIP, '0123456789', '')) IS NOT NULL;

      COMMIT;
   END;

   PROCEDURE MIGRATION_TEXTFILE
   AS
      v_file     UTL_FILE.FILE_TYPE;
      cur_date   VARCHAR (30);

      CURSOR CSR_STAGING
      IS
         SELECT *
           FROM PNC_DATA
          WHERE STATUS IS NULL;
   BEGIN
      PNC_VALIDATE_DATA ();

      v_file :=
         UTL_FILE.FOPEN (location       => 'EXTRACT_DIR',
                         filename       => 'pncdata.txt',
                         open_mode      => 'w',
                         max_linesize   => 32767);

      cur_date := TO_CHAR (SYSDATE, 'MM/DD/YY HH12:MI AM');

      FOR cur_rec IN CSR_STAGING
      LOOP
         UTL_FILE.PUT_LINE (v_file, 'dDocTitle=' || cur_rec.DOCTITLE);
         UTL_FILE.PUT_LINE (v_file, 'dDocAuthor=' || cur_rec.DOCAUTHOR);
         UTL_FILE.PUT_LINE (v_file, 'dSecurityGroup=' || 'Public');
         UTL_FILE.PUT_LINE (v_file, 'dDocType=' || 'Invoice');
         UTL_FILE.PUT_LINE (v_file, 'xInvoiceNumber=' || cur_rec.INVOICENUMBER);
         UTL_FILE.PUT_LINE (v_file, 'xInvoiceDate=' || TO_CHAR (cur_rec.INVOICEDATE, 'MM/DD/YY'));
         UTL_FILE.PUT_LINE (v_file, 'xInvoiceAmount=' || cur_rec.INVOICEAMOUNT);
         UTL_FILE.PUT_LINE (v_file, 'xvendorName=' || cur_rec.VENDORNAME);
         UTL_FILE.PUT_LINE (v_file, 'xShipToName=' || cur_rec.SHIPTONAME);
         UTL_FILE.PUT_LINE (v_file, 'xAddress=' || cur_rec.ADDRESS);
         UTL_FILE.PUT_LINE (v_file, 'xZip=' || cur_rec.ZIP);
         UTL_FILE.PUT_LINE (v_file, 'dInDate=' || cur_date);
         UTL_FILE.PUT_LINE (v_file, 'primaryFile=' || cur_rec.PRIMARYFILE);
         UTL_FILE.PUT_LINE (v_file, '<>');
      END LOOP;

      UTL_FILE.FCLOSE (v_file);
   EXCEPTION
      WHEN OTHERS
      THEN
         UTL_FILE.FCLOSE (v_file);
         RAISE;
   END;
END PNC_KA_POC_PKG;
/

The purpose of writing PNC_VALIDATE_DATA method is to validate the data that is present in the table. Sometimes a particular column should contain a unique value, like INVOICENUMBER, while some data might not be in a proper format. For such rows, the status is marked as E (stands for error). Only the validated rows are processed further and are written in the batch loadfile.

3. Execute the PL/SQL Procedure using the below syntax to create the final UCM batch load file (pncdata.txt) in C:\Extract directory.
BEGIN
 PNC_KA_POC_PKG.MIGRATION_TEXTFILE;
END;

No comments:

Post a Comment