Wednesday, December 14, 2011

Sqlldr Interview questions


What is SQL*Loader and what is it used for?

SQL*Loader is a bulk loader utility used for moving data from external files into the Oracle database. Its syntax is similar to that of the DB2 Load utility, but comes with more options. SQL*Loader supports various load formats, selective loading, and multi-table loads.

How does one use the SQL*Loader utility?

One can load data into an Oracle database by using the sqlldr (sqlload on some platforms) utility. Invoke the utility without arguments to get a list of available parameters. Look at the following example:
        sqlldr scott/tiger control=loader.ctl
This sample control file (loader.ctl) will load an external data file containing delimited data:
        load data
          infile 'c:\data\mydata.csv'
                               into table emp
          fields terminated by "," optionally enclosed by '"'                          
          ( empno, empname, sal, deptno )
The mydata.csv file may look like this:
              10001,"Scott Tiger", 1000, 40
              10002,"Frank Naude", 500, 20
Another Sample control file with in-line data formatted as fix length records. The trick is to specify "*" as the name of the data file, and use BEGINDATA to start the data section in the control file.
        load data
          infile *
          replace
          into table departments
          (  dept     position (02:05) char(4),
             deptname position (08:27) char(20)
          )
        begindata
          COSC  COMPUTER SCIENCE
          ENGL  ENGLISH LITERATURE
          MATH  MATHEMATICS
          POLY  POLITICAL SCIENCE

Is there a SQL*Unloader to download data to a flat file?

Oracle does not supply any data unload utilities. However, you can use SQL*Plus to select and format your data and then spool it to a file:
        set echo off newpage 0 space 0 pagesize 0 feed off head off trimspool on
        spool oradata.txt
        select col1 || ',' || col2 || ',' || col3
        from   tab1
        where  col2 = 'XYZ';
        spool off
Alternatively use the UTL_FILE PL/SQL package:
        rem Remember to update initSID.ora, utl_file_dir='c:\oradata' parameter
        declare
           fp utl_file.file_type;
        begin
           fp := utl_file.fopen('c:\oradata','tab1.txt','w');
           utl_file.putf(fp, '%s, %s\n', 'TextField', 55);
           utl_file.fclose(fp);
        end;
        /
You might also want to investigate third party tools like SQLWays from Ispirer Systems, TOAD from Quest, or ManageIT Fast Unloader from CA to help you unload data from Oracle.

Can one load variable and fix length data records?

Yes, look at the following control file examples. In the first we will load delimited data (variable length):
   LOAD DATA
   INFILE *
   INTO TABLE load_delimited_data
   FIELDS TERMINATED BY "," OPTIONALLY ENCLOSED BY '"'
   TRAILING NULLCOLS
   (  data1,
      data2
   )
   BEGINDATA
   11111,AAAAAAAAAA
   22222,"A,B,C,D,"
If you need to load positional data (fixed length), look at the following control file example:
   LOAD DATA
   INFILE *
   INTO TABLE load_positional_data
   (  data1 POSITION(1:5),
      data2 POSITION(6:15)
   )
   BEGINDATA
   11111AAAAAAAAAA
   22222BBBBBBBBBB

Can one skip header records load while loading?

Use the "SKIP n" keyword, where n = number of logical rows to skip. Look at this example:
   LOAD DATA
   INFILE *
   INTO TABLE load_positional_data
   SKIP 5
   (  data1 POSITION(1:5),
      data2 POSITION(6:15)
   )
   BEGINDATA
   11111AAAAAAAAAA
   22222BBBBBBBBBB

Can one modify data as it loads into the database?

Data can be modified as it loads into the Oracle Database. Note that this only applies for the conventional load path and not for direct path loads.
   LOAD DATA
   INFILE *
   INTO TABLE modified_data
   (  rec_no                      "my_db_sequence.nextval",
      region                       CONSTANT '31',
      time_loaded             "to_char(SYSDATE, 'HH24:MI')",
      data1        POSITION(1:5)  ":data1/100",
      data2        POSITION(6:15) "upper(:data2)",
      data3        POSITION(16:22)"to_date(:data3, 'YYMMDD')"
   )
   BEGINDATA
   11111AAAAAAAAAA991201
   22222BBBBBBBBBB990112
 
   LOAD DATA
   INFILE 'mail_orders.txt'
   BADFILE 'bad_orders.txt'
   APPEND
   INTO TABLE mailing_list
   FIELDS TERMINATED BY ","
   (  addr,
      city,
      state,
      zipcode,
      mailing_addr   "decode(:mailing_addr, null, :addr, :mailing_addr)",
      mailing_city   "decode(:mailing_city, null, :city, :mailing_city)",
      mailing_state
   )

Can one load data into multiple tables at once?

Look at the following control file:
   LOAD DATA
   INFILE *
   REPLACE
   INTO TABLE emp
        WHEN empno != ' '
   ( empno  POSITION(1:4)   INTEGER EXTERNAL,
     ename  POSITION(6:15)  CHAR,
     deptno POSITION(17:18) CHAR,
     mgr    POSITION(20:23) INTEGER EXTERNAL
   )
   INTO TABLE proj
        WHEN projno != ' '
   (  projno POSITION(25:27) INTEGER EXTERNAL,
      empno  POSITION(1:4)   INTEGER EXTERNAL
   )

Can one selectively load only the records that one need?

Look at this example, (01) is the first character, (30:37) are characters 30 to 37:
   LOAD DATA
   INFILE  'mydata.dat' BADFILE  'mydata.bad' DISCARDFILE 'mydata.dis'
   APPEND
   INTO TABLE my_selective_table
   WHEN (01) <> 'H' and (01) <> 'T' and (30:37) = '19991217'
   (
      region              CONSTANT '31',
      service_key         POSITION(01:11)   INTEGER EXTERNAL,
      call_b_no           POSITION(12:29)   CHAR
   )


Can one skip certain columns while loading data?

One cannot use POSTION(x:y) with delimited data. Luckily, from Oracle 8i one can specify FILLER columns. FILLER columns are used to skip columns/fields in the load file, ignoring fields that one does not want. Look at this example:
        LOAD DATA
        TRUNCATE INTO TABLE T1
        FIELDS TERMINATED BY ','
        ( field1,
          field2 FILLER,
          field3
        )

How does one load multi-line records?

One can create one logical record from multiple physical records using one of the following two clauses:
  • CONCATENATE: - use when SQL*Loader should combine the same number of physical records together to form one logical record.
  • CONTINUEIF - use if a condition indicates that multiple records should be treated as one. Eg. by having a '#' character in column 1.

How can get SQL*Loader to COMMIT only at the end of the load file?

One cannot, but by setting the ROWS= parameter to a large value, committing can be reduced. Make sure you have big rollback segments ready when you use a high value for ROWS=.

Can one improve the performance of SQL*Loader?

  1. A very simple but easily overlooked hint is not to have any indexes and/or constraints (primary key) on your load tables during the load process. This will significantly slow down load times even with ROWS= set to a high value.
  2. Add the following option in the command line: DIRECT=TRUE. This will effectively bypass most of the RDBMS processing. However, there are cases when you can't use direct load. Refer to chapter 8 on Oracle server Utilities manual.
  3. Turn off database logging by specifying the UNRECOVERABLE option. This option can only be used with direct data loads.
  4. Run multiple load jobs concurrently.

What is the difference between the conventional and direct path loader?

The conventional path loader essentially loads the data by using standard INSERT statements. The direct path loader (DIRECT=TRUE) bypasses much of the logic involved with that, and loads directly into the Oracle data files. More information about the restrictions of direct path loading can be obtained from the Utilities Users Guide. s

7 comments:

  1. Hey Man..Thats some awesome stuff presented with so ease.. Quite useful..

    ReplyDelete
  2. very useful...useful than many videos on youtube . Many Thanks !!

    ReplyDelete
  3. thanks..for sharing
    awesome stuff

    ReplyDelete
  4. Its really awesome for the freshers & experienced candidates.

    ReplyDelete