Splitting text lines and populating views

create table incoming_data (
  d varchar2(50)
);

insert into incoming_data values ('   13002000'                 );
insert into incoming_data values ('00000300   04001000'         );
insert into incoming_data values ('00000300 04001000   12001700');
insert into incoming_data values (''                            );
insert into incoming_data values ('        '                    );
insert into incoming_data values ('      07001500'              );
insert into incoming_data values ('    06001800'                );
insert into incoming_data values ('             21000600'       );

Each record consists of zero or more eight-digit blocks. The task is to split a line into more records of which each contains exactly one eight-digit block. Then, after the split, a view must be populated.

The functionality that uses the regular expressions will be programmed in a function (named split_records). This function returns a table of char(8) which is a user defined type (whose name is table_char_8). It must first be created:

create type table_char_8 as table of char(8);
/

create or replace function split_records
  (in_line in incoming_data.d%type)
  return table_char_8
as
  ret table_char_8;
  s   varchar2(50);
  p   varchar2(50);
  a   char(4);
  b   char(4);
  i   number:=1;
begin
  ret :=table_char_8();

  p := '[[:digit:]]{8}';
  s := regexp_substr(in_line, p);

  while s is not null loop
    ret.extend;
    ret(ret.count) := s;

    i := regexp_instr(in_line,p,i);
    i := i+8;

    s := regexp_substr(in_line, p,i);
  end loop;
  
  return ret;
  
end split_records;
/

The variable p defines the search pattern. In this case, it says, it must be eight digits in a row ([[:digit:]]{8}). This pattern is then repeatedly applied on regexp_substr. regexp_substr returns the string that matches p. In this case, it returns the next eight digit block.

The variable i is used to remember the position of the last match. We do not want to match the same eight digit block twice, so we have to tell regexp_substr at what position it has to start with the match. regexp_instr returns the position of a match which will be assigned to i. i is then incremented by 8 (length of digit block). regexp_substr returns null when it does not match anything. If it returns null, the while loop will exit.

select column_value from incoming_data, table (split_records(d));

create view v_incoming_data as
  select column_value piece
  from incoming_data, table (split_records(d));

		Splitting text lines and populating views
		Someone has asked an interesting question on comp.databases.oracle.server. He has the following data: create table incoming_data ( d varchar2(50) ); insert into incoming_data values (' 13002000' ); insert into incoming_data values ('00000300 04001000' ); insert into incoming_data values ('00000300 04001000 12001700'); insert into incoming_data values ('' ); insert into incoming_data values (' ' ); insert into incoming_data values (' 07001500' ); insert into incoming_data values (' 06001800' ); insert into incoming_data values (' 21000600' ); Each record consists of zero or more eight-digit blocks. The task is to split a line into more records of which each contains exactly one eight-digit block. Then, after the split, a view must be populated. On Oracle 10g the split can be done with regular expressions. The functionality that uses the regular expressions will be programmed in a function (named split_records). This function returns a table of char(8) which is a user defined type (whose name is table_char_8). It must first be created: create type table_char_8 as table of char(8); / create or replace function split_records (in_line in incoming_data.d%type) return table_char_8 as ret table_char_8; s varchar2(50); p varchar2(50); a char(4); b char(4); i number:=1; begin ret :=table_char_8(); p := '[[:digit:]]{8}'; s := regexp_substr(in_line, p); while s is not null loop ret.extend; ret(ret.count) := s; i := regexp_instr(in_line,p,i); i := i+8; s := regexp_substr(in_line, p,i); end loop; return ret; end split_records; / The variable p defines the search pattern. In this case, it says, it must be eight digits in a row ([[:digit:]]{8}). This pattern is then repeatedly applied on regexp_substr. regexp_substr returns the string that matches p. In this case, it returns the next eight digit block. The variable i is used to remember the position of the last match. We do not want to match the same eight digit block twice, so we have to tell regexp_substr at what position it has to start with the match. regexp_instr returns the position of a match which will be assigned to i. i is then incremented by 8 (length of digit block). regexp_substr returns null when it does not match anything. If it returns null, the while loop will exit. This function can now be applied on the table: select column_value from incoming_data, table (split_records(d)); COLUMN_V -------- 13002000 00000300 04001000 00000300 04001000 12001700 07001500 06001800 21000600 column_value is, if you want, an alias for the unnamed column that is returned by split_records. Now, it is easily possible to create the view. create view v_incoming_data as select column_value piece from incoming_data, table (split_records(d));


	René Nyffenegger's collection of things on the web
	René Nyffenegger on Oracle - Most wanted - Feedback - Follow @renenyffenegger