René Nyffenegger's collection of things on the web | |
René Nyffenegger on Oracle - Most wanted - Feedback
- Follow @renenyffenegger
|
Splitting text lines and populating views | ||
Someone has asked an interesting question
on comp.databases.oracle.server. He has the following data:
create table incoming_data ( d varchar2(50) ); insert into incoming_data values (' 13002000' ); insert into incoming_data values ('00000300 04001000' ); insert into incoming_data values ('00000300 04001000 12001700'); insert into incoming_data values ('' ); insert into incoming_data values (' ' ); insert into incoming_data values (' 07001500' ); insert into incoming_data values (' 06001800' ); insert into incoming_data values (' 21000600' );
Each record consists of zero or more eight-digit blocks. The task is to split a line into more records of which each contains exactly one eight-digit block.
Then, after the split, a view must be populated.
On Oracle 10g the split can be done with regular expressions.
The functionality that uses the regular expressions will be programmed in a function (named split_records). This function returns a
table of char(8) which is a user defined type (whose name is table_char_8). It must first be created:
create type table_char_8 as table of char(8); / create or replace function split_records (in_line in incoming_data.d%type) return table_char_8 as ret table_char_8; s varchar2(50); p varchar2(50); a char(4); b char(4); i number:=1; begin ret :=table_char_8(); p := '[[:digit:]]{8}'; s := regexp_substr(in_line, p); while s is not null loop ret.extend; ret(ret.count) := s; i := regexp_instr(in_line,p,i); i := i+8; s := regexp_substr(in_line, p,i); end loop; return ret; end split_records; /
The variable p defines the search pattern. In this case, it says, it must be eight digits in a row ([[:digit:]]{8}). This
pattern is then repeatedly applied on regexp_substr. regexp_substr returns the string
that matches p. In this case, it returns the next eight digit block.
The variable i is used to remember the position of the last match. We do not want to match the same eight digit block twice, so we have
to tell regexp_substr at what position it has to start with the match. regexp_instr returns
the position of a match which will be assigned to i. i is then incremented by 8 (length of digit block). regexp_substr returns
null when it does not match anything. If it returns null, the while loop will exit.
This function can now be applied on the table:
select column_value from incoming_data, table (split_records(d)); COLUMN_V -------- 13002000 00000300 04001000 00000300 04001000 12001700 07001500 06001800 21000600
column_value is, if you want, an alias for the unnamed column that is returned by split_records.
Now, it is easily possible to create the view.
create view v_incoming_data as select column_value piece from incoming_data, table (split_records(d)); |