René Nyffenegger's collection of things on the web | |
René Nyffenegger on Oracle - Most wanted - Feedback
- Follow @renenyffenegger
|
October 18, 2005: On storing hierarchical data | ||
The canonical approach to store hierachical data is to create a
self referencing table...
create table hierarchy_tbl ( id number primary key, parent_id references hierarchy_tbl, data varchar2(30) );
...and then to use Oracle's start with ... connnect by idiom to select from it.
Unfortunatly, this approach causes a problem if a subtree can be assigned to more than one node. For example, let's assume you sell
three products: prod_1, prod_2 and prod_3. In order to produce these products, you buy things (I'll call these things items) from
other vendors. These items are then assembled into compositions. These compositions can then be assembled into further compositions.
Such a composition might not only consist of other compositions but also of other items. Finally, the compositions (possibly again along with items)
are then assembled into a product.
As can be seen, the traditional approach is not ideal for this because data would have to be stored redundantly. As always, an example makes this clearer.
Here's the item table which stores the items bought from a different vendor. An item is idenfied by its id and has a txt attribute
for human readability:
create table item ( id number primary key, txt varchar2(25) not null );
Each composition is stored in the composition table. This table, like item, has an id and a txt column.
create table composition ( id number primary key, txt varchar2(25) not null );
Lastly, the part_to_composition table. This table allows to assign parts to a composition. A part is either an item
(identified by id_item) or a composition (identified by id_non_item). In order to make sure that this rule is adhered to, a
check constraint is put in place on the table.
id_composition determines which composition the item or composition is assigned to.
cnt specifies how many parts are assigned to the compositon.
create table part_to_composition ( id_composition references composition not null, id_item references item, id_non_item references composition, cnt number(5) not null check (cnt > 0), check (id_item is null and id_non_item is not null or id_item is not null and id_non_item is null) );
With these tables, we're ready to insert some items:
insert into item values (1, 'item 1'); insert into item values (2, 'item 2'); insert into item values (3, 'item 3'); insert into item values (4, 'item 4'); insert into item values (5, 'item 5'); insert into item values (6, 'item 6');
Before we can assign parts to compositiones, the compositions must be recorded:
insert into composition values (1, 'comp 1'); insert into composition values (2, 'comp 2'); insert into composition values (3, 'comp 3'); insert into composition values (4, 'comp 4'); insert into composition values (5, 'comp 5'); insert into composition values (6, 'prod 1'); insert into composition values (7, 'prod 2'); insert into composition values (8, 'prod 3');
Now, parts (remember: a part is either an item or a composition) can be assigned to other compositions. Because a product is conceptually
just another part, the final products are defined with the following insert statements:
insert into part_to_composition values (1, 4, null, 2); insert into part_to_composition values (1, 6, null, 3); insert into part_to_composition values (2, 1, null, 5); insert into part_to_composition values (2, 4, null, 1); insert into part_to_composition values (2, 5, null, 3); insert into part_to_composition values (3, 2, null, 2); insert into part_to_composition values (3, null, 1, 2); insert into part_to_composition values (4, null, 2, 1); insert into part_to_composition values (4, 3, null, 4); insert into part_to_composition values (4, 6, null, 1); insert into part_to_composition values (5, null, 3, 1); insert into part_to_composition values (5, 2, null, 4); insert into part_to_composition values (6, null, 5, 1); insert into part_to_composition values (6, null, 4, 1); insert into part_to_composition values (6, 1, null, 1); insert into part_to_composition values (7, null, 5, 1); insert into part_to_composition values (7, null, 3, 2); insert into part_to_composition values (8, null, 5, 1); insert into part_to_composition values (8, 3, null, 2);
With the data inserted, the following query can be used to show the items that belong to prod_2
(
start with id_composition = 7 ):
select substr(lpad(' ', 2*level-1) || nvl(composition.txt, item.txt), 1, 40) txt, ' [' || cnt || 'x' || '] ' cnt from part_to_composition left join item on part_to_composition.id_item = item.id left join composition on part_to_composition.id_non_item = composition.id start with id_composition = 7 connect by id_composition = prior id_non_item;
Here's the query's output. comp 3 appears twice: once as a component to comp 5 and once as a component to the product itself.
Also, the subtrees of comp 3 are in both cases identical:
TXT CNT ---------------------------------------- --------- comp 3 [2x] comp 1 [2x] item 6 [3x] item 4 [2x] item 2 [2x] comp 5 [1x] comp 3 [1x] comp 1 [2x] item 6 [3x] item 4 [2x] item 2 [2x] item 2 [4x] Update October 27, 2005
This article has a sequel:
On using triggers to keep summary data in the nodes of hierachical data in sync.
More on OracleThis is an on Oracle article. The most current articles of this series can be found here.
|