XFOCUS Databases

by Noreen Redden

Lately, questions on XFOCUS databases (SUFFIX=XFOC) seem to be pouring in from customers, so I felt this was a good time to discuss them. 

First, let’s address storage. 

Like a FOCUS file, an XFOC database consists of multiple pages, where each page contains instances of a single SEGMENT, instances of a specific internal index, or values of text fields. In addition to those pages, the first page contains only the FDT (road map to pages for segments, indexes, etc.).  Since we no longer limit the size of the SEGNAME or INDEX fieldname, the FDT no longer can co-exist with the instances for the root segment.

In a FOCUS database, each page is 4K, with a maximum of 512 pages or 2Gig. An XFOCUS database, on the other hand, consists of a maximum of 1,000 16K pages, to a maximum of 16Gig for a single dataset. (See partitioning for storing up to 4 terabytes of data).

Segment instances are now divided into three parts:

1) Pointer to start of instance (located at the beginning of the page)
4bytes

2) Pointer from instance to instance                                   
4 bytes/pointer

            Parent to Child  (includes static cross-references)
Child to Parent
Chain Pointer     (even for Unique segments)

            Pointer to TX information                                       
8 bytes

3) Data storage
Alpha field      (An)                                                 n bytes
Integer                                                                     4 bytes
Packed                                                                    8 bytes or 16 bytes
Floating Point (single Precision)                              4 bytes
Floating Point (double Precision)                             8 bytes
TX                                                          actual length on separate page

As with FOCUS files, use EX CALCFILE to get an estimate of file size based on segment population. 

Partitioning
There are two types of partitioning -- storing the data in an FOC database or XFOC database in multiple datasets. Consider a database that contains the following information:

DATE    ACCOUNT#  ACCOUNTNAME  REPID  REPNAME  SALES

We can partition this data horizontally and/or vertically.

Horizontal partitioning is better known to the FOCUS community as LOCATION. In this case, a SEGMENT and all of its instances are stored on a separate dataset, by specifying LOCATION= in the Master File Description. 

Additionally, vertical partitioning allows us to partition the data based on a value. In my example, we could store each year in its own file. So, if DATE is from 20070101 to current, it is stored in SALES, from 20060101 to 200671231, in SALES2006, etc. 

Typically, these separate datasets (each one of which can now be 16Gig, are referenced via USE

USE c:\ibi\apps\sales\sales.xfoc AS SALES
C:\ibi\apps\sales\sales2006.xfoc AS SALES

END

More useful perhaps is the ACCESS file, which will perform the same concatenation for reporting, without a USE having to be issued:

Sales.mas:     FILE=SALES,SUFFIX=XFOC,ACCESS=SALES
Sales.acx:      MASTER=SALES,$
DATA=\ibi\apps\sales\sales.xfoc
WHERE=DATE FROM 20070101,$
DATA=\ibi\apps\sales\sales2006.XFC,
WHERE=DATE FROM 20060101 TO 20061231,$

The addition of WHERE tests is “intelligent partitioning.” A report request that contains a WHERE DATE EQ 20070513 will not even open the second partition.

Remember, however, that just as with the USE concatenation, a MODIFY or MAINTAIN will use only the first of the partitions, and if there is a need to maintain data in all partitions, separate MFDs must be established for prior years.

Indexes

Finally, we come to indexes.

Internal indexes, noted by INDEX=I in the MFD, are modified/changed as the original database is changed. They are always up to date, but large updates may be slow in order to maintain those indexes. Additionally, the internal index does take pages in the database, and therefore must be accounted for in the 16gig limitation. Another limitation of internal indexes is JOIN. You cannot JOIN to a FOC or XFOC database that consists of multiple vertical partitions using internal indexes because they are not concatenated.

Finally, a relatively new feature for FOC and XFOC files is the Multi-Dimensional Index, which will index multiple dimensions in one index, giving rapid retrieval for a WHERE test that requires multiple dimensions. For instance, in the example above, we might have a dimension of DATE, REPID and ACCOUNT. Any report request that performs a WHERE DATE EQ 20070501 AND REPID EQ 1234567 will be able to utilize the MDI. Without the MDI, the WHERE test would use the INDEX on date, for instance, and then have to scroll through all of the REPIDs associated with that date. The MDI, which can be (and should be) stored in multiple datasets is indicated in the ACCESS file:

MDI=HISMDI1, TARGET_OF = SEGSALES,$
     DIM=DATE,$
     DIM=ACCOUNT,$
     DIM=REPID,$
     DIM=BRANCH,$

The MDI is maintained separately, and a separate REBUILD must be run to create/update the MDI. This is not useful, therefore in a Sink machine application.

Accessing XFOC databases is basically the same as reporting against any data source, especially where dataset names, MDI information, etc., are stored in the metadata.  However, there are two sets that I would recommend, and two that are required.

Required:

SET MASTER = NEW
SET XFC = ON  

Recommended:

SET JOINOPT = NEW           ß- 
Especially in cross-reference situations, this sets a correct retrieval call from the host file to cross-reference files (both dynamic and static cross-referencing)

SET XFCBINS = 256 (or higher). An XFOC bin is 4K, so obviously 4 must be linked together in order to support the 16K page. The default (based on release of FOCUS) is either 16 or 64, which means there will be a lot of thrashing if more than 4 or 16 pages in the XFOC database must be accessed and re-accessed. Of course, the XFCBINS do come from memory, so start with 256, which will give 64 pages in memory at any one time.

previous next