|
Relational Efficiencies and Tuning
"This report from the DB2 or Oracle table(s) just runs too long!" What's "too long" depends, of course, on the environment. In WebFOCUS, it could be a minute or until the server times out. If a mainframe FOCUS request runs in batch, the question is usually how large a time window is required, in order for subsequent jobs to run. But online, it's a question of how long the user is willing to wait. In this article, I will try to help you pinpoint the nature of the problems, and suggest possible solutions.
To investigate where the problem(s) might be, let's step back and review the process for a TABLE request.
- Analysis phase: This is where FOCUS reads the MFD and the request and determines:
- What types of files are used?
- What JOINs are in effect?
- What DEFINEs must be evaluated, and WHEN?
- What is the level of aggregation, etc?
- Retrieval Phase: SELECT statements are issued to retrieve records for the report.
- Logical structure What JOINs are in effect and used for the request?
- Are all of the logical database components of the same type (i.e., DB2)?
- Can we pass the JOIN to the relational engine, so there will only be one SELECT? If not, there will be SELECT statements for each table.
- Selection criteria WHERE tests and Filters
- Is the selection based on a "key" or unique index?
- Can the selection criteria be passed to the engine?
- Temporary fields DEFINEs are evaluated.
- If the DEFINE can be passed to the relational engine it will be a more efficient process, especially if the temporary field is part of the selection process (IF or WHERE tests on a DEFINEd field)
- If the DEFINE cannot be passed, then all records are returned to FOCUS, where a temporary field is created. This affects not only selection, but may also affect sorting.
- Sorting and aggregation
- Relational engines have built in sort and aggregation facilities, so if aggregation can be performed by the relational engine, then the resulting answer set should be significantly smaller, and the operation more efficient.
- Output phase
- What styling options are used? Must FOCUS style each line, or are Cascading Style Sheets applied?
- Must the user wait until the entire report is complete before seeing output?
- How many records are actually displayed?
Let's assume that OPTIMIZATION is ON, (the default), which means that the interface attempts to fully optimize all requests. Only if DBMS optimization cannot be used will FOCUS perform the remaining actions, which generally means that more data is returned, and more work then done locally.
Here we want to identify report-request components that cannot be passed to the underlying relational engine.
DEFINE statements be on the lookout for:
 |
User-written subroutines, or FOCUS functions |
 |
Self-referencing calculations (COUNTER = COUNTER + 1) |
 |
Strong concatenation |
 |
DECODE operations |
DEFINE optimization hint where possible, change to COMPUTE and pass the components.
JOIN operations be on the lookout for:
 |
Different file suffixes |
 |
KEYS=n in the Access File Description that are not specified correctly |
 |
DEFINE-based JOINs based on a DEFINE that could not be optimized |
JOIN optimization hints:
 |
If both files are large, create a HOLD FORMAT DB2 file for nonrelational JOIN components |
 |
If JOIN is done for selection (JOIN EID IN flat file to ALL EID in DB2), consider WHERE IN FILE instead |
SORTs/Aggregation be on the lookout for:
 |
FRL FOR clauses, or BY field ROWS value OVER value
|
 |
IN-GROUPS-OF |
 |
Aggregation of Alpha field |
 |
BY/ACROSS defined field where the DEFINE could not be optimized |
 |
Aggregation cannot be passed with multi-verb requests |
 |
COUNT with MISSING = ON |
SORT/Aggregation optimization hints:
 |
Consider COMPUTE instead of DEFINE Sort by "real field" and then COMPUTE |
 |
If a multi-verb request is required, consider using a single verb, creating a HOLD file, and doing the TABLE request against the HOLD file |
To assist you in optimizing reports, there are several useful debugging and tracing tools. Prior to release 7.2, simply allocating FSTRACE3 and/or FSTRACE4 turned on specific trace facilities.
 |
FSTRACE3 SQL optimization messages and warnings |
 |
FSTRACE4 SQL statements generated by the Analysis Phase |
Beginning with release 7.2, the tracing facilities were combined for all of the interfaces/data adapters, and can now be turned on/off with SET commands:
SET TRACEON=component/adaptor/destination |
where destination can be CLIENT or FSTRACE. So, to see the equivalent of FSTRACE3 (SQL optimization messages):
DYNAM ALLOC F FSTRACE DA …
SET TRACEOFF = ALL
SET TRACEON = SQLAGGR/FSTRACE |
To turn on tracing for a single data adapter (for example, DB2):
SET TRACEON=SQLAGGR/DB2/FSTRACE |
For additional information about trace options, see:
http://techsupport.informationbuilders.com/tech/ibm/ibm_tmo_tm7989.html
Up to this point in our optimization review, we have been analyzing the request itself, and not the actual retrieval, which you can turn off:
SET XRETRIEVAL = OFF and then EX fex |
Sometimes, even with everything optimized a report still runs too long. In that case, you might want to turn on STMTRACE, which shows the actual SQL code generated. Again, you can do this with XRETRIEVAL OFF as it's the SQL code and not the actual data that's critical here.
Remember too, that new releases of FOCUS/WebFOCUS always bring new features that may improve your efficiency. OUTJOIN may not be passed specifically to the interfaces, and that might be critical for your application.
Good luck, and good efficiency.
|