You might achieve the best
possible performance when using threaded Reads by specifying the
DBSLICE=
data set option for DB2 in your SAS
operation. This is especially true if your DB2 data is evenly distributed
across multiple partitions in a DB2 Enterprise Extended Edition (EEE)
database system. When you create a DB2 table under the DB2 EEE model,
you can specify the partitioning key that you want to use by appending
the clause
PARTITIONING KEY(column-name)
to
your CREATE TABLE statement. Here is how you can accomplish this by
using the LIBNAME option, DBCREATE_TABLE_OPTS=, within the SAS environment.
/*points to a triple node server*/
libname trlib2 db2 user=db2user pw="db2pwd" datasrc=sample3c
DBCREATE_TABLE_OPTS='PARTITIONING KEY(EMPNUM);
proc datasets library=trlib;
delete MYEMPS1;run;
data trlib.myemps(drop=morf whatstate
DBTYPE=(HIREDATE="date" SALARY="numeric(8,2)"
NUMCLASS="smallint" GENDER="char(1)" ISTENURE="numeric(1)" STATE="char(2)"
EMPNUM="int NOT NULL Primary Key"));
format HIREDATE mmddyy10.;
do EMPNUM=1 to 100;
morf=mod(EMPNUM,2)+1;
if(morf eq 1) then
GENDER='F';
else
GENDER='M';
SALARY=(ranuni(0)*5000);
HIREDATE=int(ranuni(13131)*3650);
whatstate=int(EMPNUM/5);
if(whatstate eq 1) then
STATE='FL';
if(whatstate eq 2) then
STATE='GA';
if(whatstate eq 3) then
STATE='SC';
if(whatstate eq 4) then
STATE='VA';
else
state='NC';
ISTENURE=mod(EMPNUM,2);
NUMCLASS=int(EMPNUM/5)+2;
output;
end;
run;
After the table MYEMPS
is created on this three-node database, a third of the rows reside
on each of the three nodes.
Optimization of the
threaded Read against this partitioned table depends on the location
of the DB2 partitions. If the DB2 partitions are on the same machine,
you can use
DBSLICE= with the DB2 NODENUMBER function in the WHERE clause:
proc print data=trlib2.MYEMPS(DBSLICE=("NODENUMBER(EMPNO)=0"
"NODENUMBER(EMPNO)=1" "NODENUMBER(EMPNO)=2"));
run;
If the DB2 partitions
reside on different physical machines, you can usually obtain the
best results by using the DBSLICE= option with the SERVER= syntax
in addition to the DB2 NODENUMBER function in the WHERE clause.
In the next example,
DBSLICE= contains specific partitioning information for DB2. Also,
Sample3a, Sample3b, and Sample3c are DB2 database aliases that point
to individual DB2 EEE database nodes that exist on separate physical
machines. For more information about the configuration of these nodes,
see
Configuring DB2 EEE Nodes on Physically Partitioned Databases.
proc print data=trlib2.MYEMPS(DBSLICE=(sample3a="NODENUMBER(EMPNO)=0"
samble3b="NODENUMBER(EMPNO)=1" sample3c="NODENUMBER(EMPNO)=2"));
run;
NODENUMBER is not required
to use threaded Reads for
SAS/ACCESS Interface to DB2 under UNIX
and PC Hosts. The methods and examples described in
DBSLICE= work well in cases where the table that you want to read
is not stored in multiple partitions to DB2. These methods also give
you full control over which column is used to execute the threaded
Read. For example, if the STATE column in your employee table contains
only a few distinct values, you can modify your DBSLICE= clause accordingly:
data work.locemp;
set trlib2.MYEMPS (DBSLICE=("STATE='GA'"
"STATE='SC'" "STATE='VA'" "STATE='NC'"));
where EMPNUM<=30 and ISTENURE=0 and SALARY<=35000 and NUMCLASS>2;
run;