Glossary
- aggregation
-
a summary of detail data that is stored with or
referred to by a cube.
- aggregation table
-
a table that contains pre-calculated totals. Aggregation
tables can be referred to by cubes, reducing the amount of time that
is required for building the cubes.
- Application Response Measurement
-
the name of an application programming interface
that was developed by an industry partnership and which is used to
monitor the availability and performance of software applications.
ARM monitors the application tasks that are important to a particular
business. Short form: ARM.
- ARM
-
See Application Response Measurement.
- calculated member
-
in a dimension, a member whose value is derived
from the values of other members.
- caption table
-
a table of translated captions for multilingual
OLAP cubes.
- child
-
in a hierarchical database, a segment or node
that has one or more superordinate segments, or parents. The branching
of parents and children form a tree structure in which each level
obtains identifying and qualifying features from the parent level
above it.
- connection profile
-
a client-side definition of where a metadata server
is located. The definition includes a computer name and a port number.
In addition, the connection profile can also contain user connection
information.
- cube
-
See OLAP cube.
- data cleansing
-
the process of eliminating inaccuracies, irregularities,
and discrepancies from data.
- data scrubbing
-
another term for data cleansing.
- data sparsity
-
a characteristic of a multidimensional data source
in which there is a relatively high proportion of empty cells (which
indicate missing data values) to filled cells.
- data warehouse
-
a collection of data that is extracted from one
or more sources for the purpose of query, reporting, and analysis.
Data warehouses are generally used for storing large amounts of data
that originates in other corporate applications or that is extracted
from external data sources.
- descendant
-
a record that a member that resides at a lower
level in relation to other members in the hierarchy. A record is a
descendant of its ancestors.
- detail data
-
nonsummarized (or partially summarized) factual
information that pertains to a single area of interest, such as sales
figures, inventory data, or human-resource data.
- dimension
-
a data element that categorizes values in a data
set into non-overlapping categories that can be used to group, filter,
and label the data in meaningful ways. Hierarchies within a dimension
typically represent different groupings of information that pertains
to a single concept. For example, a Time dimension might consist of
two hierarchies: (1) Year, Month, and Date, and (2) Year, Week, and
Day.
- dimension table
-
in a star schema or snowflake schema, a table
that contains data about a particular dimension. A primary key connects
a dimension table to a related fact table. For example, if a dimension
table named Customers has a primary key column named Customer ID,
then a fact table named Customer Sales might specify the Customer
ID column as a foreign key.
- drill down
-
to explore data and access information by moving
from summary information to more detailed data from which the summary
is derived. For example, you could click folders in a hierarchy from
the top downwards to find a specific file. Drilling down provides
a method of exploring multidimensional data by moving from one level
of detail to the next.
- drill up
-
in a view of a data table, multidimensional database
(MDDB), or cube, to click on detail data in order to view higher-level,
summarized information. For example, if you are looking at sales totals
for a sales district, you might drill up to view sales totals for
the entire country or sales region that the sales district is part
of.
- drill-through table
-
a view, data set, or other data file that contains
data that is used to define a cube. Drill-through tables can be used
by client applications to provide a view from processed data into
the underlying data source.
- fact
-
a single piece of factual information in a data
table. For example, a fact can be an employee name, a customer's phone
number, or a sales amount. It can also be a derived value such as
the percentage by which total revenues increased or decreased from
one year to the next.
- fact table
-
the central table in a star schema or snowflake
schema. The fact table contains the individual facts that are being
stored in the database as well as the keys that connect each fact
to the appropriate value in each dimension.
- foreign key
-
a column or combination of columns in one table
that references the corresponding primary key in another table. A
foreign key must have the same attributes as the primary key that
it references.
- format
-
See SAS format.
- granularity
-
the relative level of detail that a data item
represents. From the top of a dimension to the bottom, granularity
increases. For example, in a Time dimension that consists of a Year-Month-Day
hierarchy, Month is more granular than Year, and Day is more granular
than Month.
- hierarchy
-
an arrangement of related objects into levels
that are based on parent-child relationships. Members of a hierarchy
are arranged from more general to more specific.
- HOLAP
-
See hybrid online analytical processing.
- hybrid online analytical processing
-
a type of OLAP in which relational OLAP (ROLAP)
and multidimensional OLAP (MOLAP) are combined. In HOLAP, the source
data is usually stored using a ROLAP strategy, and aggregations are
stored using a MOLAP strategy. This combination usually results in
the smallest amount of storage space. In HOLAP, aggregates can be
pre-calculated and can be linked into a hybrid storage model. Short
form: HOLAP.
- informat
-
See SAS informat.
- leaf member
-
the lowest-level member of a hierarchy. Leaf members
do not have any child members.
- level
-
an element of a dimension hierarchy. Levels describe
the dimension from the highest (most summarized) level to the lowest
(most detailed) level. For example, possible levels for a Geography
dimension are Country, Region, State or Province, and City.
- linguistic sorting
-
a method of sorting data that applies different
collation rules in place of the default dictionary collation rules.
- locale
-
a setting that reflects the language, local conventions,
and culture for a geographic region. Local conventions can include
specific formatting rules for paper sizes, dates, times, and numbers,
and a currency symbol for the country or region. Some examples of
locale values are French_Canada, Portuguese_Brazil, and Chinese_Singapore.
- logical server
-
the second-level object in the metadata for SAS
servers. A logical server specifies one or more of a particular type
of server component, such as one or more SAS Workspace Servers.
- MDDB
-
See multidimensional database.
- MDX language
-
See multidimensional expressions language.
- Measures dimension
-
a special dimension that contains summarized numeric
data values (measures) that are analyzed. Total Sales and Average
Revenue are examples of measures. For example, you might drill down
within the Clothing hierarchy of the Product dimension to see the
value of the Total Sales measure for the Shirts member.
- member
-
an element of a dimension. For example, for a
dimension that contains time periods, each time period is a member
of the dimension.
- metadata repository
-
a collection of related metadata objects, such
as the metadata for a set of tables and columns that are maintained
by an application. A SAS Metadata Repository is an example.
- metadata server
-
a server that provides metadata management services
to one or more client applications. A SAS Metadata Server is an example.
- MOLAP
-
See multidimensional online analytical processing.
- multidimensional database
-
a specialized data storage structure in which
data is presummarized and cross-tabulated and then stored as individual
cells in a matrix format, rather than in the row-and-column format
of relational database tables. The source data can come either from
a data warehouse or from other data sources. MDDBs can give users
quick, unlimited views of multiple relationships in large quantities
of summarized data.
- multidimensional expressions language
-
a standardized, high-level language that is used
to query multidimensional data sources. The MDX language is the multidimensional
equivalent of SQL (Structured Query Language). Short form: MDX language.
- multidimensional online analytical processing
-
a type of OLAP that stores aggregates in multidimensional
database structures. Short form: MOLAP.
- multilingual cube
-
an OLAP cube that generates different query results
and captions according to the locale of the querying application.
- navigate
-
to purposefully move from one view of the data
in a table (or in some other data structure, such as a cube) to another.
Drilling down and drilling up are two examples of navigation.
- NWAY aggregation
-
the aggregation that has the minimum set of dimension
levels that is required for answering any business question. The NWAY
aggregation is the aggregation that has the finest granularity.
- ODBO
-
See OLE DB for OLAP.
- OLAP
-
See online analytical processing.
- OLAP cube
-
a logical set of data that is organized and structured
in a hierarchical, multidimensional arrangement to enable quick analysis
of data. A cube includes measures, and it can have numerous dimensions
and levels of data.
- OLAP schema
-
a container for OLAP cubes. A cube is assigned
to an OLAP schema when it is created, and an OLAP schema is assigned
to a SAS OLAP Server when the server is defined in the metadata. A
SAS OLAP Server can access only the cubes that are in its assigned
OLAP schema.
- OLE DB for OLAP
-
an extension to OLE DB that enables users to access
multidimensional databases in addition to relational databases. Short
form: ODBO.
- online analytical processing
-
a software technology that enables users to dynamically
analyze data that is stored in multidimensional database tables (cubes).
- parallel I/O
-
a method of input and output that takes advantage
of multiple CPUs and multiple controllers, with multiple disks per
controller to read or write data in independent threads.
- parallel processing
-
a method of processing that divides a large job
into several smaller jobs that can be executed in parallel on multiple
CPUs.
- parent
-
in a hierarchical database, a segment or node
that has one or more subordinate segments, or children. The branching
of parents and children form a tree structure in which each level
obtains identifying and qualifying features from the parent level
above it.
- primary key
-
a column or combination of columns that uniquely
identifies a row in a table.
- reach-through
-
the act of retrieving and displaying to a user
the (unsummarized) detail data from which the summarized data in a
multidimensional database is derived, when that detail data is stored
in a separate data repository.
- relational online analytical processing
-
a type of OLAP in which the multidimensional data
is stored in a relational database. Short form: ROLAP.
- ROLAP
-
See relational online analytical processing.
- roll up
-
to summarize (or apply some other type of calculation
or formula to) data values at one level of a dimension hierarchy in
order to derive values for a parent level. For example, sales figures
for January can be rolled up to Quarter1, and employee data for one
department can be rolled up to the division level.
- SAS Application Server
-
a logical entity that represents the SAS server
tier, which in turn comprises servers that execute code for particular
tasks and metadata objects.
- SAS ARM interface
-
an interface that can be used to monitor the performance
of SAS applications. In the SAS ARM interface, the ARM API is implemented
as an ARM agent. In addition, SAS supplies ARM macros, which generate
calls to the ARM API function calls, and ARM system options, which
enable you to manage the ARM environment and to log internal SAS processing
transactions.
- SAS format
-
a type of SAS language element that applies a
pattern to or executes instructions for a data value to be displayed
or written as output. Types of formats correspond to the data's type:
numeric, character, date, time, or timestamp. The ability to create
user-defined formats is also supported. Examples of SAS formats are
BINARY and DATE. Short form: format.
- SAS informat
-
a type of SAS language element that applies a
pattern to or executes instructions for a data value to be read as
input. Types of informats correspond to the data's type: numeric,
character, date, time, or timestamp. The ability to create user-defined
informats is also supported. Examples of SAS informats are BINARY
and DATE. Short form: informat.
- SAS Management Console
-
a Java application that provides a single user
interface for performing SAS administrative tasks.
- SAS Metadata Repository
-
a container for metadata that is managed by the
SAS Metadata Server.
- SAS Metadata Server
-
a multi-user server that enables users to read
metadata from or write metadata to one or more SAS Metadata Repositories.
- SAS name
-
a name that is assigned to items such as SAS variables
and SAS data sets. For most SAS names, the first character must be
a letter or an underscore. Subsequent characters can be letters, numbers,
or underscores. Blanks and special characters (except the underscore)
are not allowed. However, the VALIDVARNAME= system option determines
what rules apply to SAS variable names. The maximum length of a SAS
name depends on the language element that it is assigned to.
- SAS OLAP Cube Studio
-
a Java interface for defining and building OLAP
cubes in SAS System 9 or later. Its main feature is the Cube Designer
wizard, which guides you through the process of registering and creating
cubes.
- SAS OLAP Server
-
a SAS server that provides access to multidimensional
data. The data is queried using the multidimensional expressions (MDX)
language.
- SAS Open Metadata Architecture
-
a general-purpose metadata management facility
that provides metadata services to SAS applications. The SAS Open
Metadata Architecture enables applications to exchange metadata, which
makes it easier for these applications to work together.
- Scalable Performance Data Engine
-
a SAS engine that is able to deliver data to applications
rapidly because it organizes the data into a streamlined file format.
Short form: SPD Engine.
- schema
-
a map or model of the overall data structure
of a database. A schema consists of schema records that are organized
in a hierarchical tree structure. Schema records contain schema items.
- scrubbing
-
another term for data cleansing.
- shared dimension
-
a dimension that is used by more than one cube.
- slice
-
a subset of data from a cube, where the data in
the slice pertains to one or more members of one or more dimensions.
For example, from a cube that contains data about customer feedback,
one slice might pertain to feedback on one particular product (one
member of the Product dimension). Another slice might pertain to feedback
on that product from customers residing in particular geographic areas
who submitted their feedback during a certain time period (one member
of the Product dimension, multiple members of the Geography dimension,
one or more members of the Time dimension).
- SMP
-
See symmetric multiprocessing.
- sparsity
-
See data sparsity.
- SPD Engine
-
See Scalable Performance Data Engine.
- star schema
-
tables in a database in which a single fact table
is connected to multiple dimension tables. This is visually represented
in a star pattern. SAS OLAP cubes can be created from a star schema.
- stored statistics
-
statistics that are stored in a cube. Stored statistics
can be used to derive higher-level statistics. Examples include sum,
minimum, and maximum.
- symmetric multiprocessing
-
a hardware and software architecture that can
improve the speed of I/O and processing. An SMP machine has multiple
CPUs and a thread-enabled operating system. An SMP machine is usually
configured with multiple controllers and with multiple disk drives
per controller. Short form: SMP.
- thread
-
a single path of execution of a process that runs
on a core on a CPU.
- thread-enabled operating system
-
an operating system that can coordinate symmetric
access by multiple CPUs to a shared main memory space. This coordinated
access enables threads from the same process to share data very efficiently.
- threading
-
a high-performance technology for either data
processing or data I/O in which a task is divided into threads that
are executed concurrently on multiple cores on one or more CPUs.
- Time dimension
-
a dimension that divides time into levels such
as Year, Quarter, Month, and Day.
- tuple
-
a data object that contains two or more components.
In OLAP, a tuple is a slice of data from a cube. It is a selection
of members (or cells) across dimensions in a cube. It can also be
viewed as a cross-section of member data in a cube. For example, ([time].[all
time].[2003], [geography].[all geography].[u.s.a.], [measures].[actualsum])
is a tuple that contains data from the Time, Geography, and Measures
dimensions.
- warehouse
-
See data warehouse.
- wizard
-
an interactive utility program that consists of
a series of dialog boxes, windows, or pages. Users supply information
in each dialog box, window, or page, and the wizard uses that information
to perform a task.
Copyright © SAS Institute Inc. All rights reserved.