chapter eleven

introducing the SAS Viya platform

In 2016, SAS introduced its new SAS Viya platform. SAS Viya is a completely re-imagined approach to delivering powerful SAS analytics using the latest technology in services deployment. The promise of cloud computing is being realized today, and SAS has designed SAS Viya to capitalize on the flexibility of elastic infrastructure.

SAS Visual Analytics 8.1 is based on the SAS Viya platform.

Overview of the SAS Viya platform

The Viya platform is made up of many different pieces, but they can be summarized in just a few categories:

•   CAS In-Memory Analytics Server: The CAS server is the next generation of in-memory analytics. Built on the lessons learned from previous in-memory analytics such as the SAS LASR Analytic Server, CAS extends its operational goals beyond just speed. While SAS software is a natural client for the CAS server, it also offers open programming APIs for clients other than SAS, such as Java, Python, and the R programming language.

•   Microservices: Each microservice has a singular goal and is focused on delivering just that. Consider the SAS Metadata Server, which does not exist on the SAS Viya platform directly. Instead, its many roles have been given to individual microservices. One of the many benefits of this approach is that SAS microservices no longer require a rigid, structured start-up and shut-down order. They are designed to act intelligently in dealing with up- and down-stream dependencies.

•   Stateful services: A handful of SAS software services must be continuously present and deliberate in their interactions. These stateful services are critical infrastructure for the SAS Viya platform since the other services rely on them to understand the environment as well as to act upon incoming directives. One example is Consul, which acts as a service registry. All of the microservices contact Consul to register their availability as well as to find out where the other microservices, which they require, are listening.

Altogether, the pieces of the SAS Viya platform can be visualized in following illustration:

Figure 11.1 SAS Viya Platform

image

As is typical of SAS software offerings, this view is just one deployment option out of many that are possible. Here, we see that the stateful services and the microservices all reside together on a single host machine and that the CAS In-Memory Analytics Server is deployed in a massively parallel processing (MPP) configuration across a minimum of four host machines: one acting as the Controller and the others as Workers.

Alternatively, the simplest deployment of the SAS Viya platform would consist of all software running on a single host machine where the CAS server would run as a single instance in single machine, symmetric multiprocessing (SMP) mode. Or you could choose to deploy in support of a large enterprise and run on even more machines than illustrated here. The usual advice follows at this point: When sizing an environment for your site, contact the SAS Enterprise Excellence Center for assistance to ensure that all considerations are weighed so that sufficient compute resources will be available for your needs.

Understanding the CAS In-Memory Analytics Server

To fully realize the new capabilities of the CAS server, it’s helpful to understand the journey that SAS has taken to develop in-memory analytics over the last several years.

Introducing massively parallel analytics

The first iteration of SAS in-memory analytics using an MPP approach was the SAS High-Performance Analytics Server offering. Still available today for the SAS 9.4 platform, the SAS High-Performance Analytics Server showed just how incredibly powerful SAS analytics can be when performed completely in memory, without using disk during interim calculations. The SAS High-Performance Analytics Server only runs on demand. This means that each invocation of the SAS High-Performance Analytics Server required loading data into RAM. Once the requested in-memory analysis task completed, the SAS High-Performance Analytics Server released the memory and shut down. To perform a follow-up calculation on the same data requires repeating the entire process. Reloading the data with each request can be tedious, especially if you are not using a fast and efficient parallel loading mechanism.

Adding persistence

The second major iteration of SAS in-memory analytics is the SAS LASR Analytic Server. Where the SAS High-Performance Analytics Server is on-demand only, the SAS LASR Analytic Server offers a persistent service that loads data into RAM and keeps it there until directed to release it. For multiple-step analysis operations on the same table, this is a huge performance improvement since data needs to be loaded only once. But with that capability, came more responsibility. Persisting large tables in memory means that someone must actively monitor and administer the environment to ensure that the correct tables are loaded when needed. Multiple LASR Analytic Servers could be started to provide flexibility in data availability and access control. But this compounds the issue of overall monitoring and administration. Furthermore, as LASR Analytic Server takes on an increasingly important role in the enterprise, IT organizations began demanding more automation of data management, failover support, and more.

Providing more flexibility

The SAS CAS In-Memory Analytics Server is the next major leap forward. Speed is still one of the foremost goals of CAS, and its internal structure has been optimized to surpass LASR in key areas. But speed is no longer the only primary objective. CAS offers a slew of improvements to how data is managed internally. For example, CAS can now understand the active data source for a specific table. When that table is needed in-memory (for example, when a SAS Visual Analytics user opens a report for the first time), then CAS can load that data automatically – without direct involvement of the administrator. In order to work automatically with data in this way, CAS uses a new memory management model that can actively cache data to disk and/or memory map at the source. This model enables CAS to seamlessly work with more data at once than LASR could in an equivalent environment. Failover is also a major goal, and the first release of the CAS server shipped with support for CAS Worker failover. If a CAS Worker goes down, then the other CAS Workers can pick up the lost node’s data and complete its analysis tasks.

The CAS server also offers new levels of openness and integration with how SAS users and their IT organizations prefer to operate. SAS currently offers programming APIs to direct CAS actions that enable users to work in programming environments (other than SAS), which they are already familiar with. So besides SAS languages, coders accustomed to working in Python and Java can use those languages to perform analysis in CAS. SAS will soon offer programming API for the R language and others as well.

Furthermore, the CAS server supports an all-new approach to loading data over parallel channels: DNFS. DNFS, which is an acronym for distributed network file system, refers to the ability to connect your CAS server hosts to a high-performance storage solution where data can be stored in CSV (text-delimited format), standard SAS data sets, and the new SASDNFS file container.

Figure 11.2 CAS accessing SASHDAT data using DNFS

image

So in addition to parallel loading of SASHDAT files stored in HDFS or using the SAS In-Database Embedded Process, users of the SAS Viya platform with the CAS server can now parallel load from standard high-performance storage offerings that your IT organization might be familiar with or otherwise prefer.

Notice also in this figure that SMP CAS servers can access the exact same SASHDAT data as MPP CAS. This is yet another way in which CAS offers new flexibility and capabilities for your analytics efforts.

SAS Viya and SAS 9.4 together

SAS Viya is an all-new platform that has been developed from the ground up. However, SAS Viya is still relatively new in its lifecycle. As such, it does not offer the full breadth and depth of capabilities that are offered by the software of the mature SAS 9.4 platform.

SAS Viya can operate independently and offer a suite of capabilities that are sufficient for delivering analytics. For a wider range of options, deploy the SAS 9.4 platform software as well. SAS 9.4 is integrated to work with the SAS Viya platform through the use of SAS/CONNECT software. For example, the CAS server provides Data Connectors, which can connect directly to third-party data providers. However, the range of Data Connectors is not yet as extensive as those provided by SAS/ACCESS products. So in a situation where your data resides somewhere available to SAS/ACCESS but not yet to the CAS Data Connectors, use SAS 9.4 products to get that data and then deliver it to SAS Viya using SAS/CONNECT. In this way, SAS 9.4 helps extend the reach of SAS Viya beyond its currently built-in capabilities. Currently, development of SAS 9.4 solutions and SAS Viya solutions is performed along parallel tracks.

Managing the SAS Viya environment

SAS Environment Manager has become the heart of the system administration. Using the SAS Environment Manager from SAS 9.4 M3 as the basis, SAS rebuilt this application. The goal was to have a unified way to administer the system and reduce complexity. To that end, the new application combines SAS Management Console, Visual Analytics Administrator, and Deployment Manager with the capabilities of the SAS Environment Manager.

Opening the application

There are two ways to open the application, from the SAS Home navigation panel by selecting the SAS Environment Manager link. SAS Environment Manager allows you to access different application features from the shelf-menu (or navigation pane) on the left. When you open SAS Environment Manager the top-level dashboard appears similar to the following figure.

Figure 11.3 SAS Environment Manager

image

This Dashboard page (shown with a 2 in the preceding figure) provides an overview of the environments connected to your SAS Visual Analytics application. You can quickly understand your deployments, custom groups, services, and servers. There are also metrics to help you understand the system health, mobile devices, and the PostgreSQL database.

From the shelf menu shown with a 1 in the preceding figure, you can access the following application areas:

•   Dashboard  top-level page that provides an overview of the system.

•   Users           Allows you to manage users and groups.

•   Data             Allows you to manage the CAS content.

•   Content       Allows you to manage the metadata associated with the environment.

•   Resources   Allows you to manage system resources and configuration.

•   Security       Allows you to control the capabilities, domains, and mobile devices.

Managing users and groups

You can manage users and groups from the Users page. A big improvement was changing the way users and groups are administered in the system. SAS Environment Manager links to a corporate directory service using LDAP. During the deployment, you can determine which groups are displayed in the tool. Note that none of this data is stored in SAS Environment Manager.

You can create custom groups for the application. Custom groups are ones that do not exist in the corporate directory. You might want to have custom groups to limit features or to better control the application security. This information is stored and managed by the application. There are some default custom groups available with the application.

By selecting Custom Groups from the top-left menu you can review all of the custom groups. When you click on a group name, the right pane shows information about that group. The SAS Administrators group, which is similar to the Unrestricted user in SAS Management Console, has the most abilities in the system. The fewest number of users are in the group. You can use the Edit icon in the corners to make changes to the group or its members.

image

Managing data

One of the most important items to manage in the system is the data. From the Data page, you can use the drop-down menu to manage the Loaded Tables, Libraries and Servers.

image

Each of these pages provides a different view of the data:

Loaded tables See a detailed view of the tables loaded into the system and their state.
Libraries See all libraries and load data.
Servers See CAS servers.
Viewing data tables

The Loaded tables page lists all data tables in each server and library. For each table you can review the state, location, and source table. You can also see the row and column count for the table to better understand the size. You can search and sort the columns.

A table has multiple states. If the table is loaded into the CAS server, then it has a green icon. If the table is not loaded, then the icon is red. You can click on the table name to review the table properties or to control and change the authorization.

Figure 11.4 Viewing tables

image

Viewing libraries

Libraries provide logical groupings to store data. Libraries are associated with servers. You can have multiple libraries for each server. The following figure shows the libraries.

image

Managing content

You can use the Content tab to create directory structures for the reports and data. When storing reports and data tables in SAS Visual Analytics, you need a logical structure. You want users and consumers to locate content quickly. You may also want to isolate contents from certain users. For instance, not everyone needs access to personnel data that might contain salaries or other sensitive information.

If you are familiar with the SAS 9.x folder structure, then you will find a similar methodology here. Each user has their own private folder area called My Folder, where they can store individual reports.

image

From the Content area, you can navigate the folder structure. When you click on a folder name, the property information appears in the right pane. You can use the icons along the top to add folders, remove folders and control folder access.

References

SAS Institute Inc. 2016. Differences in the SAS 9 and SAS Viya 3.1 Platforms. Cary, NC: SAS Institute Inc.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset