• +1-617-874-1011 (US)
  • +44-117-230-1145 (UK)
Live Chat
Follow Us:

ITECH1103- Big Data and Analytics – Lab 3 – Working with Data Items

ITECH1103- Big Data and Analytics 
Lab 3 – Working with 
Data Items 
Federation University Australia   

Objectives

ITECH1103- Big Data and Analytics – Lab 3 – Working with Data Items Image 1

* Creating, testing, and comparing models can be accomplished with SAS Visual Statistics andSAS Visual Data Mining and Machine Learning.

ITECH1103- Big Data and Analytics – Lab 3 – Working with Data Items Image 2 ITECH1103- Big Data and Analytics – Lab 3 – Working with Data Items Image 3

SAS Visual Data Studio uses a CAS table as input and creates a CAS table as output.SAS Visual Analytics uses a CAS table as input and creates a report that can be viewed in the Report Viewer or Mobile BI app. Any changes to data made in Visual Analytics apply to the report only and do not affect the CAS table.Beginning with Visual Analytics 8.3, report data views can be created to save and apply settings for a data source. A data view acts as a template for any settings that are modified, including data property changes, data source filters, hierarchies, geography data items, calculated items, and more. A data view does not update the CAS table. If the view is updated, your reports are not automatically updated with the new settings.Data views are saved separately from your reports. If you create a data view in one report, you can apply it to other reports that use the same data source.Data views can be shared so other users can also apply them to the data source.A data source can have a default view as set by an administrator. You can also set the default view for yourself. A default data view is automatically applied anytime you add the data source to a report.For more information about data views, see “Working with Data Views in Reports” in the SAS® Visual Analytics 8.3: Working with Report Data documentation.

Data Item Properties

ITECH1103- Big Data and Analytics – Lab 3 – Working with Data Items Image 4

In the Data pane, an icon next to each data item indicates the type of data item. The following types of data items are available:

Category: A data item whose distinct values are used to group and aggregate measures.

Date and Time: A category data item whose distinct values are used to group and aggregate measures. There are three types of date categories: date, datetime, and time.

Custom Category: A data item that can be created based on either a category or numeric data item. A custom category is always a category data item with alphanumeric values.

Calculated (category): A data item that is calculated from existing data items using an expression and returns an alphanumeric value.

Calculated (datetime): A data item that is calculated from existing data items using an expression and returns a datetime value. Calculated dates and times are treated as categories with distinct values being governed by the chosen date or time format.

Geography: A category data item whose values are mapped to geographical locations or regions. These data items can be used to show data on a geographic map.

Hierarchy: A data item with a predefined arrangement of category data items, typically whose values are arranged with more general information at the top and more specific information at the bottom. The first level of the hierarchy is known as the root level.

Geographic Hierarchy: A hierarchy whose members are all geographic data items.

Interaction Effect : A user-created data role that can be used when there is a non-additiverelationship between two variables (the effect of one variable on a model changes as another variable changes). SAS Visual Statistics must be licensed for you to create and use an interaction effect.

Measure: A data item whose values can be used in computations. These values are numeric. By default, almost all measures have a default aggregation of Sum, but the aggregation can be modified.

Calculated (measure): A data item that is calculated from existing data items using an expression and returns a numeric value. Numeric data items are treated as measures (with an aggregation type of Sum), or they can be changed to category data items.

Frequency: A measure data item whose value represents the number of observations in the selected data source. This data item is automatically added to the Data pane under the Measure group. You cannot change the classification for this data item. This data item is automatically assigned to some report objects when no measure is assigned.

Frequency Percent: A measure data item whose value represents the percentage of observations in the selected data source. This data item is automatically added to the Data pane under the Aggregated Measure group. You cannot change the classification for this data item.

Aggregated Measure or Time Period Calculation: A data item that represents special predefined operations, like distinct count, percentage of totals, percentage of subtotals, or frequency percent. Users can also create their own aggregated measure calculations. Aggregated measures cannot be used in all report objects, filters, controls, spark lines, or time series graphs. Some aggregated measures cannot be used in a detail rank. Percentage of subtotal items can be used only in a crosstab.

Working with Data Items

This demonstration illustrates how to modify data item properties (name, format, aggregation) in Visual Analytics.

1.From the browser window, select SAS Drive from the bookmarks bar.

2. If necessary, clickand select New Report in the top left corner of SAS Drive.

3.Click Data.

4.In the Open Data Source window, select Data Sources.

ITECH1103- Big Data and Analytics – Lab 3 – Working with Data Items Image 5

5. Double-click cas-shared-default.

6. Double-click Public PATH.

ITECH1103- Big Data and Analytics – Lab 3 – Working with Data Items Image 6

7. SelectCUSTOMERS_CLEAN.Note: If CUSTOMERS_CLEAN does not exist, then right-click CUSTOMERS_CLEAN.sashdatand select Load.

8.ClickOK.The Data pane is displayed, and it contains a list of data items from the CUSTOMERS_CLEANtable.

ITECH1103- Big Data and Analytics – Lab 3 – Working with Data Items Image 7

9. If necessary, click the Data icon in the left pane.

10.Verify that Customer ID and Order ID appear in the Category group, because the data type waschanged to character in SAS Data Studio.

Note: Character and datetime data items appear as categories in Visual Analytics.

11.Verify that the new columns created in SAS Data Studio (Customer_FirstName,Customer_LastName, and Title) appear in the Category group.

ITECH1103- Big Data and Analytics – Lab 3 – Working with Data Items Image 8

12.Verify that the new columns created in SAS Data Studio (Days to Delivery and Profit) appear inthe Measure group. Note: Numeric (double) data items appear as measures in Visual Analytics

ITECH1103- Big Data and Analytics – Lab 3 – Working with Data Items Image 9

Note: Cost and Retail Price were renamed in SAS Data Studio to Unit Cost and Total Revenue, respectively. Those new names are not reflected because Visual Analytics displays labels, not data source names.

13.Modify properties for a data item, Date Order was Delivered.

  1. In the Category group, right-click Date Order was Delivered.
  2. Select Format c:: More formats.
  3. SelectMMMYYYY.
  4. ClickOK.
  5. Click(Edit properties) next to Date Order was Delivered.
  6. Enter Delivery Date in the Name field and press Enter.

14. Modify properties for a data item,Profit.

  1. In the Measure group, click(Edit properties) next toProfit.
  2. Click(Edit) for the Formatfield.
  3. If needed, change Width to 12.
  4. If needed, change Decimals to 2.
  5. Click OK.

15.Modify properties for a data item, Discount in percent of Normal Total Retail Price.

  1. In the Measure group, click(Edit properties) next to Discount in percent of Normal Total Retail Price.
  2. Select Average for the Aggregation field.
  3. Enter Discount in the Name field and click(Edit properties) to collapse the data item.

16.Modify the aggregation for a data item, Days to Delivery.

  1. In the Measure group, click(Edit properties) next to Days to Delivery.
  2. Select Average for the Aggregation field.
  3. Enter Average Days to Delivery in the Name field and click(Edit properties) to collapse the data item

17.Rename data items.

  1. In the Category group, click(Edit properties) next to Date Order was placed by Customer.
  2. Enter Order Date in the Name field and click(Edit properties) to collapse the data item.
  3. In the Measure group, click(Edit properties) next toCost.
  4. Enter Unit Cost in the Name field and click(Edit properties) to collapse the data item.
  5. In the Measure group, click(Edit properties) next to Quantity Ordered. f.Enter Quantity in the Name field and click(Edit properties) to collapse the data item.
  6. In the Measure group, click(Edit properties) next to Retail Price.
  7. Enter Total Revenue in the Name field and click(Edit properties) to collapse the data item.

18.Create a data view.

  1. Click(Actions) and selectSave data view.
  2. Verify that CUSTOMER_CLEAN_View_1 appears in the Namefield.
  3. Enter Modified data item properties (renamed, changed formats, changed aggregations) in the Description field.
  4. Click Save.
ITECH1103- Big Data and Analytics – Lab 3 – Working with Data Items Image 10

19.Save the report.

  1. In the upper right corner, click(Menu) and select Save as.
  2. Navigate to the Content>>Courses>>YVA183>>Basics>>Demosfolder.
  3. EnterVA1- Demo3.1 in the Name field.
  4. Click Save.

Business Scenario: Employees

ITECH1103- Big Data and Analytics – Lab 3 – Working with Data Items Image 11

Exercise

1.Working with Data Items

  1. Open the browser and sign in to Visual Analytics using Eric’s credentials.
  2. Open the VA1- Exercise3.1 report from the Content>>Courses>>YVA183>>Basics>>Exercises (HR) folder.
  3. View the data items in the Data pane and answer the following questions:What is the classification of Employee ID? Manager at 1. level?Answer:What does the Frequency data item represent?Answer:
  4. Change the classification for Manager at 1. level to Category.
  5. Change the format for Annual Salary to Dollar13.2.
  6. Rename the following data items: Old

Old name

New name

Employee ID

ID

Employee Name

Name

Manager at 1. level

Manager ID

Frequency

Number of Employees

Exploring Data with Charts and Graphs

Objectives

Discuss when to use descriptive graphs (histogram, box plot, bar chart) in Visual Analytics.

Maximize graphs objects to view details. • Modify roles and options for graph objects.

Co p yrigh t © SAS In sti tu te In c. All righ ts reserved .

ITECH1103- Big Data and Analytics – Lab 3 – Working with Data Items Image 12

Objects: Graphs (Descriptive)

ITECH1103- Big Data and Analytics – Lab 3 – Working with Data Items Image 13

Histogram

Box plot

The histogram contains a series of bars that represent the number of observations (or percentage of all observations) for a measure that fit in a specified value range (or bin). The shape of the distribution can be affected by the number of bins specified for the histogram.

Note: If you use the default number of bins, then the minimum and maximum values on the histogram might not match your actual data values. However, if you specify the number of histogram bins, then the minimum and maximum values on the histogram match your actual data values exactly.

The size and location of the box indicate the range of values between the 25th and 75th percentile (or the interquartile range). The diamond marker inside the box indicates the mean value, and the line inside the box indicates the median value. You can modify options to display outliers in the plot. Outliers are data points whose distance from the interquartile range are more than 1.5 times the size of the interquartile range. The whiskers (lines protruding from the box) can indicate either minimum and maximum values of the plot or the range of values outside of the interquartile range but close enough not to be considered outliers. If there are a large number of outliers, the range of outlier values is represented by a bar colored to represent the number of values inside the outlier range (as seen above).

ITECH1103- Big Data and Analytics – Lab 3 – Working with Data Items Image 14 ITECH1103- Big Data and Analytics – Lab 3 – Working with Data Items Image 15

Bar chart

A bar chart displays data aggregated by the distinct values of a category. By default, the bars are sorted by descending order of the value of the first measure. For ranked bars, the data is sorted based on the values of the rank. Stacked bar charts enable you to compare totals for each category, as well as totals for all categories. However, comparing segments is difficult, and when there are many segments in the chart, it is difficult to read. To see relative differences (parts of a whole) in a bar chart, select Normalize groups to 100% for the Group scale option.

Note: Nominal values are categories whose data has no particular order.

Exploring Data: Part 1

This demonstration illustrates how to use the automatic chart to explore data and modify roles and options for charts and graphs in Visual Analytics.

  1. From the browser window, select SAS Drive from the bookmarks bar.
  2. Click the New Report button in the top left corner of SAS Drive.

The Welcome to SAS Visual Analytics window appears.

  1. Click Open.
    1. Navigate to the Content>>Courses>>YVA183>>Basics>>Demos (Marketing)
    2. Double-click VA1- Demo3.2a to open the report.
  2. Turn off automatic graph titles.
    1. In the upper right corner, select Eric c:: Settings.
    2. Select General under SAS Visual Analytics on the left side of the window.
    3. Scroll down to When adding a new object to a report, use the following default object title settings.
    4. For Graphs, change Automatic title to No title.
      ITECH1103- Big Data and Analytics – Lab 3 – Working with Data Items Image 16
    5. Click Close.
  3. Create an automatic chart.
    1. In the left pane, click the Data
    2. Drag Profit from the Data pane to the canvas.

The automatic chart functionality determines the best way to display the selected data.

ITECH1103- Big Data and Analytics – Lab 3 – Working with Data Items Image 17
  • histogram is used to display the distribution of profits.
  1. If necessary, click the Roles icon in the right pane. ITECH1103- Big Data and Analytics – Lab 3 – Working with Data Items Image 18
    • histogram accepts two roles, Measure and Frequency.
  2. For the Frequency role, select Frequency c:> Frequency Percent. The histogram is updated to use frequency percent for the Y axis.
    ITECH1103- Big Data and Analytics – Lab 3 – Working with Data Items Image 19
  3. In the right pane, click the Options 1) Expand the Object group.

2) Enter Distribution of Profit in the Name field.

ITECH1103- Big Data and Analytics – Lab 3 – Working with Data Items Image 20

Note: The Automatic title setting was turned off for Graph objects in an earlier demo. You can turn it on for this graph by selecting Automatic title, or you can create a custom title by selecting Custom title.

ITECH1103- Big Data and Analytics – Lab 3 – Working with Data Items Image 21 ITECH1103- Big Data and Analytics – Lab 3 – Working with Data Items Image 22
  1. Click the highest bar in the graph.
  2. Scroll through the table to find the highlighted row.
ITECH1103- Big Data and Analytics – Lab 3 – Working with Data Items Image 23

A majority of the products ordered are low profit items, in the $0 to $25 range. Also notice that a little more than 20% of items result in a loss. Why is this problem occurring? Are these products ordered from a similar product area, geographical area, or order type? Could the costs be too high in these areas? What can we do to reduce costs?

  1. In the upper right corner, click (Restore).
  2. Create a crosstab.
    1. In the left pane, click the Objects
    2. Drag the Crosstab object, from the Tables group, to the bottom of the canvas.
    3. In the right pane, click the Roles
    4. For the Rows role, select Add c:> Order Type and click OK.
    5. For the Measures role, select Frequency c:>

Profti. The Roles pane should resemble the following:

ITECH1103- Big Data and Analytics – Lab 3 – Working with Data Items Image 24

Note: The Measures role is required for the crosstab object. The crosstab should resemble the following:

ITECH1103- Big Data and Analytics – Lab 3 – Working with Data Items Image 25

Profits are much lower in the internet and catalog channels. A company-wide policy mandates that we need to try to improve profits for orders through these channels.

  1. On the Roles tab, for the Columns role, select Add c:> Continent Name and click OK. The updated crosstab should resemble the following:
    ITECH1103- Big Data and Analytics – Lab 3 – Working with Data Items Image 26
  2. In the right pane, click the Options
  3. Expand the Totals and Subtotals
  4. Select the Totals check box.

By default, totals are added to rows and columns.

The updated crosstab should resemble the following:

ITECH1103- Big Data and Analytics – Lab 3 – Working with Data Items Image 27

The updated crosstab should resemble the following:

ITECH1103- Big Data and Analytics – Lab 3 – Working with Data Items Image 28

Profits are much lower in North America than in Europe. Because our corporate office is located in North America, we would expect higher profits. Also notice the loss in Africa for internet sales. Why is this loss occurring? Is this due to start-up operations (for example, building distribution facilities in Africa)? Are the losses consistent over time or has this changed over time?

  1. Change the crosstab to a bar chart.
    1. In the upper right corner of the crosstab, click (More) and select Change Crosstab to c:: Bar Chart.

The bar chart should resemble the following:

ITECH1103- Big Data and Analytics – Lab 3 – Working with Data Items Image 29
  1. In the right pane, click the Roles
ITECH1103- Big Data and Analytics – Lab 3 – Working with Data Items Image 30

The bar chart has many more roles available.

  • Category data items can be added to the Group role to show additional bars for each category, or to the Lattice columns and Lattice rows roles to add additional bar charts for each distinct category.
  • Category and Measure data items can be added to the Data tip values role to show additional information when a bar is selected.
  • Datetime data items can be added to the Animation role to animate the bar chart.
  • Category or date data items can be added to the Hidden role for mapping data sources, adding color-mapped display rules, or adding external links.
  1. Drag Order Type, from the Lattice columns role, to the Group role. The bar chart should resemble the following:
    ITECH1103- Big Data and Analytics – Lab 3 – Working with Data Items Image 31
  2. In the right pane, click the Options
  3. Expand the Object
  4. Enter Profit by Continent and Order Type in the Name
    ITECH1103- Big Data and Analytics – Lab 3 – Working with Data Items Image 32
  5. In the Bar group, for the Grouping style field, click (Stacked).
  6. Select Data labels.
  7. Select 9 for the Text style

The Options pane should resemble the following:

ITECH1103- Big Data and Analytics – Lab 3 – Working with Data Items Image 33

The updated bar chart should resemble the following:

ITECH1103- Big Data and Analytics – Lab 3 – Working with Data Items Image 34

Profits in North America are less than half of total profits in Europe. We need to understand why this discrepancy exists and try to improve profits in non-European countries.

  1. In the left pane, click the Outline icon.
ITECH1103- Big Data and Analytics – Lab 3 – Working with Data Items Image 35

The Outline pane displays a list of all pages and objects in the report.

  1. In the upper right corner, click(Menu) and select Save.
  2. Select Your Email:: Sign Out in the upper right corner to sign out.
ITECH1103- Big Data and Analytics – Lab 3 – Working with Data Items Image 36

Exercise

  1. Exploring Data: Part 1
  2. Open the browser and sign in to Visual Analytics using Eric’s credentials.
  3. Open the VA1- Exercise3.2a report from the Content>>Courses>>YVA183>>Basics>>Exercises(HR) folder.
  4. Create an automatic chart using the following data items:

Annual Salary

Frequency Percent

  1. Modify the following options for the automatic chart:

Name

Distribution of Salary

Bin range

Measure values

Set a fixed bin count

<selected>

Bin count

4

   

The automatic chart should resemble the following:

ITECH1103- Big Data and Analytics – Lab 3 – Working with Data Items Image 37

e. Maximize the histogram to answer the following question: Into which range do a majority of salaries fall?

Hint: After answering the question, click (Restore) in the upper right corner.

f. Add a bar chart on the right of the automatic chart by assigning the following data items to the specified roles:

Category

Job Title

Measure

Annual Salary

Group

Department

g. Specify Total Salary by Job and Department as the name of the bar chart. The bar chart should resemble the following:

ITECH1103- Big Data and Analytics – Lab 3 – Working with Data Items Image 38

h. Answer the following questions: In which department are a majority of our salary costs spent? For which job title?
Answer:
Why do you think salary costs are so much higher for this group?
Answer:
The final report should resemble the following:

ITECH1103- Big Data and Analytics – Lab 3 – Working with Data Items Image 39

i. Save the report.

j. Sign out of Visual Analytics.

ITECH1103- Big Data and Analytics – Lab 3 – Working with Data Items Image 40

Exploring Data: Part 2

This demonstration illustrates how to use box plots to explore data in Visual Analytics.

  1. From the browser window, select SAS Drive from the bookmarks bar.
  2. Click the New Report button in the top left corner of SAS Drive.

The Welcome to SAS Visual Analytics window appears.

  1. Click Open.
    1. Navigate to the Shared Data/Basics/Demos (Marketing)
    2. Double-click VA1- Demo3.2b to open the report.
  2. In the upper left corner of the report, click the Page 2
  3. Create a box plot.
    1. In the left pane, click the Objects
    2. Drag the Box Plot object, from the Graphs group, to the left side of the canvas.
ITECH1103- Big Data and Analytics – Lab 3 – Working with Data Items Image 41

c. In the right pane, click the Roles icon.
d. For the Category role, select Add ¢ Order Type.
e. For the Measures role, select Add ¢ Profit and click OK.
The Roles pane should resemble the following:

ITECH1103- Big Data and Analytics – Lab 3 – Working with Data Items Image 42

The box plot should resemble the following:

ITECH1103- Big Data and Analytics – Lab 3 – Working with Data Items Image 43

f. In the right pane, click the Options icon.
g. If necessary, expand the Object section.
h. Enter Profit by Order Type in the Name field.
i. In the Box Plot group, select Ignore Outliers for the Outliers field.
j. Select the check box for Averages.

ITECH1103- Big Data and Analytics – Lab 3 – Working with Data Items Image 44

The box plot should resemble the following:

ITECH1103- Big Data and Analytics – Lab 3 – Working with Data Items Image 45

k.In the upper right corner of the box plot, click(Maximize) to view additional details. The table of data values displays descriptive statistics for Profit for each order type.

ITECH1103- Big Data and Analytics – Lab 3 – Working with Data Items Image 46

Even though total profits are highest for the retail sales channel, averages across all channels are very similar, but are a bit higher for catalog and internet sales. This reinforces our company-wide policy to try to increase profits in these channels. Total profits might be higher in retail because there are more customers or more orders for that channel.

l. In the upper right corner, click (Restore).

m. In the upper right corner of the Profit by Continent box plot, click (Maximize) to view additional details. The table of data values displays descriptive statistics for Profit for each continent.

ITECH1103- Big Data and Analytics – Lab 3 – Working with Data Items Image 47

Even though total profits are highest for Europe, averages are higher in North America and Asia. Because our corporate office is located in North America, we will start by focusing on increasing profits in North America. Total profits might be higher in Europe because there are more customers or more orders for that continent. Also, note the negative average profits in Africa. Why is this occurring? What can we do to increase profits for that continent?

n. In the upper right corner, click (Restore).

9. In the upper right corner, click (Menu) and select Save.
10. Select Email¢ Sign Out in the upper right corner to sign out.

ITECH1103- Big Data and Analytics – Lab 3 – Working with Data Items Image 48

3. Exploring Data: Part 2

a. Open the browser and sign in to Visual Analytics using Eric’s credentials.
b. Open the VA1- Exercise3.2b report from the Shared Data/Basics/Exercises (HR)folder.
c. On Page 2, create a box plot by assigning the following data items to the specified roles:

Category

Measures

Job Title

Annual Salary

d. Modify the following options for the box plot:

Name

Salary Analysis by Job Title

Outliers

Show Outliers

Show averages

<selected>

The box plot should resemble the following:

ITECH1103- Big Data and Analytics – Lab 3 – Working with Data Items Image 49

e. Maximize the box plot to answer the following questions:

Which job title has the highest average salary? The lowest?
Answer:
Orion Star has had a great sales year and would like to promote some employees. With which job title would you recommend starting the promotion analysis? Why?
Answer:
Hint: After answering the question, click (Restore) in the upper right corner. f. Save the report.

g. Sign out of Visual Analytics.

Assignment Help Australia Assignment Help Australia

Australia University Assignments