We have heard many times in the news, blogs, social media and internet magazines about data warehouse. But what does that mean, this thing is taking a life of its own. Big Data is the new force when it comes to storage technology because it breaks data storage free from structures. Big data and data warehouse are integral parts of the modern data management strategy. Data warehouse is the central location where the data is stored and managed to all the organizations applications. The data that these companies have to store are so big that they will run out of space eventually. Data warehouse is centralized, dynamic and it involves the process of collective and saving data but is not a massive database which means that the company does not have to invest in an expensive mind blowing equipment and technologies. There is this mentality that we have to remove ourselves from and that is it think outside the box, since every company is different their needs have to be different as well.
One of the benefits of modern data warehouse is that this is not another database, instead is a component that you will open and close as necessary to save and retrieve data. This is beautiful because you can mix brands and components to adjust your needs. To integrate this into single source, traditional data warehouse was in one place in multiple storage volumes but the modern systems are equipped with interfaces to the DW systems. There are specialized packages that have their own back-end databases in storage for each specific use and purpose.
The goal of data warehouse is enable the user’s quick and reliable access to a view of the organization data, support forecast for better informed decision making process at the lowest level. The information needs to be reach out the information in a constant matter, the information need to be single and consistent with was has been saved. This data is read-only but able to adapt to changes because of the rules that will equal a change overtime and the data warehouse needs to be resilient to those changes.
What type of data resides in a data warehouse?
A data warehouse is a collection of data that's:
Separate from operational systems
Accessible and available for queries
Subject-oriented by business
Integrated and consistently named and defined
Associated with defined periods of time
Static (non-volatile); meaning that updates aren't made
(Walters, 2016)
To have a successful warehouse data you need to have a specific software. The first thing to always consider is budget, so with this in mind we will squeeze the research until you can find the right product for your business need. To make a selection we have to consider:
In terms of software there are some that we will consider:
To have success in your data warehouse elements we need to synchronize the elements on each step of the project, automation will help the business and technology to integrate in an effective way. Your Data warehouse has to offer value for your money. It needs to help reduce the development time and increase the potential of the value. We need to establish success by developing a timeframe and environment that targets success. This product will help in implementation of best practices by standardizing the approach in the developmental process, the data lineage and automatization is a must. Maintaining and modifying data warehouse will improve the development seamless and that will extend far beyond the initial period to avoid time waste. And finally the elimination of manual effort in the design, build and administration process.
For the purpose of this project I will be developing a data warehouse for the Global Combat Support System Army (GCSSA system). I will use SAP because is a state of the art software that will provide oversight of logistic and finance systems as an automated combat enabler for Soldiers embedded in the DOD financial system. This will help provide highly accurate cost management and material support handling. I will use the web-based system to make it accessible from virtually any computer and any personnel that possess a Common Access Card (CAC).
Sharing information is has been linked with databases as long as they have been systems development. At this point of time the sharing of information needs to be immediate, efficient and secure, but in all the databases within the enterprise to retrieve the data effectively requires a combined and coordinated effort between the systems. There is a need to have one locations for storage and sharing of data instead of trying to link the multiple databases that exists today and this is how data warehouse gets in the game.
For business analysts data warehousing is a dream come true. A place where all the information and activities are joined together and you only need a set of analytical tools.But in order to accomplish this dream you need to plan a successful data warehouse system. The purpose of the data warehouse system is to help providing accurate and timely information to determine the best path to take in the decision making process. There are seven basic steps in order to plan, design and setup a successful data warehouse.
This project is now moving to the next step which is implementation. We are going to use SAP data warehouse software because upon determination SAP will be the one we will use. SAP is the program that describe the tasks and concepts for managing task chains and configure the scheduling profiles. In the implementation part we will try to make it work by designing task chains for dependencies in the data load processes. The schedule will be simple and flexible while monitoring the executions. To get in the program we need user authorization and authentication that will be implemented with the user account and authentication service. This is a Department of Defense Project and just like any other needs Data Privacy and Protection and this is done via many legal requirements and privacy acts that are specific with the government. SAP provides us with features and functions that are in compliance with federal requirements. After each transaction the personal data is deleted to avoid identity theft. Every time the task is activated the “responsible” database entry is cleared. Here is the table codes for the query:
DELETE '/instance/:instanceId' clears all information regarding a specific instanceId, for example, when the corresponding service is deleted.
POST '/instance/:instanceId/anonymize' anonymizes all known user relevant information for the whole instanceId.
Query parameters for route DELETE '/taskChain/:namespace/:taskChainId/' (same applies to /task/:namespace/:taskId):
?clear=true
The change log code is:
xs set-env dwf-toe AUDIT_OLD_VALUE "true"
(SAP, 2017)
We will use DataStore and name the project GCSS-A:
The database will run in development mode, with the task chain folder right click new and task chain, new task, properties, save.
We need to extract sources with synonyms, virtual tables and flat files.
Flowgraph in Web IDE (SAP, 2017) SELECT * FROM SalesLT.Customer; SELECT Title, FirstName, MiddleName, LastName, Suffix FROM SalesLT.Customer; SELECT Title, FirstName, MiddleName, LastName, Suffix FROM SalesLT.Customer; SELECT SalesPerson, Title + '' + LastName AS CustomerName, Phone FROM SalesLT.Customer; SELECT CAST(CustomerID AS VARCHAR) + ': ' + CompanyName AS CustomerCompany FROM SalesLT.Customer; SELECT SalesOrderNumber + '(' + STR(RevisionNumber, 1) + ')' AS OrderRevision CONVERT(nvarchar(30), OrderDate, 102) AS OrderDate FROM SalesLT.SalesOrderHeader; SELECT FirstName + ''+ ISNULL(MiddleName + ' ', '') + LastName AS CustomerName FROM SalesLT.Costomer; SELECT CustomerID, COALESCE(EmailAddress, Phone) AS PrimaryContact FROM SalesLT.Costomer; SELECT SalesOrderID, OrderDate, CASE WHEN ShipDate IS NULL THEN 'Awaiting Shipment' ELSE 'Shipped' END AS ShippingStatus FROM SalesLY.SalesOrderHeader SELECT DISTINCT City, StateProvince FROM SalesLT.Address; SELECT c.CustomerID, p.ProductID FROM SalesLT.Customer AS c FULL JOIN SalesLT.SalesOrderHeader AS oh ON c.CustomerID = oh.CustomerID FULL JOIN SalesLT.SalesOrderDetail AS od ON od.SalesOrderID = oh.SalesOrderID FULL JOIN SalesLT.Product AS p ON p.ProductID = od.ProductID WHERE oh.SalesOrderID IS NULL ORDER BY ProductID, CustomerID;
Data warehouse is challenging in the security terms, the large systems and serving many user communities is something that needs flexibility to avoid hackers attacks and the same time the data needs to be available to the users as needed while recording activities. Data warehouse contains data from many sources making this a lucrative business for hackers. A strong security structure will improve the effectiveness of data warehouse. In our business we need warehouse security because it is used by many divisions within the ARMY. The infrastructure needed is to ensure that every employee can only see the data relevant to themselves and nothing else.
For GCSSA we will use SAP Security. SAP has the biggest security response because it is committed to identify and address all issues in SAP and cloud to keep them secure. It security is something that requires profound attention because it involves and affect process, people and technology to avoid security issues from the beginning. This is the configuration of SAP as a whole with the secure configuration applied to the related systems which are authorization, encryption and logging with access control checks by computer, IP and company. The mission critical data involved in the process is protected from all types of attacks: on-site or cloud. This system is the safeguard of robust data and IT security and more.
Data migration is the process used to move the data from one storage system to another one. For GCSSA it needs to be done because there is an issue with the system compatibility. We have embarked in the project of data migration to replace the servers and storage equipment and it will be helping in the maintenance of the infrastructure and the migration of applications for data center relocations.
In order to create Data Migration we need a plan. This plan will cover the impact to the business in terms of delay or hiccups in the migration progress to prevent downtime in the system. Some questions we need to ask are how long it will take for the migration, how much downtime is required and set a little more time for compatibility of data. There are three categories of data moves:
When we start the migration of the GCSSA we will work considering the information that we are moving is the most up-to date and in the right format, in the right order and making sure that our old data is saved before is moved. After that we will have a validation period in what the information migrated will be compared with the information saved form the old system.
GCCSSA is an ERP systems that will be in charge of all the logistical and funding enterprise. This system is replacing the old one called PBUSE, SAMS-E, SSF-MW and FCM. The development of this centralized system had opened new parameters and indexes to all the fields involved in the transition. The single database will help providing accurate, real time logistic and financial information in every component. In the deployment strategy was determined that there will be two phases, the first one is the test and evaluation which will help in the future fielding and second one will be the general fielding. In the first wave there will be a development of the full system migration but just to a small group and that it will not affect the bigger group. With this the data migration will only impact 14,000 users. In the second wave will be 140,000 users affected, 10 times the first wave. But for this one the blackout will happen while the Soldiers are taking the training, this way the data migration will be validated at the end of the training. Data Migration and implementation is not just another training, is also a cultural change among all these fields that were apart before and now they are coming together to give visibility to the stakeholders in all fields and in real time.
For .cvs we have to variables:
String.Split(char[]) in C# or String.Split(Char()) in VB.NET String.Split(char[], int) in C# or String.Split(Char(), Integer) in VB.NET This is done with the coma separated value: string values = "TechRepublic.com, CNET.com, News.com, Builder.com, GameSpot.com"; string[] sites = values.Split(','); foreach (string s in sites) { Console.WriteLine(s); This will be generated: TechRepublic.com CNET.com News.com Builder.com GameSpot.com String code: Dim values As String values = "TechRepublic.com, CNET.com, News.com, Builder.com, GameSpot.com" Dim sites As String() = Nothing sites = values.Split(",") Dim s As String For Each s In sites Console.WriteLine(s) Next s char[] sep = new char[3]; sep[0] = ','; sep[1] = ':'; sep[2] = ';'; string values = "TechRepublic.com: CNET.com, News.com, Builder.com; GameSpot.com"; string[] sites = values.Split(sep, 4); foreach (string s in sites) { Console.WriteLine(s); using System; using System.Collections.Generic; using System.IO; using System.Text; namespace ReadWriteCsv { ////// Class to store one CSV row /// public class CsvRow : List{ public string LineText { get; set; } } /// /// Class to write data to a CSV file /// public class CsvFileWriter : StreamWriter { public CsvFileWriter(Stream stream) : base(stream) { } public CsvFileWriter(string filename) : base(filename) { } ////// Writes a single row to a CSV file. /// /// The row to be written public void WriteRow(CsvRow row) { StringBuilder builder = new StringBuilder(); bool firstColumn = true; foreach (string value in row) { // Add separator if this isn't the first value if (!firstColumn) builder.Append(','); // Implement special handling for values that contain comma or quote // Enclose in quotes and double up any double quotes if (value.IndexOfAny(new char[] { '"', ',' }) != -1) builder.AppendFormat("\"{0}\"", value.Replace("\"", "\"\"")); else builder.Append(value); firstColumn = false; } row.LineText = builder.ToString(); WriteLine(row.LineText); } } ////// Class to read data from a CSV file /// public class CsvFileReader : StreamReader { public CsvFileReader(Stream stream) : base(stream) { } public CsvFileReader(string filename) : base(filename) { } ////// Reads a row of data from a CSV file /// /// ///public bool ReadRow(CsvRow row) { row.LineText = ReadLine(); if (String.IsNullOrEmpty(row.LineText)) return false; int pos = 0; int rows = 0; while (pos < row.LineText.Length) { string value; // Special handling for quoted field if (row.LineText[pos] == '"') { // Skip initial quote pos++; // Parse quoted value int start = pos; while (pos < row.LineText.Length) { // Test for quote character if (row.LineText[pos] == '"') { // Found one pos++; // If two quotes together, keep one // Otherwise, indicates end of value if (pos >= row.LineText.Length || row.LineText[pos] != '"') { pos--; break; } } pos++; } value = row.LineText.Substring(start, pos - start); value = value.Replace("\"\"", "\""); } else { // Parse unquoted value int start = pos; while (pos < row.LineText.Length && row.LineText[pos] != ',') pos++; value = row.LineText.Substring(start, pos - start); } // Add field to list if (rows < row.Count) row[rows] = value; else row.Add(value); rows++; // Eat up to and including next comma while (pos < row.LineText.Length && row.LineText[pos] != ',') pos++; if (pos < row.LineText.Length) pos++; } // Delete any unused items while (row.Count > rows) row.RemoveAt(rows); // Return true if any columns read return (row.Count > 0); } } } Here is the code broken down by sections: using System.IO; using LumenWorks.Framework.IO.Csv; void ReadCsv() { // open the file "data.csv" which is a CSV file with headers using (CsvReader csv = new CsvReader(new StreamReader("data.csv"), true)) { int fieldCount = csv.FieldCount; string[] headers = csv.GetFieldHeaders(); while (csv.ReadNextRecord()) { for (int i = 0; i < fieldCount; i++) Console.Write(string.Format("{0} = {1};", headers[i], csv[i])); Console.WriteLine(); } } } using System.IO; using LumenWorks.Framework.IO.Csv; void ReadCsv() { // open the file "data.csv" which is a CSV file with headers using (CsvReader csv = new CsvReader( new StreamReader("data.csv"), true)) { myDataRepeater.DataSource = csv; myDataRepeater.DataBind();
Data warehouse reference architectures. (2014, March 26). Retrieved from https://www.lynda.com/SQL-Server-tutorials/Data-warehouse-reference-architectures/156150/167724-4.html
Database / Hardware Tool Selection in Data Warehousing. (n.d.). Retrieved from https://www.1keydata.com/datawarehousing/tooldatabase.html
Goals of a Data Warehouse - Rensselaer Data Warehouse Project. (n.d.). Retrieved from http://www.rpi.edu/datawarehouse/dw-goals.html
IBM Analytics - Analytic Solutions for Business. (n.d.). Retrieved from https://www.ibm.com/analytics
SQL Server 2016 | Microsoft. (n.d.). Retrieved from https://www.microsoft.com/en-us/sql-server/sql-server-2016
Good one to know! (n.d.). Retrieved from http://www.businessdictionary.com/definition/SAP.html
Fulton, S. M. (2013, September 25). What Is Data Warehousing Today? - Understanding Data Warehousing. Retrieved from http://www.tomsitpro.com/articles/data_governance-big_data-business_analytics-shadow_it-hadoop,2-549-2.html
Walls, D., & Scott, M. D. (1999, December 20). 7 Steps to Data Warehousing. Retrieved from http://www.itprotoday.com/microsoft-sql-server/7-steps-data-warehousing
SAP Help Portal. (n.d.). Retrieved from https://help.sap.com/viewer/ff18034f08af4d7bb33894c2047c3b71/7.5.9/en-US/b2e50138fede083de10000009b38f8cf.html
Develop your agile DW with SAP Web IDE - SAP HANA SQL Data Warehouse - SAP HANA. (2017, December 8). Retrieved from https://blogs.saphana.com/2017/12/08/web-ide-sap-hana-sql-data-warehouse/
Securing a Data Warehouse. (n.d.). Retrieved from https://docs.oracle.com/cd/B28359_01/server.111/b28314/tdpdw_security.htm#TDPDW0121
Why SAP for Security | SAP Security Overview. (n.d.). Retrieved from https://www.sap.com/corporate/en/company/security.html
McDonough, J. (2018, January 23). The United States Army | GCSS-Army. Retrieved from https://gcss.army.mil/Library/TopStories/WaveOneEnds.aspx
Patton, T. (2006, January 24). Easily parse string values with .NET. Retrieved from https://www.techrepublic.com/article/easily-parse-string-values-with-net/
Rouse, M. (2017, April). What is data migration? - Definition from WhatIs.com. Retrieved from http://searchstorage.techtarget.com/definition/data-migration
Wood, J. (2012, July 4). Reading and Writing CSV Files in C# - CodeProject. Retrieved from https://www.codeproject.com/Articles/415732/Reading-and-Writing-CSV-Files-in-Csharp
Urgenthomework helped me with finance homework problems and taught math portion of my course as well. Initially, I used a tutor that taught me math course I felt that as if I was not getting the help I needed. With the help of Urgenthomework, I got precisely where I was weak:
Read More
Follow Us