14 Dec The Dirty Dozen: 12 Principles to Implement When Prepping your Data
A data analytics course is a fantastic way for data analysts and/or a data scientist to learn about data and data prep. This is because data has become such an important aspect of our lives that we need people who understand it.
One of the most important things you can do with your data is clean it up so you can make sense of what’s going on inside your business.
In this blog post, we will discuss 12 principles for data prepping that anyone in the field should know!
However, we won’t go into extreme detail about the data cleaning process or data analysis in this article, although data analysts will benefit from this information. You don’t need any extensive data skills or a deep understanding of big data analytics to appreciate data cleansing.
Any budding data analyst should be able to clean data and prepare data sets. Dirty data is a common part of data science and should be considered a common skill in data analysis.
If you want further insight and secrets on wrangling data while maintaining high data quality, you’ll need to check out Incus Services.
At Incus, you will learn hands-on data analysis skills and understand as much as we create data everyday, once you visualize data, you’ll be able to develop sound business strategies. Don’t let anyone tell you that you need a google data analytics certificate or any google career certificates to get data scrubbing or data wrangling right!
Furthermore, the certificate program supersedes expensive data cleaning courses and data analytics courses which refuse to provide job ready skills and work through presentation tools such as Tableau. Facilitators will be answering key questions which will facilitate more informed decision making.
Preparing data for analysis can be laborious and time-consuming. However, data cleaning goes hand in hand with data analysis.
Filling in missing values, selecting correct columns, formatting dates, reformatting text; there are several tedious tasks that need to be completed before you can analyze your data properly.
The Dirty Dozen
- Union -Union is a method for combining data by appending rows of one table onto another table.
- Splitting Fields – We have the field Training that contains multiple pieces of information that would be easier to analyze if the values are split into separate fields. The two (2) types of split options are Custom and Automatic Split.
- Cleaning Strings – This allows you to perform relatively quick and easy cleaning using some simple functions to make your string values look the way you desire.
- Group and Replace- Grouping is the technique required to solve some duplication where you apply a particular logic to string fields to recognize a pattern. One of the challenges with string data is making them consistent to be able to count how many of the same entities we have or be able to create relationships between the entities.
5. Filter – A filter in data preparation is the way to make a choice as to whether you want to keep or remove data from a data set. In Prep Builder, this is represented as Keep Only or Exclude respectively.
6.Join – Joins are designed to add additional columns to the original dataset from an external source. Depending on the Join Type and Join Conditions set, the resulting data set can be different in terms of the number of data fields, as well as number of rows. To have a successful join, the datasets must contain a common field to create the join.
7.Aggregate- If you need to adjust the granularity of your data, a great option to create a step would be to aggregate or group data. Fields are distributed between the Grouped Fields and Aggregated Fields columns based on their data type.
8.Pivot – The Pivot step is the most important when changing the data structure. There are two types of Pivots: Columns to Rows- A columns to rows pivot changes the structure of your data from “wide” to “tall” by turning columns into rows. The icon for the ‘Columns to Rows’ Pivot is displayed by the orange icon shown above. → Rows to Columns- A rows to columns pivot changes the structure of your data from “tall” to “wide” by turning rows into columns. The icon for the ‘Rows to Columns’ Pivot is displayed by the dark purple icon shown above.
9.Remove unwanted characters/ nulls – Before we attempt to ‘Join’ any sheet to the rest of the data, we should remove those null values in the data. What are null values? A null indicates the absence of a value in a field within the dataset. The absence of data is very different to a zero, a new row or space.
10.Change data type
Data Prep does not always have to be a tedious task. Tableau Prep Builder is an amazing tool optimized to help you get to your analysis faster by being able to combine, reshape and confidently clean your data swiftly.
This along with the Dirty Dozen provides you and your organization with a lumpsum of knowledge to effectively prep and automate data.
Of course, this is not an exhaustive list or detailed run-down of the data cleaning process, nor does it segway into the data analysis process, consider it a form of data manipulation to put you in the right direction…towards the Incus Services Data Analytics Workshop.
HERE AT INCUS SERVICES WE KNOW THE VALUE OF DATA LITERACY AND ARE HERE TO HELP YOU ATTAIN IT!
A huge part of data literacy is understanding the ins and outs of data analytics, and there is nothing more important to Incus than providing a strong foundation and understanding of data analytics and moving you from confusion to clarity. Whatever your objective may be, Incus Services can help.
IF YOU’RE A DATA NOVICE OR JUST LOOKING TO GET THE MOST OUT OF YOUR EXISTING DATA MANAGEMENT, GET INTO CONTACT WITH THEM ABOUT THEIR WORKSHOP OR SPECIFIC SERVICES THAT ARE TAILOR MADE FOR YOUR ORGANIZATION.
But the workshop is just the beginning. Consulting with Incus Services as part of your data improvement drive can make all the difference between being a leading organization or falling behind the competition.
Incus Services can work closely with your organization to help your data talk to you and offer key insights. It is our objective to provide businesses with the machine learning and artificial intelligence strategies that they need to succeed.
Aren’t you ready to take your business to the next level? Why wait another moment to lead the finance sector through technology and digital transformation?
YOU’VE GOT THE DATA AND INCUS SERVICES HAS THE EXPERTISE TO HELP YOU REMAIN LONG-TERM LEADERS IN YOUR FIELD