Data Transformation Overview

The estimated time to complete this lab is 40 minutes.

In this lab, we will cover what is under the hood then you will connect to your data and transform it into a usable data model.

Creating reports in Power BI is driven by a need to either understand some raw data or to solve a problem or both. Sometimes it isn’t evident you have a problem until you see the data and sometimes you have a problem and you don’t know how to solve it but the answer lies in the data.

The data that we will be using in this Power BI training is based on the locations of where crocodiles are captured in the Northern Territory as defined by broad geographic areas referred to as Crocodile Capture Zones. The coordinate location of traps is not provided to the public.

The zones may be viewed using NR Maps. See folder: Parks and Culture > Wildlife. https://nrmaps.nt.gov.au/nrmaps.html

Zones were generally developed as follows;

  • Borroloola: 100 km buffer from Borroloola township
  • Katherine Zone 1: 5 km buffer from the Katherine River
  • Litchfield: an extraction from reserves in GEODATA.
  • Management Zone: originated in 2009 with minor modifications over the years.
  • Nhulunbuy: 50 km buffer from Nhulunbuy township.
  • Shoal Bay, Upper Harbour, Lower Harbour: straight line boundaries were hand-drawn

The spatial polygons are provided in an ESRI shapefile format in Geographics with datum GDA94. The capture table is provided in a Microsoft Office Excel format.

For more information about crocodile captures in the Northern Territory,

visit nt.gov.au/emergency/community-safety/crocodile-capture-and-management/map-of- crocodile-captures

 This data describes the daily captures within the Northern Territory Crocodile Capture Zones.                                                                                                                           

Custodian:    Parks and Wildlife Commission of the Northern Territory

Agency:        Department of Environment, Parks and Water Security. Northern Territory Government Metadata:                    http://www.ntlis.nt.gov.au/metadata/export_data?type=html&metadata_id=C67DB1A044D43F View:                    https://nrmaps.nt.gov.au/nrmaps.html

The capture zones can be viewed in NR Maps. See folder: Parks and Culture > Wildlife.

This dataset features date, region and other crocodile capture components making perfect real data for training materials.

Data description:

ORDERFIELDDATA FORMDESCRIPTION
1OBJECTID Unique identifier for each record
2DATE_CAPTUREDdateDate of capture
3SCIENTIFIC_NAMEtext x 50Scientific name
4COMMON_NAMEtext x 50Common name
5CAPTURE_METHODtext x 50Method of capture
6TOTAL_LENGTHfloatLength (cm) between the tip of snout and the tip of tail
7HEAD_LENGTHfloatLength (cm) between the tip of snout and the end of dorsal cranial platform
8SEXfloatFemale or male
9SNOUT_VENT_LENGTHtext x 1Length (cm) between the tip of snout and the beginning of vent (cloaca)
10TAIL_COMPLETEtext x 1Whether the tail is complete or not
12REGIONtext x 50The region for management zones are Katherine, Nhulunbuy, Darwin, Borroloola and Other/Unknown
13ZONE_NAMEtext x 50The capture zones are divided into 9 zones: Upper Harbour, Lower Harbour, Shoal Bay, Management Zone, Katherine, Borroloola, Litchfield, Nhulunbuy and Outside Management Zone.
14ZONE_CODEnumberThe code for capture zones
15GROUP_NAMEtext x 50The location group of traps inside the capture zones
16LOCATIONtext x 50The location area of traps inside the location group

Prep 1: Fuel Cap Release & Other Options

Power BI Desktop offers a comprehensive range of options and settings that allow users to customise their experience and optimize various aspects of their data analysis and reporting. Let’s explore some of these sections in more detail:

About Power BI Desktop:

File Menu: Go to the File menu and select About.   Note the Version that is specified, the edition (32 or 64 bit) and the month and year.  You need to know this when weird things happen and someone from either Corporate or Microsoft support will require this info.

Preview Features:

Preview Features: This section allows you to enable or disable experimental or beta features that are still under development. It is only offered with the Cloud edition of Power BI Desktop not the On Premises or Report Server edition.   It provides early access to new functionalities and enhancements, but be cautious as they may be less stable than the stable release features.

Data Load:

Data Load: Here, you can configure settings related to data loading behaviour. It includes options such as enabling parallel loading to speed up data retrieval, defining data privacy levels to control data source access, and adjusting the data cache size to optimise performance.

Power Query Editor:

Global: This section provides options to control the behaviour of the Power Query Editor, the tool used for data transformation and cleansing. Users can adjust settings such as the number of undo steps, data type detection, and handling of column name changes.

Privacy: Users can define privacy levels for each data source, specifying whether data from different sources can be combined or accessed by other queries.

Diagnostics: This setting allows users to enable diagnostic logging for query performance analysis and troubleshooting.

Privacy:

Data Privacy: This section provides options for managing data privacy settings. You can specify the level of data privacy for each data source, control privacy levels for combining data from multiple sources, and enable Enhanced Data Privacy (EDP) mode for increased data security.

Security:

Data Security: You can configure settings related to data security, such as defining privacy levels, encrypting connections, and enabling/disable Fast Combine, which allows Power BI Desktop to optimise data loading by bypassing certain privacy checks.

File Security: This option allows users to set a password to protect their Power BI Desktop files, restricting unauthorised access.

Save and Recover:

AutoRecover: You can enable and adjust automatic recovery to save their Power BI Desktop file at regular intervals, helping to prevent data loss in case of unexpected application crashes or system failures.

File Path: This setting allows you to specify the default file path for saving their Power BI Desktop files.

Report Settings:

Current File: This section contains settings specific to the current report. Users can customise the interaction behaviour of visuals, such as enabling cross-highlighting or enabling preview features like Smart Narratives or Smart Guides.

These options and settings provide users with fine-grained control over their Power BI Desktop environment, allowing them to optimise data loading, ensure data privacy and security, customise the behaviour of the Power Query Editor, and configure report-specific settings. By exploring and utilising these options effectively, users can tailor their Power BI Desktop experience to their specific needs and enhance their data analysis and reporting workflows.

Power BI Source Control:

Save as PBIP :  If you’re working on a new project or you’ve opened an existing Power BI Desktop file (pbix), you can save your work as a Power BI project file (pbip).   Let’s take a closer look at what you see in your project’s root folder:

<project name>.Dataset

A collection of files and folders that represent a Power BI dataset. It contains some of the most important files you’re likely to work on, like model.bim. To learn more about the files and subfolders and files in here, see Project Dataset folder.

<project name>.Report

A collection of files and folders that represent a Power BI report. To learn more about the files and subfolders and files in here, see Project report folder.

.gitIgnore

Specifies intentionally untracked files Git should ignore. Power BI Desktop creates the .gitignore file in the root folder when saving if it doesn’t already exist.

Dataset and report subfolders each have default git ignored files specified in .gitIgnore:

  • Dataset

.pbi\localSettings.json

.pbi\cache.abf

  • Report

.pbi\localSettings.json

<project name>.pbip

The PBIP file contains a pointer to a report folder, opening a PBIP opens the targeted report and model for authoring.: