Power BI is often regarded as simple tool and many folks think it is just a drag and drop tool and your work is done. Many of my colleagues have the impression that PowerBI is just another PowerPoint with Excel capability. One has to connect the data and that's it, the work is done. Because they do lot of data massaging already in excel which is manual and repetitive task and of course not dynamic :). What people often don't see is the effort behind making a good data model, measures or DAX, efficiency and various other factors
In order to make data appear as per your wish is paramount task. With this in mind, I thought to introduce you to simple understanding on DAX
There will be more blogs on DAX and today is just a glimpse of it!
DAX are like custom calculations and most of you are familiar with the name. What people don't know is how it is getting calculated, evaluated behind the scene and thus commit mistakes
Scenario 1
I have a table called as dept where I have records of various departments and employee counts in each departments. You can notice HR is repeating twice and a duplicate row is created. However the data is accurate and rather than editing the count, data entry operator has created a new row item and inserted HR department with another set of 20 employees
I have created a sum measure to find total employee count
total count = SUM(Dept[Count])
Then as step 2, I am using an iterator function SUMX to create another count of employees
Measure 1 = SUMX(Dept,Dept[Count])
Measure 2 = SUMX(Dept,[total count])
What is your opinion? Will both options give you same result or not?
Solution and explanation
Measure 1 will result in 100
Measure 2 will result in 140
Why was that?
The answer is SumX is an iterator function and behaves in row context. Here the rows were duplicating (Dept and count both were same) and hence it treats them as duplicates and evaluates twice (Measure 2 = SUMX(Dept,[total count]))
So what was your observation from above example!
In reality if we don't want to commit mistake then we must familiarize with DAX concepts rather than making hints
In true world, we don't see such row example and if duplicates are there then we must treat duplicates first and then our DAX. That's the first data modelling concept which I was referring in the beginning :). If your data model is perfect, there is high chances that whatever DAX you write, it is not going to impact your result
How to remove duplicates if it is coming from source?
I have an idea! Do you know how can we treat duplicates to avoid above issue :)
Thanks for your patience in reading this blog
Idea, suggestions are welcome
Signing off,
Neeraj
Comments