Pragmatic Works Nerd News

Best Practice To Trim Before Removing Duplicates or Merging In Power Query Editor

Written by Matt Peterson | Feb 09, 2021

In last week’s blog, I wrote and did a video about how to remove duplicate records and keep the most recent entry as long as a date column was part of the data source.  I came across the scenario while giving training on Power BI with my company Pragmatic Works.  See the video below:

This week, while doing another two-day training I came across a different scenario from a follow-up conversation from day 1.  I had explained how to remove duplicate records and one of the students started working on a Power BI project she has for her company.  On day 2 the student informed me that her remove duplicates step was not working.  I said that is odd and I asked to see the data.  In one of her table visuals, I could see that it appeared that a few of the records had duplicates based on the name column.  After further investigation though, we figured out the culprit. 


She had done all the steps correctly, but it was a data integrity issue.  In her data source, the person in charge of entering the data had mistakenly typed a space for a few of the records before entering the name value.  Thus, when Power BI looked for a duplicate in the name column it did its job correctly because “Jamie” is unique compared to “ Jamie”.  That leading white space was the culprit.  Not only did it affect her Remove Duplicates step, but it would also have affected a merging of two queries if she had included that as part of her report. 


For this week’s video, I want to show how to use the Trim function in the Power Query Editor as a best practice to avoid these unwanted outcomes for any future merges or removing of duplicates.