The Dirty Dozen


But what value can supposedly guiding decisions have if they are based on questionable or contaminated sources? Whether outdated, double-registered or simply incomplete - the reasons for poor quality information are many and varied. The problem of dirty data particularly affects companies that collect large amounts of data. "If the mess already stings, spring cleaning is more than overdue"Andreas Köninger, digitization expert and member of the board at SinkaCom, knows this. "After all, the richer the stores, the harder it gets to clean them up." Fortunately, measures taken at the structural level can improve data hygiene immensely, ensuring reliable reporting.Â
Contrary to popular belief, a lot does not always help when it comes to data management. On the contrary: collecting data in a frenzy tends to be counterproductive. "In practice, it is regularly seen in this regard that, in addition to complexity, effort and costs also increase exponentially, while data quality falls to an even greater extent," emphasizes Andreas Köninger. The first step towards improving the situation is therefore to scrutinize one's own processes. What is the goal of collecting the information? Does it promote the optimization of processes? Can key figures be derived from the knowledge? If data has no relevance, it is worth deleting it to thin out the thicket. Such a procedure also complies with the GDPR principle of data economy.
"To enable effective analysis in the next step, there needs to be some form of compatibility between all the data formats used"knows Andreas Köninger. "It starts with a common corporate language and ends with fully automated transformation of information." Because only if all departments use the same designation for a business process or customer, for example, can impurities such as duplicate entries be detected at all. For example, different designations such as "Managing Director" and "Chairman of the Management Board" can suggest to the machines that there are several positions, even though they are the same job.
When in doubt, standalone applications assist in transferring the information into the desired format and automatically clean up inaccuracies. Especially if transmissions from external partners arrive regularly, a specialized program or a well-designed interface reduces the potential for dirty data. If all relevant information can be standardized across departments and thus made readable, a big step has been taken. Lean structures work wonders against dirty data.