1. What does the data actually mean?
Don’t underestimate the impact of individuals or teams working in silos. What is logical to you and your team may not be obvious to everyone. Having clearly defined what each piece of data means is key for everyone in a project to be aligned. It will also reduce the effort for developers to clarify and dig for the data later on. What’s the format? Is the data compared to another one? Are there any thresholds that trigger special formatting or an alert? What’s the business logic? How is it calculated?
Here is an example:
Average lead time:
[Definition] The average lead time is the sum of the time between the start and completion of the production…
[Logic]…from department X, Y, Z.
[Format] It is accounted for in days.
[Comparison] We compare it to the target and the previous year to date (YTD).
[Threshold] For the production of X, the target lead time is 12 days. A minor delay is considered from 12+ days and a major delay from 16+ days.
2. Where does the data come from?
If you’re pulling data from multiple sources, some will likely be more complicated to retrieve than others and they won’t always be a priority on your roadmap. It’s important from the start to identify where the data comes from to help designers better understand what their designs can display (now and in the future) but also help developers find the source to pull the data from.
3. How is the data calculated today?
If some data can be easily extracted from the system, some might require you to aggregate different metrics in order to give you the relevant information. This calculation often uses multiple data sources that require long hours or even days of manual work for someone to create a report and analysis. It is important to understand the current pains and opportunities combined with the effort and intelligence needed to actually automatically generate the data. This way you can decide whether or not it can be addressed on the roadmap.
4. How frequently is the data updated?
People talk a lot about live data, but what does it actually mean? New data every second? Every minute? Hour? More? There are various parameters that can influence the speed at which your data gets refreshed and different sources can get different refresh rates. It is important to know the refresh rate of your different sources so that you can assess the quality and accuracy of the data you’re working with. It’s also important to be mindful of this when designing and considering inactive states, notifications if new data is available or just a time indicator of the latest update.
5. Who’s the audience and what objectives will the data support?
When considering which format would be most useful to display the data, you need to understand who will be consuming the data in the first place. Are they novices? Experts? Where and when do they look at the data? What actions do they take as a result? And if you have different groups of users, do they have the same requirements? The answers to those questions will impact the degree of complexity in the data display.
6. Is the user already using a data visualisation software?
It’s important to know if there is already a solution that’s powering the visualisation so that you understand the potential constraints. You might not want to force a design into a tool that’s not fit to do the job which might also impact the readability and load which are key when designing with data.
Even with a rigorous discovery, it’s unlikely that you become 100% comfortable with the data and what it means — especially when dealing in scientific practice. Therefore it is essential to talk and interact with the users of the data and have them steer you. In this process, it is key to thoroughly document all the information you gather to process it and anticipate later reviews and handover.
Working on a data-driven project or just want to discuss designing with data? Get in touch with us. &us is an eclectic mix of product people, designers, and coaches who make things and coach others to make.