DESIGNING UX for MACHINE-LEARNING DEVELOPMENT

Project Brief

I worked on a company’s flagship product, a machine-learning platform, as the sole usability researcher. This software was originally driven and used internally, and currently supports the company’s secondary products. At that time, the company was primarily involved in implementing machine-learning for anti-money-laundering and transaction monitoring tools, with their main clients consisting of banks and financial institutions.

We were looking to build the second iteration of this machine-learning platform for eventual external deployment, and thus sought to understand how the context of use might change. This research would then form the basis for our product roadmap for this second build.

ROLE

User Researcher

Tools

UXPin
Miro

Research Process

We conducted a series of user interviews to understand the existing landscape both internally and externally.

The first was with data scientists currently in the company, and this included a mix of those who had been working with the platform for years, as well as those who were relatively new to the company, and therefore the platform. With this group of users, we were largely interested in usability problems in the product’s current iteration, as it was frequently cited as being too difficult to use with a steep learning curve.

The second was with a series of data scientists that we identified as potential users and recruited through a 3rd party user recruitment company. With them, we wanted to understand the context in which our potential user group currently operated, their experience with such software, as well the pain-points they encountered in their workflow.

We also did a series of competitive benchmarking with existing products in the space to understand their features, their user flows, and their existing limitations.

Problem

With the first group of users, veterans of the software exalted the product as being revolutionary. For them, the problem with the product lay in its steep learning curve and the friction in getting used to the difference in workflow in machine learning programming on a software as opposed to with code. Similarly, new users found the software difficult to use and complained that “it was faster and easier to do things with code.” We identified that many of these new users would typically look for roundabouts to avoid working within the platform, or look to persuade project managers that for the sake of efficiency, temporary hacks ought to be accepted.

With the second group of people, we found issues that they faced were specific to doing data science in a bank environment. Banks needed extremely tight security and often utilised closed, intranet environments. This meant that most data scientists working in banks did not have access to open source platforms and languages, including R and Python. They typically employed all manner of work-arounds, including weekly data requests and the use of but general inefficiency and limitations persisted in their workflow.

A second problem that these users faced was a lack of clear structure in their process, including poor version control - after-all, the most popular version control tool, Git, is also open-source. In one instance, a user described how code was checked by sharing snippets via email, so something that was approved was not necessarily what became deployed. This led to critical gaps in the audit trail that the department head had to manually review.

We also asked these users about their experience with using existing third-party data science and machine learning solutions, such as DataIku and H2O. We found that the problems they encountered with using these products overlapped with some of the problems new users of our product faced. On the whole, it seemed that users of third party data science products did not enjoy using them, citing limitations in building pipelines on the platform or excessively complicated workflows, especially when compared to simply writing in code, which most of them were more comfortable with. One user said that such third party software was likely to be more suitable for junior data scientists or business analysts who were not good programmers, and needed a crutch to help with building and testing pipelines. Of course, quite often, these problems did not matter as much, because the closed nature of the bank’s intranet environment meant that such tools were ill-equipped to support their day-to-day workflow.

Solution For Consideration

When we probed existing users on why the software continued to be salient in the company’s development process, they cited two points of merit. Firstly, when used as a standalone, the software was typically deployed in a closed environment, was generally stable and as such did not depend on frequent updates. Secondly, we noted that a large part of the reason why senior developers continue to insist on its use even if it came with a trade-off with more time needed on work was because it was able to support detailed versioning, and archived the various instances for each pipeline.

We found this overlap interesting, and wanted to explore it. We hypothesized that a more useful product for external deployment would be to focus less on features for developing pipelines, and to look into its version control capacity in more detail.

Unfortunately, progress on this project stalled as the team was pulled off to work on other products, and as such we were unable to pursue this further.