How and why I started to collect data of a development team’s well-being

”Hello, how are you today?” Source:

In March 2017 I joined a team developing an in-house CMS. Already being Product Owner of the CMS that was mainly used at that time, my role was to represent two digital news titles within Bonnier News, the biggest news corporation in Sweden. The main objective was (and still is) to fully switch from the old system to the new.

Needless to say, a large operation such as this — gradually migrating hundreds of users from an archaic, monolithic system to a modern, microservice based product — requires quite a lot of time and effort. It’s therefore imperative to not only keep the technological parts up-to-date, but also the development processes within the team. Because of this, improving our processes has been a fundamental component in the team’s retrospectives.

In September 2017, I was assigned this action point in a retrospective:

Measure the sense of stability and well-being in the team on a weekly basis.

Prior to the retrospective resulting in the action point, an agile coach had measured a set of key values within the team:

  1. Exploration
  2. Prestigelessness
  3. Passion
  4. Simplicity
  5. Quality
  6. Respect

Each week, we’d give a score to each topic through a survey. Have we been working on stuff we’re passionate about? Then give passion a nice, fat score. Hey, did that stuff nudge on the quality of our product? Okay, give quality a lower score then. And so on.

The purpose of the measurement was to get a hint of the direction of the product we’re developing, from the team members’ perspective.

Measuring key values — and how the development is currently aligning with them — is a great way to establish a sense of stability.

But unfortunately, the team at that time wasn’t in the right frame of mind for this particular sort of measurement and had trouble understanding the purpose of it. When the agile coach left the team in April 2017 it therefore fell into oblivion instead of being refined and iterated.

During the months leading up to the above mentioned retrospective, the focus quickly shifted from the team dynamics to the opposite side of the spectrum of product development: architecture/infrastructure, code smell, feature implementations, debugging, and so on. Meanwhile, the ”softer” aspects — the interaction between team members, the processes within the team, matters related to empathy, and how the development is aligning with the team members’ individual values — regressed to a far less prominent state, despite having a few strong advocates.

In September, the softer values had made its way back into the spotlight, reaching higher grounds once more. It was time to reintroduce a measurement of said values.

But where, and how, would we start? 🤔

Measure the team instead of its product

First of all, the key values of the past were all focused on the product and the vision defining our development strategy. This time, we wanted to measure the team itself. How are we doing on a personal level? Are we feeling good about ourselves and our work? Are we satisfied with what we’ve accomplished together?

We decided to try the first thing that came to mind and then iterate. The following week, a Google Form was created and distributed within the team, with an answer rate of roughly 80 percent. The questions in that very first form, all being answered anonymously, were these:

  1. How have you been doing this week, generally speaking?
  2. How fun has work been during this week?
  3. How stressful has work been compared to last week?
  4. How satisfying has work been during this week?
  5. How would you rate the level of affinity within the team during this week?
  6. Has this week been better, worse, or the same as last week?
  7. Is there anything in particular that has made you feel better or worse during this week? (Free text field.)

All of the questions except the last two were answered on a scale of 1 to 7 — which was quickly iterated after feedback from one of the team’s UX designers, who strongly dislikes score systems that allow participants to ”play safe” by picking the neutral option. Since that very first form, we’ve had a scale of 1 to 6 on all score-based questions. Thanks, Niclas Ramström. 💚

Besides changing the scale of the scores, the survey has been tweaked continuously. After consulting data scientist Max Berggren — who warmly and very respectfully deemed the entire survey ”scientifically insignificant” 😂 due to a number of statistical shortcomings — we ditched the ”compared to last week”-queries, instead opting for more general questions:

  1. How stressful has work been this week? (With the options being too stressful, neutral, and too relaxed.)
  2. How would you describe the week in general? (With the options being good, kind of good, kind of bad, and bad.)

When we hit our 10 week mark before Christmas, we put together all of the stats and created a spreadsheet. With the spreadsheet — and the charts generated thanks to it — we now have a way to if not statistically efficiently then at least effectively monitor the well-being of our team on a weekly basis, and address issues that might be revealed within the data.

A chart depicting the weekly average of the four score based questions between September 2017 and May 2018.

It’s not anywhere near perfect, and the scientific significance probably hasn’t increased since its insertion. The survey wasn’t constructed to withstand scientific scrutiny, however — the purpose is simply to get an indication of the team members’ well-being on a personal level. It’s my firm belief that the well-being of a product is deeply connected with the well-being of the team developing the product.

And it has already helped us take action in certain areas of our work.

When the team’s satisfaction — after peaking at an average of 4.4 (out of 6), started to decrease pretty drastically — Anton Niklasson (one of ten developers in our team) arranged a retrospective zeroing in on how we define satisfaction individually, and how we as a team can raise our concerted level of satisfaction.

We quickly resurged to an average of 4.36. (It has since dropped again though. Bummer. Perhaps it’s time for a follow-up retro, Anton?)

And when we noticed that the ”stress level” was constantly dwelling underneath our base value, indicating that some team members felt they had too little to do, we started discussing ways to address this potential problem. We have yet to figure it out, but we might have never even identified it in the first place had it not been for the survey.

The team’s ”stress level.” The question has three options: too stressful, neutral, and too relaxed. The options are interpreted and inserted into a scale of 1 to 5, with 1 being too relaxed and 5 being too stressful. Our base value — neutral — is interpreted as a 3. It’s a bit bulky, but it was the least bad way to translate the early 1–7 score to this new metric, and keep the data from the first weeks intact.

And these are just a few examples.

A few weeks down the road, we dropped the last query, which was a free text field entry (”Is there anything in particular that has made you feel better or worse during this week?”) in favor of a new, unique question every week. The concept of ”custom questions” has not only kept up the attentiveness of the survey, but also helped us identify other issues and points of interest.

Naturally, there are fluctuations in the data. Since we’re a small team (15 members, all in all), tiny changes can have a pretty big impact on the outcome. Some stats do seem to correlate with each other though, and it guides us into new thought-processes, approaching things from a different perspective.

The data output shouldn’t be taken out of proportion — it’s a quick health-check and a simple way of measuring the well-being of our team in a general sense, helping us improve our workflow and raise awareness of potential pitfalls.

As such, it has already proven invaluable.

Source link—-819cc2aaeee0—4


Please enter your comment!
Please enter your name here