#159 - 11 Mar 2023

Use surveys sparingly; Iterative strategy; Stuff costs money; The lone developer problem; Using incidents to improve reliability; Backblaze SSD stats

How many surveys have you skipped over or deleted in emails or web pages in the last month?

Whenever I suggest we talk to people more often, the question of surveys always comes up.

Surveys are a limited but useful tool for collecting data from our researcher clients, but work best in a very specific use cases. They should definitely be one tool in our toolbox, but they get overused, get used lazily, and get used for purposes where other methods would be a better fit.

As in so many areas of our work, when we’re considering gathering data from our client community, the key question is to ask ourseleves is: “what problem am I trying to solve?”

Surveys Get Used A Lot Because They’re Easy (For Us)

The trap of surveys is that they’re very little work for us. After all, it feels like a nice clean distillation all of this complex human people interaction stuff into something quantitative, which we know how to deal with. We end up with a few numbers that we can put in a slide and then move on, feeling like we’ve accomplished something, by outsourcing the work to our respondents.

There are absolutely particular use cases where surveys are good and helpful. But it all comes down to: what are we trying to accomplish?

Surveys are Limited Measurement Tools

Surveys work well when:

  • We expect a large number of responses, because we have a large audience and/or expect a high response rate
  • We have relatively easily answered questions you need an answer to (which tend to mean superficial responses)
  • There’s a time window that other approaches would struggle with
  • Anonymity is a benefit

In general we do not have large audiences, and we definitely don’t expect high response rates. Many of the things we need to know are fairly complex and subtle, and benefit from a lot of back-and-forth discussion. And generally the things that matter most are ongoing efforts, rather than needing point-in-time snapshots.

We’ve Always Done It This Way

A lot of surveys go out because they’ve always gone out. The annual report has always had a little bar chart with satisfaction numbers, say, and it would be weird to stop having it. Besides, we want to see if the number has gone up or down!

But in our contexts, we are rarely if ever going to get enough data to clearly see trends — noise and or response bias dominates. Some teams ask a lot of questions and hope they’ll end up with a nice dashboard to “show what’s going on.” For most of us, that just isn’t going to work.

Besides, what do you think when some other group shows survey responses, demonstrating that users of their services who are engaged enough to respond to surveys are happy? If a picture of a plane with bullet holes in it doesn’t come to mind, you’re a more charitable person than I am.

Many of these “because we’ve always done it” surveys can be much more effectively replaced by some pull quotes from the user (and non-user) interviews described last issue (#158). What would be more compelling to you - a table of some survey results, or a few well-chosen pull quotes:

This year we had great success with [Initiative X]:
  • “I’ve already accomplished three projects this year with the new X service; it’s changed what kinds of work our lab can do” - Mechanical Engineering Faculty Member
  • “My trainees are extremely enthusiastic about X; one has been bringing it up as a selling point when we’re interviewing graduate students” - Physics Faculty Member
On the other hand. we continue to have trouble meeting researcher expectations with our Y services:
  • “While other efforts are good, we’ve found we just can’t rely on Y and are looking for other options” - Chemistry Faculty Member
and that will be a priority in the coming year.

One of those two approaches gives much higher confidence that the team presenting them knows and cares how their research clients are experiencing their services.

Surveys Are Good For Asking Questions about Specific Events People Participated In

One area where surveys can be very useful for us is where we’d expect response rates to be high because the recipients have participated in something that they cared about.

One common example is training courses, or other special events. People expect the “how did we do” sort of survey, and because it just happened, they were there, and they likely have opinions, they’re likely to fill it out. Here though, the use case is funny - you have the people right there. Are you sure a survey (rather than just talking to people) is the right way to go, as opposed to scheduling discussion at the end? Maybe it is - there are advantages (anonymity, less time investment), but disadvantages (inability to ask follow up questions), too.

For training in particular, immediate reaction to the training is just one rung on the Kirkpatrick model of evaluation. We probably also want to know how much learning occurred (e.g. with pre and post tests), and follow up months later to find out whether they’re applying the training and what the results have been. A survey can be a great way to begin that followup.

Another use case is followup after a project. We helped a research group get some code written/do an analysis/run some big computation. A post-project satisfaction survey can make sense! On the other hand, having a retrospective with them will provide much richer information. The survey approach can be a great way to augment retrospectives if it’s sent to someone else (say, the PI while we were working most closely with the trainees, so included them in the retrospective), or if it’s done in a way that gives you an advantage like anonymity (sending them out for all engagements in a given quarter, say).

If we are going to use surveys to collect data on situations like this, where people are very engaged, or in some other context, the next question is - to what end?

We Measure Things To Inform Decisions And Actions - What Actions Will This Inform?

Surveys are ways of measuring something, and as managers or leads we make measurements to inform decisions about what action to take. So the key question is: what actions will you take based on the survey results?

Any question on a survey where we don’t know ahead of time what we’re going to do based on the answer is a waste of our and the respondents time. And every extra question drops response rates! So it’s best to be very clear up front what we want to accomplish once the answers are in.

Sometimes the thing we might want is to use answers for is advocacy. “Look, 90% of our respondents say they could do more and better science if we were better funded!” That’s a good result, but is a survey they best way to compellingly make this point? Wouldn’t some very specific pull quotes from some very influential researchers be more effective? Maybe a combination of the two is even stronger; great! We just want to make sure we begin with the end in mind.

Other times, the survey questions can ask something very specific we can actually do something about. Did people like the new venue, or prefer the old one? Are people happy with the current prioritization of work between two competing services, or should we prioritize one over the other? Repeatedly asking for opinions on things we can’t or won’t change is a great way to have people stop responding over time.

Sometimes, Follow-up Is The Purpose

One of my favourite survey styles is just a quick check in, whose main purpose is to identify people to follow up with because things are going particularly well (and you want to ask for testimonials/help with advocacy) or particularly poorly (and you want to talk to them before it becomes a big deal).

This is a simple two-question survey:

  • An NPS-style “Would you recommend our services to a new researcher in your group? ” (1-10)
  • “May we follow up with questions? If so, please provide contact information (email or phone number)”

Then we follow-up with (ideally) everyone who gave us contact information and who responds 9-10, 6(say) or lower, and a random sample of people in between. This is a great way of using survey information to target our user and non-user interviews from last issue.

You can break the single NPS-style question into several areas of the service provision if you want to follow up on more specific areas (this is useful as your team and researcher base starts expanding). Some possible examples:

When discussing with a new researcher, would you say that we:

  • Are helpful in providing new insights to your problems (1-10)
  • Meet our deadlines (1-10)
  • Keep you informed of progress (1-10)
  • Listen well to what we need (1-10)
  • Provide high-quality results (1-10)
  • Are reliable (1-10)
  • Keep us informed of recent changes that are relevant to us (1-10)
  • Are a resource you would recommend to others (1-10)

Or some similar list. As you can see from (the maybe exaggerated case) above it’s important not to index too highly on the super-technical stuff, and to include people-interaction topics.

If You’re Going To Create A Survey, Consult The Experts - Social Scientists

Many many social science grad students take courses that include material on good survey design. Tap into that expertise! They know what they’re doing. They’ll likely know about other alternatives, such as watching people as they work, or group discussions. And right away, they’ll ask the key question - what is the research question you’re looking to answer.

Surveys are just one tool in the toolbox for understanding what our researcher clients think or need. When we’re thinking about what information we need, certainly consider surveys as one possible instrument for collecting that information - but seriously consider others, too. They’re more time consuming for us, but because of that they demonstrate much more concern about people’s needs and concerns, and give much richer information. They’re also something that can be delegated to team members who want to take on wider responsibilities and start engaging with clients and stakeholders.

Am I too negative about surveys here? Is there some other use case you’ve been happy using them for? Just reply or email me at [email protected].

And now, on to the roundup!

Managing Teams

6 Steps to Make Your Stratic Plan Really Strategic - Graham Kenny

If there’s one thing I’ve radically changed my opinion about over the past five or ten years, it’s the value of strategic planning for our teams.

Don’t get me wrong, I still think having a strategy is very important. And I do believe (and have seen!) that a clear strategic plan can be transformative, even in our context.

But more often, we don’t get trained how to do these plans well, don’t have the case studies to learn how they can work, and we often don’t have the air cover from higher up in our organizations to do it in any meaningful way at any rate. But it’s still demanded of us. In those cases, the “strategic plans” end up being, basically, useless compliance exercises where you submit a document because you have to submit a document, the boss or stakeholder receives the document because they have to receive one, and then no one ever thinks about it again.

Again, I think that’s a shame, but that’s the situation most teams are in. If you find yourself in that situation, don’t worry! Go through the compliance exercise, putting the minimal amount of plausible effort in that will make those who want the document from you happy enough.

Then actually start doing some strategy, in a focussed, incremental, and iterative way that will actually work.

The approach that Kenny describes here, mostly with professional services examples, is consistent with the samizdat approach I see working relatively well:

  • Understand the boundary conditions - who the stakeholders all are (which is less obvious than it sounds), what you want of them, and what they want of you
  • Understand what the team is already doing, and get really good at doing it - if the team is just kinda muddling along, strategy or its lack isn’t the issue
  • Figure out where you can have the biggest impact
  • Start figuring out ways to do more work with the biggest impact
  • Repeatedly revisit and incrementally improve

In short: execution, then strategy.


In last week’s Manager, Ph.D. there were articles on:

  • “Closing” calls, the last stage of a hiring process;
  • Recognizing that your team might have a problem with expressing disagreement; and
  • “How do I get X to start/stop doing Y?” - you don’t, you can’t.

Technical And Project Leadership

Also over at Manager, Ph.D. I give the first piece of advice I always give about project management tools - don’t worry about project management tools. Get good at the work and the process, first; then choose a tool.


Product Management and Working with Research Communities

Lack of sustainability plans for preprint services risks their potential to improve science - Naomi Penfold
Funding woes force 500 Women Scientists to scale back operations - Catherine Offord, Science

Our community can’t have nice things, because we collectively won’t pay for them.

We want open access journals, but don’t want to pay author charges. We want good software, and funding for software maintenance, but won’t use any software or online service that isn’t free. We want robust and flexible and cutting edge computer hardware - as long as it’s the lowest bid.

Stuff costs money. If we won’t spend money, we get less and worse stuff. It’s all well and good to say that someone else should pay for it, but there is no one else. It’s not like grant committees and scientific advisory boards and tenure panels are staffed by dentists or construction workers. The community is us.

Anyway, preprint servers are having trouble staying up and running without being bought by companies like Elsevier, and the organization that wrote the workbook on inclusive scientific meetings shared in #155 has had to layoff staff. Maybe someone else will fix it.


Research Software Development

The lone developer problem - Evan Hahn

Hahn points out one very real problem in some research software:

I’ve observed that code written by a single developer is usually hard for others to work with. This code must’ve made sense to the author, who I think is very smart, but it doesn’t make any sense to me!

Hahn suggests that one reason is that the lone developer hasn’t had to explain the code to coworkers or collaborators. Under those conditions, it is pretty easy for the code base to evolve to become very idiosyncratic (and poorly documented). I’m not sure how best we can address this issue, other than continuing to build communities where those writing research software can bounce ideas off of each other and get helpful and constructive feedback in a way that builds, rather than erodes, confidence and capability.


Research Data Management and Analysis

So I really enjoy the Annual Reviews series of journals - good review articles are fantastic tools to help get up to speed in a new area, and these journals have consistently high quality reviews. The latest Annual Reviews of Statistics and its Applications is out. If you’re comfortable describing things in terms of statistical models, there are several reviews that may be of timely interest:

amongst others.


Research Computing Systems

Move past incident response to reliability - Will Larson

When I started in this business, most systems teams approach to incident response was pretty haphazard (or, erm, “best effort”). That was a luxury afforded to our teams at that time because we had a very small number of pretty friendly and technically savvy users whose work could continue despite a few outages here and there; outages which were normally pretty short because our systems were small and less complicated.

None of that’s the case now. Our much larger and more complex systems are now core pieces of research support infrastructure for wildly diverse teams, many of whom can’t proceed with out us. I’m very pleased to see more and more teams stepping up to have real, mature, incident response expectations and playbooks. That doesn’t necessarily mean 24/7 on call - our teams and context are different - but it does mean those teams aren’t just playing it by ear every time they have an outage.

Larson reminds us that just responding professionally to service incidents is only half the job. The other half is learning from those incidents to have more reliable systems.

Larson suggest three steps:

The answer is extending your incident response program to also include incident analysis. Get started with three steps: (a) Continue responding to incidents like you were before, including mitigating incidents’ impact as they occur. (b) Record metadata about incidents in a centralized store (this can be a queryable wiki, a spreadsheet, or something more sophisticated), with a focus on incident impact and contributing causes. (c) Introduce a new kind of incident review meeting that, instead of reviewing individual incidents, focuses on reviewing batches of related incidents, where batches share contributing causes, such as “all incidents caused when a new host becomes the primary Redis node.” This meeting should propose remediations that would prevent the entire category of incidents from reoccurring. In the previous example, that might be standardizing on a Redis client that recovers gracefully when a new Redis primary is selected.

The common underlying problems will be different for us, but the idea is the same - once we have good incident responses, including writing up (and disseminating!) good incident reports, start mining all that fantastic data and making changes that reduce or eliminate entire classes of failures.

Once done, we can use our scientific training and come up with experiments and test them to make sure they work, a la Slack’s “disasterpiece theatre” (#5).


The SSD Edition: 2022 Drive Stats Review - Andy Klein, Backblaze
Backblaze: SSDs fail slightly less than HDDs - Chris Mellor, Blocks and Files

Backblaze continues to generously provide the very valuable community service of publishing their disk drive reliability results; Klein’s article gives the full results.

This year they have enough data to start making some convincing data about SSD reliability, and Mellor’s article highlights maybe the most surprising result:

Cloud storage provider Backblaze has found its SSD annual failure rate was 0.98 percent compared to 1.64 percent for disk drives, a difference of 0.66 percentage points.

I have to admit, I would have expected the difference to be larger. It’s something of a tribute to hard disk manufactures that they can make tiers of metal spinning at top linear speeds of tens of meters per second with read/write heads hovering above them at distances with tolerances of micro-to-nanometers that have reliability numbers similar to solid state devices. It reflects what a mature technology hard drives.

And maybe the fact that SSDs are still so compartively new that Klein and team see such inconsistency in SSD SMART data:

Here at Backblaze, we’ve been wrestling with SSD SMART stats for several months now, and one thing we have found is there is not much consistency on the attributes, or even the naming, SSD manufacturers use to record their various SMART data. For example, terms like wear leveling, endurance, lifetime used, life used, LBAs written, LBAs read, and so on are used inconsistently between manufacturers, often using different SMART attributes, and sometimes they are not recorded at all.


This week I learned about Princeton’s JobStats package, which lets uses look up the CPU/GPU/memory/scheduled runtime utilization of their slurm jobs, and gives recommendations.


Julia Evans writes about her first experiences using nix - stand up an old version of hugo for her blog, including creating a package. I’m somewhat reassured to see how easily you can use nix piecemeal like this without completely going all-in on it.


Emerging Technologies and Practices

Inside NCSA’s Nightingale Cluster, Designed for Sensitive Data - Oliver Peckham, HPC Wire

NCSA’s Nightingale system has been online for two years now, and Peckham speaks with the team about the work it supports.

These kinds of sensitive data clusters are going to proliferate in the coming years, and as the standards we’re held to for sensitive data continue to rise, they’re going to end up looking noticeably different from traditional HPC systems. HPC systems are optimized for performance partly by optimizing away from having strong inter-tenant security (e.g. encrypted traffic) I’m curious to see how things evolve.


Random

Lots of computerized machines built in the 80s or 90s - some airplanes, computerized looms and sewing machines, and even the Chuck E. Cheese Animatronics - require 3.5” floppies for updates or routine maintenance, and with new floppies being essentially non existent now, these machines are in danger of becoming obsolete e-trash even though they otherwise work perfectly.

I continue to be impressed by the growing set of code-analysis tools available - topiary is a library for writing code formatters, letting you focus on the ASTs for your language and rules you want without writing parsers, lexers, etc.

Learn and play with Trusted Program Modules for e.g. confidential computing in your browser - tpm-js.

Cross-platform flying toasters - After Dark screensavers in pure css.

Finally, computers that you can just saute and eat if they’re misbehaving - Inside the lab that’s growing mushroom computers.

Visual Studio plugin for debugging CMake scripts, it’s obviously completely normal for a build system to require an IDE and debugger to figure out what the heck is going on.

By all accounts, Google Groups continues to deteriorate. Does anyone have any favourite mailing list hosts for communities? I’ve heard good things about groups.io, which seems great but is a bit more full-featured than I normally am looking for.

A battery-free gameboy powered by solar or the user’s own button-mashing.

A github cli-based script to merge dependabot updates.

Running (the smallest) LLaMA model on a laptop.

Cryptography is hard and even schemes approved by experts can have vulnerabilities; don’t roll your own.


That’s it…

And that’s it for another week. Let me know what you thought, or if you have anything you’d like to share about the newsletter or management. Just email me or reply to this newsletter if you get it in your inbox.

Have a great weekend, and good luck in the coming week with your research computing team,

Jonathan

About This Newsletter

Research computing - the intertwined streams of software development, systems, data management and analysis - is much more than technology. It’s teams, it’s communities, it’s product management - it’s people. It’s also one of the most important ways we can be supporting science, scholarship, and R&D today.

So research computing teams are too important to research to be managed poorly. But no one teaches us how to be effective managers and leaders in academia. We have an advantage, though - working in research collaborations have taught us the advanced management skills, but not the basics.

This newsletter focusses on providing new and experienced research computing and data managers the tools they need to be good managers without the stress, and to help their teams achieve great results and grow their careers.


Jobs Leading Research Computing Teams

This week’s new-listing highlights are below in the email edition; the full listing of 181 jobs is, as ever, available on the job board.