In this third issue in the “measure what matters” series, I want to continue on the topic of planning and evaluating programs based on impact, not activity.
Issue #162 covered the idea with Kirkpatrick evaluation model for training. Last week (#163) we talked about logic models, which are simple but powerful tools that help organize our thoughts, discussions, planning, and evaluation in terms of outcomes and how we get there, not just on the activities. They help clarify assumptions, align resources and actions with desired results, and (like the Kirkpatrick model) give a framework in which to track and demonstrate your progress and achievements.
Maybe most importantly, they can be used to usefully redirect discussions with those stakeholders or decision makers who otherwise have an unhelpful “underpants gnomes” mental model of our work (for example, Phase 1: Run a computer! Phase 2: ??? Phase 3: Science!)
Logic models are fine illustrations for describing individual services or products. But where they really help clarify thinking is for programmes of services. If the the inputs-to-outcomes chains are spelled out in enough detail, it provides a lot of visibility into possible opportunities into combining some work, and (especially if we routinely talk about it with researchers) identifying whether there are gaps.
In particular, the programmes of services depicted don’t all have to be ours, and it doesn’t need to be us who fills in those gaps.
That’s why nonprofits like us often use tools like logic models to think about possible collaborations for maximizing collective impact.
A quick aside on this, since this it isn’t universally appreciated.
One of the reasons leading research software, data, or systems teams is so challenging is that we don’t have any single role model we can follow. Unless we’re still trapped in the “utility” model, we operate like professional services firms (#127); due to the nature of research we’re also sometimes have the product problems of startups, where we’re trying to build a solution while we’re figuring out the problem.
But our business model is that of a nonprofit. (If we have a significant amount of our income coming from recharge/cost recovery/whatever your institution calls it, then there’s a bit of social enterprise thrown in, too). That means a few things. For instance, we have Drucker’s two customers - the primary clients (the researchers), but also the supporting customers (funders, institutional decision makers) which both have to be kept happy. We also sometimes have boards setting direction, and frequently badly underpay our staff.
The thing that makes us most like a lot of nonprofits, though, is that we have a goal — advance research and scholarship — that’s vastly, almost comically, larger than what any single team or organization can accomplish. And that means we have to collaborate.
Other research support teams in our institutions, elsewhere, or even in the private sector, are not competition (#142). What would that even mean? We have an impossibly large mandate, and it’s absurd to think we can do it all ourselves.
We can face bigger challenges and take advantage of larger opportunities by collaborating with (or even just availing ourselves of the work of) other research support teams that provide complementary services. Working together we can tackle more complex research problems, serve existing researcher clients better, reach more researchers, improve our effectiveness, and increase sustainability. With differing capabilities and expertise, working together we can accomplish more than we can on our own.
With a logic model (or the moral equivalent - as mentioned last week there’s other tools for this) we can start looking through the diagrams for gaps - or things that maybe would ideally be gaps in your service provision:
This gives ideas about areas to rely on others. It doesn’t necessarily have to be an explicit collaboration — we can direct some work to another service provider without necessarily working things out with the provider first. But discussing collaboration (even with private sector providers) provides opportunities to make the workflow smoother, find other ways to work together, and even jointly advocate for additional resources.
And using something like a logical model to plan the collaboration helps ensure that there’s a clear and shared understanding of what we want to achieve together, how we will achieve it together, and how we will know if you have achieved it together.
Just like the Kirkpatrick model helps spell out evaluation at different stages of an implied logic model, an expanded logic model that includes external activities helps organize ideas about where and how to monitor success. Such monitoring may involve the other partners; some thought then goes into how to monitor activities, outputs, outcomes, and impacts together? What are the indicators and data sources that we’ll will use to track progress and achievements? How to collect, analyze, and report your data?
And as with surveys (#159) there’s no point in measuring anything that won’t be used to make decisions - so what will we do with the results once we get it? How do we know if it’s good or not, and if so how to fix things?
By using logic models to evaluate and communicate the collaboration, we can ensure that you have a clear and evidence-based account of what we have done together, why it matters, and how it can be improved or replicated. We can also increase our credibility and reputation as a collaborative partner, and attract more support and recognition for our work.
And, just as a note - funding decision makers typically love to see “leverage”, and two or more groups they fund separately coming together to make the funding worth even more.
Next week I’ll close out this series by talking about our nonprofit nature, and why it’s so important to not just measure the right things (even if qualitatively), but to take those measurements seriously and hold ourselves to high standards.
CaRC Student Workforce Development Program Presentation: Northwestern University Research Computing Services [Video] - Scott Coughlin, Alper Kinaci, and Colby Witherup Wood, Northwestern
I very rarely link to video but this is a great overview of a very well thought out student workforce program that went from 4 students in 2016 to 21 today, on three tracks - data science support, computing support, and infrastructure. (The link is to the student-facing page, which is a very nice example of describing what’s in it for the students, including testimonials).
The Northwestern University RCS program recruits students for a year or longer, but only part time (0-30 hours a week) and often with quite flexible schedules. That offers benefits, particularly for the students (who can participate in longer-term projects and see a more varied range of projects), but makes tracking work and managing the students significantly harder.
The three have created a very coherent and thoughtful approach for the whole student workforce program, from an internal hiring handbook, a student onboarding handbook, and a process for experienced students mentoring new students. Because of the long timescales, they’re even thinking about career ladders of a sort for their long-term students, with explicitly expanding responsibilities. This would codify increasing expectations, while recognizing good work and giving the students a promotion to show on their résumé, thus incentivizing staying: it sounds like a terrific plan. It’s also something I’ve not seen before anywhere that I can recall (there’s hints of it in the also-excellent paper by Feng and LeBlanc we discussed in #124, but that was in the context of a single research group, which is a little different).
The hiring handbook has job descriptions, compensation, lists of advertisement venues and contacts, email and letter templates, and a clear interview process (screening criteria, questions, and evaluation rubric). Similarly there’s an onboarding checklist, a learning roadmap for getting up to speed, and a knowledge base.
The program extends to tracking impact of the student work, both for the team and for the students, qualitative and quantitative where possible (number of tickets, projects, publications).
Amongst day-to-day work of programming data analysis workflows, handling tickets, and doing assigned systems tasks, the students (especially the computing support students) generate content for trainings, workshops, and how-tos.
I found this presentation really heartening. It’s terrific to see a younger generation of newer managers and leads doing such a fantastic job of building strong programs for both getting work done and developing team members. I suspect this means they themselves have leadership who encourages, values, and supports such work. I get discouraged sometimes at the state of management in our line of work, and the disconnect between current practice and what’s possible, so it was great to watch this (and great to watch the appreciative reaction from the audience on the call).
Across the way over at Manager, Ph.D. I discussed finding everyday opportunities to work on the team. In the roundup was:
Fast-forwarding decision making - James Stanier
Working in the research world, we become used to a certain… unhurried … pace. That can make sense in research. Research projects have timescales measured in units of years, with lots of uncertainty and irreducible deep work that has to be done. So it doesn’t necessarily make a lot of sense to drive a sense of urgency in scheduling meetings or decision making.
But we’re not in Research any more. As I wrote above, we are nonprofits, serving the urgent unmet need of supporting research. Our habits learned in the world of research ought not govern us any more.
Stanier gives three examples of ways that even fairly simple work will get bogged down by default:
Stanier also proposes simple fixes. They all involve taking back a little bit of control, asserting ourselves a little bit, or exposing as-yet unfinished work for early feedback. That can be uncomfortable. We could seem wrong or bossy, we worry.
But making rapid progress is pleasingly contagious, and getting things done and off our desks feels good, for us and the others involved. Just moving one notch faster than the default can make a big difference for getting something done.
It’s Time to Declare Backlog Bankruptcy - Dan Wyks
I’m super bad at this, whether it’s actual issues tracked in a project management system or to do list items or articles sitting in my “to read” pile:
We aren’t being honest with ourselves, our team, our stakeholders … the vast majority of the backlog will never be done. Some of them might even be already done or overcome by events (OBE), and we’re still carrying them around!
And of course digital tools just make this hoarding worse, and enable the behaviour of vacillating on the fate of these old items:
[…] what would your backlog look like if you were still using analog tools, i.e., sticky notes and markers? Would you have hundreds of them on a wall somewhere?
Wyks points out some textual support from no less than the Agile Alliance and Scrum Alliance that large backlogs are bad and arguably un-agile. To say nothing of the fact that they sit there like a weight on the shoulders and imagination of everyone who opens the ticket tracker.
The cognitive cost of reconstructing a six-month old issue from scratch if you do need it is probably less than updating the backlog item given all the changes that have happened in the meantime. And it’s certainly less than also dragging along all the dozens of issues that you won’t need for every one you do. Export them all into a separate document entitled “Ideas for later” if you like, if you’re worried about lose something. But get rid of ‘em.
Unit Testing Analytics Code - Matt Kaye
Just like other kinds of research software, data analysis code has a hard time working its way up the technological readiness ladder (#91) to the point that it stops being research input and starts being a production research output (#119). The code starts in an exploratory phase (“will this even work?”), and it takes significant and seldom-rewarded work to move it up the ladder to something people can use with confidence.
If anything, data analysis code often has a particular challenge - the tools we use for exploratory work (like Jupyter notebooks) have a workflow which adds friction to getting the code into version control and test suites (RStudio is an honourable exception here).
Kaye walks us through what a unit test even means in this context (and calls out data wrangling code as one of three examples: data wrangling is basically a rolled-up ball of edge cases and it’s so easy for things to go wrong here).
Then there’s a case study of feature engineering and what can go wrong there; simple testing (here in
R and using
testthat), getting the tests to pass, and then confidently refactoring and making some changes, knowing that the tests are there and can identify the issues we’re testing for.
Having even a small set of tests is so much better than having none that it’s worth adding them; the other commonly-cited advantages is that now there’s a form of documentation and examples of use of the routines, and breaking up a pipeline into chunks that are individually testable is an excellent way to way to start architecting or rearchitecting larger chunks of code. (Everyone here who’s had a researcher hand them a 500-line script that was all one function, raise your hands… yes, that’s what I thought).
Interestingly, Oracle Database now has a free, developer release which you can spin up in a container. Oracle has some interesting-seeming multimodal, clustering, partitioning and authorization features, but to my knowledge hardly any one in our space has played with it because it’s always been incredibly expensive and lacked a free version. Curious to see what happens here. Anyone here played with it, maybe in a past career?
Charting a Path in a Shifting Technical and Geopolitical Landscape - Yelick et al., Committee on Post-Exascale Computing For the National Nuclear Security Administration
For better or worse (I have my opinions), the DOE and similar nuclear weapons agencies have have hugely outsized impact on the broader HPC and Research Computing communities - relative to fraction of total compute resources, and certainly relative to scientific impact. So it’s worth seeing which way those winds are blowing.
Encouragingly, our long international nightmare of being trapped in a community primarily driven by Top-500 HPL benchmarks seems to be nearly over:
This report uses “post-exascale era” as the 20-year period starting with the installation of the first DOE exascale system in 2022 and “post-exascale systems” as the leading-edge HPC systems that will follow the current exascale procurements. The committee has chosen not to describe these future systems as zettascale systems, because the focus is not on a particular floating-point rate but on time-to-solution […]
There’s recommendations for more of an application focus, software support, applied math research, adoption of emerging technologies where appropriate like AI and quantum, the rising recognition that memory bandwidth and making use of commercially-driven technologies are important, coming to terms with the rewriting of legacy codes, and best of all better that a substantial workforce development push is essential.
All in all this is another very hearting item to be able to share with you this week. If even the DOE is coming to terms with the fact that “but that’s the way we’ve always done things” is no longer a good enough argument, then there’s hope for us all…
(My apologies, dear reader, for the amount of snark in this item. I try to be relentlessly positive, by my standards, in this newsletter. But I assure you, going through this document and recent related discussions, it took heroic personal effort and a number of rewrites to limit the snark to the comparatively tiny amount you see above.)
Oh interesting - is this new? There’s now a Podman Desktop.
A project that clustered 21 million English-language biomedical science papers by their abstracts - and the code and data they used, including the vector embeddings they used to do the clustering, so you can do your own exploration.
I’m still kind of amazed of what can be done with auto differentiation these days, both on the applications side (like generalizing what seems clearly to be a discrete problem into a continuous one and using these tools on it) or on the tools themselves. Here is some simple work on differentiable finite state machines(!?). And here’s a tool for automated bounding of the Taylor remainder series of a function which, if I were still in the PDEs business, I’d be figuring out how to use.
The age of North American computer magazines is now completely over. Ugh. Anyone else remember typing in program listings from magazines, or drooling over (e.g.) Computer Shopper?
Pueue - queue long running shell commands and manage the queue.
Take screenshots or videos of macOS windows from the command line - macosrec.
Almost 3000 words on what happens when you hit Ctrl-D.
Watch the progress of GitHub actions with watchgha.
In praise of powershell. To each their own, I say.
Or maybe you’d be into code golf and golflangs after this introduction. Hey, the heart wants what the heart wants.
And that’s it for another week. Let me know what you thought, or if you have anything you’d like to share about the newsletter or management. Just email me or reply to this newsletter if you get it in your inbox.
Have a great weekend, and good luck in the coming week with your research computing team,
Research computing - the intertwined streams of software development, systems, data management and analysis - is much more than technology. It’s teams, it’s communities, it’s product management - it’s people. It’s also one of the most important ways we can be supporting science, scholarship, and R&D today.
So research computing teams are too important to research to be managed poorly. But no one teaches us how to be effective managers and leaders in academia. We have an advantage, though - working in research collaborations have taught us the advanced management skills, but not the basics.
This newsletter focusses on providing new and experienced research computing and data managers the tools they need to be good managers without the stress, and to help their teams achieve great results and grow their careers.