I heard a number of answers back about what sorts of service request processes people were using:
One data science team had a longer answer that I want to quote here:
For a relatively small team of a few staff and about 10 students taking about 250 consultation requests a year one on one help on data science topics, not help tickets for a service), we use […] – basically a form + spreadsheet (google forms/sheet would work too). We have columns to track the status (incoming request, in progress, completed, etc.) and who has responded/been assigned, along with the info about the request. We don’t put other significant updates on the consults in this sheet – just for tracking status. We make sure our team brings the status of everything up to date at team meetings. Requests ping a slack channel when they come in, and we determine who will respond. So we can make sure that someone took each consult, even if the sheet doesn’t get updated immediately (it should be, but it doesn’t always happen).
For longer projects, we have only 6-12 or so per year – those go on a separate sheet for recordkeeping/reporting. All updates are communicated via team meetings or 1:1 meetings.
That same team is also working with other allied teams to figure out some kind of way to have a shared integrated history of interactions with researchers.
I found all your answers really useful! I think it’s an important topic, too - it ties in a bit to the last newsletter topic.
We’re in the expertise business, and there’s a spectrum of ways to bundle that expertise into services and products and make it available to researchers.
For researcher efforts that require cutting-edge expertise for an open problem it makes makes sense to expose that via longer, open-ended service engagements; breadth of experience makes sense to expose to the researcher via a consultation model with productized services (services, but a bit more structured, with a well defined process internally and well-defined scope externally); and efficient procedural work makes sense to bundle as at least semi-automated, cookie-cutter products.
All else being equal, ideally a team will have a portfolio of ways to engage researchers along that spectrum. There’s a bunch of reasons:
I mentioned “exposing” the expertise to researcher clients a few ways; my mental model here is of an API. And that’s why I’m so interested in how teams handle incoming requests for work from researchers. These different services lend themselves to different “APIs”, to different kinds of work request models.
The sort of traditional request/ticket trackers work really well for very transactional requests, like for IT service requests or for Utility or product interactions. “Please reset my password”, “login5 is down”, “bug report: when I enter zero in the form I get the wrong answer”, “please update to the latest version of this dataset”. There’s not a lot of scoping or diagnosis or collaboration here. There’s an issue or a service request, a couple backs-and-forths, and it’s fixed. Metrics like how quickly tickets get addressed make sense - faster is better. These systems are time-tested and efficient approaches for these sorts of brief and somewhat shallow interactions.
But they’re kind of crummy for longer-term, more collaborative engagements! If in my personal life I want to have a conversation with an expert to help me with something - a lawyer or a doctor or a contractor to help work out a remodelling effort - I don’t file a ticket. I certainly don’t have metrics about ticket closure time - some problems are bigger than others and may kick off months long engagements, and that’s not obviously good or bad. The bigger engagement certainly produces documentation about the effort and the work product, but it’s not merely a series of back-and-forths on a web form or email - it’s much more distilled than that. Maybe if I don’t realize the question I thought was a simple request is actually Big Deal, I start the conversation in some sort of transactional system, but it eventually gets migrated out.
There’s no one-size-fits-all here. As with the team using slack that migrates to email, and the team using two different systems for two different engagement types, different work is just different and needs to be treated differently. I wish there were better tools for us to use! But if there’s any simple answers out there, I haven’t seen it yet.
Thanks to everyone who responded! Now on to the roundup.
The time’s long passed when we could simply post the generic HR job req for our openings on the institution’s career website and wait for the applications to roll in. I used to say “That may work for postdocs and faculty positions…”, but it increasingly doesn’t even work for postdoc openings.
These two articles from new newsletter community member Lowe lays out the work that needs to be done to get good candidates into the application process these days:
Yes, that’s a lot of work, but that’s what it takes:
I recently learned that in the space of 14 months, our management found only a single candidate who they felt was qualified enough to interview. After I did a recruiting blitz, they got 6 in the space of 5 days.
In part I, Lowe gives a worked example of looking for a junior HPC systems engineer. She targeted two groups of IT professionals - those who have recently gone back to school for training, and those looking to switch career paths. Then she specifically reached out to career services and/or departments at nearby institutions for the back to school crowd, and IT affiliation groups for the career change market.
In Part II, Lowe shares the case of a more specialized mid- to senior-level role, where now you have to work a little harder:
How Do Individual Contributors Get Stuck? A Primer - Camille Fournier
Fournier lists some common places for individual contributors to get stuck - such as when they need to:
And that they can sidetrack themselves by:
The problem with Fighting Fires - Ed Batista
It’s worth hearing this simple message from time to time. Fighting fires feels great and important - it’s very satisfying! But as managers and leaders our job is to coordinate firefighting and - more importantly - fire prevention efforts. Constant firefighting is, too often, a symptom of valuing activity over effectiveness.
Project management, like people management, is something we’re typically thrust into without any training. If we came from research, we have some experience of managing research projects, which is good - but there the timescale is longer and no one is depending on our results month-to-month. RCD projects are different. Linking to this post from elsewhere, Rubick writes something that squares with my experience with RCD teams:
I see most engineering managers gravitate to two extremes with project management: either they don’t do much at all, or they go all-in on traditional project management approaches.
He outlines pragmatic, aggressively simple, approach to project management of the kinds we’re usually involved in; the context is in software development but it applies more broadly:
He argues that this should only take a couple hours a week for the kinds of projects we’re most likely leading. (Obviously we might be involved in more elaborate projects - construction of a new facility - but we’re probably not running those!). Once that’s done there will be other things to do to coordinate the actual work that the plan’s for. But higher-level planning, Rubick argues, doesn’t have to involve huge tools and long documents. Project planning is to support the work and the people doing or relying on it, it’s not an end in itself.
A Conversation with Mathematical Consultant John D. Cook - Krešimir Josić, SIAM News
Going solo is always an option for people with highly specialized skills. Josić interviews the famously prolific Cook, whose blog posts or tweets you’ve almost certainly read, about his experience starting his own consultancy.
Huge congratulations to Harvard for going big - 15 open positions as part of the creation of a University Research Computing Office, growing FASRC (the faculty of arts & sciences HPC centre) while adding data management and software development as part of a University-wide portfolio.
Pair Programming - James Cross
Pair programming is one of those things where it’s not necessarily hard, but it’s easy to do wrong. Like everything, in our role, having clear shared expectations beforehand makes the difference.
The short list of recommendations here are to have two people each of one defined roles (driver and navigator), set time limits at the beginning, switch periodically, and still do code review at the end - the argument is that the synchronous nature of pair programming still can lead to groupthink and so could use external review.
On the other hand, we’re told not to worry too much about who to match with who - sure, there’s advantages to having a more experienced mentor a junior, but there’s advantages to pairing peers, too, and having the juniors navigate is a useful experience.
Thinking of starting a research software development group in your institution? Know of some others that already exist in the same kind of environment? This Google Doc is an outline of some questions to ask in an informational interview, to find out how they’re organized, what’s worked, what hasn’t, how funding works, and more.
Data Ethics Club: Creating a collaborative space to discuss data ethics - Di Cara et al, Cell Patterns, 100537
Data ethics is a very topical and interdisciplinary area, and a way to engage a group of potential collaborators around data data issues and researchers. The authors here describe the success of a data science topical discussion group at their effort:
Data Ethics Club is a fortnightly reading and discussion group held virtually that is currently hosted by University of Bristol staff and students. The hour-long lunchtime meeting is free to attend and open to everyone.
Discussions are organized and structured and recorded via a github repository (which has a lot of great material there already).
More accuracy with less precision - Lang et al, Quarterly Journal of the Royal Meteorological Society, 146:4358-4370
This paper was from October, and the announcement was actually made last May but somehow I missed it. It thus won’t be a surprise to people who follow weather and climate simulations more than I do, but maybe others will be as startled as I was:
Reducing the numerical precision of the forecast model of the Integrated Forecasting System (IFS) of the European Centre for Medium-Range Weather Forecasts (ECMWF) from double to single precision results in significant computational savings without negatively affecting forecast accuracy […] ECMWF’s ensemble and deterministic forecasts will run operationally at single precision from IFS model cycle 47R2 onwards.
The imbalance between memory-bandwidth and compute-power available in research computing systems has been growing worse and worse over time. More and more groups are trying to figure out how they can use less memory, or at least bandwidth, in their computations. Reducing the precision of variables is one way to do this. We’ve seen reduced precision in AI, of course, and there’s growing interest in mixed-precision methods (here’s a recent review for linear algebra and another for other methods).
But I hadn’t realized that fairly staid ECMWF simulator, used for research but also for production weather forecasts, was running in production in single precision. What’s more, they used the computational and memory savings of switching from FP64 to FP32 to increase the vertical resolution, which lead to an increase in accuracy of the model (and makes comparison to medium- and extended- range forecasts easier).
Unless there are dramatic advances in memory bandwidth - either with memory technology or putting more computation closer to the memory - we’re going to see more of this. It’ll require a lot of interesting algorithmic work and code changes! But the benefits are pretty clear.
Wouldn’t it be awesome if we didn’t have to recompile code every time the next generation of processor had an incrementally better or bigger vector operations? Many new and upcoming generation Arm chips (yes, NVIDIA will have one) will have an old vector-processor inspired “scalable vector extensions” (SVE) instructions. As Lemire says,
What is unique about SVE is that you work with vectors of values, but without knowing specifically how long the vectors are. This is in contrast with conventional SIMD instructions (ARM NEON, x64 SSE, AVX) where the size of the vector is hardcoded.
RISC-V has something similar and I couldn’t be happier. I love the “back to the future” aspect of it, and hope that it eventually makes maintaining code across architectures easier as more processors adopt something related.
A look inside our sixth generation of server hardware - Eric Shobe and Jared Mednick, Dropbox
A quick overview of what Dropbox’s next generation of storage systems looks like - 20 PB/rack, with 100 drives per chassis with a single (!) 100 Gb NIC.
How to Adopt an SRE Practice (When You’re not Google) - Jemiah Sius On
On tells us you don’t have to be Google-sized to adopt some Site Reliability (SRE) practices on your team. The main thing is have to have clear service level expectations, understand where the risks come from of not meeting them, and someone whose responsibility it is to guide the team towards meeting those expectations.
Sketching methods are becoming widely used in large scientific data science, led in no small part by bioinformatics. Here’s a good overview of doing a min-hash similarity join - the code is in Scala for Spark, but the explanations are very clear and apply more broadly.
Join and Index (!?) with jq.
Dicts considered harmful? For data exchange, anyway.
I didn’t realize there was a publicly available, petabyte-scale web crawling dataset freely available - Common Crawl.
The challenges of running and steering Atari.
Yet another hopeful SQL replacement that’s more of a proper programming language: PRQL. I’d love to see one of these succeed someday, and yet here we are in 2022 with GitHub littered with the corpses of SQL replacements.
Rsync is a sadly under-appreciated tool. Here’s how it works.
An opinionated python testing style guide.
Why not just use an ad-hoc dewey-decimal type system to organize all your stuff? Johnny Decimal.
I love that embedded databases are now so commonly used and increasingly respected that there are articles like “SQLite or PostgreSQL? It’s Complicated!”
And that’s it for another week. Let me know what you thought, or if you have anything you’d like to share about the newsletter or management. Just email me or reply to this newsletter if you get it in your inbox.
Have a great weekend, and good luck in the coming week with your research computing team,
Research computing - the intertwined streams of software development, systems, data management and analysis - is much more than technology. It’s teams, it’s communities, it’s product management - it’s people. It’s also one of the most important ways we can be supporting science, scholarship, and R&D today.
So research computing teams are too important to research to be managed poorly. But no one teaches us how to be effective managers and leaders in academia. We have an advantage, though - working in research collaborations have taught us the advanced management skills, but not the basics.
This newsletter focusses on providing new and experienced research computing and data managers the tools they need to be good managers without the stress, and to help their teams achieve great results and grow their careers.