#52 - 27 Nov 2020

Focus and communiations; saying no; blogging research; emerging boring strategy; I/O APIs and performance

Hi!

I wrote a few weeks ago that, post-pandemic, we in research computing teams are going to have to work to make our value clear to administrations, funders, researchers, and our team members.

A lot of the link roundup items I’ve pointed to over the past year have focussed on our team members, which is crucial - we can’t support research without an excellent, motivated, aligned team doing the work. But there’s a lot less material out coming out on communicating research computing teams’ value to funders and administrators - what are we for, why should this team in particular continue to be here, why are we the right team to take on this new, strategically important, project?

These issues have been on my mind a lot lately. I need to do better on this for the work of our own team - our area is getting competitive for the first time in ages, and we need to better communicate our successes! But I’ve also been working with a decision maker for a different organization with a sizeable research computing team that just went through a leadership change. The decision maker wants some pretty clear medium-term goals and indicators to help the team get back on track.

Listing the right things to do is easy, but actually doing them is hard. We all pretty much know the answer. Successful research support teams have to

  • Focus, andk
  • Communicate.

And yet generally we don’t.

Focus is the by far the hardest of the two, because it means making difficult choices. Having a focus, a specialty, a particular expertise, makes starting every conversation with funders, administrators, and researchers easier. And it means that your team is the first that comes to mind when a new opportunity pops up.

When you’re “the geospatial software team” rather than “research software support”, you have a strong clientele of committed users, a very clear story to tell administrators, and a very compelling reason why your team should be the one to get the grant to build the mapping software project. When you’re the “accelerated computing system experts” instead of “a team that runs a cluster”, you’re the one the researchers talk to about new GPU needs. When you’re the “long term data preservation team” rather than just a team that also has a lot of tape storage, you’re the one that they talk to about research data plans, and the ones they write into their grants.

The downside? Everyone wants to have a focus, but no one wants to stop doing things.

Research teaches us to be, well, entrepreneurial when it comes to ferreting out new research opportunities and new projects. Those of us who trained in teams cobbled precariously together by a patchwork of grants tend to develop a sense that every funding opportunity should be chased down.

Successful faculty members know better. They quickly develop a good sense of where their efforts are the most valued, and pursue opportunities to make contributions there with laser focus, making occasional, considered, detours or pivots as the field evolves.

But in research support services we often don’t quite develop that knack. There’s a lot of different kinds of research out there, and we genuinely want to help them all. But we can’t. We can’t have five top priorities, and single teams can’t have five specialties. We have to chose, and that means not doing things. Passing up opportunities - pointing potential research clients to other teams - is scary, and it’s the only way we can develop a focus on the areas we’re best at and most needed.

Communication is a lot easier and more successful with a tight focus. For researchers, it’s much easier say exactly what you can do for them when your team writes geospatial software than when you write “research software”. For administrators, it’s easier for them to get a sense from a particular community how valuable your team is then when you’ve helped a number of different groups with quite different tasks.

And the communication you perform is vastly more effective. If you’re regularly posting geospatial software success stories - or contributions or even just helpful tips - on your communication channels, that looks vastly more compelling to researchers who need geospatial work done, or funders evaluating grants, than the same number of stories scattered over three or four specialties. And that’s assuming you could have the same number of success stories with three or four “focuses” as with one, which probably isn’t true.

There is some really good material out there on finding focus, and tightly defining and managing the services you provide to researchers, in the literature around core facilities - I’m going through this article by Turpen et al now - and there’s a lot we could learn from those groups.

I’ll write more about that shortly; for now, on to a focus- and communications-themed link roundup!

Managing Teams and your Own Career

How to say “No” right now - Lara Hogan, Wherewithall

The year end is approaching, and with it deadlines and final pushes. But other stuff comes up.

The solution to dealing with too much work isn’t “time management” - managing time isn’t a power granted to us. There’s only task management, and the number one task management skill is declining them.

This is a good time of year to practice saying no or deferring a yes - everyone’s in the same boat and so understands. In her latest newsletter issue, Hogan walks through three steps to say no for the next month:

  1. Identify the #1 thing you are optimizing for through the end of the year.
  2. Run any decision past that #1 thing.
  3. Say “yes” or “no” in a way that works for you.

Product Management and Working with Research Communities

Blogging your research: Tips for getting started - Alice Fleerackers, ScholCommBlog

Fleerackers’ post is aimed at researchers but works equally well for us in research computing. It starts off with an important point - not every blog post needs to be a 1500 word feature. It could be a summary of a paper or conference session, a Q&A/interview post; anything that makes the audience more informed about your groups work than it was before.

The decision for where to blog is normally easier for our teams - we generally have a website (but getting posts put elsewhere is good too - like the US-RSE blog aggregator or guest posts on relevant sites), and then set some kind of reasonable schedule. The most important thing about the schedule (IMHO) is that it’s realistic - routine posts, even spaced apart by a couple months, is better than a flurry of over-ambitious posting followed by exhaustion and abandonment.


Cool Research Computing Projects

Emulation as a Service for Heritage Institutions (PDF) - Dutch Digital Heritage Network

We tend to think of computing systems and software as tools for powering research and data preservation - but sometimes they’re the objects of research, or preservation, themselves. This is a status report on using emulation of older computing systems to make sure that items of digital heritage - software or content - aren’t lost as the hardware becomes extinct.


Research Software Development

Write Five, Then Synthesize: Good Engineering Strategy Is Boring - Will Larson

Focus enables strategy - not only what you’ll be doing, but how you’ll be doing it.

Developing a software development strategy for a team allows you to focus on the important parts of each project rather than bikeshedding the same decisions again and again. You can’t develop such a strategy for executing projects if each project is completely different.

Larson’s article is an argument in favour of grounding such a strategy in the pragmatic (e.g. boring) and avoiding the siren call of “innovation” by doing it from the ground up - writing multiple decision documents for individual projects or components, and discovering the underlying strategy by synthesizing them.


Research Computing Systems

SC20 Panel – OK, You Hate Storage Tiering. What’s Next Then? - John Russell, HPCWire

Continuing with storage - tiering is hard to manage, and works poorly at modest scales. This summary of an SC20 panel discusses a number of vendors’ upcoming models for how to handle the diverse range of file sizes and access patterns needed in large computing centres. There’s several presentations summarized here, but to my mind the more interesting ones are:

  • Reducing the number of tiers and tying them to particular use modes as withNERSCs Perlmutter
  • Tiering based on file size not “temperature” (how often it’s used), as with Panasas’ solution
  • All-fast storage with NVMe as with VAST’s solution

Emerging Data & Infrastructure Tools

Modern storage is plenty fast. It is the APIs that are bad. - Glabuer Costa
How io_uring and eBPF Will Revolutionize Programming in Linux - Glauber Costa, Scylla

Our current userspace and kernel level APIs for file I/O are based on storage hardware from decades ago, and are now frequently the bottleneck for high-performance persistent media like NVMe or even SSDs. Costa’s two articles - the first more recent and higher level and the second older and focussed on the two recent technologies that will enable future filesystems (or applications that take control over their I/O, such as databases) to make full use of this hardware.

(Worth noting another recent link, too. There are also people who argue that the division between filesystems and and databases is a historical artificact which we should move past - such as the Boomla project is trying to do.)


Random

Efficient single-node graph mining with peregrine.

You don’t always need machine learning. (See also: Figure 1 of https://www.biorxiv.org/content/10.1101/2020.01.10.897116v1 where least squares does very, very well)

Pretty important for CPU-intensive research computing - between cgroups and containers, your scripts should use nproc not /proc/cpuinfo for determining number of cores to use

As you know, I think embedded databases are scandalously underused in research computing - here’s an article on SQLite as a document database.

“Reproducible”, “Replicatable”, or “Repeatable” mean a lot of different things in research computing communities. Konrad Hinson has a blog post laying out reproducibility as a spectrum. I’m not sure I like the terminology he’s chosen, but thinking of it as a spectrum seems a much more useful approach than using yes/no criteria.