#40 - Link Roundup, 4 Sept 2020

Hi!

Our last AMA (Ask Managers Anything) question was:

For non-embedded teams, what do you do to keep researcher clients / stakeholders up to date on progress of work?

We received one answer:

Our communication with stakeholders - leadership, projects we were supporting, research community members - was always a lot more structured than our internal team communications. So everyone working from home wasn’t as big a deal. For our team we’ve had to be a lot more deliberate in creating communications channels to replace the loss of “water cooler” interactions. But we have always maintained pretty scheduled meetings and emails with stakeholders. Walk-ins were pretty rare. So we haven’t lost much. We have a couple of projects that had dashboards of different kinds and those have certainly taken on new importance for having other parties feel like they are “in the loop”.

I think the AMAs have been pretty successful! We got a big initial burst of questions which we’ve gone through. I think I’ll give it a few months and try this again.

Are there any other recurring features you’d like to see in the newsletter?

For now, on to the link roundup.

Managing Teams

There’s a big difference between a team and a working group - Tim Leslie

The distinguishing feature of a real team is mutual accountability.

This 10-word quote nicely summarizes the article. A team isn’t just a collection of individuals with related tasks; it’s a group fo people who feel they can rely on each other’s contributions and hold each other accountable to that. When that is set up and working well, the team is an entity in and of itself.

My own managerial style tends to be a bit more reductionist, and I tend to interject myself into peer interactions more often then is often helpful. When I remember to let go a little bit and encourage team members to rely on each other (and give them the space to work up to relying on each other for increasingly big tasks) the team-formation process can begin. (This article by Warren Lynch is a good introduction to the Tuckman “forming, storming, norming, and performing” model of team formation).

Simple Burnout Triage - Ben McCormick

McCormick suggests one simple question for your team members to make sure they’re not edging towards burnout:

If you take the pace & quality of the last 2 months of your life and repeated it again and again, how long would you be able to sustain it?

If you get an answer ranging from “I could make this work, but..” to “I can’t go on like this”, then that raises increasingly serious red flags. The only non-worrying answer to this question is something along the lines of a genuine “oh no, this is good, I can do this indefinitely”.

Managing Your Own Career

Emotional Resilience In Leadership 2020 Report - Jonny Miller & Jan Chipchase

It’s been a long six months or so, and even if their teams are doing well, a lot of managers are feeling exhausted. Leadership is lonely and tiring at the best of times, but trying to manage a newly distributed team while keeping things on track and juggling the new challenges in our own lives makes it even more so. And if we’re not careful, that can lead to burnout. It is a lot harder and more time consuming to recover from burnout than it is to avoid it.

This is a long read, but if you’re feeling more and more exhausted and stressed it’s worth it; and even if you’re not, just the first section (a couple of pages of a google doc) is worth spending some time with. Some of the key points are:

We tend to recognize the importance of big and sudden external stressors (“this new project just got dumped on me”), but the low-grade ongoing stressors, external or internal, will get to you just as much.
Like “sleep debt” - if you’re not sleeping enough you’ll be overtired and it takes more than a couple normal good nights sleep to catch up - emotional/stress debt piles up too and has to be paid off before you get back to “normal”.
Being stressed out ripples outwards and can bounce back (think of being in a bad mood and so getting into an argument with a colleague that then makes work tense). This can cause avoidable spirals of stress.

When we know we’re stressed and tired we know what to do, but low-grade ongoing stressors can sneak by our defences. Just being aware of them and knowing we can take action to short-circuit spiralling consequences of stress, in ourselves, our team members, or our close ones, can help a lot.

Scaling yourself as an engineering manager - Sally Lait

Speaking of new projects being dumped on you…

When our responsibilities grow, we need to grow too. That means focussing on the truly important, not doing the things that simply don’t make the cut of the priority list, getting the help you need. Not discussed in this article, though it’s at least as important, is delegating tasks and efforts you know how to do well and were doing previously to your team members, helping them grow as well.

This article also gives some time to two items that don’t get discussed enough. First is that the processes you’ve built that were serving you well - your own processes or process with your teams (how you were doing staff meetings, etc) may need to change; these should always be up for reconsideration.

The second is that you’ll need to communicate what’s changing and why to your entire team. You may be less available, temporarily or not, and a team member who could previously chat with you easily is going to assume the worst if suddenly you seem aloof and less communicative. Any changes you make should be communicated clearly and probably repeatedly to your team (and any other affected stakeholders).

What’s it like as a Senior Engineer? - Zain Rizvi

Rizvi, who has been a senior+ developer at Google, Microsoft, and Stripe, talks about what being a senior technical contributor is like in tech.

This is relevant to us because I don’t think we appreciate how those of us who work with digital research infrastructure (software, data, systems) have had to develop skills that are fairly high up the career ladder in industry. Having to be quite self-directed, working on open-ended ill-defined problems, balancing risk and reward, and building consensus around solutions is pretty much table stakes in the world of research computing. I think this leads us to undervalue our own skills.

In terms of new hires it also means we underestimate the amount of coaching that quite technically talented new team members from industry will require in these areas.

Product Management and Working with Research Communities

Ten simple rules to increase computational skills among biologists with Code Clubs - Ada K. Hagan et al.

Bootcamp-style training can be very useful for getting research trainees “over the hump” and starting to be effective with developing software for their own use. But it’s pretty well understood that retention of that material fades quickly unless it’s in regular use. For the majority of attendees who don’t regularly use what they’ve learned afterwards, the benefits of the bootcamp can quickly fade away.

In this article, the authors describe their approach to “Code Clubs” (think journal clubs) to get research trainees ongoing practice with writing personal research software. Sessions can be “BYOC sessions”, where attendees rotate bringing their own code or problem and present it; the facilitator breaks the attendees into sub teams with a very specific goal and (refactor the code to make it more generalizable and more DRY is an example). They can also be more tutorial sessions, where again their is a hands on component but it follows a presentation on a new package or technique.

These ongoing sessions are known to be more effective at building longer term skills, and can follow a bootcamp. The authors give ten rules for facilitators thinking of running such sessions.

Research Software Development

Brittleness and Bureaucracy: Software as a Material for Science - Matt Spencer, Perspectives on Science

This is a paper from 2015 but was recently mentioned on the Society of Research Software Engineering slack. It’s an interesting view of a major software transition for the fluid simulation code Fluidity; the author watched and interviewed the team over a year and a half, during which there was a rewrite due to the original software becoming brittle. This required not just a rewrite but a change of how the team operated:

Fluidity’s robustness was increased by the re-write. But it would be a mistake to think about manipulability [including maintainability/extensibility - LJD] solely as a property of the software itself. Everything depends on working practices. There is no straightforward way to isolate the technology from the wider ecosystem of techniques through which it is brought into use.

There’s lots of great stuff in here, including familiar issues of different members of the community having different ideas of what the long-term goal of the software effort was, the buildup of technical issues which finally result in wholesale change. It’s a short and clear read from an informed outsider about the process.

Implementing Shape-Up - Nolan Phillip

I’ve written before about shape up, the development process out of Basecamp that has longer cycles (6 weeks) than typical agile, and focusses on pitching competing efforts for the next 6 week cycles. As you can likely tell I’m interested in this approach for research software development, as an attempt to to balance thenmedium-term planning cycles needed when you’re genuinely in somewhat uncharted territory with short bursts of execution. (My own default is to focus on the longer term, and I sometimes need to be dragged kicking and screaming back down to the day-to-day and week-to-week focus of execution).

This is a description of how shape up was implemented at one company. In this case it was added entirely on top of a weekly sprint cycle. The first week focussed on planning and shaping the goals for the upcoming 6-week effort, weeks 2-7 focussed on execution, and week 8 was a cool-down week/preparations to begin again.

Introducing Github Container Registry - Kayla Ngan

You’ve no doubt already heard about this suspiciously well-timed announcement from GitHub, following as it did on the heels of Dockerhub’s announcement that they would no longer host and serve container images indefinitely for free.

GitHub Actions, Github packages, and now Github Container registry make for an increasingly compelling solution for testing, building, and making available for deployment. Have you integrated GitHub Actions into your team’s workflow yet, or have you played with Github Container Registry’s public beta already?

Emerging Data & Infrastructure Tools

We Replaced an SSD with Storage Class Memory. Here is What We Learned - Sasha Fedorova

The MongoDB team has been playing with Optane technologies for a bit - we wrote about an earlier experiment with Optane as SSDs. Here they compare using Optane as an SSD vs using an Optane Persistent Memory (PM) module. The underlying NVMe hardware is basically identical, so the difference here is between a device which is sitting on the memory bus vs PCI-attached and using a file system interface.

The takeaway here is that writes are still pretty low-bandwidth (compared to memory). Reads are quite high bandwidth - but DRAM caching can mask that difference quite effectively. The big win is in the latency of new reads:

Latency, and not bandwidth, is where SCM [Storage Class Memory] can shine. In contrast to bandwidth, the latency of reading a block of data from an Optane PM is two orders of magnitude shorter than reading it from an Optane SSD: 1 microsecond vs 100-200 microseconds.

So random read-heavy workloads are where this could make a big difference.

Lightweight Kubernetes - Rancher

K3s, Rancher’s new lightweight kubernetes, has made a bit of a splash since it just recently got certified as a Kubernetes distribution. It’s a highly stripped down kubernetes and it bills itself as:

K3s is a highly available, certified Kubernetes distribution designed for production workloads in unattended, resource-constrained, remote locations or inside IoT appliances.

While it’s pitched at edge/IoT applications, its compact nature and aim towards unattended running could potentially make it useful for deploying researcher applications that need something more than a VM plus an ansible script or docker compose but a full Kubernetes would be overkill and too much management overhead.

I’ll be watching this to see how it plays out in research computing.

Flume - The Flume Project
Crepe - The Crepe Project

Alternative programming models like data flow and declarative programming are becoming more and more accessible, and each can play roles in different areas of research computing. Flume is a project that allows your team to readily easily create easy-to-use DAG data flow diagrams like workflows (“Let users code with type safety in your own visual programming language”) while Crepe provides relatively easy access to Datalog-style programming for declarative calculations in Rust.

Events

Executable Research Article (ERA): Enrich a research paper with code and data - Dr. Emmy Tsang, eLife - September 9 14:00 – 15:00 UTC

Next up in the SORSE programme, Dr. Tsang presents her work with eLife on supporting executable research articles:

We published our first demo ERA in February 2019. Over the past year, we have been working closely with our collaborator Stencila to build an open tool stack that would enable our authors and production team to easily publish ERAs at scale. In this talk, we hope to showcase the potential of ERAs with examples and walk through how authors can enrich their traditional eLife paper using Stencila Hub.

Calls for Proposals

Call for Posters: Minisymposterium on Software Productivity and Sustainability for Computational Science and Engineering - Poster Proposals due Sept 14 for event Mar 1-4 2021

Colleagues at Better Scientific Software are advertising a call work 1500 word abstracts of poster submissions for a minisymposium on software productivity and sustainability. The minisymposium is part of SIAM Computational Science Engineering 2021 which will be held in Fort Worth TX COVID-19 permitting, but remote participation will be available as well.

Random

HTTP status codes came from a protocol for submitting batch programs to computers in the early 70s, by way of FTP.

Finally - I can work with spreadsheets in the terminal and pretend I’m still a developer. sc-im is an ncurses-based spreadsheet program.

The thermodynamics of Turing machines as a fundamental connection between computation and physics.

Not new, but becoming increasingly mainstream - cgroups2 is slowly replacing the original cgroups as linux OSs get updated, which will improve the usability of “rootless containers” and container fleet management.

SDSC’s CloudBank is now operational. This commercial-cloud brokering play is an interesting experiment for research computing and I’m curious to see how it turns out.

Here’s a really slick-looking VS Code debugging visualizer for complex data structures, that works with several languages of interest to us here (javascript, go, python, java, C++, rust).

Automate postgres audit logging using triggers.

Mozilla’s financial woes and cutbacks remind us that research isn’t the only part of important digital infrastructure that has no sustainable funding model.

RCT Newsletter