#29 - Link Roundup, 19 June 2020

Hi, everyone!

It’s been a weird couple of weeks here at RCT Global Headquarters. Our entire team is moving institutions, following our boss, and so there’s been a flurry of paperwork, both helping coordinating offers and then arranging for the resignations once those were out.

Coordinating a mass resignation wasn’t quite as fun as it sounds. And it’s funny to see how invested we are in our teams; I actually had to suppress a momentary gasp of dismay when I received the first email with Subject: “Resignation Letter”, even though I (a) asked for the resignation letters and (b) provided the template, both less than an hour before.

The work won’t change at all, but we’re moving to a much larger institution, with more peer technical teams to interact with. So I’m looking forward to “being at” our new workplace, even virtually. Now there’s just the barrage of new hire paperwork and elearning orientation modules to go through. (And if you’re curious - even our boss, who’s moving into a C-level position, had to diligently work his way through those. HR processes will simply not be sidestepped.)

In even weirder news, this week I implemented a (completely trivial) front end to a web service for a demo. In React. It was… actually kind of fun? It continues to amaze me how much more advanced the tooling is to quickly get a web services project off the ground then a more traditional research-computing type application like a simulation code or data analysis project. Part of this is due to the web development problem being more constrained, but I think a larger part is just the numbers of it — there’s way more people developing web services than those kinds of research software products.

Anyway, on to full-stack research computing team management!

Managing Teams

The Role Canvas - Four Questions to help you re-think how you hire — Kate Leto

You can’t hire well without a clear idea of what you’re looking for. In research computing we too often copy and paste job descriptions from HR or lists of qualifications from what other groups are doing without really sitting down and figuring out what a really good hire would look like. Once that’s done you can start pitching your add and your interview questions to find that person.

I like this article. The role canvas reminds me a lot of the “Manger-Tools Quick and Dirty Job Description”, based on 5 questions:

  • The reason the company created this job was …
  • The most important ways a person doing this job should spend their time are ….
  • The 2-3 most important duties of this job are …
  • What this job takes to be successful is ….
  • The simplest, easiest way to see if this job is being done well is ….

Sitting down, ideally with the team members a new person would be working with, and really hashing out the answers to either sets of questions is a great way to figure out how to spot someone who would be great at your job. And having those questions clear in your mind when interviewing significantly helps cut through the biases we have. An extra year of C++ experience or 3 extra papers in [domain expertise] probably aren’t the qualifications that you actually need.

How can you become an Anti-Racist in tech and support Black technologists - Anjuan Simmons

A twitter thread of four things ICs can do to support Black workers in tech. For us managers, he has a really strong talk (45 min video), Lending Privilege.

The HPC Certification Forum: Toward a Globally Acknowledged HPC Certification - Julian Kunkel, Weronika Filinger, Christian Meesters, and Anja Gerbes

I hadn’t been aware of this effort - putting together a list of HPC core competencies and figuring out what a certification in those would look like. You, reader, will know that I think research computing is much (much much) broader than HPC now, and was likely ever so. Still, for those looking to hire in HPC or train people in HPC, the work described in references here might form a principled basis for a set of core competencies to look for/train for.

Managing Your Own Career

Email Etiquette: How to Ask People for Things and Actually Get a Response - Jocelyn Glei

As you move up in research computing (or anywhere really) you start communicating more, especially upwards, with people whose attention is torn between more and more things. That means for your emails to work you have to make them increasingly self-contained but also concise. There’s 9 points here but four of them are key tools in my kit:

  • Lead with the ask - You’re sending this email to achieve some purpose: either to get a go-ahead for something, inform someone, ask for a decision, etc. Lead with that.

  • Write your subject lines like headlines - They should have a pretty good idea of the purpose of the email and what is asked of them by reading the scanning

  • Give them a deadline - This, plus a default, is a secret weapon of mine. “Let me know by Tuesday, and if I haven’t heard otherwise I’ll assume it’s ok to submit the abstract”. It might take you some getting used to doing this, but honestly, give it a try. For high stakes/high time sensitivity things you wouldn’t be using email anyway. Give them a deadline and a default action you’ll take!

  • Preview all messages on your phone - Another secret weapon. There’s a 33% chance your boss is reading this email on a phone in a meeting. You want it to be clear in that context.

Product Management and Working with Research Communities

Static Hosting Benchmark 2020 - Pier Bover

If a project you’re supporting plans on hosting some static web pages somewhere for a project page (I feel pretty strongly that research computing teams have better things to do than support these kinds of use cases), here’s a rundown of a simple, time-to-first-byte benchmark of a simple static webpage on 10(!) options.

Netlify and Cloudflare do really well. I wish pricing had been in here though - would probably tilt things pretty decisively towards GitHub for those with small enough pages and for researchers who are comfortable with it (although GitHub’s web interface or desktop app might be ok even then).

Simple rules for concise scientific writing - Scott Hotaling

Every moment spent clarifying our writing and aiming for clearer communication is a moment well spent. Here’s a short article on improving scientific writing (which, let’s face it, is mostly awful).

Fixing chains of trust in health research - Iain Marshall

Reproducibile (in the CS sense) science for research computing is a huge productivity win for science - open repos, unit testing, etc. But it’s often suggested as a way of reducing scientific misconduct or improving the validity of science research, which it isn’t (or only very indirectly). This suggests a more sophisticated model of “chains of trust” for thinking about this issues.

Cool Research Computing Projects

Digital infrastructure for research into the social network of the full Dutch population - Universiteit Leiden
Platform Digitale Infrastructuur Social Science and Humanities - Project website

A really remarkable project, the Population Scale Network Analysis for Social Sciences and Humanities (POPNET-SSH) to describe the in-real-life social network of the full, 17-million-strong Dutch population. Researchers will use this to study segregation, income disparities, and any of a number of sociological or economic questions.

Research computing staff are building the infrastructure to analyze and query this network without deep computing expertise, to make it as useful as possible to scholars in the social sciences and humanities.

An extremely interdisciplinary project. Would love to read more about the technical and privacy considerations.

Research Software Development

Twitch stream, book status, and other Fortran news - Milan Curcic

A new Fortran book is coming out, which isn’t remarkable, but the other projects listed kind of are. Twitch streams of Fortran coding! An official Fortran twitter account! A stdlib and a package manger! Language change proposals on GitHub! I have a soft spot for Fortran for number-crunching kinds of work and it’s nice to see it move super belatedly into a more modern community software development era, at least in these ways.

Off to file an issue for a feature request to make “IMPLICIT NONE” the default. Be right back…

Today was a Good Day: The Daily Life of Software Developers - André N. Meyer, Earl T. Barr, Christian Bird, and Thomas Zimmermann

Interesting study of how 5,971 software developers spend their day in general, and how they spend it on days they feel were good days and typical days; the idea is that this could be used to help managers have their developers make more good days.

It’s an interesting and short read. I walked away with two big points, but there’s others in there two. First was about expectations - developers expecting one type of day and getting a different type of day generally doesn’t work well. Second wasn’t necessarily surprising on an individual level but it was surprising to see how widely it held - the developers didn’t hate meetings. They hated bad meetings, which we can do something about, and they hated meetings when they were in the middle of deep coding work. The developers polled seemed happy in either meeting-filled, requirements defining/design/collaboration mode, or heads-down coding mode, but having a mix of the two wasn’t great.

COVID Shield

Not research software development, but public sector, and of timely interest. Despite a number of countries having had disappointing results with contact tracing apps, Canada and Ontario are rolling out one based on the Apple/Google API - and both the server and mobile client are on Github. Kubernetes, go, protobuf on the server side, Typescript for the client, and Ruby on Rails for a researcher-visible portal. Much more sophisticated implementations than I would have guessed.

(FWIW - I have serious doubts about how effective this will turn out to be - other countries’ experiences haven’t been great - but if you were going to do it, this API is the least privacy-shredding approach I’m aware of. If you have questions on issues with how big companies technical decisions start becoming or shaping public policy, despite dubious effectiveness - and you should! - following @biancawlyle on twitter is a great start).

Research Computing Systems

OpenFOAM on Amazon EC2 C6g Arm-based Graviton2 Instances, up to 37% better price/performance - Emma White, AWS Blog
AWS Graviton2-Powered EC2 Instances Now Available - Oliver Peckham, HPCWire

The rest of the AWS Graviton2 ARM line is now in GA. Some traditional research computing software, like OpenFOAM, has a notably better price/performance than on Xeons. Will be interesting to see how other kinds of research software perform.

Emerging Data & Infrastructure Tools

Benchmarking the Oracle bare metal cloud for DiRAC HPC workloads - Andy Turner, EPCC

Oracle and IBM are distant also-rans in the commercial cloud market, but that means they’re especially eager to find ways to distinguish themselves and niches to inhabit. Oracle in particular allows bare-metal access, something AWS has only started to do, and from this article it sounds like they’re trying to gain some attention of the HPC community.

The EPCC team of course does an excellent job of benchmarking, and if the topic interests you you should read the blog post.

The tl;dr is

  • I/O performance was very strong
  • Compute performance was good up to 16 or 32 nodes, but
  • The RoCE (as vs Infiniband) hurt some latency-sensitive codes before that
  • Getting more than 16 nodes at a time often took a while.

There’s also some interesting comments about how the Oracle cloud is to use - it takes a lot more work for a user than say Azure’ or AWS’s HPC configs but third party software packages are starting to help.

Calls for Proposals

Chan Zuckerberg Initative: Essential Open Source Software for Science, Cycle 3 - Due 4 Aug

The third instalment of this extremely useful funding mechanism. Open source software which underpins research computing - especially but not exclusively in the life sciences - is eligible. Our community needs more programs like this, and ideally we wouldn’t have to depend on private foundations for them.


Tsunami is a general purpose network security scanner with an extensible plugin system for detecting high severity vulnerabilities with high confidence. From Google.

A summary of the features and design space chosen by 60(!!) computational notebooks (think Jupyter) out there. Includes many I had never heard of.

Woah. Gilbert Strang has a new textbook introducing linear algebra, this time with machine-learning type application as vs PDEs. A sign of the times, of course, but Gilbert Strang has the clearest explanations (to my ears) of linear algebra I’ve ever seen anywhere. Kind of tempted to buy it for reading in my copious spare time.

ProxyJump, for coordinating multi-step ssh proxy chains, is apparently a feature in openssh? This would have been very handy to know about two jobs ago.

Learning to love systemd - the author makes a serious and worthwhile argument here coming from a SystemV init script background, but they are of course horribly, wretchedly, wrong. As is well known, no one of good conscience learn to love systemd.