#3 - Link Roundup, 24 Jan 2020

Hi; I hope you’ve had a good week!

A reader chimes in on the link roundup comments on one-on-ones, from the perspective of a director of research computing embedded in a larger unit. I’m sharing it with their permission:

Absolutely right on the one-on-ones (or O3s, as we call them). Our whole [unit] picked up the habit from manager-tools.com, and we had the founders come onsite years ago a couple of times to deliver their Effective Manager and Effective Communicator training. Having O3s with my Research Computing staff weekly and a weekly staff meeting allows us to move forward with work and not spend time running up and down hallways. They know they have 30 minutes with me and no distractions.

This is great to hear! I had a similar experience with saving time - and learning more about my team members and the work going on.

As I get ready to start the newsletter in earnest in mid-February, I wonder if I could ask for some feedback from you, too. Are there topics you’d especially like to hear about - or not hear about - in the future? We seem to be converging on a few broad areas; do please respond to this email (it’ll just go to me) and let me know which of the below you would and wouldn’t be most interested in hearing about.

Managing Research Computing Teams (One-on-ones, feedback, meetings..)
Research Computing Credit and Funding (Citation, grants…)
Research Software Development (Tools, testing, agile vs…)
Data and infrastructure tools (Cloud, Kubernetes, databases…)
Working with research communities (validating needs, advocacy..)

Teams

How to become a Good Lab Manager, Elizabeth Sandquist, ASBMB Today

This is an older post but it’s one of the most read on the ASBMB blog, and it was posted on twitter again because the upcoming ASBMB meeting will have a workshop on the topic.

The post is well worth reading, and extremely relevant to us. Its top tips, which Elizabeth expands on extremely well, are:

You can learn management skills.
Have a five-year plan for your lab.
Set clear standards and expectations.
Optimize your management style for each lab member.
Listen to your lab members.
Walk around the lab daily.
Learn when to say no.
Be prepared when small amounts of free time become available.
Get to know the people at your institution who can help you.
Celebrate successes with your lab.

Number 1, that management is a set of skills you can learn and start doing passably well rather than being a personality trait or an inclination that you had or didn’t, was a crucial thing for me to understand, and it makes everything else so much easier. Other tips I’ve seen not followed as often as I’d like in research computing or software environments are 3., the clear standards and expectations, 4., optimize your management style for each lab member (which is easy if you’re having one-on-ones!) 7., saying no, and 9, connecting effectively cross-institutions. In our environment I’d probably add a points about giving feedback and hiring.

Google Spent Two Years Researching What Makes a Great Remote Team, It Came Up With These Three Things - Justin Bariso, Inc

In research computing, we spend a lot of time working with remote teams - not for the reason it is coming up in tech firms more and more (an increase in working from home), but because we’re often involved in multi-institutional collaborations.

The three things - get to know your people, set clear boundaries/expectations, and forge connections - drive home something I’ve been trying to get across for a while. There is nothing fundamentally different about distributed teams; it only makes certain failure modes easier. There is no team in the world where you won’t get better results by putting effort into getting to know your people, setting clear expectations, and building connections. It’s just that doing those things takes more deliberate work in distributed teams. Even timezones are just a special case of some teammates having different work schedules.

On the other hand, multi-institution efforts do have one significant hurdle - multiple reporting lines and bosses. But even that is a problem that single matrixed organizations have, and it’s especially common when doing project management.

The Breadth of Research Computing

https://twitter.com/astrocrash/status/1218994542881517568

It’s interesting to read this thread alongside a post by Ben Northrop on The Myth of Architect as a Chess Master from earlier in the month. My own background, like James Guillochon’s, was in astrophysics simulation; he describes the enormous differences between long-running large simulations and real- or near-real time needs of robotics. The interesting thing to me is that both are resource constrained but for very different reasons. Lack of logging on the simulation side is called out and I think that’s absolutely true across a lot of disciplines. Maybe at some point I should write something similar on computing in astrophysics and bioinformatics.

Ben Northrop’s piece isn’t about research computing specifically, it’s more about technical architecture in general, but the theme of diversity of problems is the same. I really like his analogy - we have this image of a chess master wandering past a casual chess game and in a split second blurting out the winning move; they can do that because they’ve played thousands of iterations of the same game before. Whereas technology, and especially in research computing:

Any given software system, on the other hand, may look the same as countless others you’ve worked on in the past - I mean, hey, it’s all ifs and elses, classes and objects, REST and SOAP calls, etc. - but just a little bit below the surface we realize the pieces and the rules are very different. In my system the rook can jump the first 2 squares but in yours the queen has 3 lives. Each system we visit has different intricacies and rules we must learn anew.

We can be deep subject matter experts in research computing in one area and utterly out of our depths in another. Coming from the culture of physics departments, it took me longer than it should have to learn that one (yes, I’ve been shown the obligatory XKCD comic).

Tools & Development

Anybody can write good bash (with a little effort), William Woodruff

Most of us are stuck with having some small key parts of infrastructure written in bash. We also all know at some level enforcing modularity, readability, commenting and the like is important; but we’re somehow willing to let lawless anarchy reign when we’re writing something that starts with #!/bin/bash. There’s no reason at all that bash scripts have to be as bad as they are - there are even bash linters now! Stop the madness! This link is getting saved along with Kfir Lavi’s Defensive Bash Programming from several years ago for my own use and to share with others.

I have two things to admit about coding; one is that I regretfully do little coding these days, something many of us share (true story: at a recent discipline-specific research computing meeting in the middle of last year, I spoke with four research software managers, all of whom asked me somewhat wistfully and almost immediately upon introduction, “Do you get to write much code?”).

The second thing is even though I do still spend some time looking at monospaced fonts, I thought the whole “code font with ligatures” thing was another goofy fad. But eventually I started using Fira Code and I love love love it. It’s a small change but it does make code just a little more pleasant and easier to read. Whether I can still contribute anything useful to it is a different conversation.

Anyway, JetBrains, which makes lovely IDEs, just openly released their new code font with ligatures, JetBrains Mono. I’m not sure I’m going to switch from Fira but it looks pretty sharp.

You may be aware that I’m fairly comfortable posting contrarian thoughts about research computing on the internet; but even I would have thought long and hard before hitting “Publish” on a blog post entitled The TeX Pestilence (Why TeX/LaTeX Sucks). Xah Lee wrote this in 2004, but it was updated in late 2019 and started circulating again this week.

For what it’s worth I think he’s only half right here. TeX is a lovely low level typesetting tool; even he admits “it is set out to do typesetting and it does that well”. We disagree on whether or not that’s a useful thing to be; I think it’s needed. What I don’t understand is why there aren’t viable candidates to replace LaTeX and variants. I find writing LaTeX really cumbersome; these days I usually start in markdown, and use pandoc to generate the first pass so I can skip all the boilerplate. LaTeX started out as a macro language which expanded out to TeX; I wish there were tools that took what we have learned in the last decades about both programming languages and document workflows and “compiled down” to TeX for typesetting, ideally interactively.

On the other hand, maybe the absence of such tools disproves the premise; maybe TeX isn’t useful as an intermediate typesetting language? If so I’d love to understand why.

Have others moved away from LaTeX? Are there tools you prefer to use for writing up technical material?

(Advocating for) Research Computing Funding

The ELIXIR Core Data Resources: fundamental infrastructure for the life sciences, the ELIXIR team Funding of research computing and data resources is hard. As was pointed out in a recent blogpost at US-RSE, research infrastructure is generally funded as a project, which ends (“and they all lived happily ever after”), rather than as a product which continues to be used; “sustainability” is a word that comes up very quickly in research computing conversations.

This is a good paper that makes a familiar case for funding the component core data resources, but also points out that as linking data sets becomes more important (or combining data and computing resources, or any of those with research software, or..), one has to look not only at an individual resource but it’s place in a wider ecosystem to understand its value. Many of us are, sadly, going to have to make cases like this on an ongoing basis, so its useful to see what they chose to collect, how they collected the data, and how it was presented

Data

Introducing Flyte: A Cloud Native Machine Learning and Data Processing Platform It’s interesting to watch commercial tech-world data analysis analysis frameworks get closer and closer to addressing research needs. Flyte, like a lot of such tools, puts a heavy emphasis on automation and flexibility, but also includes automatic tracking of data versions and provenance. It looks like it’s still both too heavyweight to set up and too prescriptive about the kinds of steps being run to be of use in general research computing, but I have to wonder for how much longer that’ll be routinely true for academic data analysis pipelines.

On the other side of the data-infrastructure complexity scale, there has been a lot of buzz in the last month or so about “low-code” or “no-code” tooling. Some of us have seen this side of the pendulum swing a few times (anyone else remember fifth-generation languages “5GL”?). Last time it was filled with hype about solving all problems, which of course it didn’t. This time it seems more circumspect, and there’s no question that largely templated frameworks for solving common problems can be useful tools to have in the kit.

So it’s in this context that I apparently became the last person in the world to hear about Datasette, a lovely python package that lets a researcher (or anyone else) make a data in a read-only SQLite database available through an API or a simple web UI, for querying or for running SQL queries. It’s pretty basic, but addresses a key building block in a lot of simple projects.

Security

It’s always interesting to read a debugging or security incident story - we all love mysteries; in Investigating a Backdoor.SH.SHELLBOT.AA Infection, Jakob gives us both that and a sobering reminder that you can’t even put a Raspberry Pi out on the internet without risking getting owned.

Conferences on Research Computing or Research Management

Several very relevant meetings announced in the UK on research computing, and a couple in Canada:

Research Software London East 2020, 6 Feb, London

UKRI Cloud Workshop 2020, 3 March, London

Canadian Research Software Conference, May 26-27, Montreal (Webpage not updated for 2020 yet)

R&D Management Symposium 2020 - Vancouver Aug 4-5 2020 Focussing on the role of academic research on innovation and commercialization

RSECon 2020 (5th annual!), 8-10 September, Birmingham