#42 - Link Roundup, 18 Sept 2020

Hi -

Sorry for sending the newsletter out somewhat late today. There’s a lot of good material in the roundup - noticing change, understanding why your team is doing things you don’t think they should, Rust in science, transferring knowledge, telling stories with video, HDF5, a better tar, and underwater data centres.

As always, reply or email me (jonathan@researchcomputing.org) if there are things you’d like to hear more about, stories or topics you think our readers would like to read, or any other feedback.

Now on to the roundup!

Managing Teams

Noticing Change - Aviv Ben-Yosef

One of the recurring themes of this newsletter is that research computing is important enough to do with professionalism, and that professionalism is nothing more than being deliberate about what you’re doing, while continuously learning from what you and others have done.

Learning from what you and your team has done necessarily means noticing that there’s been a change. And there’s no way to systematically notice improvements - or regressions - without gathering data, taking notes, and otherwise keeping track of changes and their response.

One suggestion Ben-Yosef makes that I’ve been meaning to try is to make “decision logs” as a manager - or even as a team. When you make a change, or a decision, log it and periodically review to see how that decision played out.

From Procedural to New Knowledge: Leveraging Your Team’s Know-How - RC Victorino, Slab

This article comes at an interesting time for our team, where we’ve been trying to figure out how to do exactly this.

The argument Victorino makes is that “procedural knowledge” - how to do X - is undervalued, and hard to usefully document. One of your team members learns how to do the thing - say how to (in our team’s case) set up a particular test suite, or configure VS code to develop in a remote container, or convert certain workflows to a new tool. Now, sure, they can probably write up a document that has all that information in one place, but many of those facts probably already existed in other documentation - it still takes someone new real time to learn how to do it.

But this procedural knowledge is actually really useful, and is a stepping stone to more valuable knowledge. Now that people have learned skill X, they can use it in all sorts of ways – “Hey I can use X to speed up that thing Y that’s been holding it back”. So transmitting it within (or beyond!) the team is valuable. And the best way to do that transmission is with shared experiences - pair programming, talks, etc. Anyway, it’s a good article and if it interests you, you should read it.

From our teams point of view, this would also have the advantage of helping the team grow together, would help more junior team members get practice giving talks and mentoring, and potentially put videos up of their talks helping grow their (and our project’s) visibility. So we’re going to try to figure out a way to do this - current best idea is to batch 2-3 short (10-15 min) talks/demos/workshops on related topics into mini “conferences” for mainly internal consumption but with relevant people from interacting groups invited.

Has your team tried similar things? How has it gone?

Your Values are the Rules You Break - Stephen Prater

When they don’t know what to do, they’ll do what they know. - Marcus Blankenship

These are two interesting articles on understanding two different reasons why your team may not be doing what you are telling them to do. They’re probably not intentionally thwarting you or ignoring you; they’re probably responding to different incentives.

Prater’s article focusses on larger picture concerns. Say your organization, or even you, routinely tell people you “value” X — high quality code, collegiality, inclusion and diversity — and want them to, too; and yet they repeatedly break those rules and behave differently. The issue here can very easily the difference between those stated values and the actual values or behaviours of you or the organization. If you “value” high quality code but actually reward or reprimand based on meeting overly-tight deadlines. you aren’t going to get high-quality code. They’ll break that rule in favour of the real values, cranking out whatever to get stuff out the door. Similarly, if the organization “values” inclusion and diversity but doesn’t actually act in a way that brings in people from diverse backgrounds and truly include them, they’ll break those rules too because they see that those “values” aren’t actually lived up to.

The second article is more focussed on your actions as a manager - if a new project, task, or behaviour isn’t clearly and continuously communicated, with instructions and examples as to how to Do The Thing, people will inevitably drift back to what they know how to do. Because then they’re doing something, instead of sitting around being uncertain. Part of a being a manager is communicating about something in a zillion different ways until you’re sick and tired of hearing it - because it’s just about then is when it’s starting to get heard by people.

Managing Your Own Career

Bucketing your time - James Stanier, The Engineering Manager

We’ve talked about organizing tasks in buckets before - In Issue 37 I’ve mentioned my experiments with trello, and in Issue 39 I linked to an article about having a “dashboard” that covers both issues, things to keep an eye on, and future-looking work.

This is a nice article about why I find these approaches work well for me - it’s a way of systematizing the discipline of not just getting lost in the day-to-day while also highlighting important-but-not-urgent tasks at a variety of timescales. If this something that you wrestle with too, this is a nice article to read. I’ve certainly found being able to keep track of “today/this week/coming month/coming quarter” tasks useful to keep my eye on the ball. Stanier also distinguishes between recurring tasks and one-off tasks, as a way of showing what tasks would be more valuable to delegate.

OffBoarding as an Engineering Leader - Iccha Sethi

We’ve had a few articles here about your-first-90-days at a new job; this is an article about your last days as a manager as you move to a new position.

Sethi mentions several areas to focus on:

Your team - informing them, and documenting pending performance issues, salary, equity, or promotion status, and then informing them of the departure
Documenting the status of the team as a whole and their projects
Documenting the status of any projects or initiatives you were pushing for
Stakeholders - peer teams, and for us research groups - informing them and having a clear plan to hand them off to a specific individual
Documenting a clear transition plan

There’s a lot to do! The good news is that if you’ve been following the practices we’ve been recommending in this newsletter - one-on-ones with notes, regularly quarterly planning and reviews, project documention, and the like - many of these tasks become much easier. And if not - well there’s no need to wait until you’re going to start a new job!

Modern Data Engineer Roadmap - Alexandra Abbas

One thing I’ve learned from maintaining the research computing manager job board is that there are a lot of “Manager, Data Science” or “Manager, Data Engineering” jobs out there, and that a lot of research computing managers could pivot into those kinds of roles. Here’s a proposed roadmap of technology expertise that would be needed to run a modern data engineering team; note that a lot of them will be very familiar to those working in modern research computing environments.

Product Management and Working with Research Communities

Storytelling for Nonprofits: Using video to tell your story - Chiara C, Tech Soup

One of the areas research support groups can learn from nonprofits is how to communicate important ideas, regularly, on a shoestring budget. Most research computing groups do very little on this; but with a very modest amount of effort, consistently applied, and an even more modest amount of money, one can reach and influence a large number of people, drawing them to your resources, courses, and services - or their equivalents at their home institutions.

This article gives a quick rundown on the importance of story telling for nonprofits, and some resources for use. The list of useful resources would be a little different for educational institutions (which generally don’t get the same price cuts as nonprofits), but it’s still a good resource list. The other thing I’d add is that there are a number of services out there that will help you build short animated explainer videos or promotional videos for a very modest amount of money. I used renderforest - which was fine but there’s lots of others out there - to make a quick explainer video for our current project which has worked very well, for a total cost of one afternoon and $27 USD.

European Commission Declares €8 Billion Investment in Supercomputing - Oliver Peckham, HPC Wire
State of the Union: Commission sets out new ambitious mission to lead on supercomputing - European Commission

The EC has proposed a new, significantly larger, tranche of funding for supercomputing, expanding and extending the 2018 EuroHPC Joint Undertaking, as a way of underpinning other R&D goals. The funding, from 2021-2033, would include hardware, software development, and also has a significant “digital sovereignty” component. From the HPC Wire article:

“I am pleased to announce an investment of 8 billion euros in the next generation of supercomputers – cutting-edge technology made in Europe,” van der Leyen said. “And we want the European industry to develop our own next-generation microprocessor that will allow us to use the increasing data volumes energy-efficient and securely. This is what Europe’s digital decade is all about!”

Research Software Development

C++20 Is Now Final, C++23 at Starting Blocks - Sergio de Simone, InfoQ

C++20 is now finalized, and you can expect to see increasing levels of support in the newest version of various compilers. Big new features include:

Modules - significantly improving C++ modularity and namespacing
Coroutines
Traits - implemented as templates with constrained types

Performance of the Versioned HDF5 Library - Melissa Weber Mendonça, QuanSight
HDF5-UDF - Lucas C. Villa Real, Gerd Heber

Lots of interesting work going on with HDF5 lately. Last issue we talked about HSDS, an HDF5 data service on S3-like object storage; two weeks earlier in in issue 39 we introduced versioned HDF5.

This week, two HDF5 articles - first is giving a performance summary of the Versioned HDF5 library in terms of both time and space. The performance is quite good! No obvious overheads on the file size, diffs are handled quite efficiently if the chunk size is chosen correctly. The speed performance is also better than I would have expected.

The UDFs are something else - user defined functions compiled and stored in the HDF5 file, which allow for “views” of the data, or processed results, or even synthetic/procedural data - anything you’d like to implement. Wrappers for Lua, Python, C/C++.

Seven technology leadership lessons from TV show writing - Daniel Jarjoura, TLT21

The community has built a lot of analogies between software development and engineering, but engineering isn’t the only discipline where people have to work together to build complex and intricate stuff under tight deadlines and shifting requirements. Jarjoura tells us seven lessons from successful show runners that he believes carry over to computer systems or software development teams

They know their show and tell everyone what it is - there’s a common, shared, and continuously communicated vision.
They create a safe space - show runners need to create an environment where everyone feels safe to share so the best ideas can surface, even in an environment where dozens and dozens of ideas will get knocked down before one or two are chosen to work with
They make writers pitch - no episode gets written just because “it’s Joe’s turn” or “because Jessica said so”
They give everyone a chance to talk - along with 2, show runners make sure everyone speaks and has their voice heard
They combine creative thinking and passion
They rotate writers and make them work collaboratively - well before “pair programming” was a thing
They write and rewrite quickly

The same analogies over and over again can be limiting, and I like the idea of borrowing from other successful industries

Rust in Science and ever-changing requirements - Amanjeev Sethi

Co-worker Sethi writes here about scientific programming, the need for prototyping and adapting to changing requirement. He argues that a statically typed language - and particularly one like Rust which is very particular about a variety of correctness checks at compile time - has advantages over languages like Python even for protyping:

This is where I would conclude that if you are starting a journey and are sure that the things will change many times over, you may be better off with a language that: gives you a solid structure to build upon, with tools to warn you when parts of that structures are moved, by giving your advice on “why”, “where” and “how” to do that.

Research Computing Systems

Microsoft’s underwater server experiment resurfaces after two years - Chaim Gartenberg, The Verge
Microsoft sank a data center the size of a shipping container 2 years ago in a wild experiment and just brought it up to see how it went - Mary Meisenzahl, Business Insider
Microsoft proves practicality of renewables-powered underwater data centres - Renewables Now

That’s not a water-cooled datacenter; this is a water-cooled datacenter…

Microsoft submerged a small containerized datacenter under a Scottish sea (the Loch NAS monster? Has anyone done that yet?) with over 800 servers and 27 PB of storage, for two years. The idea is that actually underwater, the conditions (temperature, humidity) are extremely constant, so there should be less thermal shock, etc to components - plus of course the ease of radiating heat out to the entire lake. They claim that they see 1/8th the component failures as in their more terrestrial regular data centres - which is a good thing, because they can’t just send someone down in SCUBA gear to swap DIMMs.

Gartenberg’s article concentrates on the Scottish experiment, Meisenzhal’s goes into a little bit more background (and a lot more pictures) about the history of the experiment (including an earlier Pacific Ocean test) and shares that another reason for longer equipment life is that the container was filled with nitrogen gas, reducing the atmosphere content. Microsoft will look at how the atmosphere in the container changed over time.

The Renewables Now article adds the additional colour that the datacenter was powered solely by renewable energies.

Archivetar - A better tar for Big Data - Brock Palen

Long-time research computing podcaster and Director, Advanced Research Computing – Technology Services at UMich introduces us to his ArchiveTar which wraps tar to solve problems with tar’ing large project directories that include both big data files and small code files to archive. It uses mpiFileUtils to parallelize scans of large number of files, clumps small files together in a number of tar files to keep object counts low while making it easier to pull out that one file the researcher wants, and provides a manifest.

Emerging Data & Infrastructure Tools

Data Cleaning IS Analysis, Not Grunt Work - Randy Au, Counting Things

Au’s article can be summed up in one pull quote:

The act of cleaning data is the act of preferentially transforming data so that your chosen analysis algorithm produces interpretable results. That is also the act of data analysis.

Again, professionalism is doing things deliberately. I think we tend to get sloppy about things that are “just” data cleaning or “just” having decent uptime or “just” putting together a script - but these things matter.

Data cleaning in particular requires pretty deep expertise in both the data generation process that lead to the data, and the data analysis steps that will come after it. It may not be glamorous or exciting, it may not use sexy recent algorithms or cool frameworks, but it very much is, in and of itself, data analysis.

Calls for Proposals

Sixth International Workshop on Serverless Computing (WoSC6) 2020 - Call for Papers due 28 Sept

Topics of interest include but are not limited to:

Infrastructure and network optimizations for serverless applications
Debugging serverless applications
Programming models
Use cases, experiences
Benchmarks
Cost models, pricing models, and economics of serverless
DevOps
Other topics related to serverless computing

HPC-Europa3 - Next call for applications - 12 Nov 2020

The next call to the HPC-Europa3 program for travel for HPC-based scientific collaboration is open, with applications due in early November. This is a really interesting program and one I wish there were analogies to in research computing more broadly (not just HPC) and in North America.

Events: Conferences, Training

US Food and Drug Administration: 8th Annual Scientific Computing Days (SCD) - 29-30 Sept

A recurring theme of the pandemic is that the rise of virtual events means that events that never would have even been on one’s radar suddenly become interesting possibilities. The 2-day meeting, usually pretty internal to the FDA, has a broad range of food and drug related talks (and an extensive poster session that may be of interest to some readers.

Random

Now that C++20 is final, Microsoft Visual Studio now finally supports C11 - and C17.

Five ways to undo a commit in Git.

The incorporation of typing into Python continues to allow a lot of cool tools, such as Nagani, a static verifier for Python modules.

Robustness testing with fuzzing for scientific codes continues to get easier with tools like libFuzzer which allows you to fuzz-test libraries built with LLVM compilers.

Jailbreak your TI-CE calculators.

We sometimes underestimate the importance and value of research papers - and we often underestimate their influence. Even wikipedia pages can be hugely influential - a randomized control trial showed that edits to chemistry wikipedia pages can not just change citation rates but arguably research directions (via twitter).

There’s finally a Numpy paper - and it’s in Nature. On the one hand I’d like other kinds of research contributions to be recognized, not just papers; but on the other, a high-profile journal like Nature publishing a software article is huge.

Now it’s official, all of the major cloud companies are making strong plays to be able to serve all areas of research computing, including HPC - Google Hires Longtime Intel Exec Bill Magro to Lead HPC Strategy.

The GitHub CLI is now out in v1.0.