#30 - Link Roundup, 26 June 2020

Hi!

As we get ready to move our team from one institution to another, we’re taking the opportunity to reconsider how we do things on our team and reset expectations about how we operate and what we’re aiming to achieve - as a team, and as a project.  The switch to working from home was an earlier opportunity to clarify and reset, too.

In Canada, 77% of IT organizations say that expectations around everyone going to the office everyday have been changed probably irreversably by the pandemic.  With universities, hospitals, and research institutes constantly tight on space, I have to think that a lot of our organizations are having similar conversations.  In the jobs section, there’s a first for the newsletter - a manager position listed a remote position.  At your org, are people discussing having work from home being a serious option in the future?  Is anyone considering operating as a fully- or mostly-distributed team indefinitely?

On to the link roundup:

Managing Teams

New Manager Training: The 4 concepts to Teach - Claire Lew, Know Your Team
Being Glue - Tanya Reilly

If you’re at the point where you’re starting to manage (or groom) team leads or managers, Clair Lew’s article and collection of resources on four things to teach new managers is useful.  Her four concepts to teach (which she covers in details with resources to use to to teach them are):

  • The mindset shift: IC → Manager.  (This is so important; no promotion or job change is as tough a transition as your that move from IC to manager - and I say that after switching fields from astrophysics to genomics!)
  • The importance of trust 
  • 1-on-1 meetings are your most high-leverage tool as a manager
  • Answer the questions, ”What’s going on?” and ”Where are we going?” for your team

Tanya Reilly’s talk focuses an aspect the first one - that as a manager, an increasing amount of your time is spent doing”glue work” rather than the technical work of the team.  And while that glue work is necessary for everything to come together (and stay together), it isn’t as valued as the technical work.  So much so that ICs stepping up and doing that glue work aren’t seen as overreaching or as promotion material, but more often as being less capable than the ICs focussed on the nuts and bolts of their tasks.


The Manager’s Guide to Inclusive Leadership — Small Habits That Make a Big Impact - First Round Review

This is a really nice article that’s well worth your time about more inclusive leadership.  They give some four high-level topics:

  • Invite and display authenticity
  • Build self-awareness and curiosity
  • Seek out and respond well to feedback
  • Lift up other perspectives consistently

But then go relatively deeply into each of them and give specific questions you can ask and approaches you can take.  The contributors are trainers at LifeLabs Learning, which leads training and facilitation around Diversity, Equity, and Inclusion (DEI) and the article links to a much longer 25-page open-sourced playbook for starting up a DEI team which also looks promising.


You Might Not Be Hearing Your Team’s Best Ideas - Michael Parke and Elad N. Sherf, HBR

We’ve talked about the importance of disagreement and input before, and how important it is that people feel ok speaking up.  This is another article on the topic, and it breaks the steps down into managing what people are saying but also managing the silence, what people aren’t saying, which I think is a useful way to think about things.


Managing Your Own Career

How to “manage up” from home- The Economist

Since many of us have been working from home for the past three months, it’s been harder managing our team, but maybe even harder being visible to our bosses (and other senior stakeholders) and maintaining those connections.

As with our teams, there’s no magic here - we just need to do what we should always be doing, communicating, listening, and documenting.  And as with our team, while we could get away with being less explicit and intentional about some of these steps when we were all in an office together, distance means we need to put more discipline into these sorts of actions.


Product Management and Working with Research Communities

From UBC: an open access resource for teaching online - Tony Bates

The pandemic has shifted expectations about how we deliver training and education; that shift might be long-lasting.

Now that many of us have gotten past the emergency, ”let’s just make this work” phase, it may be useful to have team members start a more comprehensive training program.   The Online Teaching Program of University of British Columbia’s Centre for Teaching, Learning, and Technology has made their Online Teaching Program modules for faculty available more widely.   There’s a lot of good material there, and the material itself is a good model for teaching online.


Research Software Development

Draft of my perf book is ready! - Dens Bakhvalov

A draft of Denis Bakhvalov’s book, ”Performance Analysis and Tuning on Modern CPU” is available for review, and will be available for free by PDF when it is final.

This is a very low-level book on performance on modern CPUs, discussing speculative execution, branch misprediction, instruction-level parallelism issues, and the like.  I look forward to seeing this out in its final form.


Update from GENCI: A journey of optimizing applications on Arm platforms - Fabrice Dupros, Christelle Piechursk,  Laurent Nguyen and Cyril Mazauric

A blog post describing GENCI’s (France’s HPC organization) ~18-month long process on porting and optimizing 8 HPC applications for ARM, focusing on getting the most out of vector performance and tuning MPI parameters, using emulators and eventually real hardware.


Open Software Packaging for Science - QuantStack

An opensource, fast replacement for conda-forge and for conda package hosting.  Interesting to see work being done on sustainability and openness of package repositories.  Also, it uses libsolv, a very fast SAT solver, for meeting versioning constraints.  SAT solvers are scandalously underused - if your project ever finds itself needing to solving a bunch of binary constraints, don’t do it yourself any more than you’d hand-implement linear algebra packages — use a SAT solver.


Research Computing Systems

Simplifying HPC Workflows with NVIDIA NGC Container Environment Modules - Akhil Docca and Scott McMillan, NVIDIA

A blog post announcing NVIDIA’s open source ”Container Environment modules” tool - a very cute setup that uses Lua-based Lmod and Singularity containers to let users run, e.g. ’module load gromacs’ and have the gmx command run a Gromacs Singularity container with NVIDIA GPU support.  Even slicker, if the latest container hasn’t been pull-ed yet, the first instance of running gmx will pull it down.


Supporting the Transformative Impact of Research Infrastructures on European Research - ESFRI

European research infrastructures ‘not financially sustainable’ - Ben Upton, Research Professional News (Paywalled)

An expert panel has taken a look at the current financial sustainability of European research infrastructures (whichincludes but is not limited to digital research infrastructure).  Despite a push for sustainability over the past several years, there’s still no systemic financial sustainability for these Europe-wide ”core facilities” for research

From the executive summary of the report:

However, issues of misalignment of national roadmap exercises and funding plans for RIs need to be overcome to make the implementation of the full RI system supporting the European Research Area (ERA) more time efficient and cost-effective.  Despite long-term sustainability being increasingly emphasised in EU funding instrument call texts, this issue remains challenging for the vast majority of RIs. We find that few RIs outside the European Intergovernmental Research Organisation forum (EIROforum) grouping are able to demonstrate the characteristics required to achieve long-term sustainability. In addition, we note that unique research infrastructures are also operated by networks and that the full deployment of competitive research services in Europe cannot be pursued by implementing an ever-growing number of autonomous legal RI entities.

The report calls for member governments to step up and align funding, for better ”life cycle management” and maturity assessments of the research infrastructures, more integration activities, and better coordination across funding pillars.

The report is very comprehensve, and has overviews of ~40 such EU-wide infrastructures; it’s interesting reading if this is your jam.  (It is mine)


The Runbooks Project - Ian Mieli

In an effort to help get people started with runbooks for operations, Ian Miele of Container Soltuions has started an opensource set of runbooks, the Open Runbooks Project, starting with their own.  Worth checking out as a set of templates, and keeping an eye on as more get added.


Emerging Data & Infrastructure Tools

Rust for Data-Intensive Computation - Frank McSherry

Frank McSherry, of differential privacy and timely dataflow fame, has a blog post out on his company’s site on the use of Rust for data-intensive computation:

Specifically, I’ve found several of Rust’s key idioms line up very well with the performance and correctness needs of data-intensive computing.

He describes the benefits of Rust’s approach to generics (traits), and use of higher-level programming with closures, for work in this space.  Maybe more controversially, he suggests Rust’s focus on ownership, borrowing, and lifetimes is a relatively low bar as well as helpful for data-intensive work:

Fortunately, they are crucial concepts in data-intensive computation, and putting them right in your face both makes you think about them, and makes your users accept that they are a thing worth thinking about too.


Calls for Proposals

Series of Online Research Software Events (SORSE) - Ongoing

An ongoing call for online Research Software Engineering events, being organized jointly by a number of national RSE organizations.  First review will occur on 12 July, with notifications on the 17th, and the next is 31 July.  Been meaning to give a talk (or panel, or workshop) on research software engineering topics?  This might be a great venue for testing out a new talk.


CZI Essential Open Source Software for Science, Cycle 3 - Applications Due 4 Aug

The latest round of the invaluable CZI funding for open source software that underpins scientific computing (especially,but not exclusively, in the life sciences).   We need more programs like this.


Events: Conferences, Training

Practice and Experience in Advanced Research Computing 2020 (PEARC ‘20) - 27 - 31 July, $100 USD

The schedule for the (virtual, of course) PEARC 2020 is out, with a mix of tutorials and talks about providing advanced research computing services.  There’s everything here from data governance (managing persistent identifiers) to deploying the OpenHPC stack and maintaining science gateways.  It’s not hard to find $100 worth of material on the schedule.


Deep Learning for Science School - 9 July to 17 Sept, Free

This virtual ”summer school” will be held on Thursdays at 9:30-11am Pacific over Zoom, with recordings on YouTube if you can’t make the sessions, and registrants will get an invite to a Slack group.   The still-tentative agenda covers:

  • Torch
  • Hyperparameter optimization 
  • Generative Models
  • Reproducibility
  • Uncertainty Quantification
  • Transfer Learning
  • PDEs
  • Incorporating Symmetry in Neural Networks 
  • Distributed Training 
  • Attention models

Random

Some readers might be interested in WireViz, a tool that generates wiring diagrams from yaml files describing the items and their connections.

In praise of (the original) Hungarian Notation for software development, to help ”make wrong code look wrong”.

The case for SQLite as an application file format for anything that outputs stuff that looks like tables.

AWS has released Honeycode, it’s low/no-code tool for simple web and mobile apps.  It’s clearly aimed at being, amongst other things, an Airtable-killer.  Is anyone using these kinds of tools or other low/no code packages to quickly deliver applications in a research computing context?


That’s it…

And that’s it for another week.

Have a great weekend, and good luck in the coming week with your research computing team,

Jonathan


Jobs Leading Research Computing Teams

Senior Director of Scientific Computing Group - Progenity, San DIego CA USA
Develops and aligns partners on a vision and roadmap for the scientific computing needs of R&D, technologies and capabilities to support and accelerate Progenity research goals. Leads and builds a high performing team of scientific technology personnel, execution and maintenance for all scientific computing and including genomics, inventory management, regulatory tools, LIMS and others.

Senior Software Manager, High Performance Inference Platform - NVIDIA, Santa Clara CA USA
You will pioneer new ways of developing and productizing high performance, highly available AI workflows infrastructure for medical imaging and healthcare. Build and demonstrate platform, profiling, orchestration capabilities, integrating Open Source and 3rd party components as well as building core components. Develop and integrate profiling and orchestration capabilities for new Deep Learning and machine learning workfoads.

Manager of Data Science - TEEMA, Vancouver BC CA
Reporting to the Director of Digital, the Manager of Data Science will be responsible for leading a geographically distributed (Vancouver, Santiago and Sparwood) and rapidly growing team of data scientists, data engineers and data analysts within the digital organization

Sr. Manager, Product & Science - Amazon, Vancouver BC CA
Discover, define, and apply scientific, engineering, and business best practice while delivering science for $1B+ opportunities. · Partner with scientists, economists, and engineers to help deliver scalable ML and econometric models while building tools to help our customers gain and apply insights.

Sr. Manager, Data Science - Amazon, Seattle WA USA
You will help us define new ways to evaluate, visualize, predict, and understand talent outcomes and decisions like hiring, promotions, and transfers. You will lead research and drive greenfield invention as part of a team of economists, data scientists, software engineers, applied scientists, product managers, and UX designers. This is an opportunity to fundamentally redefine talent management for one of the largest and most complex workforces in the world.

Senior Full Stack Engineer - Ufonia, Oxford UK
On a day-to-day basis you will work alongside our Chief Product Officer, taking a key leadership role defining the technology strategy and architecture for our back- and front-end services. You will set this strategy from the frontline, writing code, working all the way from database to the UI and perhaps tweaking our NLP pipeline in between.

Scientific I&IT Coordinator - Ministry of the Solicitor General, Ontario, Toronto ON CA
Lead and manage multiple medium to large scale projects of laboratory databases, content management systems, internet/intranet resources and automation. Provide support to the Laboratory Information Management System (LIMS) in the CFS.

Deputy Director, Data Management - Sanofi, Toronto ON CA
The Deputy Director, Data Management will deliver high quality data engineering solutions to support data scientists, data analysts, and business user applications. They will report to the Director Data Science. The span of responsibility for this position covers data management strategy, data project delivery, and overseeing data operations team

Programming Manager - GlaxoSmithKline, Brentford UK
In recognition of the developing sophistication and technical requirements of the role, Clinical Programming was formed as a standalone department distinct to Clinical Statistics. Programming asset teams are now stepping up to achieve the goal of being the Biostatistics’ leaders of delivery and execution, in a way that optimises, expedites and delivers to the highest quality.

Manager, Data & Analytics - Canada Life Assurance Company, London ON CA
The Intake and Operations Manager is responsible for managing data intake, planning, and the end to end data pipeline, with change management acting as a key area of focus to drive cultural change.

Lead Data Manager - Covance, WFH UK
As the study Lead Data Manager; be accountable for all DM deliverables per the established timeline; providing instruction to their DM study team(s) and review of their study team’s output to ensure the highest delivery quality, while adjusting resource allocations accordingly.

Data Manager - NHS Scotland, Edinburgh UK
Lead and manage members of the team involved in data management services ensuring data are timely, accurate and fit for purpose

Building Project Manager, Data Centres - Michael Page Property, London, UK
A multinational multi-disciplinary Consultancy client is seeking Senior Project Manager with Data Centre experience