Calibrations for software engineering interviews

Written on November 2020 · 4-minute read · by Diego Ballona

After posting my takes on posture in system design interviews in TechWriters, David Golden inspired me to write about calibrations. This article focuses narrowly on two perspectives:

How peer interviewers can reach a shared understanding of who best fits a role;
How hiring managers can increase confidence they're making the right hire for their team and organisation.

I aim to be objective and transferable, but calibrations are inherently nuanced. They vary across organisations and even within one. This is a simplified take — covering calibrations fully would require diving into your specific hiring process, which is out of scope.

Context & Pre-requisites

Interview calibrations help interviewers find common ground on a candidate's performance, then compare candidates against each other to decide who moves forward.

Unstructured calibrations — just talking through impressions — tend to go wrong. Hiring managers and senior interviewers easily influence others' opinions, increasing bias.

Pre-requisites for effective calibrations:

An organisation-wide leveling framework defining roles, expectations, and impact for each level;
A well-defined interview process per level or career track, applied equitably to all candidates regardless of referrals or returning status;
A database of interview questions with clear guidelines for structured stages like technical interviews;
Mandatory interviewer training covering practicalities, D&I, and unconscious bias.

These establish a foundation for objective comparison, standardised assessment, and fair evaluation.

The interviewer's role

Study the leveling framework. Understanding expectations for a role is the starting point for fair assessment. Beyond "does x, y, z," know what impact and effect the role demands. This lets you evaluate candidates more objectively.

Gather comparative data across candidates. When there are fewer roles than qualified candidates, deciding who fits best is hard — especially if different people interviewed them. Two approaches help: pairing a lead interviewer with a shadow interviewer, or running multiple interviews of the same type per candidate. Both reduce bias and give more angles for comparison.

Define good outcomes for each interview stage upfront. Know what you're calibrating against before the interview happens. "Better" is relative — discussing it on shared terms keeps debates specific. For a system design interview, criteria might include completion, prioritisation, depth and breadth of knowledge, communication, and trade-off awareness. Limit grading to the stage's defined criteria.

Map stage performance to candidate leveling. Candidate leveling considers overall interview performance plus prior accomplishments. Stage performance should support the overall assessment, backed by interview criteria. For example, seasoned engineers typically show deeper trade-off awareness in system design interviews. Knowing what to expect per level makes the decision fairer — but avoid turning assessment into a purely mechanical process.

Give candidates the benefit of the doubt. Interviewing is stressful. Rather than judging whether someone is fit for a role, treat the interview as a snapshot of observed behaviours. Stay open to being convinced either way.

For ties, delegate the decision to other stages. A 45-minute interview gives a limited picture. At some point, splitting hairs between candidates reflects your bias more than their qualities. Let other criteria drive the decision.

The hiring manager's role

Hire for the organisation first, then the team. Things go wrong — projects get deprioritised, or your team may no longer exist in a year. A narrow hiring strategy backfires when context changes. If leveling is standardised across the organisation, moving people to new work is less disruptive.

Look beyond your team's immediate capability gaps. As a hiring manager, you should know your problem space well, but people need room to grow. Check whether similar needs exist elsewhere in the organisation. Validate assumptions with peers and consider progression paths before hiring. If things go well, your hire will have options for their path forward.

Calibrate candidates against the leveling framework. Most companies have overlapping responsibilities and compensation bands between levels. Promotion requires consistently performing at the next level, so level newcomers correctly — and fairly relative to their peers. When unsure, down-level. Promoting someone in 6-12 months is simpler than demoting them.

Calibrate candidates against their potential peer group. Team-driven hiring can erode overall talent quality — this article from Google explains how. Look at adjacent teams with people at the same level and discipline. Understand their duties, impact, and responsibilities. This reveals how the candidate fits the broader picture and maintains talent quality across the organisation.

Calibrate interviewers against each other. Different interviewers grade differently, even with a solid framework. Some skew positive; others are conservative. Most applicant tracking systems have calibration reporting that shows how interviewers tend to grade candidates who received offers. Use this to weight assessments.

For split decisions, host an interview packet review meeting. Bring all interviewers together, build consensus, and make a final decision. These meetings surface ambiguity in criteria, provide feedback loops for interviewers, and improve the process over time.

This is just the tip of the iceberg. Effective calibrations require cross-disciplinary effort from talent acquisition, HR, and technical leadership. Fair hiring demands continuous measurement and iteration — both quantitative and qualitative.