Key Management for MSPs: A Practical Approach to the Full Lifecycle
Cryptographic keys underpin almost everything in a modern IT environment. BitLocker protecting endpoints. TLS certificates securing web services. API keys authenticating integrations. VPN pre-shared keys connecting sites. Backup encryption keys protecting recovery data. Entra ID synchronisation keys keeping identity in sync.
At most organisations — and most MSPs — these are managed inconsistently. Some keys are in a password manager. Some are in a shared spreadsheet. Some are in someone's head. Rotation happens when a key expires or, more often, when something breaks because a key expired without anyone noticing.
This is a practical post about building a key management process that covers the full lifecycle, from generation through to destruction, using tools that MSPs are already likely to have: Azure Key Vault, Hudu, and — for the long term — Azure Event Grid and Azure Functions.
The Key Management Lifecycle
Key management is typically broken into six phases. Each one has failure modes that organisations discover at the worst possible time.
1. Generation
Keys must be generated using cryptographically strong methods. The minimum acceptable key length for symmetric encryption is 128 bits; 256 bits is the standard recommendation and should be the default. For asymmetric keys, RSA 2048 is the minimum; 4096-bit or elliptic curve alternatives are preferable for new deployments.
For secrets (API keys, passwords, tokens), generate them with a cryptographically secure source rather than manually typing something memorable. A 64-character random secret (512 bits of entropy) at a password manager like NordPass is appropriate for most use cases. If the dependent application has a length restriction, document why and flag it for future review.
Hardware Security Modules (HSMs) provide the strongest generation method — they use hardware-based true random number generation and store keys in a tamper-resistant enclosure. Cloud equivalents (Azure Managed HSM, AWS CloudHSM) provide HSM-level security without the on-premises hardware cost. For most MSP clients, Azure Key Vault's standard tier is sufficient; HSM-backed keys are worth considering for the most sensitive use cases.
2. Distribution
Keys must reach their destination without being intercepted. This sounds obvious, but "send it over Slack" remains a common distribution method.
For secrets that need to be distributed to systems, Azure Key Vault handles this programmatically — applications authenticate to Key Vault and retrieve the secret at runtime, rather than storing it in a config file or environment variable on the server.
For human distribution — handing an API key to a colleague or a client — use an encrypted channel. A purpose-built secret-sharing tool is better than email. For physical distribution of HSM-protected keys, the physical media and the PIN should travel separately.
Access should be controlled with role-based access control (RBAC). In Entra ID, this means creating a group for each key or class of keys, and assigning only the users or service principals that genuinely need access. A junior technician should not have access to the VPN pre-shared key or the backup encryption secret.
3. Storage
In Azure-based environments, Azure Key Vault is the right answer for cryptographic keys, certificates, and secrets. It provides access logging (every retrieval is recorded), RBAC integration, soft-delete and purge protection (preventing accidental or malicious deletion), and optional HSM-backed key storage.
Keys should not be stored in:
- Source code repositories (even private ones)
- Plain text configuration files on servers
- Email threads or chat histories
- Shared spreadsheets without access controls
BitLocker recovery keys should be escrowed to Entra ID automatically — this is a configuration option in Microsoft Endpoint Manager / Intune and should be enabled for all managed devices. The key is then available to authorised administrators without anyone having to remember it or store it separately.
4. Usage
Monitoring key usage is under-implemented at most organisations. Azure Key Vault's diagnostic logs (sent to a Log Analytics workspace) give you a record of every access: which key, which identity, at what time. Anomalous access patterns — a service principal retrieving a key outside its normal operating hours, or a key being retrieved far more frequently than expected — are detectable if you're looking.
Access controls should follow the principle of least privilege. A web application that only needs to verify tokens should not have access to the key used to sign them — it should have access only to the public verification key.
5. Rotation
Keys should be rotated on a schedule, not in response to incidents. Regular rotation limits the damage window if a key is compromised without anyone realising.
The practical challenge is that rotation requires updating every system that uses the key. For some systems this is straightforward; for others — legacy applications, hardware devices, third-party integrations — it requires coordination and testing. This is why rotation schedules need to be documented and tracked, not left to memory.
A pragmatic approach for MSPs without full automation in place:
- Set an expiry date (6 or 12 months, depending on sensitivity) on every key in Azure Key Vault
- Record the expiry date in Hudu, linked to the relevant client and asset
- Configure Hudu to send notification emails at 30, 14, 7, and 1 day before expiry — with escalating subject lines so urgency is clear
- Document the rotation procedure in Hudu so any technician can perform it, not just the person who created the key
For the longer term, Azure Event Grid can listen for near-expiry events from Azure Key Vault and trigger Azure Functions to handle the notification and, where possible, the rotation automatically. Azure Key Vault has built-in rotation policies for its own keys; secrets that need to be updated in external systems will require the Azure Function to call the relevant API.
6. Revocation and Destruction
When a key is compromised, the response needs to be immediate and complete. Revocation in Azure Key Vault is a two-step process: disable the key (blocking new use immediately) and then delete it. With soft-delete and purge protection enabled, deletion is reversible for a defined period — which matters if you later discover the revocation was in error.
After revocation, every system that used the key needs to be updated with a replacement. This is where the documentation work pays off — if you know what uses each key, you can work through the list methodically rather than discovering dependencies as systems fail.
At end of life, keys should be permanently deleted from Key Vault and purged, confirming no recovery path remains. For HSM-backed keys, the destruction should be logged and confirmed by the HSM's audit trail.
The Minimum Viable Process
For an MSP that doesn't yet have any formal key management process, the minimum viable starting point is:
- Inventory: catalogue every key, secret and certificate across your clients' environments — where it is, what it's used for, when it expires
- Centralise: move everything that can be moved into Azure Key Vault
- Assign ownership: every key should have a named owner responsible for rotation
- Set expiry dates and document them in your PSA or documentation system
- Write the rotation procedure for each key and store it where any technician can find it
- Enable access logging and review it quarterly
This doesn't require Azure Event Grid or Azure Functions. It requires discipline and documentation. The automation layer is worth building, but the human process is the foundation.
A key that nobody knows about, that nobody rotates, and that grants access to a backup encryption store — is a risk that exists whether or not you've acknowledged it.
Key management isn't a particularly glamorous area of security. It doesn't have the narrative pull of threat hunting or incident response. But compromised or expired keys are a consistent source of outages and breaches, and the fix — inventory, centralise, document, rotate — is entirely within reach for any MSP with the process discipline to follow through on it.