
It was the Synthient threat data that ate most of my time this week, and it continues to do so now, the weekend after recording this video. Data like this is equal parts enormously damaging to victims and frustratingly noisy to process. I have to be confident enough that it's new enough, legit enough and impactful enough to justify loading and that the value presented to breach victims sufficiently offsets the inevitable chorus of "what am I meant to do with this, tell me exactly what password was exposed for my record". It's an expensive exercise too; we're currently running an Azure SQL Hyperscale database at 80 cores to analyse the ~2 billion credential stuffing email addresses in this corpus. That's 2 billion unique email addresses too 😮 More on that in the next video, let's just work out if it's going to go live in the system first.
Click to Open Code Editor