I feel the need, the need for speed.
Faster, Faster, until the thrill of speed overcomes the fear of death.
If you're in control, you're not going fast enough.
And so on and so forth. There's a time and a place for going fast, and there's no better place to do that than when querying Have I Been Pwned's Pwned Passwords service. (Ok, a lot less glamorous than the context of the previous statements, but also less likely to have a catastrophic outcome.)
In December last year, Pwned Passwords saw not just a fresh batch of 225M new passwords from the NCA, but it also welcomed the ongoing ingestion of new passwords from the FBI. This created a lot of excitement which is great, but it also led to a very important question: what's the fastest way to query the entire corpus of data? That API is returning more than 99% of queries from the Cloudflare edge so it should be super fast, but how fast? What happens if you want to check millions of passwords? As all the Pwned Passwords code is now open source, we thought it would be cool to open this challenge up to the community and see what you can come up with in terms of an app to do just this. And, in the spirit of open source, that code should be available to all so that your good work can benefit the masses.
Here's what we've done: chief Pwned Passwords wrangler Stefán Jökull Sigurðarson has stood up a repository which you can begin working on right now. If you'd like to contribute in another language, leave a comment below and we'll create an all new repo in your favourite language; folks using this code need to be able to read, understand and trust the code so the more the merrier. There's no API key for this service so no secrets management, just plain and simple code that queries well documented existing APIs.
A few little tips for you:
And that's pretty much it. This is a pretty simple little challenge that I hope you can have some fun with, but it's also a challenge that will do a great deal of good for many organisations and individuals alike.
Stefán has already made a head start on a C# .NET version and has achieved some blistering results:
Mileage will vary based on factors such as cache level (his Cloudflare edge node already had 100% of those requests cached) and bandwidth (let's just say Reykjavík doesn't have Australia level connectivity), but those 3k+ requests per second are a pretty good benchmark to begin with. Wanna go really fast? Check out the speed against locally cached passwords:
So, can you beat it in your language of choice? Give it a go 🙂
And just to save this coming up in the comments, no organisation should be storing customer passwords in a format they could readily feed into this challenge. Where this is useful is for cases where passwords have been obtained in plain text and that ranges from credential stuffing lists to malware campaigns to law enforcement agencies identifying compromised passwords in the course of their investigations.
Click to Open Code Editor