About three years ago, I worked on a project at Uber to try to detect when photos taken client-side were blurry. The goal was to be able to fail fast, ask the user to retake them if necessary, and save the cycle of waiting for something we could programmatically determine would be a rejection. In short — we could lower the turnaround time for errors from 3 days to ~200ms with Renderscript.
We ended up not shipping this, and subsequently it’s not the most refined code. This wasn’t because it didn’t work, but admittedly more a victim of ever-shifting priorities. We wanted to share the implementation though, hoping maybe someone in the community would want to take this up or otherwise find it valuable/interesting to learn about. The algorithm itself isn’t a new concept, but doing it in Renderscript was a fun adventure.
The process is pretty straightforward and you can get by with just using Java bindings for Renderscript’s API (no need to write custom
.rs files). At a high level, we basically run an edge detection over it, and then look for edges. In this process, if you run this on a blurry image, you’ll basically get a black image with no lines. If you have a sharp image though, you’ll get bright white lines like pictured above.
The steps go as follows:
- Apply a soft blur with
ScriptIntrinsicBlur. In our testing, this helped remove artifacting. By “soft”, I mean really soft. We used a blur radius of
- Greyscale the image with a
ScriptIntrinsicColorMatrix. After we run edge detection, we just want to check luminosity to detect “white” pixels of the sharp lines.
- Run an edge detection convolution using
ScriptIntrinsicConvolve3x3. We were experimenting with a few different convolutions, but the classic edge detection one (anecdotally) seemed to yield the best results. You can find more different kinds of convolutions here.
- Iterate over the pixels and look for one with a luminosity above a certain threshold. You could also tweak this to be smarter (look for actual lines), or even just expecting a minimum number of pixels above a threshold.
And that’s it! It’s a bit light on technical detail, but it’s much easier to just read the code, which you can find here: https://gist.github.com/ZacSweers/aac5f2e684e125e64ae6b1ad0a2540aa
For the sake of demonstration, the code above just uses
#CECECE as a threshold, where the
mostLuminousColor calculated would just be compared against that. The plan was to try different values and do analysis to see what the best results were. There’s a few dials worth turning like that, including the blur radius and also trying different convolutions. Gathering this data would just be a factor of running the detection client-side in the background, and comparing later whether or not that document was ultimately accepted/rejected, and if the rejection was for blurriness.
There are many ways to do this, and I’m no computer vision or graphics guru. For a team that wants to do some client-side noise filtering without importing a 30MB OpenCV SDK, this was a great MVP. We would get execution times of ~200ms on a Nexus 6P at the time, and that was with a really slow implementation! See story below for details.
- Doesn’t work with glare/flash situations
- Requires API 17+ or the androidx Renderscript library
- May be slow on older devices
Bonus debugging story
Initially, I hit a strange scenario where I’d see the white lines if I set the edge-detected bitmap as the image drawable on an ImageView with a black background, yet when I peeked at the pixels directly in code (i.e. via
getPixels()), they all reported themselves as transparent black pixels. At some point in debugging this, I yolo-tried writing the edge-detected bitmap to disk and read it back on the off-chance that changed anything, only to find that it suddenly worked. We had no clue why though, but it was still fast enough (<200ms) for testing so we ran with it. Instead of writing to disk though, we tweaked it to write to an in-memory byte array and read it back ¯\_(ツ)_/¯.
Some time later, I met Romain Guy and asked for his advice on what we might be missing with this. After sending over a sample snippet, he pointed out that all we needed to do was disable the alpha layer on the bitmap with
setAlpha(false). This also explained why writing out the stream worked as well, as that would write it out as an alpha-channel-less JPG and read it back. He also pointed out several areas where performance could be improved, and the linked code above now includes all of these improvements.
This was originally posted on my Medium account, but I've migrated to this personal blog.