Question:How does OCR work in Google Drive?

I have noticed that Google Drive recognizes text in PDFs (and other files such as images and text documents). Out of curiosity, I want to know what did they do to show selectable and searchable img tags. For instance, when I inspect a Google Drive document in Chrome Developer Tools, each page is an image but it doesn't behave as an image because the text is selectable. On the other hand, when I zoom in, it seems like another image with higher resolution is loaded. I think that's the same trick that scribd is using.

I also read that the Google has been improving tesseract-ocr and that the Google Books team helped with the OCR implementation in Google Drive, but I'm not sure what is the process to generate img tags in the way they are doing it.

What is going on behind scenes?

asked Sep 13, 2013
edited Sep 12, 2013
