Thursday, August 15, 2013

Android and Tesseract (Part 2)

Since we have an environment with the Tesseract Library loaded we can now attempt to write some code utilizing it. I managed to create a simple sample app that can capture an image and then spit out what the text that the OCR managed to pick up. So let's walk through what you'll need for this.

Classes:

I used a series of four classes, including MainActivity:
1. MainActivity.java - The main activity
2. ExternalStorage.java - Operations for saving the image we'll be using
3. OCRActivity.java - Activity in which we interact with the image and OCR
4. OCROperation.java - Backend call to the OCR library

So let's start by looking at the code for MainActivity.java

package com.rkts.tipassistant;

import java.io.File;

import android.app.Activity;
import android.content.Context;
import android.content.Intent;
import android.graphics.Bitmap;
import android.graphics.BitmapFactory;
import android.net.Uri;
import android.os.Bundle;
import android.provider.MediaStore;
import android.util.Log;
import android.view.LayoutInflater;
import android.view.Menu;
import android.view.View;
import android.view.ViewGroup.LayoutParams;
import android.widget.Button;
import android.widget.ImageView;
import android.widget.PopupWindow;
import android.widget.TextView;

public class MainActivity extends Activity {

public static Context appContext;
protected Button _button;
protected ImageView _image;
protected TextView _field;
protected String _path;
protected boolean _taken;
String testURI;

protected static final String PHOTO_TAKEN = "photo_taken";


Bitmap globalBitmap;

@Override
protected void onCreate(Bundle savedInstanceState) {
super.onCreate(savedInstanceState);
setContentView(R.layout.activity_main);
appContext = getApplicationContext();

ExternalStorage.createImageStore();

_image = ( ImageView ) findViewById( R.id.image );
_field = ( TextView ) findViewById( R.id.field );
_button = ( Button ) findViewById( R.id.button );
_button.setOnClickListener( new ButtonClickHandler() );
ExternalStorage es = new ExternalStorage();
_path = es.getImageStore().toString() + File.separator + "receipt.jpg";

File existingReceipt = new File(_path);

if (existingReceipt.exists()) {
boolean deleteSuccess = es.deleteExistingReceipt(existingReceipt);
if (deleteSuccess == true) {
Log.d("Debug(MainActivity): ","Removed existing image");
}
else Log.d("Debug(MainActivity): ","No existing image to remove");
}


}

@Override
public boolean onCreateOptionsMenu(Menu menu) {
// Inflate the menu; this adds items to the action bar if it is present.
getMenuInflater().inflate(R.menu.activity_main, menu);
return true;
}

@Override
protected void onSaveInstanceState( Bundle outState ) {
outState.putBoolean( MainActivity.PHOTO_TAKEN, _taken );
}
@Override
protected void onRestoreInstanceState( Bundle savedInstanceState)
{
Log.i( "MakeMachine", "onRestoreInstanceState()");
System.out.println(savedInstanceState.getBoolean(MainActivity.PHOTO_TAKEN));

if( savedInstanceState.getBoolean( MainActivity.PHOTO_TAKEN ) ) {
onPhotoTaken();
}
}



public class ButtonClickHandler implements View.OnClickListener
{
public void onClick( View view ){
startCameraActivity();
}
}

protected void startCameraActivity()
{
File file = new File( _path );
Uri outputFileUri = Uri.fromFile( file );
System.out.println(Uri.fromFile(file));
testURI = Uri.fromFile(file).toString();
Intent intent = new Intent(android.provider.MediaStore.ACTION_IMAGE_CAPTURE );
intent.putExtra( MediaStore.EXTRA_OUTPUT, outputFileUri );

startActivityForResult( intent, 0 );
}

@Override
protected void onActivityResult(int requestCode, int resultCode, Intent data)
{
Log.i( "MakeMachine", "resultCode: " + resultCode );
switch( resultCode )
{
case 0:
Log.i( "MakeMachine", "User cancelled" );
break;

case -1:
onPhotoTaken();
break;
}
}



protected void onPhotoTaken()
{
_taken = true;
beginOCROp();
Log.d("Debug(MainActivity.onPhotoTaken:","End of method");
}


public void beginOCROp() {

Intent intent = new Intent();
intent.setClass(this, OCRActivity.class);
startActivity(intent);

}
}





What you'll see here is that during the onCreate() we set our selves up properly to write to device storage by using the getImageStore method. If my memory serves this should work for both devices with separated internal storage and those with actual external removable storage. We also set up our basic buttons, set a context for us to reference in other classes, and set our onClickListeners for our simple UI buttons. Down as we go you'll see the ButtonClickHandler which simply makes a call to the startCameraActivity method. That method is used to prep our use of the camera to store our image file that we will then analyze with the OCR library. One thing that is important about the startCameraActivity is the intent that comes back to it. In order to handle that intent data we have the onPhotoTaken() method which has a switch that determines what our course of action is. Essentially if we actually take a photo it kicks off beginOCROp() which starts the OCRActivity activity and if the user cancels it takes us back to our original state.

Now let's take a look at ExternalStorage.java

/**
*
*/
package com.rkts.tipassistant;

import java.io.File;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.InputStream;
import java.io.OutputStream;

import android.content.Context;
import android.content.res.AssetManager;
import android.os.Environment;
import android.util.Log;

/**
* @author ryan
* This class is for operations that affect the device external storage.
*/
public class ExternalStorage {
String _path;
static File imageStoreDirectory;

public ExternalStorage() {

}

public ExternalStorage(File receipt) {

}

//This method is used to create the initial image store folder on external storage if it does not exist and is not a directory.
public static void createImageStore() {

File externalStorage = Environment.getExternalStorageDirectory();
imageStoreDirectory = new File(externalStorage + File.separator + "tipAssistant");

System.out.println("ImageStoreDirectory Exists?:" + imageStoreDirectory.exists());

Log.d("Debug:","Directory logic check");

if (imageStoreDirectory.exists()==false && imageStoreDirectory.isDirectory()==false) {
imageStoreDirectory.mkdir();


}
ExternalStorage es = new ExternalStorage();
es.copyAssets();

}

public Boolean deleteExistingReceipt(File receipt) {
boolean success = receipt.delete();
return success;

}

public File getImageStore() {
return imageStoreDirectory;
}


private void copyAssets() {
Context context = MainActivity.appContext;
_path = getImageStore().toString() + File.separator + "tessdata";
File tessdata = new File(_path);
if(!tessdata.exists()) {
tessdata.mkdir();
Log.d("Debug(ExternalStorage(copyAssets):","Making tessdata dir");
}

_path = getImageStore().toString() + File.separator + "tessdata";
AssetManager assetManager = context.getAssets();
String[] files = null;
try {
files = assetManager.list("");
System.out.println(files[0]);
} catch (IOException e) {
Log.e("tag", "Failed to get asset file list.", e);
}
for(String filename : files) {
InputStream in = null;
OutputStream out = null;
try {
in = assetManager.open(filename);
out = new FileOutputStream(_path + File.separator + filename);
copyFile(in, out);
in.close();
in = null;
out.flush();
out.close();
out = null;
} catch(IOException e) {
Log.e("tag", "Failed to copy asset file: " + filename, e);
}
}
}
private void copyFile(InputStream in, OutputStream out) throws IOException {
byte[] buffer = new byte[1024];
int read;
while((read = in.read(buffer)) != -1){
out.write(buffer, 0, read);
}
}






}

This class will handle our I/O operations with storage on the device in question. In our case all we are really looking to do is to create the folder we wish to store the image in temporarily, make sure our tesseract required files are in the proper place, and of course save (or delete and save if something is there) an image for us to analyze. createImageStore() handles prepping the storage environment for our use. It simply checks if the appropriate folder structure exists, and creates it if it does not. We have a method to delete the existing image file and return the success or failure of the operation (deleteExistingReceipt()) as well as a method to get the path to the image store (getImageStore()). After that is what we use to make sure tesseract will behave. We use the copyAssets() method to copy over all the resource data that tesseract needs on the device in order to operate properly. Lastly we have copyFile which is pretty self explanatory and just uses a buffer object.

Next we have OCRActivity.java


package com.rkts.tipassistant;

import java.io.File;

import android.os.Bundle;
import android.os.Environment;
import android.app.Activity;
import android.graphics.Bitmap;
import android.graphics.BitmapFactory;
import android.view.Menu;
import android.view.MenuItem;
import android.widget.ImageView;
import android.support.v4.app.NavUtils;

public class OCRActivity extends Activity {

@Override
protected void onCreate(Bundle savedInstanceState) {
super.onCreate(savedInstanceState);
setContentView(R.layout.activity_ocr);
// Show the Up button in the action bar.
//getActionBar().setDisplayHomeAsUpEnabled(true);
String _path;
ExternalStorage es = new ExternalStorage();
_path = es.getImageStore().toString() + File.separator + "receipt.jpg";

File receiptImg = new File(_path );
System.out.println(_path);


try {
System.out.println("setting imageview to receipt...");
ImageView receipt = (ImageView) findViewById(R.id.previewReceipt);
BitmapFactory.Options options = new BitmapFactory.Options();
options.inSampleSize = 4;
Bitmap myBitmap = BitmapFactory.decodeFile(_path, options);
receipt.setImageBitmap(myBitmap);
}
catch (Exception e) {
e.printStackTrace();
}

try {

OCROperation ocr = new OCROperation(_path);
ocr.runOCR(_path);

}
catch(Exception e) {
e.printStackTrace();
}






}

@Override
public boolean onCreateOptionsMenu(Menu menu) {
// Inflate the menu; this adds items to the action bar if it is present.
getMenuInflater().inflate(R.menu.activity_ocr, menu);
return true;
}

@Override
public boolean onOptionsItemSelected(MenuItem item) {
switch (item.getItemId()) {
case android.R.id.home:
// This ID represents the Home or Up button. In the case of this
// activity, the Up button is shown. Use NavUtils to allow users
// to navigate up one level in the application structure. For
// more details, see the Navigation pattern on Android Design:
//
// http://developer.android.com/design/patterns/navigation.html#up-vs-back
//
NavUtils.navigateUpFromSameTask(this);
return true;
}
return super.onOptionsItemSelected(item);
}

}


This one is pretty straight forward. The activity onCreate attempts to retrieve the image we have created, decode it, and then run the runOCR method from the OCROperation class. The activity also puts the image in an image view so you can see what you are working with when the text comes out the other side (to check for correctness).

Lastly we have OCROperation.java. This is perhaps the meat and potatoes of what we are trying to do here.


/**
*
*/
package com.rkts.tipassistant;

import java.io.File;
import java.io.IOException;

import android.graphics.Bitmap;
import android.graphics.BitmapFactory;
import android.graphics.Matrix;
import android.media.ExifInterface;
import android.os.Environment;

import com.googlecode.tesseract.android.TessBaseAPI;


/**
* @author ryan
*
*/
public class OCROperation {

ExternalStorage es = new ExternalStorage();

public OCROperation(String _path) {

}

public void runOCR(String _path) throws IOException {
// _path = path to the image to be OCRed

BitmapFactory.Options options = new BitmapFactory.Options();
options.inSampleSize = 4;

Bitmap bitmap = BitmapFactory.decodeFile(_path, options);


ExifInterface exif = new ExifInterface(_path);
int exifOrientation = exif.getAttributeInt(
ExifInterface.TAG_ORIENTATION,
ExifInterface.ORIENTATION_NORMAL);

int rotate = 0;

switch (exifOrientation) {
case ExifInterface.ORIENTATION_ROTATE_90:
rotate = 90;
break;
case ExifInterface.ORIENTATION_ROTATE_180:
rotate = 180;
break;
case ExifInterface.ORIENTATION_ROTATE_270:
rotate = 270;
break;
}

if (rotate != 0) {
int w = bitmap.getWidth();
int h = bitmap.getHeight();

// Setting pre rotate
Matrix mtx = new Matrix();
mtx.preRotate(rotate);

// Rotating Bitmap & convert to ARGB_8888, required by tess
bitmap = Bitmap.createBitmap(bitmap, 0, 0, w, h, mtx, false);
bitmap = bitmap.copy(Bitmap.Config.ARGB_8888, true);
}

TessBaseAPI baseApi = new TessBaseAPI();
// DATA_PATH = Path to the storage
// lang for which the language data exists, usually "eng"
File externalStorage = Environment.getExternalStorageDirectory();
File baseDir = new File(externalStorage + File.separator + "tipAssistant");
String path = baseDir.toString() + File.separator;
baseApi.init(path, "eng"); baseApi.setImage(bitmap);
String recognizedText = baseApi.getUTF8Text();
baseApi.end();
System.out.println(recognizedText);


}




}


Here we are taking the image stored on the device and prepping it for use with the OCR library. To do this we make use of ExifInterface and the bitmap object. In tandem these make sure that we have the appropriate data type to analyze and that the user rotating the device wont screw us up too badly. At the end we actually create the TessBaseAPI object that is then instantiated and used to pull a string out of the image. Interestingly enough the actual call that gets you the string is quite simple:


String recognizedText = baseApi.getUTF8Text();

Once you set that string value you can take the information and do anything you'd like with it. The only thing limiting you at that point is your ability to manipulate strings.


I hope this was some what informative as I did have a bit of fun playing around with the library in order to write this. Feel free to comment or shoot me an email if you have questions or concerns.



1 comment:

  1. This OCR solution is a good resource for document and image recognition. I'm doing research on available OCR solutions for our project. Till now, this is a good one.

    Related: .net ocr library, c# ocr open source

    ReplyDelete