Implement Lucene Search in Android App – How to make it work

I created an Android app for World Health Organization long ago. I made different kinds of search engines before with full-text indexing. This time I wanted to do something different. I had only 4 days to make the app. They had different kinds of contents like

  1. html
  2. Plain text files
  3. PDF files

And they needed a search engine to search all the contents and show a unified result set ( Showing results in a same page with relevancy ). Now I could implement it in SQLite, but that would mean a lot of task and the gain would be much lesser than a document-based search engine like Apache Lucene.

Learning basics of Lucene is easy if you have a habit of learning new things in tech and also learning more about computer science. I would say you could do basic things after studying for a week max. So, learn the basics from tutorialspoint and start here again.

The nightmare

Now, there were several problems with integrating lucene in android because there is a difference between java libraries in android and in non-android platforms. I started with the latest version, didn’t work. I tried two from 5.x versions, didn’t work either. Problems were related to dependency mismatches and some minor bugs in lucene libraries. Then, I tried 4.x versions. 4.x versions are interesting, though they stated it was for java 6, they mixed java 7 things. So, after trying different minor version from 4.x, only 4.1 worked! I lost two days, I read source codes of Lucene because even google or stackoverflow could not help me. It just worked when I was thinking of getting rid of it and the app was only half-way.

So, 4.1 has different ways of initializing things, so, I would recommend you to follow these to make it work

public class DiseaseIndexer {

    public static final String LOG_TAG = "DiseaseIndexer";

    private IndexWriter writer;

    private Analyzer analyzer;

    private IndexWriterConfig iwc;

    public DiseaseIndexer( String indexDirPath ) throws IOException {

        Log.v( LOG_TAG, "Createing index at " + indexDirPath );

        Directory indexDir = FSDirectory.open( new File( indexDirPath ) ); //file replaced by path (nio package)

        analyzer = new StandardAnalyzer( Version.LUCENE_41 );

        iwc = new IndexWriterConfig( Version.LUCENE_41,  analyzer );

        //default configuration for index. This cannot be changed after index is created using this object. For any changes we will need to getConfig from the index writer.

        iwc.setOpenMode( IndexWriterConfig.OpenMode.CREATE_OR_APPEND ); // only create removes previous index. create or append adds to

        iwc.setRAMBufferSizeMB( 16.0 ); //

        try {

            writer = new IndexWriter( indexDir, iwc );

        } catch ( Exception e ) {

            Log.e ( LOG_TAG, e.getMessage(),  e );

        }

    }

    public void close() throws IOException {

        writer.close();

    }

    public void createIndex(Context context) {

        // if diease is disabled, don't index

        Map<String, Disease> diseases = DiseaseManager.getInstance().getDiseases();

        Set<String> diseaseKeys =  diseases.keySet();

        for ( String key: diseaseKeys ) {

            Disease disease = diseases.get( key );

            if ( disease.isDisabled() )
                continue;

            indexDisease( context, disease );

        }

    }

    public void indexDisease( Context context, Disease disease ) {

        Log.v(LOG_TAG, "Indexing disease: " + disease.getFolder() );

        try {

            Document doc = disease.getDocument();

            if ( writer.getConfig().getOpenMode() == IndexWriterConfig.OpenMode.CREATE ) {

                writer.addDocument(doc);

            } else {

                Term idTerm = new Term( DiseaseLuceneFieldNames.ID.getStr(), disease.getFolder() );

                writer.updateDocument(  idTerm , doc );

            }

        } catch ( IOException e ) {

            Log.e( LOG_TAG, "Section index failed: " + disease.getFolder(), e );

            return;
        }

        List
<Section> sections = disease.getSections();

        for( Section section : sections ) {

            try {

                Document doc = section.getDocument( context, disease.getFolder() );

                Log.v( LOG_TAG, "Indexing section doc: " + section.toString() );

                if ( writer.getConfig().getOpenMode() == IndexWriterConfig.OpenMode.CREATE ) {

                    writer.addDocument(doc);

                } else {

                    Term idTerm = new Term( DiseaseLuceneFieldNames.ID.getStr(), disease.getFolder() + "-" + section.getId() );

                    writer.updateDocument(  idTerm , doc );

                }

            } catch (IOException e) {

                Log.e( LOG_TAG, "Section index failed: " + section.getId(), e );

            }

        }

    }

}

That was the indexer class, just like tutorialspoint. The core difference is in the initializing code for lucene 4.1. Read that well.

Then comes the searcher class. There is nothing fancy here, except the initialization code for 4.1

public class DiseaseSearcher {

    public static final int MAX_SEARCH = 20;

    IndexSearcher indexSearcher;

    IndexReader reader;

    MultiFieldQueryParser queryParser;

    Query query;

    public DiseaseSearcher ( String indexDirPath ) throws IOException, IOException {

        Directory indexDir = FSDirectory.open( new File( indexDirPath ) );

        reader = DirectoryReader.open( indexDir );

        indexSearcher = new IndexSearcher( reader );

        String[] matchFields = { DiseaseLuceneFieldNames.CONTENT.getStr(), DiseaseLuceneFieldNames.DISEASE_ID.getStr(), DiseaseLuceneFieldNames.NAME.getStr() };

        queryParser = new MultiFieldQueryParser(
                Version.LUCENE_41,
                matchFields,
                new StandardAnalyzer( Version.LUCENE_41 )
        );

    }

    public TopDocs search (String searchQuery ) throws IOException, ParseException {

        query = queryParser.parse( searchQuery );

        return indexSearcher.search( query, MAX_SEARCH );

    }

    public Document getDocument (ScoreDoc scoreDoc ) throws IOException {

        return indexSearcher.doc( scoreDoc.doc );

    }

    public void close() throws IOException {

        reader.close();

    }

}

Contact me or comment if you still can’t get it working.

Advertisements

One thought on “Implement Lucene Search in Android App – How to make it work

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s