Brainmaker

Nanos gigantium humeris insidentes!

luncene

  • August 8, 2012 8:49 pm

http://kalanir.blogspot.com/2008/06/creating-search-index-in-database.html

 

Creating Lucene Index in a Database – Apache Lucene

My previous post, Indexing a database and searching the content using Lucene, shows how to index records (or stored files) in a database. In that case the index is created in the local file system. However in real scenarios most of the applications run on clustered environments. Then the problem comes where to create the search index.

Creating the index in the local file system is not a solution for the particular situation as the index should be synchronized and shared by every node. One solution is clustering the JVM while using a Lucene RAMDirectory(keep in mind it disappears after a node failure) instead of a FSDirectory. Terracotta framework can be used to cluster the JVM. This blog entry shows a code snippet.

Anyway I thought not to go that far and decided to create the index in the database so that it can be shared by everyone. Lucence contains the JdbcDirectory interface for this purpose. However the implementation of this interface is not shipped with Lucene itself. I found a third party implementation of that. Compass projectprovides the implementation of JdbcDirectory. (No need to worry about compass configurations etc. JdbcDirectory can be used with pure Lucene without bothering about Compass Lucene stuff).

Here is a simple example

  1. //you need to include lucene and jdbc jars
  2. import org.apache.lucene.store.jdbc.JdbcDirectory;
  3. import org.apache.lucene.store.jdbc.dialect.MySQLDialect;
  4. import com.mysql.jdbc.jdbc2.optional.MysqlDataSource;

.

  1. //code snippet to create index
  2. MysqlDataSource dataSource = new MysqlDataSource();
  3. dataSource.setUser(“root”);
  4. dataSource.setPassword(“password”);
  5. dataSource.setDatabaseName(“test”);
  6. dataSource.setEmulateLocators(true); //This is important because we are dealing with a blob type data field
  7. JdbcDirectory jdbcDir = new JdbcDirectory(dataSource, new MySQLDialect(), “indexTable”);
  8. jdbcDir.create(); // creates the indexTable in the DB (test). No need to create it manually

.

  1. //code snippet for indexing
  2. StandardAnalyzer analyzer = new StandardAnalyzer();
  3. IndexWriter writer = new IndexWriter(jdbcDir, analyzer, true);
  4. indexDocs(writer, dataSource.getConnection());
  5. System.out.println(“Optimizing…”);
  6. writer.optimize();
  7. writer.close();
  8. static void indexDocs(IndexWriter writer, Connection conn)
  9. throws Exception {
  10.     String sql = “select id, name, color from pet”;
  11.     Statement stmt = conn.createStatement();
  12.     ResultSet rs = stmt.executeQuery(sql);
  13.     while (rs.next()) {
  14.         Document d = new Document();
  15.         d.add(new Field(“id”, rs.getString(“id”), Field.Store.YES, Field.Index.NO));
  16.         d.add(new Field(“name”, rs.getString(“name”), Field.Store.YES, Field.Index.TOKENIZED));
  17.         d.add(new Field(“color”, rs.getString(“color”), Field.Store.YES,  Field.Index.TOKENIZED));
  18.         writer.addDocument(d);
  19.     }
  20. }

This is the indexing part. Searching part is same as the one in my previous post.

Print Friendly