Importing / Indexing MySQL data into Solr
Apache Solr is a fast, full featured, open-source Java search server. It enables you to easily create search engines which index, search & analyze data from websites, databases or files.
In schema.xml file you have to define the
Search for <uniqueKey> tag and define your primary key name
Download MySQL jdbc driver jar and copy it into <SOLR_HOME>/lib directory
Step 5:
(Re)Start your webserver
You can check the status of the command by accessing
http://<host>:<port>/solr/dataimport
To do an incremental or delta indexing of data since the last full or delta, issue the command
http://<host>:<port>/solr/dataimport?command=delta-import
Similarly any other database can be configured all you need is
Before we proceed further I presume that you have already configured Solr server without data or refer to Solr in 5 minutes tutorial to configure your Solr server first.
Configure your Solr server
Step 1:
To index your data firstly you need to define schema by editing file called schema.xml present at <SOLR_HOME>/example/solr/conf/
Schema.xml in solr serves the similar purpose that table does in mysqlIn schema.xml file you have to define the
- Fields with there field type
- Field which should be used as unique key
- Which fields are required, indexed
- Default search field (deprecated in new versions)
- Default search operator (AND | OR deprecated in new versions)
Search for <fields> tag and define your fields under this
<fields>
<field name="blog_id" type="string" indexed="true" stored="true" omitNorms="true"/>
<field name="tags" type="text_general" indexed="true" stored="true"/>
<field name="desc" type="text_general" indexed="true" stored="true"/>
<field name="created" type="tlong" indexed="true" stored="true" omitNorms="true"/>
<field name="uid" type="int" indexed="true" stored="true" omitNorms="true"/>
</fields>
Search for <uniqueKey> tag and define your primary key name
<uniqueKey>blog_id</uniqueKey>
Step 2:
Now you need to define how this schema is mapped to your database. For this purpose you have to create a new file data-config.xml under <SOLR_HOME>/example/solr/conf directory.
<?xml version="1.0" encoding="UTF-8"?>
<dataConfig>
<dataSource type="JdbcDataSource"
driver="com.mysql.jdbc.Driver"
url="jdbc:mysql://localhost/myDatabase"
user="dbuser"
password="p@ssword"/>
<document name="doc">
<entity name="blog"
query="select b.blog_id, b.tags, concat(b.title,' ' ,b.desc) as desc, b.created, b.uid, from blogs b order by b.blog_id;">
<field column="blog_id" name="blog_id" />
<field column="tags" name="tags" />
<field column="desc" name="desc" />
<field column="created" name="created"/>
<field column="uid" name="uid"/>
</entity>
</document>
</dataConfig>
Step 3:
Edit file solrconfig.xml which is located at <SOLR_HOME>/example/solr/conf directory. Add the following requestHandler entry if not already existing.
<requestHandler name="/dataimport">
<lst name="defaults">
<str name="config">data-config.xml</str>
</lst>
</requestHandler>
Step 4:
Download MySQL jdbc driver jar and copy it into <SOLR_HOME>/lib directory
Step 5:
(Re)Start your webserver
Performing full or delta indexing
If everything works correctly, you can get solr to fully index the configured tables by accessing the following command via your browser.http://<host>:<port>/solr/dataimport?command=full-import
You can check the status of the command by accessing
http://<host>:<port>/solr/dataimport
To do an incremental or delta indexing of data since the last full or delta, issue the command
http://<host>:<port>/solr/dataimport?command=delta-import
Similarly any other database can be configured all you need is
-
jdbc jar of that database
-
make changes to data-config.xml
Comments
Post a Comment