-
Notifications
You must be signed in to change notification settings - Fork 209
Reuters tutorial
- Reuters tutorial
- Step 1: Talk to Solr
- Step 2: Add a results widget
- Step 3: Add a pager widget
- Step 4: Add a tagcloud widget
- Step 5: Display the current filters
- Step 6: Add a free-text widget
- Step 7: Add an autocomplete widget
- Step 8: Add a map widget
- Step 9: Add a calendar widget
- Step 10: Extra credit
In this tutorial, we'll go step-by-step through building the AJAX Solr demo site.
Before we start, we write the HTML to which the JavaScript widgets will attach themselves. In practice, this HTML will often be the non-JavaScript version of your search interface, which you now want to improve with unobtrusive JS.
If you want to run a local instance of the Solr server used in this demo, this tarball contains a Solr index of the Reuters data. Replace the data
directory of your Solr instance with this tarball's data
directory. Then, add the following to your schema.xml
in the conf
directory of your Solr instance:
<field name="places" type="string" indexed="true" stored="true" multiValued="true" omitNorms="true" termVectors="true" />
<field name="countryCodes" type="string" indexed="true" stored="true" multiValued="true" omitNorms="true" termVectors="true" />
<field name="topics" type="string" indexed="true" stored="true" multiValued="true" omitNorms="true" termVectors="true" />
<field name="organisations" type="string" indexed="true" stored="true" multiValued="true" omitNorms="true" termVectors="true" />
<field name="exchanges" type="string" indexed="true" stored="true" multiValued="true" omitNorms="true" termVectors="true" />
<field name="companies" type="string" indexed="true" stored="true" multiValued="true" omitNorms="true" termVectors="true" />
<field name="allText" type="text_general" indexed="true" stored="true" multiValued="true" omitNorms="true" termVectors="true" />
<copyField source="title" dest="allText"/>
<copyField source="text" dest="allText"/>
<copyField source="places" dest="allText"/>
<copyField source="topics" dest="allText"/>
<copyField source="companies" dest="allText"/>
<copyField source="exchanges" dest="allText"/>
You may need to replace the date
field definition with the following (changes type
to pdate
):
<field name="date" type="pdate" indexed="true" stored="true" multiValued="true" omitNorms="true" termVectors="true" />
And this (Solr 3.5.0):
<field name="dateline" type="string" indexed="true" stored="true" multiValued="true" omitNorms="true" termVectors="true" />
These partial instructions are taken from the SolrJS wiki. The commands below will download the data from the Reuters-21578 Text Categorization Collection and checkout old SolrJS code. The instructions don't yet include adding the Reuters data to the Solr index, because those commands have not been tested. A starting point for that follows the commands below.
svn checkout --depth empty -r 824380 http://svn.apache.org/repos/asf/lucene/dev/trunk/solr/
svn update -r 824380 --set-depth infinity solr/client/javascript
cd solr/client/javascript/example/reuters/testdata
curl -O http://kdd.ics.uci.edu/databases/reuters21578/reuters21578.tar.gz
tar xf reuters21578.tar.gz
cd ../../..
solr/client/javascript/example/reuters/importer/java/org/apache/solr/solrjs/ReutersService.java
defines an importer, which can be run with ant reuters-import
. However, the above instructions have not gotten that far, yet.
(Attribution: The demo site is based on the SolrJS demo site.)