>> Table of Contents >> Extensions

Plug-in: Search Engine

General introduction

This scripts lets your visitors search the content of your web site. You don't need programming knowledge to be able to use this software.

The Sun Java VM is required for installation of this package. This is a free product that you can get at http://java.sun.com.

Notice to PHP newbies: This software requires PHP and thus will possibly not run on your local hard disk, but only on the internet. To be able to run this software on your web site your provider needs to support PHP. If in doubt, ask your provider.

Features

 

included files

tools for creation of search indexes
tools/index.jar containts program code
tools/start.bat start file
tools/suchIndexErstellen.bat Command line interface (German)
tools/searchIndex.bat Command line interface (English)
misc. files
plugins/search directory for program files of the search engine
plugins/search_admin administrative application
skins/default/search template files

Installation


Tip: If you prefer to work offline on your local machine, create the search index first.. To do this use the file " start.bat " in directory " tools/ ". For further details on the usage of this tools see the next section. The search index itself will be automatically created and saved to the files keywords.dat and documents.dat . Once these files are created, your search engine is ready for use.

Install search engine:

  1. Open a connection to your account via your FTP program.
  2. Copy your changes to a directory on your web site.
  3. close FTP connection.

Tip: The files in directory "tools/" do not need to be uploaded to the internet.

Creating the search index

For better performance while searching the search engine uses an index of all searchable documents. This directory is called a "search index". Before you are able to use the search engine, this index needs to be created. For the search index you may either: 1) create it offline on your local machine and upload it (either by FTP or via the administration menu of the program) to your server, or 2) create it online via the administrator's menu of the program. Both alternatives are available, so just choose the one that serves you best. The upcoming chapter will first explain how to create the search index offline on your own computer. Afterwards a second chapter will explain how to create the search index online on a web server.

Creating the search index offline

You don't have to enter all pages of your web site by hand. The package contains a small Java tool, that can do the work for you.


Figure: graph. user interface to create search index
Installation of Java

It is not relevant whether you are running this software under Winows, Linux, Unix or Mac. OS. However what you do need is a working runtime environment. Get the most up to date version for your system for free at Java.Sun.com. You need at least Java 2 version 1.4.2 or later with the Swing- and Beans-libraries installed. For Linux, Java is already included in many distributions. For Windows this is not necessarily the case.

To find out whether Java is already installed on your system or not, just search for the file "java.exe". On Windows choose "Start" > "Search" > "Files and Folders". If the file is not found, Java still needs to be installed.

After installation, the file needs to be added to the system's path variable. You may check this in Windows by opening the "DOS command line" from the menu "Start" and typing "java". A list of options should then be displayed to you. If not, the file still needs to be added to the system's path.

In Windows XP you do this via the system's setting calling "System" > "Extra" > "Environment variables" > "System variables". Restart your computer afterwards and try again, by entering "java" at the command line. If you still do not succeed at this point, you have no other choice then to install Java again.

For previous versions of Windows please edit the file "autoexec.bat". Add the following entry (don't forget to make yourself a copy of this file first):
SET PATH="JAVA-folder\bin\java.exe;%PATH%"

You can get the installed version of Java under Windows by typing "java -version" at the DOS command line.

Start indexing software

Just start the file start.bat by double-clicking it. The program has a graphical interface with colorful icons which should be self-descriptive. This program declares all necessary input and will create the search index on a click.

Alternatively: you may use the command line interface

Tip: you do NOT have to use the command line, if you don't want to. There is a graphical user interface, which you can use by running the file "start.bat". If this does not work, or you don't like it, you still can run the program from the command line.

You can use the DOS command line, or utilize the file "suchIndexErstellen.bat" to run it.

By default this file is set to search the current directory plus all subdirectories. If you want to change this, edit the file or run the program manually from the command line. Otherwise it is enough to simply execute the file by double-clicking it.

Using the command line interface:

java -classpath index.jar suchindexErstellen FOLDER FOLLOW META_TAGS

FOLDER: is the directory that is to be searched E.g. the current directory is .\ the parent directory is ..\ if the directory would be "home", use home\ aso.
Please use relative path names only (e.g. not C:\\...) otherwise files would be linked to your local hard disk.
FOLLOW: lets you choose whether subdirectories should be searched as well, or not. Choose true if you want this to be done or false if not.
META_TAGS: here you may choose, whether keywords included in meta tags should be included in the search results or not. Select true if you want this, or false if not.

  Examples:

1) If your front page is at "C:\Homepage" and your search engine is installed at "C:\Homepage\search", the Java files need to be copied to "C:\Homepage\search". Here state the following call (at the command line):

java -classpath index.jar suchindexErstellen ..\ true true

2) Given your web pages are stored in "C:\Homepage", again your search engine is stored at "C:\Homepage\search", but you just want to search the directory "C:\Homepage\test" without subdirectories. Then enter the following:

java -classpath index.jar suchindexErstellen ..\test\ false true

3) Given your web is (again) stored in "C:\Homepage", but this time your search engine is in the same directory and you just want to search the directory "C:\Homepage\test" and it's subdirectories, then the call would be:

java -classpath index.jar suchindexErstellen .\test\ true true

Batch file:

Open the file searchIndex.bat in a text editor of your choice (e.g. Notepad). Here you will find the same call as stated above, including the description. Change the settings as you see fit, save and start the file.

Using the administrator's menu to upload the search index
  1. Open the front page of the Yana Framework installed on your server in your web browser.
  2. Log in using your user name and password. Please note that you require administrator privileges to be able to save the search index on the server.
  3. At the "Sitemap" click on "show administration menu".
  4. Make sure, the setup plug-in of the search engine and the search engine itself are activated.
  5. Choose "Search engine Setup" from the options menu (see following figure).
  6. In the menu "upload search index" click the button "choose" to select the files of the search index stored on your local computer
  7. Click on "Go" to start the upload

Figure: Form to upload the search index at the administrator's menu

Creating the search index online

As an alternative to creating the search engine offline at your local computer, you may also have it created for you online on your web server. This way you don't have to upload the index files to the web server. On the other hand, all files, that are to be searched, need to be available on the same web server at the same time. In addition you can't check the files by hand, if you want to, before putting them online.

Using the administrator's menu
  1. Open the front page of the Yana Framework installed on your server in your web browser.
  2. Log in using your user name and password. Please note that you require administrator privileges to be able to create the search index on the server.
  3. At the "Sitemap" click on "show administration menu".
  4. Make sure, the setup plug-in of the search engine and the search engine itself are activated.
  5. Choose "Search engine Setup" from the options menu (see following figure).
  6. Scroll to section "create search index online".
  7. Click on "Go" to start building the index.

Figure: Administrator's menu to create a search index

Frequently Asked Questions

What informations are stored?

The database contains all words found in a document, "key words" and the title of the page.
The search is restricted to HTML documents (*.htm, *.html, *.xml, *.shtml). Beside the found words, the page's URL and first 80 characters as a description are stored too.

All directories whose names begin with an underscore: "_" are ignored. The reason is that some web site making programs (e.g. Frontpage) use this to mark system directory that, of course, are not to be searched.

Some documents have changed. What am I to do to have the new files found as well?

Create and upload a new search index using either " searchIndex.bat " or " start.bat ". Both files " keywords.dat " and " documents.dat " need to be replaced. Also flush the cache of the search engine (if any) stored in directory " cache/ " by deleting the file(s) " *.cache ".

I have created a new search index, but the old results are still viewed.

This may occur while old data is still in the cache of the search engine. To do this open directory " cache/ ", delete all files with the extension " *.cache " and try again.

Nothing or 0 hits are shown.

Displaying the hit list requires JavaScript. This makes the search engine extremly fast and relieves the server load, since no data has to be requested from the server while going through the pages of results. However: to be able to do this JavaScript has to be activated in your web browser. If the page stays empty, check if JavaScript is deactivated or blocked by your firewall software.

What is the difference between "start.bat", "suchIndexErstellen.bat" and "searchIndex.bat".

In directory "tools" there are 3 files "start.bat", "suchIndexErstellen.bat" and "searchIndex.bat". They all do the same thing. Just that "start.bat" offers a neat graphical interface, which asks all necessary data. The other two start the program directly at the command line. Here you need to adjust the arguments by hand. Here "searchIndex.bat" will show an explanation of the arguments in English, while "suchIndexErstellen.bat" will show the same text in German. Choose the solution, which suits you best.

I'm getting an error: "Error 500 search index not found".

Probably you don't have created the search index yet, or the files are stored in the wrong directory. To find out how to create a search index see the section "Creating the search index".

I'm getting an error: "Error 404 page not found".

Probably you have forgotten to copy some important files to the internet. Please check if all files are present.

When searching I'm getting an error: "Error 403 access denied".

This may occur if insufficient access rights are set. Use your FTP program to check whether the following folders: " cache/ " and " config/ ", plus all subdirectories and files have sufficient access rights set (should be: readable AND writable). For UNIX you can set this by using the command CHMOD 777. A detailed explanation of access restrictions can be found in the installation guide for beginners.

I tried to run the file "start.bat", but it didn't work.

Please check if you have at least version 1.4.2 of the Java Virtual Machine installed. To check this enter "java -version" at the command line. If you have an older version, download a more current version of the Java Virtual Machine for free from the internet at java.sun.com. Or try to use the file "searchIndex.bat" to run the program from the command line. While this is not as comfortable, it should even work with version 1.3 of the Java Virtual Machine.

Author: Thomas Meyer, www.yanaframework.net