Gisgraphy user guide
To suggest a change or a correction to any part of the documentation, please send a mail to
David Masclet.
A little bit of technic...
What technologies are used ?
Options and environement specific settings
Output fields description
Output fields description
Introduction
About
Gisgraphy is a free and open source framework that provides fulltext search and find nearby services for places on earth (aka : toponyms)
from several databases (aka : gazeteers) on the Web (mainly Geonames, but not only : ESRI for instance). It provides an importer to inject the data into a strongly typed Postgres / Postgis database and use them via 2 webservices or a java API : geolocalisation and fulltext in many format (XML, json, PHP, ruby, python, Atom, RSS / GeoRSS). Here are the main functionalities :
- Importers from geonames CSV files. Just give the country(ies) you wish to import and / or the placetypes, and Gisgraphy download the files and import them with all the alternateNames (optional), and sync the database with the fulltext search engine
- REST WebService
- Several output formats supported : XML, json, PHP, ruby, python, Atom, RSS / GeoRSS
- Full text search (based on Lucene / Solr with default filters optimized for city search
(case insensitivity, separator characters stripping, ..) via an Java API or a webservice
- Findnearby function (with limits, pagination, restrict to a specific country and/or language and other useful options) via a Java API or a Web Service
- An admin / back office whith statistics interface
- Fully replicated / scalable / high performance / cached services
- Search for zipcode name, IATA, ICAO
- Internationalized (with support of cyrillique, arabic, chinese,... alphabet)
- Dojo widgets / prototype / Ajax to ease search but can be use it even if javascript is not enabled on the client side
-
- Opensearch module
- Plateform / language independent
- Provides all the countries flags in svg and png format
A little bit of technic...
- Spring Services / Hibernate mapping with its own hibernate dialect
- Lucene Schema
- Use Postgres / postgis but will be soon compliant with other Spatial providers (Oracle, Mysql,...)
- Build using Maven2
- Automatic synchronization between Postgres and Solr
- High test coverage
- Event - based design
- UTF-8 all over the place
- Gisgraphy has its own maven repository with all the dependencies
- Designed with DSL (Domain Specific Language)
- Designed with the DDD paradigm
What's Gisgraphy for ?
- Find places (city, lake, forest, country, stream, castle airport, more than 650 different features) via a Web service a Java API
- Find city by Zipcode
- Import geonames features into a strongly typed database with error correction.
- Use Spring hibernate to manage places
- Find places around a GPS point or an other place via a REST API or a Java API
- use Dojo widgets / prototype / javascript to ease search
Some examples of uses
- I ve got a project in an other language than java and I want to offer a city fullText search, and open a popup if more than one result match
- I ve got a java project and i need Spring Dao and services / hibernate Mapping for geoloc
- I want to import all the features of Geonames for United state and use them in Python
- Get the zipcode of washington
- Find all the forests around Paris and their distances, with a radius of 30 km in spanish, order by distance
- Find all the city that match 'Paris' and paginate from 1 to 1O with scoring.
- Find all the lake of Ireland in JSON format and limit the search to 5 results.
- Find the GPS position of London.
- Search 'Saint-André', 'st andre', 'SaInT andré', 'st-andré'; will return the same instance...
- Search paris texas will not return the french capital.
- Type 'lutece' or 'paname' and get Paris.
- Locate on a google map or yahoo map the results.
- Search the elevation of Madrid.
- What is the flag of Argentina
- Find the administrative division of New york.
- Search the nearest city for the GPS point lat=47.5/Long=5 .
- Search all the monuments / restaurants and hotels around a given GPS position.
- Search the country for a specified GPS point.
- Find the limitroph countries of Brasil.
Requirements
- Gisgraphy needs Java 1.5 or greater
- PostgreSQL with Postgis extension (It is HIGHLY recommended to have postgis 1.3.1 or greater for good performances... more )
- A servlet container if you want to use it as a Servlet (Not programatically). Actually Gisgraphy has been tested on Tomcat and Jetty but any Servlet container should be OK
What technologies are used ?
- Java
- Maven 2
- Spring
- Hibernate
- PostgreSQL / Postgis
- SolR / Lucene
- Jersey (probably CXF in next version)
- Hibernate Spatial
- Appfuse
- Dojo / prototype (in Version 1.0)
Placetype schema
Here is a simple diagramme that represents relation beetween the placetype, gisFeature, Adm, country, and languages.
GisFeature is the mother class of all the placetype, it has all the basic informations.
Language represents a spoken language
AlternateName is a specific name for a specific language.
All the placetype extends gisFeature. Such as :
- City
- Country
- Adm (aka : administrative division)
- All the other placetypes
Adm are in tree structure.
A GisFeature :
- Belongs to an Adm
- Belongs to a country
- Has some alternate names for a specific language
License
Gisgraphy is licensed under the terms of the LGPL V3.0 License
Installation
Import Data
You must use the admin interface. see the admin interface section for more informations
Debug Mode
Each Gisgraphy service (Geoloc and fulltext) is a servlet.
Each servlet can be run in debug Mode. The error message will be in verbose mode. To do so just declare the servlet init parameter "debugMode" to true (in the web.xml) as shown :
<servlet>
<servlet-name>geoloc service</servlet-name>
<servlet-class>
com.gisgraphy.servlet.GeolocServlet
</servlet-class>
<init-param>
<!-- if true the output field error will contains the exception message. Default to false -->
<param-name>debugMode</param-name>
<param-value>true</param-value>
</init-param>
<load-on-startup>1</load-on-startup>
</servlet>
<servlet>
<servlet-name>fulltext service</servlet-name>
<servlet-class>
com.gisgraphy.servlet.FulltextServlet
</servlet-class>
<init-param>
<!-- if true the output field error will contains the exception message. Default to false -->
<param-name>debugMode</param-name>
<param-value>true</param-value>
</init-param>
<load-on-startup>2</load-on-startup>
</servlet>
Options and environement specific settings
All the options and environement specific settings are located in the env.properties file. The env.properties is located in the $GISGRAPHYDISTRIBUTION/webapps/ROOTWEB-INF/classes directory of the Gisgraphy distribution
Take care of white spaces in properties file : MyProperties=bar is not the same as MyProperties= bar (whitespace after the equals sign are taken into accounts)
importer.geonames.dir
This option determines the directory where the Geonames files are located. It must ends with / or \ according to the system path separator. it is also the directory where the Geonames files will be downloaded from the importer.geonames.downloadURL
URL
. On Windows The '\' character must be escaped as in the example bellow . The path can be absolute or relative (from the directory where you've launch Gisgraphy). It is not recommended to put space in the path.
. This option is case sensitive if the underlaying file system is case sensitive (e.g : Linux / Unix).
Examples on Linux :
importer.geonames.dir=./data/prod/
importer.geonames.dir=/home/user/data/prod/
Example on Windows
importer.geonames.dir=.\\data\\prod\\
Don't forget the ending slash (or backslash if you use windows) !
importer.geonames.downloadURL
This option determines the URL of the server to download the files to be processed. This option is case sensitive
Example :
importer.geonames.downloadURL=http://download.geonames.org/export/dump/
Don't forget the ending slash for the URL !
importer.filesToDownload
This option determines the files to be download from the importer.geonames.downloadURL
URL. The files must be ';' separated. You can specify any files, but if you download from the Geonames server, you should specify country ZIP files (or allcountries.zip) and alternateNames.zip . This option is case sensitive.
Examples :
importer.filesToDownload=AD.zip;CY.zip
importer.filesToDownload=allCountries.zip
If you run an import and the change the option and re-run an import : you must delete the old downloaded file before re-run the import. If you don't : the files you've downloaded will be processed.
It is not necessary To download Countries.txt,iso-languagecodes.txt, because there are already in the Gisgraphy distrib. Just focus on the country you want to process and alternatenames.zip if you want to import alternate names.
If allCountries.txt file is in the importer.geonames.dir
Directory, the other countries files will be logically ignored.
importer.retrieveFiles
Wether the files defined by the importer.filesToDownload
option should be downloaded. If 'true' the importer will download the files according to the importer.filesToDownload
option. If 'false', it will use the files already presents in importer.geonames.dir
. This option should be in lower case
Examples :
importer.retrieveFiles=true
importer.retrieveFiles=false
fulltextSearchUrl
The URL of the SolR Server. If you use the SolR server of the Gisgraphy distribution : the URL should be the Gisgraphy URL follow by solr (name of the war file). If you need better performance, (that's to say run Gisgraphy and the SolR server in two distinct JVM. see jvm optimisation) : specify the URL of the server you want to use. This option is case sensitive.
Examples :
fulltextSearchUrl=http://localhost:8080/solr/
Example with the default SolR port
fulltextSearchUrl=http://localhost:8983/solr/
importerConfig.wrongNumberOfFieldsThrows
Wether we should throws an exception and stop the import if a line in an imported file haven't the expected number of fields (CSV fields). This option should be in lower case. it is recommended to set it to false in a standard import. if you use Gisgraphy for errors correction, set it to true.
Even if you want to do errors correction, and set it to false, you can see the reported errors in log files and fix them at the end of the import. It can be easier to fix all the errors in one time.
Examples :
importerConfig.wrongNumberOfFieldsThrows=true
importerConfig.wrongNumberOfFieldsThrows=false
importerConfig.missingRequiredFieldThrows
Wether we should throws an exception and stop the import of data if a required field is missing. This option should be in lower case. It is recommended to set it to false in a standard import. If you use Gisgraphy for errors correction, set it to true.
Even if you want to do errors correction, and set it to false, you can see the reported errors in log files and fix them at the end of the import. It can be easier to fix all the errors in one time.
Examples :
importerConfig.missingRequiredFieldThrows=true
importerConfig.missingRequiredFieldThrows=false
importerConfig.acceptRegExString
List of regular expresions separated by ';' that determines the feature class / code to be imported. The default value is .* (all the feature class / code).
The regular expressions must match featureClass.featureCode
. The gisFeature which matches "A.ADM." (administrative divisions) and "A.PCL." (countries) regex are automaticaly imported. This option is case sensitive and the should be set in upper case because feature class / code are in upper case.
imported (Administrative division and country).
Examples :
.* : import all gisfeatures, no matter their feature class / code
P[.]PPL[A-Z&&[^QW]];P[.]PPL$;P[.]STLMT$ : import Israeli settlements and all the cities except destroyed and abandoned city
V.FRST. : import all the forests
P[.]PPL[A-Z&&[^QW]];P[.]PPL$;P[.]STLMT$;V.FRST. : import Israeli settlements and all the cities except destroyed and abandoned city, and the forests
importerConfig.tryToDetectAdmIfNotFound
If this option is set to true : The importer will try to detect Adm for features if the AdmXcodes values does not correspond to a known Adm. set this option to true activate errors correction. If set to false errors correction is disabled and if no Adm is found for the AdmXcode, the feature will be linked to a null Adm.
This option is case sensitive and must be set in lower case.
Example : There is an adm with level 2 which have adm1Code = 'A1' and adm2Code = 'B2' in the datastore, suppose there is a gisFeature which have Adm1code='A3' and Adm2Code='B2', Gisgraphy will detect an error because there is no Adm with those codes. so if this option is set to true, Gisgraphy will correct the error and will link the feature to the Adm with codes adm1Code = 'A1' and adm2Code = 'B2'. If if this option is set to false, Gisgraphy won't try to correct the error, put a warning message in logs, and links the Feature to a null Adm.
Examples :
importerConfig.tryToDetectAdmIfNotFound=true
importerConfig.tryToDetectAdmIfNotFound=false
importerConfig.syncAdmCodesWithLinkedAdmOnes
This option is a little bit difficult to understand. An example is often simpler than a big speech ;). First, there is a few little thing to know : a feature has the following properties:
FeatureId......Adm...Adm1Code...Adm2Code...Adm3Code...Adm4Code...adm1Name...Adm2Name...adm3Name...Adm4Name...
and an Adm is a feature too and has the same properties.
So a feature is linked to an administrative division (AKA : Adm). For performance reasons, the codes and names of the Adms are stored in the Feature itself too.
Now consider the example above : if there is an error the adm will not be the same as the Codes in The CSV files. this option allow to choose beetween two strategy :
- If
importerConfig.syncAdmCodesWithLinkedAdmOnes
is set to 'true' then the admXcodes and admXnames of the features will be the same as the linked Adm One (in our example : if importerConfig.tryToDetectAdmIfNotFound
is set to 'true' the adm1Code will be 'A1' and the Adm2Code will be 'B2' that's to say the same as the Adm ones)
- If
importerConfig.syncAdmCodesWithLinkedAdmOnes
is set to 'false', the admXcodes and admXnames of the features will be the same as the CSV file (in our example : if importerConfig.tryToDetectAdmIfNotFound
is set to 'true', the adm1Code will be 'A3' and the Adm2Code will be 'B2' And the Linked Adm will be null)
In other words if you want the importer to set the admXcode and admXnames with the CSV one : set this option to false. if you want those codes to be the same as the linked Adm : set it to true.
if you don't know what to do : set it to the recommended value : true. This option is case sensitive and must be set in lower case.
importerConfig.tryToDetectAdmIfNotFound and importerConfig.syncAdmCodesWithLinkedAdmOnes are orthogonal concepts
importerConfig.admXExtracterStrategyIfAlreadyExists
In order to import the Adm before the other features, Gisgraphy extract the Adm1, Adm2, Adm3 and Adm4 files. This option tells what to do if an AdmX file (determines with the importerConfig.admXFileName
option) is already present in the importer.geonames.dir
.This option is case sensitive. 3 options are available :
- skip : the extract will be skiped, and the file already presents will be used
- backup : the file already present will be backup (with the date and the current time), a new file will be extracted and used
- reprocess : the file will be replace by the new one
Examples :
importerConfig.adm3ExtracterStrategyIfAlreadyExists=reprocess
importerConfig.adm4ExtracterStrategyIfAlreadyExists=skip
importerConfig.adm1FileName
Specify the filename of the CSV file with Administrative division with level 1. Should normally be 'admin1Codes.txt'. This option is case sensitive if the underlaying file system is case sensitive
importerConfig.adm2FileName
Specify the filename of the CSV file with Administrative division with level 2. Should normally be 'admin2Codes.txt'. This option is case sensitive if the underlaying file system is case sensitive
importerConfig.adm3FileName
Specify the filename of the CSV file with Administrative division with level 3. Should normally be 'admin3Codes.txt'. This file name will be used to extract Adm with level 3.This option is case sensitive if the underlaying file system is case sensitive
importerConfig.adm4FileName
Specify the filename of the CSV file with Administrative division with level 4. Should normally be 'admin4Codes.txt'. This file name will be used to extract Adm with level 4.This option is case sensitive if the underlaying file system is case sensitive
importerConfig.languageFileName
Specify the filename of the CSV file with languages. should normally be 'iso-languagecodes.txt'. This option is case sensitive if the underlaying file system is case sensitive
importerConfig.countriesFileName
Specify the filename of the CSV file with countries information. Should normally be 'countries.txt'. This option is case sensitive if the underlaying file system is case sensitive
importerConfig.alternateNamesFileName
Specify the filename of the CSV file with languages. Should normally be 'alternateNames.txt'. This option is case sensitive if the underlaying file system is case sensitive
importerConfig.importGisFeatureEmbededAlternateNames
Some of the alternate names are provided in each country dump file and all the alternate names with languages and additionnal information are in a separated file name. The alternate names provided in the country files are incomplete. if you want to import the alternate names of the country files (faster but a lot of informations are lost) set this option to true, in that case the importerConfig.alternateNamesFileName will be ignored. if you want a full import with the alternatenames separated file set this option to false. This option is case sensitive and must be set in lowercase.
Examples :
importerConfig.importGisFeatureEmbededAlternateNames=true
importerConfig.importGisFeatureEmbededAlternateNames=false
fulltextsearch.maxConnectionsPerHost
Limits the numbers of connections to the SolR server per host. Recommended : 32.
fulltextsearch.maxTotalConnections
Limits the numbers of connections to the SolR server for all hosts. Recommended : 128.
geolocsearch.defaultGeolocSearchPlaceType
define the default placetype Class for the geoloc query. An empty or wrong value will search for any placetype by default. This option is case sensitive. and the placeType must be in the entity package
Examples :
geolocsearch.defaultGeolocSearchPlaceType=City
geolocsearch.defaultGeolocSearchPlaceType=
This name of the class should not ends with '.class' but is case sensitive.
Import failure
When an error occured during import. You have to:
- Find and repair the error
- Reset the database(A script is provided in the sql directory), because some data are already in the database and you'll have duplicate key / constraints exceptions.
- Restart the web application, in order to flush the cache and to reset the configuration : importers keep informations of what have been imported. So you must restart the web application in order to clear those informations
- Re-run the import
Never re-run an import before cleaning the database and restart the web application, it will failed!!.
Full text service
Description
The full text service allows to search for features / places.
you can
- Specify one or more words
- Search for text or zip code
- Limit the results to a specific
- Paginate the results
- Specify the ouput verbosity
- Tells if you want the output to be indented
The search is case insensitive, use synonyms (Saint/st, ..), separator characters stripping, ...
Parameters
- The searched text (required) : The text for the query, it can be a zip code, a String or one or more String.
Examples :
- Start pagination index (optional) : The first pagination index. Numbered from 1. If the number is < 1 or not specified, it will be set to the default value : 1.
- End pagination Index (optional) : The last pagination index. if < 1 or not specified, it will be set to startindex + 10.
- The output format (optional) : The formats available are :
- The language code (optional) : The iso 639 Alpha2 or alpha3 Language Code.
Some properties as the AlternateName AdmNames and countryname belongs to a certain language code, The language parameter can limits the output of those fields to a certain language (it only apply for the FULL style) :
- If the language code does not exists or is not specified, properties with all the languages are retirved
- If it exists, the properties with the specified language code, are retrieved
use the alpha2 code when possible, only use the alpha3 code when no alpha2 code exists for the language.
- The output style verbosity (optional) : Determines the output verbosity. 4 styles are available :
- Short : feature_id, name, fully_qualified_name, zipcode (if city), placetype, country_code, country_name
- Medium (default) : Short + lat, lon, feature_class, feature_code, population, fips
- Long : Medium + adm1_name, adm2_name, adm3_name, Adm4_name, adm1_code, Adm2_code, Adm3_code, Adm4_code
- Full : Long + alternateNames, country_alternate_name, adm1_alternate_name, adm2_alternate_name. If the language parameter is specified : only alternate names with the specified language are retrive, otherwise, all the alternate names for all the languages are retrieve
For a full list and descrition of output fields : see
bellow.
- The placetype (optional) : limit the search to the specified place type. place type regroup some feature class and feature code. you need to specify the class coresponding to the place type you want to search. default : search for all features
See a full list and explanation of placetype :
here.
- The country code (optional) : limit the search to the specify ISO 3166 country code. Default : search in all countries
- The indentation (optional) : indents the results. Default to false. the possible values are true|false (or "on" when used with the rest service see
) Georss and Atom won't be indented for performance reason
Web service
The full text web service use a servlet to wrap the Java API. It links web parameters to a fulltext query and output the results via HTTP.
All the parameters should be case insensitive. if you've got some problems with case, please notify a bug.
All the parameters should be encoded in
UTF-8 and the URL MUST be
encoded.
Here is a summary of the Web parameters mapping :
Parameter name | Web parameter name |
The searched text | q |
Start pagination index | from |
End pagination index | to |
Output format | format |
Language code | lang |
Output style verbosity | style |
placetype | placetype |
country code | country |
indent | indent |
If you use a checkbox in a form to indent the results, the value will be "on" or "off", so for a simple use : the value of indent, for the fulltext web service can be "true" or "on".
Examples :
http://localhost:8080/fulltext/fulltextsearch?q=paris&from=1&to=10&format=xml&lang=fr&style=short&placetype=city&country=fr&indent=true
http://localhost:8080/fulltext/fulltextsearch?q=paris
Actually, the webservice limits the number of results to 10.
By default the fulltext service is mapped to /fulltext
pattern but you can change it in the WEB-INF/web.xml
Output fields description
Here is a description of all the output fields :
Field | Description | Available from style |
error | A String only present if an error occured (e.g : empty query) The field 'error' appears in the path response/responseHeader/error | ERROR |
feature_id | A unique id that identify the feature | SHORT |
Name | The name of the feature | SHORT |
fully_qualified_name | A name of the form : (adm1Name et adm2Name are printed) Paris, Département de Ville-De-Paris, Ile-De-France, (FR) | SHORT |
placetype | The place Type of the Feature | SHORT |
country_code | The ISO 3166 country code | SHORT |
country_name | The name of the country the features belongs to | SHORT |
zipcode | The zipcode (only for city plactetype) | SHORT |
google_map_url | The URL to get the location on Google Map | MEDIUM |
yahoo_map_url | The URL to get the location on Yahoo Map | MEDIUM |
country_flag_url | The relative URL to get the country flag image | MEDIUM |
feature_class | The feature Class. More... | MEDIUM |
feature_code | The feature Code. More... | MEDIUM |
population | How many people lives in this feature | MEDIUM |
elevation | Elevation in meters | MEDIUM |
name_ascii | The ascii name | MEDIUM |
timezone | The timezone (e.g :Europe/Paris) | MEDIUM |
gtopo30 | Average elevation of 30'x30' (ca 900mx900m) area in meters | MEDIUM |
lat | The latitude (north-south) | MEDIUM |
lng | The longitude (east-West) | MEDIUM |
adm1_code | The internal code for the administrative division of level 1 | LONG |
adm2_code | The internal code for the administrative division of level 2 | LONG |
adm3_code | The internal code for the administrative division of level 3 | LONG |
adm4_code | The internal code for the administrative division of level 4 | LONG |
adm1_name | The name of the administrative division of level 1 | LONG |
adm2_name | The name of the administrative division of level 2 | LONG |
adm3_name | The name of the administrative division of level 3 | LONG |
adm4_name | The name of the administrative division of level 4 | LONG |
name_alternate | The alternate names of the feature that without specific language code | LONG |
name_alternate_languagecode | The alternate names of the feature for this language Code | LONG |
adm1_name_alternate | The alternate names of the administrative division of level 1 without specific language code | FULL |
adm1_name_alternate_languagecode | The alternatenames of the administrative division of level 1 for this language Code | FULL |
adm2_name_alternate | The alternate names of the administrative division of level 2 without specific language code | FULL |
adm2_name_alternate_languagecode | The alternatenames of the administrative division of level 2 for this language Code | FULL |
adm3_name_alternate | The alternate names of the administrative division of level 3 without specific language code | FULL |
adm3_name_alternate_languagecode | The alternatenames of the administrative division of level 3 for this language Code | FULL |
adm4_name_alternate | The alternate names of the administrative division of level 4 without specific language code | FULL |
adm4_name_alternate_languagecode | The alternatenames of the administrative division of level 4 for this language Code | FULL |
country_name_alternate | The alternate names of the country without specific language code | FULL |
country_name_alternate_languagecode | The alternate names of the country for this language Code | FULL |
Java API
The fulltext API looks like this
Click on the UPPERCASE parameters above to see the description of the parameter.
Here is an example :
Pagination pagination = paginate().from(1).to(10);
Output output = Output.withFormat(OutputFormat.XML)
.withLanguageCode("FR").withStyle(OutputStyle.SHORT)
.WithIndentation();
FulltextQuery fulltextQuery = new FulltextQuery("Paris Texas",
pagination, output, City.class, "US");
String result = fullTextSearchEngine.executeQueryToString(fulltextQuery);
You can output results to an OutputStream (useful for servlet use) or a String.
The API is thread safe.
It is possible to create a query directly from a HTTP servlet request
The methods are Designed with
DSL (Domain Specific Language), and can be chained as in the example above.
Geolocalisation service
Description
The geolocalisation service allows to search for features around earth location.
you can
- Specify GPS position
- Limit the results to a specific place type (e.g : search all monuments around a point)
- Limit the results to a specified radius
- Paginate the results
- Tells if you want the output to be indented (currently, apply only for XML, not json for performance reasons. may change in next version)
Parameters
- The latitude (required) (north-south) for the location point to search around. The value is a float and can be negative. It use GPS coordinates.
Examples :
- The longitude (required) (east-West) for the location point to search around. The value is a float and can be negative. It use GPS coordinates.
Examples :
- The radius (optional) : distance from the location point in meters we'd like to search around. The value is a number > 0 if it is not specify or incorrect : The default value will be used (10 km).
Examples :
- Start pagination index (optional) : The first pagination index. Numbered from 1. If the number is < 1 or not specified, it will be set to the default value : 1.
- End pagination Index (optional) : The last pagination index. if < 1 or not specified, it will be set to startindex + 10.
- The output format (optional) : The formats available are :
- The placetype (optional) : limit the search to the specified place type. place type regroup some feature class and feature code. you need to specify the class coresponding to the place type you want to search. default : search for all features
For performance reasons, it is highly recommended to specify a placetype (if you use web GUI, put a required parameter with the placetype)
See a full list and explanation of placetype :
here.
- The indentation (optional) : indents the results. Default to false. the possible values are true|false (or "on" when used with the rest service see
If you use a checkbox in a form to indent the results, the value will be "on" or "off", so for a simple use : the value of indent, for the geoloc web service can be "true" or "on".
Actually, only XML can be indented. It is not a bug : Indent JSON is possible but decrease performance. If it is a critical need you can see an example of how indent json in source code of GeolocSearchEngine.
Web service
The geolocalisation web service use a servlet to wrap the Java API. It links web parameters to a geoloc query and output the results via HTTP.
All the parameters should be case insensitive. if you've got some problems with case, please notify a bug.
All the parameters should be encoded in
UTF-8 and the URL MUST be
encoded.
Here is a summary of the Web parameters mapping :
Parameter name | Web parameter name |
Latitude | lat |
Longitude | lng |
Radius | radius |
Start pagination index | from |
End pagination index | to |
Output format | format |
Placetype | placetype |
Indent | indent |
Examples :
http://localhost:8080/geoloc/findnearbylocation?lat=4.5&lng=5.7&radius=5000&from=1&to=10&format=xml&placetype=city&indent=true
http://localhost:8080/geoloc/findnearbylocation?lat=4.5&lng=5.7
Actually, the webservice limits the number of results to 10.
By default the geolocalisation service is mapped to /geoloc
pattern but you can change it in the WEB-INF/web.xml
Output fields description
Here is a description of all the output fields :
Field | Description |
error | A String only present if an error occured (e.g : empty Latitude or longitude) | |
numFound | The number of results display with this query (it takes the pagination into account) |
QTime | The execution time of the query in ms |
distance | The distance beetween the point and the gisFeature in meters |
Name | The name of the feature |
asciiName | The ASCII name of the feature |
feature_id | A unique id that identify the feature | SHORT |
countryCode | The ISO 3166 country code |
google_map_url | The URL to get the location on Google Map |
country_flag_url | The relative URL to get the country flag image |
yahoo_map_url | The URL to get the location on Yahoo Map |
zipcode | The zipcode (only for city plactetype) |
featureClass | The feature Class. More... |
featureCode | The feature Code. More... |
placeType | The Type of Feature see faq |
population | How many people lives in this feature |
lat | The latitude (north-south) |
lng | The longitude (east-West) |
adm1Code | The internal code for the administrative division of level 1 |
adm2Ccode | The internal code for the administrative division of level 2 |
adm3Code | The internal code for the administrative division of level 3 |
adm4Code | The internal code for the administrative division of level 4 |
adm1Name | The name of the administrative division of level 1 |
adm2Name | The name of the administrative division of level 2 |
adm3Name | The name of the administrative division of level 3 |
adm4Name | The name of the administrative division of level 4 |
timezone | The time zone (e.g : Europe/Paris) |
gtopo30 | Average elevation of 30'x30' (ca 900mx900m) area in meters |
elevation | The elevation in meters |
Java API
The geoloc API looks like this
Click on the UPPERCASE parameters above to see the description of the parameter.
Here is an example :
Point point = GeolocHelper.createPoint(-3.5F, 45F);
Pagination pagination = paginate().from(1).to(10);
Output output = Output.withFormat(OutputFormat.XML)
.WithIndentation();
GeolocQuery geolocQuery = new GeolocQuery(point,100000
pagination, output, City.class);
String result = geolocSearchEngine.executeQueryToString(geolocQuery);
The methods are Designed with
DSL (Domain Specific Language), and can be chained as in the example above.
You can output results to an OutputStream (useful for servlet use) or a String.
The API is thread safe.
It is possible to create a query directly from a HTTP servlet request
Admin Interface
to access the admin interface :
Login - Password
You can insert the two default users with the provided script in the sql directory : insert_users.sql
There is two default users already set :
user | password | profile | Description |
user | user | ROLE_USER | user with simple rights : can not admin other users, can only edit his profile, can not import data |
admin | admin | ROLE_ADMIN | user with all rights : can admin other users and profiles, can edit options,can import data. |
It is highly recommended to change the default users of the admin interface. To do so : You must login as 'admin' with password 'admin' or edit the 'insert_users.sql' file in the sql directory, set the users / passwords / roles, and run the script
Import data
To import data, you must log With a user with admin rights. Then go to the Administration menu -> Run importer and check the configuration. Click on the 'Run importer' link.
a page with the import status will be display and refresh every minuts.
The importer process may takes more than 24 hours, depending on how much data you import and the machine the importer runs on. (some
dumps will be soon availables)
because The admin inteface is based on Appfuse : If you have some questions about basic features of the admin interface, see
Appfuse documentation
Screenshots
some screenshots are available here
Security
Default admin password
It is highly recommended to change the default users of the admin interface. To do so : You must login as 'admin' with password 'admin'
Protect webServices
Some users wants to restrict the solr engine to the host 'localhost' in order to disallow user to ask the SolR search engine directly. You can use a firewall and restrict the access of the Webapp with the following code
With Tomcat :
<Context path="/path/to/secret_files">
<Valve className="org.apache.catalina.valves.RemoteAddrValve"
allow="127.0.0.1" deny=""/>
</Context>
With Jetty :
A server configuration-XML-file can look something like this:
<Configure class="org.mortbay.jetty.Server">
...
<Call name="addContext">
...
<Call name="addHandler">
<Arg>
<New class="IPAccessHandler">
<Set name="Standard">deny</Set>
<Set name="AllowIP">192.168.0.103</Set>
<Set name="AllowIP">192.168.0.100</Set>
</New>
</Arg>
</Call>
See more on http://www.jdocs.com/jetty/5.1.11.rc0/org/mortbay/http/handler/IPAccessHandler.html
Performance
Jmeter
Some Jmeter benchs are available (scripts and results) here.
Database optimisations
in order to have good performances it is recommended to use database indexes. as far as I know it is possible to tell Hibernate to create index but not possible to choose the type of index (BTREE,GIST, and so on) with annotations. so you must create your own index with the following code :
DROP INDEX IF EXISTS locationindex ;
CREATE INDEX locationindex
ON gisfeature
USING gist
("location");
VACUUM FULL ANALYZE;
You can proceed for all the tables which have Geometry collumns.
A script named 'createGISTIndex.sql' is provided in the 'SQL' dir in the Gisgraphy distribution to create all the GIST indexes for all the tables
You will have GREATER performnance if you specify a placetype, if you search for placetype 'gisFeature', your query will be slower.
You can use command line but PGAdmin could be a friendly way.
If you use Postgis 1.3.1 or greater you don't have to use the GIST indexes because they will automatically be used.
More .
It is recommended to run :
VACUUM FULL ANALYZE;
on postgres, after an import
JVM optimisations
You will have better performances if you run Gisgraphy and the SolR server in two distinct JVM.
To run solr in a separated JVM copy the solr Directory (default parameter) with the schema.xml, solrconfig.xml, the data directory, and so on to a SolR distribution and start it with java -jar start.jar
(or an other way of your choice, that's the easier way).
Then you can remove the solr.war from the Gisgraphy release and configure the fulltextsearch URL to point to the new Solr URL.
It is also recommended to use the sun JVM (not the GCJ one) and to use the VMargs -server