Gisgraphy user guide
To suggest a change or a correction to any part of the documentation, please send a mail to
David Masclet.
technical informations...
What technologies are used ?
Options and environement specific settings
Output fields description
Output fields description
Output fields description
Introduction
version
This documentation is for the last version of gisgraphy (snapshots, nightly builds, and "not-released yet" versions included), you can find documentation and api docs for older versions here
About
Gisgraphy is a free and open source framework. Gisgraphy goal is to provide tools to use free GIS Data on the Web. Actually it manage Geonames and OpenStreetMap (42 million entries). it provides an importer to inject the data into a strongly typed Postgres / Postgis database and use them via webservices : worldwide geocoding, worldwide reverse geocoding, fulltext and find nearby. Results can be output in XML, Atom, RSS, JSON, PHP, Ruby, and Python. Here are the main functionalities :
- Importers from geonames CSV files. Just give the country(ies) you wish to import and / or the placetypes, and Gisgraphy download the files and import them with all the alternateNames (optional), and sync the database with the fulltext search engine
- Importers for Openstreetmap data in csv format (view data)
- WorldWide geocoding / worldWide reverse geocoding / street search WebServices;
- REST WebService
- Several output formats supported : XML, json, PHP, ruby, python, Atom, RSS / GeoRSS
- Full text search (based on Lucene / Solr with default filters optimized for city search
(case insensitivity, separator characters stripping, ..) via an Java API or a webservice
- Findnearby function (with limits, pagination, restrict to a specific country and/or language and other useful options) via a Java API or a Web Service
- An admin / back office whith statistics interface
- Fully replicated / scalable / high performance / cached services
- Search for zipcode name, IATA, ICAO
- Internationalized (with support of cyrillique, arabic, chinese,... alphabet)
- Dojo widgets / prototype / Ajax to ease search but can be use it even if javascript is not enabled on the client side
- Opensearch module
- Plateform / language independent
- Provides all the countries flags in svg and png format
Technical informations...
- Spring Services / Hibernate mapping with its own hibernate dialect
- Lucene Schema
- Use Postgres / postgis but will be soon compliant with other Spatial providers (Oracle, Mysql,...)
- Build using Maven2
- Automatic synchronization between Postgres and Solr
- High test coverage
- Event - based design
- UTF-8 all over the place
- Gisgraphy has its own maven repository with all the dependencies
- Designed with DSL (Domain Specific Language)
- Designed with the DDD paradigm
What's Gisgraphy for ?
- Find streets via a Web service a Java API
- Do worldwide geocoding and reverse geocoding
- Find places (city, lake, forest, country, stream, castle airport, more than 650 different features) via a Web service a Java API
- Find city by Zipcode
- Import geonames features into a strongly typed database with error correction.
- Use Spring hibernate to manage places
- Find places around a GPS point or an other place via a REST API or a Java API
- use Dojo widgets / prototype / javascript to ease search
Some examples of uses
- I want to search the nearest street from a GPS point
- I want to search street in a city
- I want to find a street by name
- I want to know the Length and GPS position of a street
- I ve got a project in an other language than java and I want to offer a city fullText search, and open a popup if more than one result match
- I ve got a java project and i need Spring Dao and services / hibernate Mapping for geoloc
- I want to import all the features of Geonames for United state and use them in Python
- Get the zipcodes of washington
- Find all the forests around Paris and their distances, with a radius of 30 km in spanish, order by distance
- Find all the city that match 'Paris' and paginate from 1 to 1O with scoring.
- Find all the lake of Ireland in JSON format and limit the search to 5 results.
- Find the GPS position of London.
- Search 'Saint-André', 'st andre', 'SaInT andré', 'st-andré'; will return the same instance...
- Search paris texas will not return the french capital.
- Type 'lutece' or 'paname' and get Paris.
- Locate on a google map or yahoo map the results.
- Search the elevation of Madrid.
- What is the flag of Argentina
- Find the administrative division of New york.
- Search the nearest city for the GPS point lat=47.5/Long=5 .
- Search all the monuments / restaurants and hotels around a given GPS position.
- Search the country for a specified GPS point.
- Find the limitroph countries of Brasil.
Requirements
- Gisgraphy needs Java 1.5 or greater
- PostgreSQL with Postgis extension (It is HIGHLY recommended to have postgis 1.3.1 or greater for good performances... more )
- A servlet container if you want to use it as a Servlet (Not programatically). Actually Gisgraphy has been tested on Tomcat and Jetty but any Servlet container should be OK
What technologies are used ?
- Java
- Maven 2
- Spring
- Hibernate
- PostgreSQL / Postgis
- SolR / Lucene
- Jersey (probably CXF in next version)
- Hibernate Spatial
- Appfuse
- Dojo / prototype (in Version 1.0)
Placetype schema
Here is a simple diagramme that represents relation beetween the placetype, gisFeature, Adm, country, and languages.
GisFeature is the mother class of all the placetype, it has all the basic informations.
Language represents a spoken language
AlternateName is a specific name for a specific language.
All the placetype extends gisFeature. Such as :
- City
- Country
- Adm (aka : administrative division)
- All the other placetypes
Adm are in tree structure.
A GisFeature :
- Belongs to an Adm
- Belongs to a country
- Has some alternate names for a specific language
License
Gisgraphy is licensed under the terms of the LGPL V3.0 License
Installation
Import Data
To import data, you must use the admin interface. see the admin interface section for more informations
Debug Mode
Each Gisgraphy service (Geoloc and fulltext) is a servlet.
Each servlet can be run in debug Mode. The error message will be in verbose mode. To do so just declare the servlet init parameter "debugMode" to true (in the web.xml) as shown :
<servlet>
<servlet-name>street service</servlet-name>
<servlet-class>
com.gisgraphy.servlet.StreetServlet
</servlet-class>
<init-param>
<!-- if true the output field error will contains the exception message. Default to false -->
<param-name>debugMode</param-name>
<param-value>true</param-value>
</init-param>
<load-on-startup>1</load-on-startup>
</servlet>
<servlet>
<servlet-name>geoloc service</servlet-name>
<servlet-class>
com.gisgraphy.servlet.GeolocServlet
</servlet-class>
<init-param>
<!-- if true the output field error will contains the exception message. Default to false -->
<param-name>debugMode</param-name>
<param-value>true</param-value>
</init-param>
<load-on-startup>1</load-on-startup>
</servlet>
<servlet>
<servlet-name>fulltext service</servlet-name>
<servlet-class>
com.gisgraphy.servlet.FulltextServlet
</servlet-class>
<init-param>
<!-- if true the output field error will contains the exception message. Default to false -->
<param-name>debugMode</param-name>
<param-value>true</param-value>
</init-param>
<load-on-startup>2</load-on-startup>
</servlet>
Options and environement specific settings
All the options and environement specific settings are located in the env.properties file. The env.properties is located in the $GISGRAPHYDISTRIBUTION/webapps/ROOTWEB-INF/classes directory of the Gisgraphy distribution
Take care of white spaces in properties file : MyProperties=bar is not the same as MyProperties= bar (whitespace after the equals sign are taken into accounts)
importer.geonames.dir
This option determines the directory where the Geonames files are located. It must ends with / or \ according to the system path separator. it is also the directory where the Geonames files will be downloaded from the importer.geonames.downloadURL
URL
. On Windows The '\' character must be escaped as in the example bellow . The path can be absolute or relative (from the directory where you've launch Gisgraphy). It is not recommended to put space in the path.
. This option is case sensitive if the underlaying file system is case sensitive (e.g : Linux / Unix).
Examples on Linux :
importer.geonames.dir=./data/prod/
importer.geonames.dir=/home/user/data/prod/
Example on Windows
importer.geonames.dir=.\\data\\prod\\
importer.openstreetmap.dir
This option determines the directory where the openStreetMap files are located. It must ends with / or \ according to the system path separator. it is also the directory where the openStreetMap files will be downloaded from the importer.openstreetmap.downloadURL
URL
. On Windows The '\' character must be escaped as in the example bellow . The path can be absolute or relative (from the directory where you've launch Gisgraphy). It is not recommended to put space in the path.
. This option is case sensitive if the underlaying file system is case sensitive (e.g : Linux / Unix).
Examples on Linux :
importer.openstreetmap.dir=./data/prod/
importer.openstreetmap.dir=/home/user/data/prod/
Example on Windows
importer.openstreetmap.dir=.\\data\\prod\\
importer.geonames.enabled
Wether the importers related to Geonames will be processed. If 'true' the importer will be done. This option should be in lower case
Examples :
importer.geonames.enabled=true
importer.geonames.enabled=false
importer.openstreetmap.enabled
Wether the importers related to Openstreetmap will be processed. If 'true' the importer will be done. This option should be in lower case
Examples :
importer.openstreetmap.enabled=true
importer.openstreetmap.enabled=false
Don't forget the ending slash (or backslash if you use windows) !
importer.geonames.downloadURL
This option determines the URL of the server to download the Geonames files to be processed. This option is case sensitive
Example :
importer.geonames.downloadURL=http://download.geonames.org/export/dump/
Don't forget the ending slash for the URL !
importer.openstreetmap.downloadURL
This option determines the URL of the server to download the OpenStreetMap files to be processed. This option is case sensitive
Example :
importer.openstreetmap.downloadURL=http://download.gisgraphy.com/openstreetmap/
Don't forget the ending slash for the URL !
importer.geonamesfilesToDownload
This option determines the files to be download from the importer.geonames.downloadURL
URL. The files must be ';' separated. You can specify any files, but if you download from the Geonames server, you should specify country ZIP files (or allcountries.zip) and alternateNames.zip . This option is case sensitive.
Examples :
importer.importer.geonamesfilesToDownload=AD.zip;CY.zip
importer.importer.geonamesfilesToDownload=allCountries.zip
If you run an import and the change the option and re-run an import : you must delete the old downloaded file before re-run the import. If you don't : the files you've downloaded will be processed.
this option has been renamed from importer.filesToDownload to importer.geonamesfilesToDownload in order to specify a different url for openstreet map (importer.openstreetmapfilesToDownload)
It is not necessary To download CountryInfo.txt,iso-languagecodes.txt, because there are already in the Gisgraphy distrib. Just focus on the country you want to process and alternatenames.zip if you want to import alternate names.
If allCountries.txt file is in the importer.geonames.dir
Directory, the other countries files will be logically ignored.
importer.openstreetmapfilesToDownload
This option determines the files to be download from the importer.openStreetMap.downloadURL
URL. The files must be ';' separated. You can specify any files, but if you download from the Gisgraphy server, you should specify country ZIP files (or allcountries.zip) and alternateNames.zip . This option is case sensitive.
Examples :
importer.openstreetmapfilesToDownload=AD.zip;CY.zip
importer.openstreetmapfilesToDownload=allCountries.zip
importer.retrieveFiles
Wether the files defined by the importer.filesToDownload
option should be downloaded. If 'true' the importer will download the files according to the importer.filesToDownload
option. If 'false', it will use the files already presents in importer.geonames.dir
. This option should be in lower case
Examples :
importer.retrieveFiles=true
importer.retrieveFiles=false
fulltextSearchUrl
The URL of the SolR Server. If you use the SolR server of the Gisgraphy distribution : the URL should be the Gisgraphy URL follow by solr (name of the war file). If you need better performance, (that's to say run Gisgraphy and the SolR server in two distinct JVM. see jvm optimisation) : specify the URL of the server you want to use. This option is case sensitive.
Examples :
fulltextSearchUrl=http://localhost:8080/solr/
Example with the default SolR port
fulltextSearchUrl=http://localhost:8983/solr/
importerConfig.wrongNumberOfFieldsThrows
Wether we should throws an exception and stop the import if a line in an imported file haven't the expected number of fields (CSV fields). This option should be in lower case. it is recommended to set it to false in a standard import. if you use Gisgraphy for errors correction, set it to true.
Even if you want to do errors correction, and set it to false, you can see the reported errors in log files and fix them at the end of the import. It can be easier to fix all the errors in one time.
Examples :
importerConfig.wrongNumberOfFieldsThrows=true
importerConfig.wrongNumberOfFieldsThrows=false
importerConfig.missingRequiredFieldThrows
Wether we should throws an exception and stop the import of data if a required field is missing. This option should be in lower case. It is recommended to set it to false in a standard import. If you use Gisgraphy for errors correction, set it to true.
Even if you want to do errors correction, and set it to false, you can see the reported errors in log files and fix them at the end of the import. It can be easier to fix all the errors in one time.
Examples :
importerConfig.missingRequiredFieldThrows=true
importerConfig.missingRequiredFieldThrows=false
importerConfig.acceptRegExString
List of regular expresions separated by ';' that determines the feature class / code to be imported. The default value is .* (all the feature class / code), if the value is not specified.
The regular expressions must match featureClass.featureCode
. The gisFeature which matches "A.ADM." (administrative divisions) and "A.PCL." (countries) regex are automaticaly imported. This option is case sensitive and the should be set in upper case because feature class / code are in upper case.
imported (Administrative division and country).
Examples :
.* : import all gisfeatures, no matter their feature class / code
P[.]PPL[A-Z&&[^QW]];P[.]PPL$;P[.]STLMT$ : import Israeli settlements and all the cities except destroyed and abandoned city
V.FRST. : import all the forests
P[.]PPL[A-Z&&[^QW]];P[.]PPL$;P[.]STLMT$;V.FRST. : import Israeli settlements and all the cities except destroyed and abandoned city, and the forests
importerConfig.tryToDetectAdmIfNotFound
If this option is set to true : The importer will try to detect Adm for features if the AdmXcodes values does not correspond to a known Adm. set this option to true activate errors correction. If set to false errors correction is disabled and if no Adm is found for the AdmXcode, the feature will be linked to a null Adm.
This option is case sensitive and must be set in lower case.
Example : There is an adm with level 2 which have adm1Code = 'A1' and adm2Code = 'B2' in the datastore, suppose there is a gisFeature which have Adm1code='A3' and Adm2Code='B2', Gisgraphy will detect an error because there is no Adm with those codes. so if this option is set to true, Gisgraphy will correct the error and will link the feature to the Adm with codes adm1Code = 'A1' and adm2Code = 'B2'. If if this option is set to false, Gisgraphy won't try to correct the error, put a warning message in logs, and links the Feature to a null Adm.
Examples :
importerConfig.tryToDetectAdmIfNotFound=true
importerConfig.tryToDetectAdmIfNotFound=false
importerConfig.syncAdmCodesWithLinkedAdmOnes
This option is a little bit difficult to understand. An example is often simpler than a big speech ;). First, there is a few little thing to know : a feature has the following properties:
FeatureId......Adm...Adm1Code...Adm2Code...Adm3Code...Adm4Code...adm1Name...Adm2Name...adm3Name...Adm4Name...
and an Adm is a feature too and has the same properties.
So a feature is linked to an administrative division (AKA : Adm). For performance reasons, the codes and names of the Adms are stored in the Feature itself too.
Now consider the example above : if there is an error the adm will not be the same as the Codes in The CSV files. this option allow to choose beetween two strategy :
- If
importerConfig.syncAdmCodesWithLinkedAdmOnes
is set to 'true' then the admXcodes and admXnames of the features will be the same as the linked Adm One (in our example : if importerConfig.tryToDetectAdmIfNotFound
is set to 'true' the adm1Code will be 'A1' and the Adm2Code will be 'B2' that's to say the same as the Adm ones)
- If
importerConfig.syncAdmCodesWithLinkedAdmOnes
is set to 'false', the admXcodes and admXnames of the features will be the same as the CSV file (in our example : if importerConfig.tryToDetectAdmIfNotFound
is set to 'true', the adm1Code will be 'A3' and the Adm2Code will be 'B2' And the Linked Adm will be null)
In other words if you want the importer to set the admXcode and admXnames with the CSV one : set this option to false. if you want those codes to be the same as the linked Adm : set it to true.
if you don't know what to do : set it to the recommended value : true. This option is case sensitive and must be set in lower case.
importerConfig.tryToDetectAdmIfNotFound and importerConfig.syncAdmCodesWithLinkedAdmOnes are orthogonal concepts
importerConfig.admXExtracterStrategyIfAlreadyExists
In order to import the Adm before the other features, Gisgraphy extract the Adm1, Adm2, Adm3 and Adm4 files. This option tells what to do if an AdmX file (determines with the importerConfig.admXFileName
option) is already present in the importer.geonames.dir
.This option is case sensitive. 3 options are available :
- skip : the extract will be skiped, and the file already presents will be used
- backup : the file already present will be backup (with the date and the current time), a new file will be extracted and used
- reprocess : the file will be replace by the new one
Examples :
importerConfig.adm3ExtracterStrategyIfAlreadyExists=reprocess
importerConfig.adm4ExtracterStrategyIfAlreadyExists=skip
importerConfig.adm1FileName
Specify the filename of the CSV file with Administrative division with level 1. Should normally be 'admin1Codes.txt'. This option is case sensitive if the underlaying file system is case sensitive
importerConfig.adm2FileName
Specify the filename of the CSV file with Administrative division with level 2. Should normally be 'admin2Codes.txt'. This option is case sensitive if the underlaying file system is case sensitive
importerConfig.adm3FileName
Specify the filename of the CSV file with Administrative division with level 3. Should normally be 'admin3Codes.txt'. This file name will be used to extract Adm with level 3.This option is case sensitive if the underlaying file system is case sensitive
importerConfig.adm4FileName
Specify the filename of the CSV file with Administrative division with level 4. Should normally be 'admin4Codes.txt'. This file name will be used to extract Adm with level 4.This option is case sensitive if the underlaying file system is case sensitive
importerConfig.languageFileName
Specify the filename of the CSV file with languages. should normally be 'iso-languagecodes.txt'. This option is case sensitive if the underlaying file system is case sensitive
importerConfig.countriesInfosFileName
Specify the filename of the CSV file with countries informations. Should normally be 'countryInfo.txt'. This option is case sensitive if the underlaying file system is case sensitive. This option is not the list of countries to import.
To be clearer the option importerConfig.countriesFileName has been renamed to importerConfig.countriesFileName (version >= 2.0 beta2)
importerConfig.alternateNamesFileName
Specify the filename of the CSV file with alternate names. Should normally be 'alternateNames.txt'. This option is case sensitive if the underlaying file system is case sensitive
importerConfig.alternateNameFeaturesFileName
Specify the name of the file where the alternate names of features that are not adm1, adm2, or country are (extracted). Should normally be 'alternateNames-features.txt'. This option is case sensitive if the underlaying file system is case sensitive
importerConfig.alternateNameAdm1FileName
Specify the the name of the file where the alternate names of adm with level 1 are (extracted). Should normally be 'alternateNames-adm1.txt'. This option is case sensitive if the underlaying file system is case sensitive
importerConfig.alternateNameAdm2FileName
Specify the the name of the file where the alternate names of adm with level 2 are (extracted). Should normally be 'alternateNames-adm2.txt'. This option is case sensitive if the underlaying file system is case sensitive
importerConfig.alternateNameCountryFileName
Specify the name of the file where the alternate names of countries are . Should normally be 'alternateNames-country.txt'. This option is case sensitive if the underlaying file system is case sensitive
importerConfig.importGisFeatureEmbededAlternateNames
Some of the alternate names are provided in each country dump file and all the alternate names with languages and additionnal information are in a separated file name. The alternate names provided in the country files are incomplete. if you want to import the alternate names of the country files (faster but a lot of informations are lost) set this option to true, in that case the importerConfig.alternateNamesFileName will be ignored. if you want a full import with the alternatenames separated file set this option to false. This option is case sensitive and must be set in lowercase.
Examples :
importerConfig.importGisFeatureEmbededAlternateNames=true
importerConfig.importGisFeatureEmbededAlternateNames=false
fulltextsearch.maxConnectionsPerHost
Limits the numbers of connections to the SolR server per host. Recommended : 32.
fulltextsearch.maxTotalConnections
Limits the numbers of connections to the SolR server for all hosts. Recommended : 128.
geolocsearch.defaultGeolocSearchPlaceType
define the default placetype Class for the geoloc query. An empty or wrong value will search for any placetype by default. This option is case sensitive. and the placeType must be in the entity package
Examples :
geolocsearch.defaultGeolocSearchPlaceType=City
geolocsearch.defaultGeolocSearchPlaceType=
This name of the class should not ends with '.class' but is case sensitive.
spellchecker.enabled
Enable or disable the spellchecker for the fulltext search engine. 'true' or 'false' are possible values. this option is case sensitive.
Examples :
spellchecker.enabled=true
spellchecker.enabled=false
Spellchecking is only available from gisgraphy 1.1.
spellchecker.activeByDefault=true
Define the default value if the spellchecking parameter is not set. 'true' or 'false' are possible values. this option is case sensitive.
Examples :
spellchecker.activeByDefault=true
spellchecker.activeByDefault=false
spellchecker.spellcheckerDictionaryName
The name of the SolR spellChecker to use. it must match the name in the solrconfig.xml file. This option is case sensitive and you must set the name in the solrconfig.xml file and in the env.properties in lower case. by default two spellchecker are define : 'levenstein' and 'jarowinkler'. In practice jarowinkler give better results.
Examples :
spellchecker.spellcheckerDictionaryName=jarowinkler
spellchecker.spellcheckerDictionaryName=levenstein
spellchecker.collateResults
For a request with several word, return a string with the best suggestion for each word. for instance for 'pariss frence' => 'Paris France' will be suggest. This option is case sensitive. 'true' and 'false' are possible values
Examples :
spellchecker.collateResults=true
spellchecker.collateResults=false
googleMapAPIKey
The google maps api key. it is required if you want to use Google maps features (see the result of a geocoding search for example). see more on the Google Maps page to sign up)
Examples :
googleMapAPIKey=ABQIAAAAC0kUg2SfDYBO-AEagcTgvhQ5aXWj7Kef4ih_G0qG0UGxHdmrpFrmSD7sGMwTJIN1g7C45waZ5ybiQ
googleanalytics.uacctcode
The google analytics code to have statistics (see more on the Google analytics page)
Examples :
googleMapAPIKey=ABQIAAAAC0GGGDYBO-AEagcTgvhQ5aXWj7Kef4ih_G0qG0UGxHdmrpFrmSD7sGMwTJIN1g7C45waZ5ybiQ
spellchecker.collateResults
For a request with several word, return a string with the best suggestion for each word. for instance for 'pariss frence' => 'Paris France' will be suggest. This option is case sensitive. 'true' and 'false' are possible values
Examples :
spellchecker.collateResults=true
spellchecker.collateResults=false
Import failure
When an error occured during import. You have to:
- Find and repair the error
- Reset the database(A script is provided in the sql directory), because some data are already in the database and you'll have duplicate key / constraints exceptions.
- Restart the web application, in order to flush the cache and to reset the configuration : importers keep informations of what have been imported. So you must restart the web application in order to clear those informations
- Re-run the import
Never re-run an import before cleaning the database and restart the web application, it will failed!!.
street / geocoding service
Description
The street / geocoding service allows to search for street around earth location.
you can :
- Specify GPS position
- Give the begining of the name of the street
- Limit the results to a specific type (e.g : Pedestrian, highway, residential, ... 25 type availables)
- Limit the results to a specified radius
- Limit the results to one way street
- Paginate the results
- Tells if you want the output to be indented (currently, apply only for XML, not json for performance reasons. may change in next version)
the service is design to allow search with autosuggestion and autocompletion
Parameters
- The latitude (required) (north-south) for the location point to search around. The value is a float and can be negative. It use GPS coordinates.
Examples :
- The longitude (required) (east-West) for the location point to search around. The value is a float and can be negative. It use GPS coordinates.
Examples :
- The begining of the name of the street (optional) : limit the search to the street that starts with the specified name (WITHOUT STREET NUMBER). You must put the name and the type. example : 'Boulevard Paul'.the search is case insensitive. default : search for all street
- The search mode (optional, default to fulltext)) : Two mode are possible to search for a street name :
- CONTAINS : the name of the street shoul contain the name given in the request. this mode is used to to autocompletion. the place of the word have importance
- FULLTEXT (Default value) : the name of the street should contains the word given in the request. this mode is used to do autosuggetion and fulltext search. the place of the word doesn't care
in Both mode the given name is case insensitive and accent insensitive.
Examples, we'd like to find "Champs'-Elysées" :
- with the CONTAINS mode : 'amps el' will match, 'elysees c' won't match, 'elysees champs' won't match, 'champs elysees' will match
- With the FULLTEXT mode : 'amps el' won't match, 'elysees c' won't match, 'elysees champs' will match, 'champs elysees' will match
- The radius (optional) : distance from the location point in meters we'd like to search around. The value is a number > 0 if it is not specify or incorrect : The default value will be used (10 km).
Examples :
- whether the street should be a oneWay street (optional) : limit the search to the street that are one way street
- Start pagination index (optional) : The first pagination index. Numbered from 1. If the number is < 1 or not specified, it will be set to the default value : 1.
- End pagination Index (optional) : The last pagination index. if < 1 or not specified, it will be set to startindex + 50.
- The output format (optional) : The formats available are :
- Distance field : Wether (or not) we want the distance field to be filled. this option is usefull when we don't care about the distance (e.g : we search for name) to improve the performances. Off course, the resultq won't be sorted by distance.
If you use a checkbox in a form to indent the results, the value will be "on" or "off", so for a simple use : the value for the web service can be "true" or "on".
- The indentation (optional) : indents the results. Default to false. the possible values are true|false (or "on" when used with the rest service see
If you use a checkbox in a form to indent the results, the value will be "on" or "off", so for a simple use : the value of indent, for the web service can be "true" or "on".
Actually, only XML can be indented. It is not a bug : Indent JSON is possible but decrease performance. If it is a critical need you can see an example of how indent json in source code of streetSearchEngine.
Web service
The street/ geocoding web service use a servlet to wrap the Java API. It links web parameters to a street query and output the results via HTTP.
All the parameters should be case insensitive. if you've got some problems with case, please notify a bug.
All the parameters should be encoded in
UTF-8 and the URL MUST be
encoded.
Here is a summary of the Web parameters mapping :
Parameter description | Web parameter name |
Latitude | lat |
Longitude | lng |
name of the street | name |
search mode | mode |
one way street | oneway |
Type of the street | streettype |
Radius | radius |
Start pagination index | from |
End pagination index | to |
Output format | format |
Distance field | distance |
Indent | indent |
Examples :
http://localhost:8080/street/streetsearch?lat=4.5&lng=5.7&radius=5000&from=1&to=10&format=xml&name=strip&mode=fulltext&indent=true
http://localhost:8080/street/streetsearch?lat=4.5&lng=5.7
Actually, the webservice limits the number of results to 50. but it can be changed (at compilation time)
By default the geolocalisation service is mapped to /street
pattern but you can change it in the WEB-INF/web.xml
street type
type are group by type. here is a list of type a street can have :
- BYWAY
- MINOR
- SECONDARY_LINK
- CONSTRUCTION
- UNSURFACED
- BRIDLEWAY
- PRIMARY_LINK
- LIVING_STREET
- TRUNK_LINK
- STEPS
- PATH
- ROAD
- PEDESTRIAN
- TRUNK
- MOTROWAY
- CYCLEWAY
- MOTORWAY_LINK
- PRIMARY
- FOOTWAY
- TERTIARY
- SECONDARY
- TRACK
- UNCLASSIFIED
- SERVICE
- RESIDENTIAL
Output fields description
Here is a description of all the output fields, :
Field | Description | Applicable for |
error | A String only present if an error occured (e.g : empty Latitude or longitude) | When error occured |
numFound | The number of results display with this query (it takes the pagination into account) | |
QTime | The execution time of the query in ms | |
Query | The name of the street that has been search (aka : name) | |
distance | The distance beetween the point and the nearest point to the street in meters | |
Name | The name of the feature | |
gid | Unique id of the street | |
streetType | The type of the street (see street type list) | |
oneWay | Whether the street is a one way street or not | |
lat | The latitude of the middle of the street(north-south) | |
lng | The longitude of the middle of the street(east-west) | |
countryCode | The ISO 3166 country code | |
Some fields were not available in older version of gisgraphy. please see
old versions
Java API
The geoloc API looks like this
Click on the UPPERCASE parameters above to see the description of the parameter.
Here is an example :
Point point = GeolocHelper.createPoint(-3.5F, 45F);
Pagination pagination = paginate().from(1).to(10);
Output output = Output.withFormat(OutputFormat.XML)
.WithIndentation();
StreetSearchQuery streetQuery = new StreetSearchQuery((point,100000
pagination, output, StreetType.PEDESTRIAN,false,"Avenue des c");
String result = geolocSearchEngine.executeQueryToString(streetQuery);
The methods are Designed with
DSL (Domain Specific Language), and can be chained as in the example above.
You can output results to an OutputStream (useful for servlet use) or a String.
The API is thread safe.
It is possible to create a query directly from a HTTP servlet request
Full text service
Description
The full text service allows to search for features / places.
you can
- Specify one or more words
- Search for text or zip code
- Limit the results to a specific
- Paginate the results
- Specify the ouput verbosity
- Tells if you want the output to be indented
The search is case insensitive, use synonyms (Saint/st, ..), separator characters stripping, ...
Parameters
- The searched text (required) : The text for the query, it can be a zip code, a String or one or more String.
Examples :
- Start pagination index (optional) : The first pagination index. Numbered from 1. If the number is < 1 or not specified, it will be set to the default value : 1.
- End pagination Index (optional) : The last pagination index. if < 1 or not specified, it will be set to startindex + 10.
- The output format (optional) : The formats available are :
- The language code (optional) : The iso 639 Alpha2 or alpha3 Language Code.
Some properties as the AlternateName AdmNames and countryname belongs to a certain language code, The language parameter can limits the output of those fields to a certain language (it only apply for the FULL style) :
- If the language code does not exists or is not specified, properties with all the languages are retirved
- If it exists, the properties with the specified language code, are retrieved
use the alpha2 code when possible, only use the alpha3 code when no alpha2 code exists for the language.
- The output style verbosity (optional) : Determines the output verbosity. 4 styles are available :
- Short : feature_id, name, fully_qualified_name, zipcode (if city or city subdivision), placetype, country_code, country_name
- Medium (default) : Short + lat, lon, feature_class, feature_code, population, fips,
- Medium (if country) continent, currency_code, currency_name, fips_code, isoalpha2_country_code, isoalpha3_country_code, postal_code_mask, postal_code_regex,
phone_prefix, spoken_languages, tld, capital_name, area
- Medium (adm only) level
- Long : Medium + adm1_name, adm2_name, adm3_name, Adm4_name, adm1_code, Adm2_code, Adm3_code, Adm4_code
- Full : Long + alternateNames, country_alternate_name, adm1_alternate_name, adm2_alternate_name. If the language parameter is specified : only alternate names with the specified language are retrive, otherwise, all the alternate names for all the languages are retrieve
For a full list and descrition of output fields : see
bellow.
- The placetype (optional) : limit the search to the specified place type. place type regroup some feature class and feature code. you need to specify the class coresponding to the place type you want to search. default : search for all features
See a full list and explanation of placetype :
here.
- The country code (optional) : limit the search to the specify ISO 3166 country code. Default : search in all countries
- The indentation (optional) : indents the results. Default to false. the possible values are true|false (or "on" when used with the rest service see more
) Georss and Atom won't be indented for performance reason
- The spellchecking (optional) : whether some suggestions should be provided if no results are found. default value is the value of the spellchecker.activeByDefault option (see more) .
Web service
The full text web service use a servlet to wrap the Java API. It links web parameters to a fulltext query and output the results via HTTP.
All the parameters should be case insensitive. if you've got some problems with case, please notify a bug.
All the parameters should be encoded in
UTF-8 and the URL MUST be
encoded.
Here is a summary of the Web parameters mapping :
Parameter name | Web parameter name |
The searched text | q |
Start pagination index | from |
End pagination index | to |
Output format | format |
Language code | lang |
Output style verbosity | style |
placetype | placetype |
country code | country |
indent | indent |
spellchecking | spellchecking |
If you use a checkbox in a form to indent the results, the value will be "on" or "off", so for a simple use : the value of indent, for the fulltext web service can be "true" or "on".
Examples :
http://localhost:8080/fulltext/fulltextsearch?q=paris&from=1&to=10&format=xml&lang=fr&style=short&placetype=city&country=fr&indent=true
http://localhost:8080/fulltext/fulltextsearch?q=paris
Actually, the webservice limits the number of results to 10.
By default the fulltext service is mapped to /fulltext
pattern but you can change it in the WEB-INF/web.xml
Output fields description
Here is a description of all the output fields :
Field | Description | Available from style |
error | A String only present if an error occured (e.g : empty query) The field 'error' appears in the path response/responseHeader/error | ERROR |
feature_id | A unique id that identify the feature | SHORT |
Name | The name of the feature | SHORT |
fully_qualified_name | A name of the form : (adm1Name et adm2Name are printed) Paris, Département de Ville-De-Paris, Ile-De-France, (FR) | SHORT |
placetype | The place Type of the Feature | SHORT |
country_code | The ISO 3166 country code | SHORT |
country_name | The name of the country the features belongs to | SHORT |
zipcode | The zipcodes | SHORT |
google_map_url | The URL to get the location on Google Map | MEDIUM |
yahoo_map_url | The URL to get the location on Yahoo Map | MEDIUM |
country_flag_url | The relative URL to get the country flag image | MEDIUM |
feature_class | The feature Class. More... | MEDIUM |
feature_code | The feature Code. More... | MEDIUM |
population | How many people lives in this feature | MEDIUM |
elevation | Elevation in meters | MEDIUM |
name_ascii | The ascii name | MEDIUM |
timezone | The timezone (e.g :Europe/Paris) | MEDIUM |
gtopo30 | Average elevation of 30'x30' (ca 900mx900m) area in meters | MEDIUM |
lat | The latitude (north-south) | MEDIUM |
lng | The longitude (east-West) | MEDIUM |
continent | The continent the country belongs (only for country placetype) | MEDIUM |
currency_code | The ISO 4217 Currency from the curencycode (only for country placetype) | MEDIUM |
currency_name | The name of the curency of the country (only for country placetype) | MEDIUM |
fips_code | The FIPS Code of the country (only for country placetype) | MEDIUM |
isoalpha2_country_code | The ISO 3166 alpha 2 code of the country (only for country placetype) | MEDIUM |
isoalpha3_country_code | The ISO 3166 alpha 3 code of the country (only for country placetype) | MEDIUM |
postal_code_mask | The mask that postal codes should verify. e.g : ##### (only for country placetype) | MEDIUM |
postal_code_regex | The regular expression that postal codes should verify (only for country placetype) | MEDIUM |
phone_prefix | The phone prefix of the country. e.g : +33 .(only for country placetype) | MEDIUM |
spoken_languages | Liste of languages spoken in the country (only for country placetype) | MEDIUM |
tld | Top level domain of the country (only for country placetype) | MEDIUM |
capital_name | Name of the capital of the country(only for country placetype) | MEDIUM |
area | Area of the country in m² (only for country placetype) | MEDIUM |
level | Level of the Adm 1 , 2, 3, or 4(only for Adm placetype) | MEDIUM |
adm1_code | The internal code for the administrative division of level 1 | LONG |
adm2_code | The internal code for the administrative division of level 2 | LONG |
adm3_code | The internal code for the administrative division of level 3 | LONG |
adm4_code | The internal code for the administrative division of level 4 | LONG |
adm1_name | The name of the administrative division of level 1 | LONG |
adm2_name | The name of the administrative division of level 2 | LONG |
adm3_name | The name of the administrative division of level 3 | LONG |
adm4_name | The name of the administrative division of level 4 | LONG |
name_alternate | The alternate names of the feature that without specific language code | LONG |
name_alternate_languagecode | The alternate names of the feature for this language Code | LONG |
adm1_name_alternate | The alternate names of the administrative division of level 1 without specific language code | FULL |
adm1_name_alternate_languagecode | The alternatenames of the administrative division of level 1 for this language Code | FULL |
adm2_name_alternate | The alternate names of the administrative division of level 2 without specific language code | FULL |
adm2_name_alternate_languagecode | The alternatenames of the administrative division of level 2 for this language Code | FULL |
adm3_name_alternate | The alternate names of the administrative division of level 3 without specific language code | FULL |
adm3_name_alternate_languagecode | The alternatenames of the administrative division of level 3 for this language Code | FULL |
adm4_name_alternate | The alternate names of the administrative division of level 4 without specific language code | FULL |
adm4_name_alternate_languagecode | The alternatenames of the administrative division of level 4 for this language Code | FULL |
country_name_alternate | The alternate names of the country without specific language code | FULL |
country_name_alternate_languagecode | The alternate names of the country for this language Code | FULL |
Some fields were not available in older version of gisgraphy. please see
old versions
Java API
The fulltext API looks like this
Click on the UPPERCASE parameters above to see the description of the parameter.
Here is an example :
Pagination pagination = paginate().from(1).to(10);
Output output = Output.withFormat(OutputFormat.XML)
.withLanguageCode("FR").withStyle(OutputStyle.SHORT)
.WithIndentation();
FulltextQuery fulltextQuery = new FulltextQuery("Paris Texas",
pagination, output, City.class, "US");
String result = fullTextSearchEngine.executeQueryToString(fulltextQuery);
You can output results to an OutputStream (useful for servlet use) or a String.
The API is thread safe.
It is possible to create a query directly from a HTTP servlet request
The methods are Designed with
DSL (Domain Specific Language), and can be chained as in the example above.
Geolocalisation service
Description
The geolocalisation service allows to search for features around earth location.
you can
- Specify GPS position
- Limit the results to a specific place type (e.g : search all monuments around a point)
- Limit the results to a specified radius
- Paginate the results
- Tells if you want the output to be indented (currently, apply only for XML, not json for performance reasons. may change in next version)
Parameters
- The latitude (required) (north-south) for the location point to search around. The value is a float and can be negative. It use GPS coordinates.
Examples :
- The longitude (required) (east-West) for the location point to search around. The value is a float and can be negative. It use GPS coordinates.
Examples :
- The radius (optional) : distance from the location point in meters we'd like to search around. The value is a number > 0 if it is not specify or incorrect : The default value will be used (10 km).
Examples :
- Start pagination index (optional) : The first pagination index. Numbered from 1. If the number is < 1 or not specified, it will be set to the default value : 1.
- End pagination Index (optional) : The last pagination index. if < 1 or not specified, it will be set to startindex + 10.
- The output format (optional) : The formats available are :
- The placetype (optional) : limit the search to the specified place type. place type regroup some feature class and feature code. you need to specify the class coresponding to the place type you want to search. default : search for all features
For performance reasons, it is highly recommended to specify a placetype (if you use web GUI, put a required parameter with the placetype)
See a full list and explanation of placetype :
here.
- Distance field : Wether (or not) we want the distance field to be filled. this option is usefull when we don't care about the distance, to improve the performances. Off course, the resultq won't be sorted by distance
If you use a checkbox in a form to indent the results, the value will be "on" or "off", so for a simple use : the value for the web service can be "true" or "on".
- The indentation (optional) : indents the results. Default to false. the possible values are true|false (or "on" when used with the rest service see
If you use a checkbox in a form to indent the results, the value will be "on" or "off", so for a simple use : the value of indent, for the geoloc web service can be "true" or "on".
Actually, only XML can be indented. It is not a bug : Indent JSON is possible but decrease performance. If it is a critical need you can see an example of how indent json in source code of GeolocSearchEngine.
Web service
The geolocalisation web service use a servlet to wrap the Java API. It links web parameters to a geoloc query and output the results via HTTP.
All the parameters should be case insensitive. if you've got some problems with case, please notify a bug.
All the parameters should be encoded in
UTF-8 and the URL MUST be
encoded.
Here is a summary of the Web parameters mapping :
Parameter name | Web parameter name |
Latitude | lat |
Longitude | lng |
Radius | radius |
Start pagination index | from |
End pagination index | to |
Output format | format |
Placetype | placetype |
Distance field | distance |
Indent | indent |
Examples :
http://localhost:8080/geoloc/findnearbylocation?lat=4.5&lng=5.7&radius=5000&from=1&to=10&format=xml&placetype=city&indent=true
http://localhost:8080/geoloc/findnearbylocation?lat=4.5&lng=5.7
Actually, the webservice limits the number of results to 10.
By default the geolocalisation service is mapped to /geoloc
pattern but you can change it in the WEB-INF/web.xml
Output fields description
Here is a description of all the output fields, some fields are specific to certain placetype (e.g : area is only available if the feature is a country) :
Field | Description | Applicable for |
error | A String only present if an error occured (e.g : empty Latitude or longitude) | When error occured |
numFound | The number of results display with this query (it takes the pagination into account) | All placetype |
QTime | The execution time of the query in ms | All placetype |
distance | The distance beetween the point and the gisFeature in meters | All placetype |
Name | The name of the feature | All placetype |
asciiName | The ASCII name of the feature | All placetype |
feature_id | A unique id that identify the feature | All placetype |
countryCode | The ISO 3166 country code | All placetype |
google_map_url | The URL to get the location on Google Map | All placetype |
country_flag_url | The relative URL to get the country flag image | All placetype |
yahoo_map_url | The URL to get the location on Yahoo Map | All placetype |
featureClass | The feature Class. More... | All placetype |
featureCode | The feature Code. More... | All placetype |
placeType | The Type of Feature see faq | All placetype |
population | How many people lives in this feature | All placetype |
lat | The latitude (north-south) | All placetype |
lng | The longitude (east-West) | All placetype |
adm1Code | The internal code for the administrative division of level 1 | All placetype |
adm2Ccode | The internal code for the administrative division of level 2 | All placetype |
adm3Code | The internal code for the administrative division of level 3 | All placetype |
adm4Code | The internal code for the administrative division of level 4 | All placetype |
adm1Name | The name of the administrative division of level 1 | All placetype |
adm2Name | The name of the administrative division of level 2 | All placetype |
adm3Name | The name of the administrative division of level 3 | All placetype |
adm4Name | The name of the administrative division of level 4 | All placetype |
timezone | The time zone (e.g : Europe/Paris) | All placetype |
gtopo30 | Average elevation of 30'x30' (ca 900mx900m) area in meters | All placetype |
elevation | The elevation in meters | All placetype |
zipcode | The zipcodes (only for city and city subdivision), one node by zipcode | City,CitySubdivision, |
level | The level of the Administrative division (1-4) | Adm |
area | The area of the country | Country |
tld | top-level domain name, (last part of an Internet domain name) of the country | Country |
capitalName | The Capital of the country | Country |
continent | The continent the country belongs | Country |
postalCodeRegex | The regexp that all zipcode/postalcode of the country matches | Country |
currencyCode | The Currency code (ISO_4217) of the country | Country |
currencyName | The Currency name of the country | Country |
area | The area of the country | Country |
fipsCode | The fips Code of the country | Country |
equivalentFipsCode | The fips Code of the country when no code are available | Country |
iso3166Alpha2Code | The iso 3166 Alpha 2 code of the country | Country |
iso3166Alpha3Code | The iso 3166 Alpha 3 code of the country | Country |
phonePrefix | The phone prefix of the country | Country |
postalCodeMask | The mask that all postal code of the country matches | Country |
Some fields were not available in older version of gisgraphy. please see
old versions
Java API
The geoloc API looks like this
Click on the UPPERCASE parameters above to see the description of the parameter.
Here is an example :
Point point = GeolocHelper.createPoint(-3.5F, 45F);
Pagination pagination = paginate().from(1).to(10);
Output output = Output.withFormat(OutputFormat.XML)
.WithIndentation();
GeolocQuery geolocQuery = new GeolocQuery(point,100000
pagination, output, City.class);
String result = geolocSearchEngine.executeQueryToString(geolocQuery);
The methods are Designed with
DSL (Domain Specific Language), and can be chained as in the example above.
You can output results to an OutputStream (useful for servlet use) or a String.
The API is thread safe.
It is possible to create a query directly from a HTTP servlet request
Client libraries
Python
You can find the python client here
Java
Not done yet, if you wish to contribute, please send a mail
PHP
Not done yet, if you wish to contribute, please send a mail
Ruby
Not done yet, if you wish to contribute, please send a mail
Admin Interface
to access the admin interface :
Login - Password
You can insert the two default users with the provided script in the sql directory : insert_users.sql
There is two default users already set :
user | password | profile | Description |
user | user | ROLE_USER | user with simple rights : can not admin other users, can only edit his profile, can not import data |
admin | admin | ROLE_ADMIN | user with all rights : can admin other users and profiles, can edit options,can import data. |
It is highly recommended to change the default users of the admin interface. To do so : You must login as 'admin' with password 'admin' or edit the 'insert_users.sql' file in the sql directory, set the users / passwords / roles, and run the script
Import data
To import data, you must log With a user with admin rights. Then go to the Administration menu -> Run importer and check the configuration. Click on the 'Run importer' link.
a page with the import status will be display and refresh every minuts.
The importer process may takes more than 24 hours, depending on how much data you import and the machine the importer runs on. (some
dumps will be soon availables)
because The admin inteface is based on Appfuse : If you have some questions about basic features of the admin interface, see
Appfuse documentation
Screenshots
some screenshots are available here
Security
Default admin password
It is highly recommended to change the default users of the admin interface. To do so : You must login as 'admin' with password 'admin'
Protect webServices
Some users wants to restrict the solr engine to the host 'localhost' in order to disallow user to ask the SolR search engine directly. You can use a firewall and restrict the access of the Webapp with the following code
With Tomcat :
<Context path="/path/to/secret_files">
<Valve className="org.apache.catalina.valves.RemoteAddrValve"
allow="127.0.0.1" deny=""/>
</Context>
With Jetty :
A server configuration-XML-file can look something like this:
<Configure class="org.mortbay.jetty.Server">
...
<Call name="addContext">
...
<Call name="addHandler">
<Arg>
<New class="IPAccessHandler">
<Set name="Standard">deny</Set>
<Set name="AllowIP">192.168.0.103</Set>
<Set name="AllowIP">192.168.0.100</Set>
</New>
</Arg>
</Call>
See more on http://www.jdocs.com/jetty/5.1.11.rc0/org/mortbay/http/handler/IPAccessHandler.html
Performance
Jmeter
Some Jmeter benchs are available (scripts and results) here.
Database optimisations
in order to have good performances it is recommended to use database indexes. as far as I know it is possible to tell Hibernate to create index but not possible to choose the type of index (BTREE,GIST, and so on) with annotations. so you must create your own index with the following code :
DROP INDEX IF EXISTS locationindex ;
CREATE INDEX locationindex
ON gisfeature
USING gist
("location");
VACUUM FULL ANALYZE;
You can proceed for all the tables which have Geometry collumns.
A script named 'createGISTIndex.sql' is provided in the 'SQL' dir in the Gisgraphy distribution to create all the GIST indexes for all the tables
You will have GREATER performnance if you specify a placetype, if you search for placetype 'gisFeature', your query will be slower.
You can use command line but PGAdmin could be a friendly way.
If you use Postgis 1.3.1 or greater you don't have to use the GIST indexes because they will automatically be used.
More .
It is recommended to run :
VACUUM FULL ANALYZE;
on postgres, after an import
JVM optimisations
You will have better performances if you run Gisgraphy and the SolR server in two distinct JVM.
To run solr in a separated JVM copy the solr Directory (default parameter) with the schema.xml, solrconfig.xml, the data directory, and so on to a SolR distribution and start it with java -jar start.jar
(or an other way of your choice, that's the easier way).
Then you can remove the solr.war from the Gisgraphy release and configure the fulltextsearch URL to point to the new Solr URL.
It is also recommended to use the sun JVM (not the GCJ one) and to use the VMargs -server