International Address Parser documentation

Description
Free access
Webservice
Java API
How it works
batch processing
Languages supported
Implemented countries
Countries not yet implemented
Supported formats by country
Output fields
Output format
Known issues
Links
[top]

Description

The address parser, is a part of the gisgraphy project (free open source worldwide geocoder). Address parsing is the process of dividing a single address string into its individual component parts : it is case insensitive, manage several language, can handle several alphabet (not only ASCII character are accepted), accept address on multiples lines manage abbreviations, synonyms. number can be as digit, letter, or Roman.
It implements all the Universal Postal Union Specifications and the common usage rules, PObox, frequently used in each country (street intersection, workarounds,...) it is based on semantic processing and dictionary.
[top]

Free access

The address parser web service is available for free :

If you need dedicated access, please send a mail
[top]

Webservice

All the parameters should be encoded in UTF-8 and the URL MUST be encoded.


Here is a summary of the Web parameters that address parser accepts :
Parameter namerequiredDefault valuedescription
addressyesnoneThe address to parse
countryyesnoneThe ISO 3166 country code of the country of the address
formatnoXMLOutput format of the response : XML, JSON, PHP, PYTHON, RUBY, PHP
callbacknononeThe callback method name, used to wrap the content into a method name, must be alphanumeric and operate only for script outputformat (json,php,ruby,python)
indentnoXMLWether the feed should be indented, the value can be 'true','false', or 'on' (this is usefull if you use a checkbox in a form)

Examples :
http://addressparser.gisgraphy.com/addressparser/?address=123 3/4 N name with space 1 number blvd south floor 2 Missouri CA 12345-4536&country=us&indent=true&format=json


[top]

Java API

	AddressParserClient addressParser = new AddressParserClient();
	String rawAddress = "101 Avenue des Champs-Elysées 75008 Paris";
	AddressQuery query =new AddressQuery("101 Avenue des Champs-Elysées 75008 Paris", "FR");
	AddressResultsDto results =addressParser.geocode(query);
	/* or 
	Address address=new Address();
	address.setCity("Paris");
	address.setZipCode("75008");
	address.setHouseNumber("101");
	address.setStreetType("Avenue")
	address.setStreetName("des Champs-Elysées");
	AddressResultsDto result = addressParser.execute(address,"FR");
	*/
	System.out.println("Query tooks "+result.getQTime()+" ms and"+
		" return "+result.getNumFound()+" result(s)");
	for (Address address : results.getResult()){
		System.out.println("housenumber : "+address.getHouseNumber());
		System.out.println("streetType : "+address.getStreetType());
		System.out.println("streetname : "+address.getStreetName());
		System.out.println("PObox : "+address.getPOBox());
		System.out.println("city : "+address.getCity());
		System.out.println("district : "+address.getDistrict());
		System.out.println("state : "+address.getState());
		//see all fields description above...
		
	}

[top]

Output fields

Here are an exhaustive list of all the fields that the address parser can extract
fielddescriptionExamples of valueExamples in address
idid that identify a feature123456N/A
nameName of the place, it is null in case of address but filled if common place. Name is different than recipient name.Tour eiffelTour eiffel Paris
recipientNameName of the organisation or person at the given addressJack bauerJack Bauer street of philadelphia city, apt 5A, Washington
houseNumberOfficial number assigned to an address by the municipality, several languages supported3;151-125;eight123 street of philadelphia city, apt 5A, Washington
houseNumberInfoAll informations that give extra informations on the house numberbis, ter, quater,125 bis rue de la france 75000 Paris
streetNameThe official name of the street or the ordinal numberMain, 8TH100 MAIN ST POB 1022 SEATTLE WA 98104
streetTypeThe type of the streetstreet,st,bd,dr,bvd,...100 MAIN ST POB 1022 SEATTLE WA 98104
cityThe city or locality, A small town or village name sometimes included in an address when the Delivery Point is outside the boundary of the main Post Town that serves it.APPLEFORDLeda Engineering Ltd APPLEFORD ABINGDON OX14 4PG
dependentLocality"Sub" city attached to a big cityDublinboulevard of liberty Washington
PostTowna city it is required part of all postal addresses in the United KingdomLondon49 Featherstone Street LONDON EC1Y 8SY
stateThe state or county when applicable, can be fullName or abbreviationWA100 MAIN ST POB 1022 SEATTLE WA 98104
districtThe district, mainly use for russiaALEKSCEVSKTY (r-n)ul. Lesnaya d. 5 pos. Lesnoe ALEKSCEVSKTY r-n VORONEJSKAYA obl 247112 RUSSIAN FEDERATION
quarterA section of an urban settlementDOĞANBEY MAH(turkey),French QuarterMebusevleri Mah. Önder Cad. Ankara Ap. 11/8 ALEKSCEVSKTY
zipCodeThe zip or post code98104100 MAIN ST POB 1022 SEATTLE WA 98104
extraInfoInformations on floor, unit, and sometimes POBOX,..floor 6,Hangar of the century100 MAIN ST POB 1022 SEATTLE WA 98104
100 MAIN ST 3rd floor SEATTLE WA 98104
POBoxPost office box, Boite postale, Casilla de Correo,..POB 45, POBOX 52,boite postale 89,Casilla de Correo 17100 MAIN ST POB 1022 SEATTLE WA 98104
100 MAIN ST 3rd floor SEATTLE WA 98104
POBoxInfoextra info on Post office box, Boite postale, Casilla de Correo,..CEDEX 015, rue Foobar, 75725 Paris CEDEX 01
POBoxAgencyAgency where the office box, Boite postale, Casilla de Correo isKHOURIBGA PRINCIPALEP.O 1737 KHOURIBGA PRINCIPALE 25005 KHOURIBGACEDEX
preDirectionThe cardinal direction before the name of the streetN,NE;NorthN broadway bd
postDirectionThe cardinal direction after the name of the street N,NE;North boulevard of liberty north Washington
streetNameIntersectionThe official name of the intersection streetMainN street of philadelhia & W boulevard of liberty Washington
streetTypeIntersectionThe type of the intersection streetstreet,st,bd,dr,bvd,...N street of philadelhia & W boulevard of liberty Washington
preDirectionIntersectionThe cardinal direction before the name of the intersection streetN,NE;NorthN street of philadelhia & W boulevard of liberty Washington
postDirectionIntersectionThe cardinal direction after the name of the intersection street N,NE;NorthN street of philadelhia & boulevard of liberty SOUTH Washington
civicNumberSuffixThe number that follow the house number (Canada only)1/210-123 1/2 main street NW MONTREAL QC H3Z 2Y7
floorThe floor in an address, not a floor number in a unit (Brasilia only)8o andarSBN - Quadra 13 - Bloca B - 8o andar BRASILIA-DF 70002-900
sectorThe sector in an address (Brasilia only)SBNSBN - Quadra 13 - Bloca B - 8o andar BRASILIA-DF 70002-900
quadrantThe quadrant in an address (Brasilia only)Quadra 13SBN - Quadra 13 - Bloca B - 8o andar BRASILIA-DF 70002-900
blockThe block in an address (Brasilia only)
the block in austria, singapore,... address
Bloca B
2
SBN - Quadra 13 - Bloca B - 8o andar BRASILIA-DF 70002-900
Rennbahnweg 25/2/15 1220 WIEN
countryThe country nameUSA
United States
France
Paris - France
countrycodeThe countrycode given in the requestFR
US
DE
N/A

Some other meta-data fields are aslo availables :
fielddescriptionExamples of value
messageWhen informations need to be givenContrycode XX is not implemented
qtimeNumber of milisecond the request has taken100
numFoundNumber of results found10

Note that only unit is supported, not company, gender, firstname, lastname. Mozilla Corporation, 1981 second street building K Mountain View CA 94043-0801 is not a parsable address, but 1981 second street building K Mountain View CA 94043-0801. A support is planed to split the first part of the address (to the first comma)

How it works

[top]
The International address is based on a modular engine, where each country has a list of syntax. So we can add a new country or add a new syntax for a country very simply. Some librairies and dictionary make the engine very customisable and give the ability to developp quickly.

Languages supported

The parser is based on semantic analysis, it us some dictionary for street type, unit, ordinal number, etc... here is a list of already languages:
An implemented language is a language that manage unit, street type, numbers, direction (cardinal point), Post Office Box, etc. Note that all the languages does not need all those type. If the dictionnary is not pertinent, the parser will fail for some parsing. if you want to help, please Send a mail
[top]

Implemented countries

AlgeriaAngolaAmerican SamoaArgentinaArubaAustraliaAustriaBelgiumBonaire, Saint eustatius and SabaBrazilCameroonCanadaChinaCongo (Democratic Republic of)CuraçaoDenmarkFalkland IslandsFaroe Islands, FinlandFranceFrench GuianaGermanyGuadeloupeGuernseyGibraltarGreenlandHong KongIndiaIndonesiaIranItalyIsle of ManJerseyKazakhstanMartiniqueMoroccoNetherlandsNetherlands AntillesNorthern Mariana IslandsNorwayPuerto RicoPolandPortugalReunionRussiaSaint HelenaSaint MartinSaint Pierre and MiquelonSan MarinoSaudi ArabiaSouth Georgia and the South Sandwich IslandsSenegalSingaporeSint MaartenSpainSudanSwedenSwitzerlandTunisiaTurkeyTurks and Caicos IslandsUkraineUnited States Minor Outlying IslandsUnited KingdomUnited States U.S. Virgin IslandsVatican


By continent : world | Africa | Asia | europe | middle east | south america
[top]

Countries not yet implemented

Here is a list of all unimplemented countries. that mean that the default pattern will be used. if you want a new country to be implemented, please contact me it can be implmented very quickly but it depends on the address complexity.

Aland IslandsAlbaniaAndorraAnguillaAntarcticaArmeniaAzerbaijanBahamasBahrainBangladeshBarbadosBelarusBelizeBeninBermudaBhutanBoliviaBosnia and HerzegovinaBotswanaBouvet IslandBritish Indian Ocean TerritoryBritish Virgin IslandsBruneiBulgariaBurkina FasoBurundiCambodiaCameroonCape VerdeCayman IslandsCentral African RepublicChadChileChristmas IslandCocos IslandsColombiaComorosCook IslandsCosta RicaCroatiaCubaCyprusCzech RepublicDenmarkDjiboutiDominicaDominican RepublicEast TimorEcuador, ,  EgyptEl SalvadorEquatorial GuineaEritreaEstoniaEthiopiaFijiFrench PolynesiaFrench Southern TerritoriesGabonGambiaGeorgiaGhanaGreeceGrenadaGuamGuatemalaGuineaGuinea-BissauGuyanaHaitiHeard Island and McDonald IslandsHondurasHungaryIcelandIraqIreland*IsraelIvory CoastJamaica*JapanJordanKenyaKiribatiKosovoKuwaitKyrgyzstanLaosLatviaLebanonLesothoLiberiaLibyaLiechtensteinLithuaniaLuxembourgMacaoMacedoniaMadagascarMalawiMalaysiaMaldivesMaliMaltaMarshall IslandsMauritaniaMauritiusMayotteMexicoMicronesiaMoldovaMonacoMongoliaMontenegroMontserratMozambiqueMyanmarNamibiaNauruNepalNew CaledoniaNew ZealandNicaraguaNigerNigeriaNiueNorfolk IslandNorth KoreaOmanPakistanPalauPalestinian TerritoryPanamaPapua New GuineaParaguayPeruPhilippinesPitcairnQatarRepublic of the CongoRomaniaRwandaSaint BarthélemySaint Kitts and NevisSaint LuciaSaint Vincent and the GrenadinesSamoaSao Tome and PrincipeSerbiaSerbia and MontenegroSeychellesSierra LeoneSlovakiaSloveniaSolomon IslandsSomaliaSouth AfricaSouth KoreaSri LankaSurinameSvalbard and Jan MayenSwazilandSyriaTaiwanTajikistanTanzaniaThailandTogoTokelauTongaTrinidad and TobagoTurkmenistanTuvaluUgandaUnited Arab EmiratesUruguayUzbekistanVanuatuVenezuelaVietnamWallis and FutunaYemenZambiaZimbabwe

Countries won't be implemented

Due to lack of informations, the following countries won't be implemented. it can be done with help of people living in this country :
Antigua and BarbudaWestern SaharaAfghanistan

Supported formats by country

How to read the pattern: the words beetween braquet mean that this is optional. zip could also mean postal code. state could also mean province, and commonly represent an adminitrative division (full name or abbreviation), words beetween comma are necessary : e.g :
[top]

Isle of Man

See United Kingdom
[top]

Guadeloupe

See France

[top]

Hong Kong

[top]

Jersey

See United Kingdom
[top]

Kazakhstan

See Russia

[top]

Martinique

See France

[top]

Moroco

[top]

Turks and Caicos Islands

See United Kingdom

[top]

Netherlands

[top]

Netherlands Antilles

See Netherlands

[top]

Northern Mariana Islands

See United States

[top]

Norway

[top]

Puerto Rico

See United States

[top]

Poland

[top]

Portugal

[top]

Reunion

See france

[top]

Russia

[top]

Saint Helena

See United Kingdom
[top]

Saint Martin

See france
[top]

Saint Pierre and Miquelon

See france
[top]

San Marino

See Italy
[top]

Saudi arabia

[top]

Senegal

See France
[top]

Singapore

[top]

Sint Maarten

See Netherlands
[top]

South Georgia and the South Sandwich Islands

See United Kingdom
[top]

Spain

[top]

France

[top]

Sudan

[top]

Sweden

[top]

Switzerland

[top]

Turkey

[top]

Tunisia

[top]

Ukraine

[top]

United States Minor Outlying Islands

See United States
[top]

United Kingdom

[top]

United States

  • Special notes :
  • [top]

    U.S. Virgin Islands

    See United States
    [top]

    Vatican

    See Italy
    [top]

    known issues

    [top]

    Links