Skip to content

Commit 922e57c

Browse files
author
Alfiya Tarasenko
committed
Add Python code samples for address standardization, validation, isolines
1 parent aecc544 commit 922e57c

13 files changed

Lines changed: 1155 additions & 3 deletions

File tree

README.md

Lines changed: 44 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -64,7 +64,50 @@ This example demonstrates how to perform reverse geocoding to retrieve addresses
6464
#### APIs Used:
6565
- [Geoapify Reverse Geocoding API](https://www.geoapify.com/reverse-geocoding-api/)
6666

67-
---
67+
Sure! Here's how you can write similar sections for the **Address Standardization** and **Address Validation** examples in the same style as your Reverse Geocode Example:
68+
69+
70+
### **5. Python: Address Standardization Example**
71+
72+
#### Description:
73+
This example demonstrates how to use the Geoapify Geocoding API to geocode addresses and generate standardized address strings based on a custom format.
74+
75+
#### Features:
76+
- Batch geocoding of address lists.
77+
- Flexible address formatting using placeholders (e.g., `{street}`, `{city}`, `{postcode}`).
78+
- Output in both NDJSON (raw results) and CSV (standardized format).
79+
80+
#### APIs Used:
81+
- [Geoapify Forward Geocoding API](https://www.geoapify.com/geocoding-api/)
82+
83+
### **6. Python: Address Validation Example**
84+
85+
#### Description:
86+
This example shows how to validate address accuracy using confidence levels returned by the Geoapify Geocoding API.
87+
88+
#### Features:
89+
- Batch address validation with detailed confidence analysis.
90+
- Classification into `CONFIRMED`, `PARTIALLY_CONFIRMED`, and `NOT_CONFIRMED`.
91+
- Output CSV includes validation results and reasons for uncertainty.
92+
93+
#### APIs Used:
94+
- [Geoapify Forward Geocoding API](https://www.geoapify.com/geocoding-api/)
95+
96+
### **7. Python: Isoline Visualization Example**
97+
98+
#### Description:
99+
This example demonstrates how to use the Geoapify Isoline API to generate and display **isochrones** (time-based) or **isodistances** (distance-based) as interactive polygons on a map using **Folium**.
100+
101+
#### Features:
102+
- Visualizes travel range from a specific location by time or distance.
103+
- Supports multiple travel modes (`drive`, `walk`, `bicycle`, etc.).
104+
- Accepts advanced options like traffic modeling, route optimization, and avoidance.
105+
- Saves and opens an interactive HTML map with isoline overlays.
106+
107+
#### APIs Used:
108+
- [Geoapify Isoline API](https://www.geoapify.com/isoline-api/)
109+
- [Geoapify Map Tiles](https://www.geoapify.com/map-tiles/)
110+
- [Folium Library](https://python-visualization.github.io/folium/)
68111

69112
## Upcoming Code Samples
70113

Lines changed: 253 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,253 @@
1+
# Standardize Addresses with Geoapify API
2+
3+
This project demonstrates how to batch-geocode addresses using the [Geoapify Geocoding API](https://www.geoapify.com/geocoding-api/) and produce standardized outputs in a customizable format.
4+
5+
The script:
6+
- Reads addresses from a text file
7+
- Geocodes each address using the Geoapify Forward Geocoding API
8+
- Saves full geocoding results to an NDJSON file
9+
- Writes a standardized address list to a CSV file using a user-defined template
10+
11+
12+
## Requirements
13+
14+
- Python 3.11 or higher
15+
- `pip` (Python package manager)
16+
17+
## Setup Instructions
18+
19+
### 1. Clone the Repository
20+
21+
```bash
22+
git clone https://geoapify.github.io/maps-api-code-samples/
23+
cd maps-api-code-samples/python/
24+
```
25+
26+
### 2. (Optional) Create a Virtual Environment
27+
28+
```bash
29+
python -m venv env
30+
source env/bin/activate # On Windows: env\Scripts\activate
31+
```
32+
33+
### 3. Install Dependencies
34+
35+
```bash
36+
pip install requests
37+
```
38+
39+
40+
## Usage
41+
42+
```bash
43+
cd address-standardization
44+
45+
python address_standardization.py \
46+
--api_key YOUR_API_KEY \
47+
--input input_example.txt \
48+
--output geocoded.ndjson \
49+
--standardized_output standardized_addresses.csv \
50+
--format "{street} {housenumber}, {city}, {state_code}, {postcode}, {country}"
51+
```
52+
53+
54+
## Command-Line Arguments
55+
56+
| Argument | Required | Description |
57+
|--------------------------|----------|-----------------------------------------------------------------------------|
58+
| `--api_key` | Yes | Your [Geoapify API key](https://myprojects.geoapify.com). |
59+
| `--input` | Yes | Path to the input file (one address per line). |
60+
| `--output` | Yes | Path to the output NDJSON file with full geocoding results. |
61+
| `--standardized_output` | Yes | Path to the CSV file for formatted addresses. |
62+
| `--country_code` | No | Restrict results to a specific country (For example, `us`, `de`, `fr`, etc). |
63+
| `--format` | Yes | Template for standardized output using placeholders (see below). |
64+
65+
66+
## Address Format Placeholders
67+
68+
The `--format` option lets you define how addresses should be output during **address standardization in Python** using this script. You can mix any of the following placeholders:
69+
70+
- `{name}` – Place name
71+
- `{housenumber}` – House/building number
72+
- `{street}` – Street name
73+
- `{suburb}` – Suburb or neighborhood
74+
- `{district}` – District or borough
75+
- `{postcode}` – Postal code
76+
- `{city}` – City or town
77+
- `{county}` – County or administrative division
78+
- `{county_code}` – County code (if available)
79+
- `{state}` – State or province
80+
- `{state_code}` – State code (e.g., `"CA"` for California)
81+
- `{country}` – Country name
82+
- `{country_code}` – Country code (ISO 3166-1 alpha-2)
83+
84+
Missing fields will be replaced with an empty string.
85+
86+
Here are some ready-to-use format examples:
87+
88+
```bash
89+
--format "{street} {housenumber}, {city}, {state_code}, {postcode}, {country}"
90+
```
91+
**Standardized Output:**
92+
`Main Street 12, San Francisco, CA, 94105, United States`
93+
94+
---
95+
96+
```bash
97+
--format "{name}, {street} {housenumber}, {postcode} {city}, {country_code}"
98+
```
99+
**Standardized Output:**
100+
`Googleplex, Amphitheatre Parkway 1600, 94043 Mountain View, US`
101+
102+
---
103+
104+
```bash
105+
--format "{country}, {postcode}-{city}, {street} {housenumber}"
106+
```
107+
**Standardized Output:**
108+
`Germany, 10117-Berlin, Unter den Linden 77`
109+
110+
---
111+
112+
```bash
113+
--format "{housenumber} {street}, {suburb}, {city}, {state}, {country}"
114+
```
115+
**Standardized Output:**
116+
`221B Baker Street, Marylebone, London, England, United Kingdom`
117+
118+
---
119+
120+
```bash
121+
--format "{street}, {city}, {country_code}"
122+
```
123+
**Standardized Output:**
124+
`Champs-Élysées, Paris, FR`
125+
126+
## Example
127+
128+
**Input (`input.txt`):**
129+
```
130+
1600 Amphitheatre Parkway, Mountain View, CA 94043, USA
131+
Unknown Place
132+
123 Example St, Springfield
133+
Platz der Republik, Berlin, Germany
134+
```
135+
136+
**Run:**
137+
```bash
138+
python address_standardization.py \
139+
--api_key YOUR_API_KEY \
140+
--input input.txt \
141+
--output geocoded.ndjson \
142+
--standardized_output standardized_addresses.csv \
143+
--format "{street} {housenumber}, {city}, {state_code}, {postcode}, {country}"
144+
```
145+
146+
**Output (`standardized_addresses.csv`):**
147+
```
148+
Original Address,Standardized Address
149+
"1600 Amphitheatre Parkway, Mountain View, CA 94043, USA","Amphitheatre Parkway 1600, Mountain View, CA, 94043, United States"
150+
"Unknown Place",""
151+
"123 Example St, Springfield","Example St 123, Springfield, IL, 62704, United States"
152+
"Platz der Republik, Berlin, Germany","Platz der Republik, Berlin, BE, 10557, Germany"
153+
```
154+
155+
## How the Script Works
156+
157+
The script performs **address standardization in two main steps**:
158+
159+
### 1. **Geocode Addresses with Rate Limiting**
160+
161+
The function `geocode_addresses()` sends requests to the **Geoapify Geocoding API** in controlled batches.
162+
To comply with the Free plan’s limit of **5 requests per second (RPS)**:
163+
- The addresses are split into chunks of 5.
164+
- Each chunk is processed in parallel using threads.
165+
- After every batch, the script pauses for 1 second before sending the next one.
166+
167+
```python
168+
def geocode_addresses(api_key, addresses, country_code):
169+
# Split addresses into batches
170+
addresses = list(it.batched(addresses, REQUESTS_PER_SECOND))
171+
172+
# Request results asynchronously for each address batch
173+
tasks = []
174+
with ThreadPoolExecutor(max_workers=10) as executor:
175+
for batch in addresses:
176+
logger.info(batch)
177+
tasks.extend([executor.submit(geocode_address, address, api_key, country_code) for address in batch])
178+
sleep(1)
179+
# Wait for results
180+
wait(tasks, return_when=ALL_COMPLETED)
181+
182+
return [task.result() for task in tasks]
183+
```
184+
185+
1. **Batching Input with `itertools.batched()`**
186+
The function uses [`itertools.batched()`](https://docs.python.org/3/library/itertools.html#itertools.batched) to split the input list into chunks of 5 addresses.
187+
This helps control request throughput so that the script doesn't exceed the Geoapify Free plan's **5 requests per second (RPS)** limit.
188+
189+
2. **Asynchronous Execution with `concurrent.futures.ThreadPoolExecutor`**
190+
Each address within a batch is submitted to a thread pool using [`concurrent.futures.ThreadPoolExecutor`](https://docs.python.org/3/library/concurrent.futures.html#concurrent.futures.ThreadPoolExecutor).
191+
This allows geocoding multiple addresses **in parallel**, improving performance and responsiveness.
192+
193+
3. **Rate Limiting via `sleep(1)`**
194+
After each batch, the function waits 1 second (`sleep(1)`) to prevent exceeding the allowed request rate.
195+
196+
4. **Waiting for All Results**
197+
The function uses [`concurrent.futures.wait()`](https://docs.python.org/3/library/concurrent.futures.html#concurrent.futures.wait) to block until all geocoding tasks are complete.
198+
199+
5. **Returning Results**
200+
It collects the final results from each task using `.result()` and returns them as a list.
201+
202+
### 2. **Generate Standardized Addresses**
203+
204+
Once the geocoding results are retrieved:
205+
- The function `generate_standard_addresses()` formats each address using a **user-defined template** (via the `--format` argument).
206+
- Placeholders like `{street}`, `{postcode}`, `{country}` are filled using data from the geocoding response.
207+
- If a result is missing or empty, the standardized address will be an empty string.
208+
- The output is written to a CSV file, pairing the original address with the formatted version.
209+
210+
```python
211+
def generate_standard_addresses(output, addresses, address_format, geocode_results):
212+
# Write csv with standardized addresses
213+
with open(output, 'w', newline='') as f:
214+
csv_writer = csv.writer(f)
215+
csv_writer.writerow(["Original Address", "Standardized Address"])
216+
for address, result in zip(addresses, geocode_results):
217+
# For empty geocoding result set empty string
218+
if not result or result.get('error'):
219+
standardized_address = ''
220+
else:
221+
# Fill template with values, fallback missing data to empty string
222+
standardized_address = address_format.format_map(GeocodeResult(**result))
223+
csv_writer.writerow([address, standardized_address])
224+
```
225+
226+
1. **Opens a CSV file** using [`csv.writer`](https://docs.python.org/3/library/csv.html#csv.writer) and writes a header row:
227+
`"Original Address", "Standardized Address"`
228+
229+
2. **Iterates** through original addresses and geocoding results using `zip()`.
230+
231+
3. **Handles invalid results** (missing or containing `"error"`) by outputting an empty string.
232+
233+
4. **Formats valid results** using [`str.format_map()`](https://docs.python.org/3/library/stdtypes.html#str.format_map) and a `GeocodeResult` dict subclass that safely substitutes missing values with empty strings.
234+
235+
5. **Writes each pair** to the output CSV.
236+
237+
## Learn More
238+
239+
- [Geoapify Geocoding API Documentation](https://apidocs.geoapify.com/docs/geocoding/)
240+
Details about available parameters, usage limits, and response formats.
241+
242+
- [Geocoding API Playground](https://apidocs.geoapify.com/playground/geocoding/)
243+
Try out forward and reverse geocoding interactively.
244+
245+
- [Address Standardization Overview](https://www.geoapify.com/solutions/address-lookup/)
246+
Learn what address standardization is and how to implement it effectively.
247+
248+
- [Get your free Geoapify API key](https://myprojects.geoapify.com/)
249+
Sign up and start using the API with free daily limits.
250+
251+
## License
252+
253+
MIT License. See `LICENSE` file for details.

0 commit comments

Comments
 (0)