Geocoding & reverse-geocoding with Google Map

Since I don’t think the documentation from Google is very clear on how to actually implement the Geocoding API, especially to a newbie like I am, here are the steps to geocode an address or reverse-geocode a set of coordinates and to get the key you want

Reverse-geocoding

Here is a short piece of code in Python

import googlemaps
gmaps = googlemaps.Client(key='<enter your Google API key here>’)

result = gmaps.reverse_geocode((40.714224, -73.961452))
print(result)

If you run that code, what you will get is a very long json. It is important to use an online tool to have a good understanding of your code’s result. Here is how it looks

full result

Go through each number to see what keys interest you. Let’s say if you are interested in [0] in the json. Here is the code to get it:

import googlemaps
gmaps = googlemaps.Client(key='<enter your Google API key here>’)

result = gmaps.reverse_geocode((40.714224, -73.961452))
print(result[0])

Here is how it looks

partial result

Continue to parse through the JSON to get the key you want such as result[0][‘geometry’]…

Geocoding

import googlemaps
gmaps = googlemaps.Client(key='<enter your Google API key here>’)

result = gmaps.reverse_geocode(‘1600 Amphitheatre Parkway, Mountain View, CA’)
print(result)

The rest is similar to Reverse-coding

Concatenate strings to form a full address

If you have address, city, state, zip code and countries in different columns, one quick way to form a full address is

df[‘fulladdress’] = df[[‘Address’, ‘City’, ‘State’, ‘Zip’]].apply(lambda x: ‘ ‘.join(x.astype(str)), axis=1)

Df stands for the current data frame. df[‘fulladdress’] refers to the creation of a new column called “full address”. The whole line of code means to add values on the same row in the address, city, state and zip columns to form a full address accordingly.

Hope it helps

Turn Excel into GEOJSON

My Capstone project requires me to turn Excel in GEOJSON for mapping purposes. Handling and preparing data is 50% of the whole process. I’d like to share what I did step by step, hoping that it will be useful to some who are learning the ropes like I am.

I am using Python as the programming language of choice and Pycharm as IDE. Create a folder on your computer and store the Excel file in question in it. Open the folder in Python and create a new Python file. Here is how it looks on my Pycharm

ExceltoGEOJSON_1

Before we move forward, it’s important to know what a GEOJSON is and how it looks. This website offers a great review on GEOJSON. In terms of structure, a GEOJSON file looks like this

{
  "type": "FeatureCollection",
  "features": [
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [0, 0]
      },
      "properties": {
        "name": "null island"
      }
    }
  ]
}

I am pretty sure we can have as many variables under “properties” as we want. The rest should be standard to be followed as possible.

This is how the Excel file looks. Notice that there are coordinates available already. In the future, I’ll work on geocoding an address into coordinates.

ExceltoGEOJSON_2

Let’s start working on the Python code.

import pandas as pd

df = pd.read_excel('CommunityPartner.xls')

Import the “pandas” package. Shorten the package’s name as pd because who would want to repeat a long name many times in the code?

The following line is to read the Excel file into a data frame called df. You can name it however you want. Since the Excel file and the Python code are in the same folder, there is no need to have a directory. Otherwise, it’s necessary to have a full directory.

collection = {'type': 'FeatureCollection', 'features': []}

The next step is to create a shell dictionary. Refer back to the sample structure of a GEOJSON file above to see why I structure the collection variable like that.

df['description'] = df[['Address', 'City', 'State', 'Zip']].apply(lambda x: ' '.join(x.astype(str)), axis=1)

Since we don’t have a full address, the above line is to combine four columns together to form a full address string. The next step is to populate the dictionary

def feature_from_row(CommunityPartner, latitude, longitude, fulladdress, Primary, Website, Phone):
    feature = {'type': 'Feature', 'properties': {'PartnerName': '', 'Address': '', 'marker-color': '',
                                                 'Website': '', 'PrimaryMission': '', 'Phone': ''},
               'geometry': {'type': 'Point', 'coordinates': []}
               }
    feature['geometry']['coordinates'] = [longitude, latitude]
    feature['properties']['PartnerName'] = CommunityPartner
    feature['properties']['Address'] = fulladdress
    feature['properties']['Website'] = Website
    feature['properties']['PrimaryMission'] = Primary
    feature['properties']['Phone'] = Phone
    if Primary == "Economic Sufficiency":
        feature['properties']['marker-color'] = "FF5733"
    elif Primary == "Social Justice":
        feature['properties']['marker-color'] = "FFF033"
    elif Primary == "Health and Wellness":
        feature['properties']['marker-color'] = "74FF33"
    elif Primary == "Environmental Stewardship":
        feature['properties']['marker-color'] = "338DFF"
    elif Primary == "Educational Support":
        feature['properties']['marker-color'] = "CE33FF"
    else:
        feature['properties']['marker-color'] = "FF3374"
    collection['features'].append(feature)
    return feature

Create a function that will undertake the data processing. Between the brackets are the input variables.

feature = {'type': 'Feature', 'properties': {'PartnerName': '', 'Address': '', 'marker-color': '',
                                             'Website': '', 'PrimaryMission': '', 'Phone': ''},
           'geometry': {'type': 'Point', 'coordinates': []}
           }

Create a “feature”variable as above. Try to mirror it in “type” and “geometry” agains the standard GEOJSON (see above) as much as possible. Leave the “coordinate” value as empty to fill in later. Under “properties”, list the keys you want.

feature['geometry']['coordinates'] = [longitude, latitude]
feature['properties']['PartnerName'] = CommunityPartner
feature['properties']['Address'] = fulladdress
feature['properties']['Website'] = Website
feature['properties']['PrimaryMission'] = Primary
feature['properties']['Phone'] = Phone

Time to populate the keys. Remember to key the names of the keys and input variables consistent with what was already posted so far.

You must wonder: what about “marker-color”. You can use the conditional argument to assign values to the variable as follows:

if Primary == "Economic Sufficiency":
    feature['properties']['marker-color'] = "FF5733"
elif Primary == "Social Justice":
    feature['properties']['marker-color'] = "FFF033"
elif Primary == "Health and Wellness":
    feature['properties']['marker-color'] = "74FF33"
elif Primary == "Environmental Stewardship":
    feature['properties']['marker-color'] = "338DFF"
elif Primary == "Educational Support":
    feature['properties']['marker-color'] = "CE33FF"
else:
    feature['properties']['marker-color'] = "FF3374"

If you wonder about the HTML color code, just Google “HTML Color Code” and you’ll see it.

collection['features'].append(feature)
return feature

The first line of the block above dictates that we add every single row of the Excel file to the “features” key of the collection variable. “Return” is a mandatory feature of every function.

geojson_series = df.apply(
    lambda x: feature_from_row(x['CommunityPartner'], x['Lat'], x['Longitude'], x['description'], x['Primary'],
                               x['Website'], x['Phone']),
    axis=1)

jsonstring = pd.io.json.dumps(collection)

The first line is to add every single row of the Excel file to the function so that we can create the string needed for the GEOJSON. The second line is to turn it into json file.

output_filename = 'CommunityPartner.geojson' 
with open(output_filename, 'w') as output_file:
    output_file.write(format(jsonstring))

Name the file however you want and use the second line to write it into GEOJSON. The file product will look like this:

import pandas as pd

df = pd.read_excel('CommunityPartner.xls') #Get the Excel file from static/Excel

collection = {'type': 'FeatureCollection', 'features': []}

df['description'] = df[['Address', 'City', 'State', 'Zip']].apply(lambda x: ' '.join(x.astype(str)), axis=1)


def feature_from_row(CommunityPartner, latitude, longitude, fulladdress, Primary, Website, Phone):
    feature = {'type': 'Feature', 'properties': {'PartnerName': '', 'Address': '', 'marker-color': '',
                                                 'Website': '', 'PrimaryMission': '', 'Phone': ''},
               'geometry': {'type': 'Point', 'coordinates': []}
               }
    feature['geometry']['coordinates'] = [longitude, latitude]
    feature['properties']['PartnerName'] = CommunityPartner
    feature['properties']['Address'] = fulladdress
    feature['properties']['Website'] = Website
    feature['properties']['PrimaryMission'] = Primary
    feature['properties']['Phone'] = Phone
    if Primary == "Economic Sufficiency":
        feature['properties']['marker-color'] = "FF5733"
    elif Primary == "Social Justice":
        feature['properties']['marker-color'] = "FFF033"
    elif Primary == "Health and Wellness":
        feature['properties']['marker-color'] = "74FF33"
    elif Primary == "Environmental Stewardship":
        feature['properties']['marker-color'] = "338DFF"
    elif Primary == "Educational Support":
        feature['properties']['marker-color'] = "CE33FF"
    else:
        feature['properties']['marker-color'] = "FF3374"
    collection['features'].append(feature)
    return feature


geojson_series = df.apply(
    lambda x: feature_from_row(x['CommunityPartner'], x['Lat'], x['Longitude'], x['description'], x['Primary'],
                               x['Website'], x['Phone']),
    axis=1)

jsonstring = pd.io.json.dumps(collection)

output_filename = 'CommunityPartner.geojson' #The file will be saved under static/GEOJSON
with open(output_filename, 'w') as output_file:
    output_file.write(format(jsonstring))

ExceltoGEOJSON_3

This is how the GEOJSON looks:

 
   “type”:“FeatureCollection”,
   “features”: 
       
         “type”:“Feature”,
         “properties”: 
            “PartnerName”:“75 North”,
            “Address”:“4383 Nicholas St Suite 24 Omaha NE 68131.0”,
            “marker-color”:“FF5733”,
            “Website”:null,
            “PrimaryMission”:“Economic Sufficiency”,
            “Phone”:“402-502-2770”
         },
         “geometry”: 
            “type”:“Point”,
            “coordinates”: 
               -95.957309,
               41.276479
            ]
         }
      },
       
         “type”:“Feature”,
         “properties”: 
            “PartnerName”:“A Time to Heal”,
            “Address”:“6001 Dodge St CEC 216 Suite 219C  Omaha NE 68182.0”,
            “marker-color”:“74FF33”,
            “Website”:null,
            “PrimaryMission”:“Health and Wellness”,
            “Phone”:“402-401-6083” …

One important note. If you are a fan of Jupyter Notebook, beware that there may be a problem when it comes to the last step of the process. Here is how the collection variable looks before being dumped into the GEOJSON file.

ExceltoGEOJSON_4

But I ran into errors in the last step. I spent quite some time on fixing it but I couldn’t.

ExceltoGEOJSON_5

Creating the Python code in Pycharm is much easier and produces the same result. It’s even more convenient if you are in the middle of an application development project.

Hope this post helps. Much thanks to appendto and geoffboeing for inspiration.

 

Create a hover effect on Mapbox

I am sharing my experience in trying to create a hover effect on Mapbox. The first thing to do is to read their example and understand what is going on. Let’s unpack a little bit:

<!DOCTYPE html>
<html>
<head>
    <meta charset='utf-8' />
    <title>Create a hover effect</title>
    <meta name='viewport' content='initial-scale=1,maximum-scale=1,user-scalable=no' />
    <script src='https://api.tiles.mapbox.com/mapbox-gl-js/v0.49.0/mapbox-gl.js'></script>
    <link href='https://api.tiles.mapbox.com/mapbox-gl-js/v0.49.0/mapbox-gl.css' rel='stylesheet' />
    <style>
        body { margin:0; padding:0; }
        #map { position:absolute; top:0; bottom:0; width:100%; }
    </style>
</head>

It’s the <head> of the HTML that has scripts from Mapbox. Just follow them and you’ll be fine. Change the text in <title> to have your own page title.

<body>


—- Your real code goes here —–

</body>

Your real work will go between and . The <div> is a container that refers to the map you are working on. The next part is Mapbox token

mapboxgl.accessToken = '<your access token here>';

To get a token, just create a free account on Mapbox. A free account is allowed up to 50,000 requests a month if I am not mistaken. It should be enough for a student or an enthusiast wishing to try it out. Once you have a token, just put it in between ” in the above line.

Let’s have a base map

var map = new mapboxgl.Map({
    container: 'map',
    style: 'mapbox://styles/mapbox/streets-v9',
    center: [-100.486052, 37.830348],
    zoom: 2
});

The “center” feature’s coordinates refer to where you want to focus on. Get your chosen destination’s coordinates and just put them there. Alternate the two figures in coordinates if you don’t get it right in the first try. “Zoom” is how close you look at the chosen destination. The greater the number, the closer the zoom.

var hoveredStateId =  null;

map.on('load', function () {
    map.addSource("states", {
        "type": "geojson",
        "data": "https://www.mapbox.com/mapbox-gl-js/assets/us_states.geojson"
    });

HoveredStateID is a placeholder variable that will be used later for hover effect. The following code block is to load the base map. Just follow the templates. Three things to note here:

  • “state” refers to the object’s name that contains the GEOJSON data. You can name whatever you want
  • “GEOJSON” refers to to the style of the file. Mapping requires GEOJSON files, whether you load it from an external source, like we do in this case, or from a hardcoded file
  • The link that goes with “data” is where the author stores the data.

One note here: if you use Github or any cloud platform to store and source your file, be careful. For instance, let’s look at a file I have on github.

Github_Link

Just copying the usual link when you access your file on Github like that won’t work. To get the link that works, click on “Raw” and here is how it shows on the screen

Github_Content_Link

Copy the link in the browser. It should work.

Back to the HTML. Add the two “map.addLayer” code sections to what you already have. It should look like the below

map.on('load', function () {
    map.addSource("states", {
        "type": "geojson",
        "data": "https://www.mapbox.com/mapbox-gl-js/assets/us_states.geojson"
    });

    map.addLayer({
        "id": "state-fills",
        "type": "fill",
        "source": "states",
        "layout": {},
        "paint": {
            "fill-color": "#627BC1",
            "fill-opacity": ["case",
                ["boolean", ["feature-state", "hover"], false],
                1,
                0.5
            ]
        }
    });

    map.addLayer({
        "id": "state-borders",
        "type": "line",
        "source": "states",
        "layout": {},
        "paint": {
            "line-color": "#627BC1",
            "line-width": 2
        }
    });

The first addLayer is for the polygon itself while the second one is for the lines between the states. “id” refers to the name of the layer for future reference. Remember to tie the “source” value back to the name of map.addSource. In this case, it’s “states”. The rest is a Mapbox standard template for hover effect. You can change the color whenever you feel like.

The next step is to work on “hover effect”. Add the following code to the end of the previous block

    map.on("mousemove", "state-fills", function(e) {
        if (e.features.length > 0) {
            if (hoveredStateId) {
                map.setFeatureState({source: 'states', id: hoveredStateId}, { hover: false});
            }
            hoveredStateId = e.features[0].id;
            map.setFeatureState({source: 'states', id: hoveredStateId}, { hover: true});
        }
    });

    // When the mouse leaves the state-fill layer, update the feature state of the
    // previously hovered feature.
    map.on("mouseleave", "state-fills", function() {
        if (hoveredStateId) {
            map.setFeatureState({source: 'states', id: hoveredStateId}, { hover: false});
        }
        hoveredStateId =  null;
    });

The first thing to notice is here:  map.on(“mousemove”, “state-fills”, function(e) {

“State-fills” is the “id” of the polygon layer mentioned previously. So whatever name is chosen for that addLayer, it should be used here.

source: ‘states’

In this case, ‘states’ refers to the source of the data in the map.addSource section above. Remember to use the same reference name for the source. The rest is just a standard template. If you have time, feel free to explore. I am under pressure to deliver features for my Capstone, so I just prefer not touching or changing any of it.

Here is an important note. If you don’t follow, the hover effect won’t work. I use the same code as Mapbox’s example, just changing the GEOJSON source. The hover effect doesn’t work as you can see below:

The key is the data source. Let’s look at the data that Mapbox uses. Here is the tree view of the first item in the polygon array, just to show its structure

GEOJSON_2

Here is the structure of the data I used that led to the unsuccessful “hover effect”

GEOJSON_3

Notice the difference? As far as I am concerned, the hover template in question needs the data to have a certain structure. Otherwise, the code won’t work. Now, there should be other ways to go around this, but if you don’t have time, I’d suggest modifying the data to mirror Mapbox’s example. Here is the structure of my modified data

GEOJSON_4

Does the code work? You bet!

GEOJSON_5

Hopefully this post will be useful to starters like I am.

 

 

Some basic SQL & Plotly in Python

In this post, I’ll document how to use SQL and plotly to create a basic bar chart in Python, specifically in Jupyter Notebook. I’ll try to be as detailed and visual as possible.

The dataset I am going to use is Nike Manufacturing Map:

The first step is to click download the dataset in Excel. Once the file is downloaded, open the file and remove the first row. The first row just has the name of the file, not the columns’ names we need. Since I haven’t figured out how to remove the first row, the fast way is to do it before starting the code to make our life easier. Click save and the data is ready to go

Open Jupyter Notebook. In my case, I use Anaconda. The first thing to do on a new Jupyter Notebook is to import necessary packages. The “as ABC” is to create an alias for packages with long names so that we won’t have a hard time calling those packages later in the code. Imagine having to type “plotly.graph_objs” 10 times instead of “go”.

import plotly.plotly as py #this package is for visualization
import plotly.graph_objs as go #this package is for visualization
import pandas as pd #this package is for handling the dataframe
from pandasql import sqldf #This is to allow users to use SQL queries to create a new dataframe
pysqldf = lambda q: sqldf(q, globals())

Packages

Then, load the file. To check if the file is loaded correctly, use df.head(). The command will show a snapshot of the data frame. If you want to see the whole data frame, just use df

df =pd.read_excel(‘/Users/camapcon/Documents/Github/Nike_Factory_Locations/export.xls’) #read the file in
df.head()

Read the file in

We need to change the names of the two columns highlighted in the screenshot. The way that they are named will make it tricky to process SQL queries. I tried it. It’s better to have more straightforward column names with no blank space in between. It’s pretty easy:

df = df.rename(columns={‘Nike, Inc. Brand(s)’: ‘Brands’, ‘% Female Workers’:’FemaleWorkers’, ‘Total Workers’:’TotalWorkers’})
df.head()

Column name change

The next step is to use SQL to create a summary table whose data will be used to draw bar charts. What we will analyze includes:

  • The total number of workers by brands
  • The number of factories by brands
  • Average number of workers at each factory by brands
  • Average percentage of female workers at each factory by brands

The code to use SQL to create a subset is below:

Primary = “””SELECT Brands, Sum(TotalWorkers) as TotalWorkers, Count(Brands) as Count, AVG(TotalWorkers) as AvgTotalWorkers, AVG(FemaleWorkers) as FemaleWorkers From df Group By Brands””” #create a data frame using a SQL
Primary_df = pysqldf(Primary)
Primary_df

Here is the result:

SQL 1

Notice that the figures aren’t pretty, right? Let’s round them up and make them look prettier

Second = “””SELECT Brands, TotalWorkers, Count, Round(AvgTotalWorkers, 0) as AvgTotalWorkers, Round(FemaleWorkers,0) as FemaleWorkers From Primary_df”””
Primary_df = pysqldf(Second)
Primary_df

Here is how it looks:

SQL 2

It’s time to create the bar charts:

trace1 = go.Bar( #create x and y axes
x=Primary_df[‘Brands’], #to use column Brands as the X axis
y=Primary_df[‘TotalWorkers’], #to use column TotalWorkers as the Y axis
name=’Total Global Workers by Nike Brands’
)
data1 = [trace1]
layout1 = go.Layout( #starting to plot a grouped bar chart
title=’Total Global Workers by Nike Brands’,
barmode=’group’
)

fig = go.Figure(data=data1, layout=layout1)
py.iplot(fig, filename=’grouped-bar’)

Here is how it looks

newplot

You can change the Y axis to another column as you wish. Here is an example:

trace2 = go.Bar( #Project Count by Mission Area
x=Primary_df[‘Brands’],
y=Primary_df[‘FemaleWorkers’],
name=’Average Percentage of Female Workers by Nike Brands’
)
data2 = [trace2]
layout2 = go.Layout( #starting to plot a grouped bar chart
title=’Average Percentage of Female Workers by Nike Brands’,
barmode=’group’
)

fig = go.Figure(data=data2, layout=layout2)
py.iplot(fig, filename=’grouped-bar’)

Try to label different charts differently to avoid confusion and errors. It would make debugging easier as well.

newplot-2

That’s all I have for this post. Just some basic SQL queries and Python code to handle and visualize data with Plotly and Pandas. I still need to practice and explore more. And I urge you to do the same, if you are as inexperienced and interested in data analysis like I am.