Turn Excel into GEOJSON

My Capstone project requires me to turn Excel in GEOJSON for mapping purposes. Handling and preparing data is 50% of the whole process. I’d like to share what I did step by step, hoping that it will be useful to some who are learning the ropes like I am.

I am using Python as the programming language of choice and Pycharm as IDE. Create a folder on your computer and store the Excel file in question in it. Open the folder in Python and create a new Python file. Here is how it looks on my Pycharm

ExceltoGEOJSON_1

Before we move forward, it’s important to know what a GEOJSON is and how it looks. This website offers a great review on GEOJSON. In terms of structure, a GEOJSON file looks like this

{
  "type": "FeatureCollection",
  "features": [
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [0, 0]
      },
      "properties": {
        "name": "null island"
      }
    }
  ]
}

I am pretty sure we can have as many variables under “properties” as we want. The rest should be standard to be followed as possible.

This is how the Excel file looks. Notice that there are coordinates available already. In the future, I’ll work on geocoding an address into coordinates.

ExceltoGEOJSON_2

Let’s start working on the Python code.

import pandas as pd

df = pd.read_excel('CommunityPartner.xls')

Import the “pandas” package. Shorten the package’s name as pd because who would want to repeat a long name many times in the code?

The following line is to read the Excel file into a data frame called df. You can name it however you want. Since the Excel file and the Python code are in the same folder, there is no need to have a directory. Otherwise, it’s necessary to have a full directory.

collection = {'type': 'FeatureCollection', 'features': []}

The next step is to create a shell dictionary. Refer back to the sample structure of a GEOJSON file above to see why I structure the collection variable like that.

df['description'] = df[['Address', 'City', 'State', 'Zip']].apply(lambda x: ' '.join(x.astype(str)), axis=1)

Since we don’t have a full address, the above line is to combine four columns together to form a full address string. The next step is to populate the dictionary

def feature_from_row(CommunityPartner, latitude, longitude, fulladdress, Primary, Website, Phone):
    feature = {'type': 'Feature', 'properties': {'PartnerName': '', 'Address': '', 'marker-color': '',
                                                 'Website': '', 'PrimaryMission': '', 'Phone': ''},
               'geometry': {'type': 'Point', 'coordinates': []}
               }
    feature['geometry']['coordinates'] = [longitude, latitude]
    feature['properties']['PartnerName'] = CommunityPartner
    feature['properties']['Address'] = fulladdress
    feature['properties']['Website'] = Website
    feature['properties']['PrimaryMission'] = Primary
    feature['properties']['Phone'] = Phone
    if Primary == "Economic Sufficiency":
        feature['properties']['marker-color'] = "FF5733"
    elif Primary == "Social Justice":
        feature['properties']['marker-color'] = "FFF033"
    elif Primary == "Health and Wellness":
        feature['properties']['marker-color'] = "74FF33"
    elif Primary == "Environmental Stewardship":
        feature['properties']['marker-color'] = "338DFF"
    elif Primary == "Educational Support":
        feature['properties']['marker-color'] = "CE33FF"
    else:
        feature['properties']['marker-color'] = "FF3374"
    collection['features'].append(feature)
    return feature

Create a function that will undertake the data processing. Between the brackets are the input variables.

feature = {'type': 'Feature', 'properties': {'PartnerName': '', 'Address': '', 'marker-color': '',
                                             'Website': '', 'PrimaryMission': '', 'Phone': ''},
           'geometry': {'type': 'Point', 'coordinates': []}
           }

Create a “feature”variable as above. Try to mirror it in “type” and “geometry” agains the standard GEOJSON (see above) as much as possible. Leave the “coordinate” value as empty to fill in later. Under “properties”, list the keys you want.

feature['geometry']['coordinates'] = [longitude, latitude]
feature['properties']['PartnerName'] = CommunityPartner
feature['properties']['Address'] = fulladdress
feature['properties']['Website'] = Website
feature['properties']['PrimaryMission'] = Primary
feature['properties']['Phone'] = Phone

Time to populate the keys. Remember to key the names of the keys and input variables consistent with what was already posted so far.

You must wonder: what about “marker-color”. You can use the conditional argument to assign values to the variable as follows:

if Primary == "Economic Sufficiency":
    feature['properties']['marker-color'] = "FF5733"
elif Primary == "Social Justice":
    feature['properties']['marker-color'] = "FFF033"
elif Primary == "Health and Wellness":
    feature['properties']['marker-color'] = "74FF33"
elif Primary == "Environmental Stewardship":
    feature['properties']['marker-color'] = "338DFF"
elif Primary == "Educational Support":
    feature['properties']['marker-color'] = "CE33FF"
else:
    feature['properties']['marker-color'] = "FF3374"

If you wonder about the HTML color code, just Google “HTML Color Code” and you’ll see it.

collection['features'].append(feature)
return feature

The first line of the block above dictates that we add every single row of the Excel file to the “features” key of the collection variable. “Return” is a mandatory feature of every function.

geojson_series = df.apply(
    lambda x: feature_from_row(x['CommunityPartner'], x['Lat'], x['Longitude'], x['description'], x['Primary'],
                               x['Website'], x['Phone']),
    axis=1)

jsonstring = pd.io.json.dumps(collection)

The first line is to add every single row of the Excel file to the function so that we can create the string needed for the GEOJSON. The second line is to turn it into json file.

output_filename = 'CommunityPartner.geojson' 
with open(output_filename, 'w') as output_file:
    output_file.write(format(jsonstring))

Name the file however you want and use the second line to write it into GEOJSON. The file product will look like this:

import pandas as pd

df = pd.read_excel('CommunityPartner.xls') #Get the Excel file from static/Excel

collection = {'type': 'FeatureCollection', 'features': []}

df['description'] = df[['Address', 'City', 'State', 'Zip']].apply(lambda x: ' '.join(x.astype(str)), axis=1)


def feature_from_row(CommunityPartner, latitude, longitude, fulladdress, Primary, Website, Phone):
    feature = {'type': 'Feature', 'properties': {'PartnerName': '', 'Address': '', 'marker-color': '',
                                                 'Website': '', 'PrimaryMission': '', 'Phone': ''},
               'geometry': {'type': 'Point', 'coordinates': []}
               }
    feature['geometry']['coordinates'] = [longitude, latitude]
    feature['properties']['PartnerName'] = CommunityPartner
    feature['properties']['Address'] = fulladdress
    feature['properties']['Website'] = Website
    feature['properties']['PrimaryMission'] = Primary
    feature['properties']['Phone'] = Phone
    if Primary == "Economic Sufficiency":
        feature['properties']['marker-color'] = "FF5733"
    elif Primary == "Social Justice":
        feature['properties']['marker-color'] = "FFF033"
    elif Primary == "Health and Wellness":
        feature['properties']['marker-color'] = "74FF33"
    elif Primary == "Environmental Stewardship":
        feature['properties']['marker-color'] = "338DFF"
    elif Primary == "Educational Support":
        feature['properties']['marker-color'] = "CE33FF"
    else:
        feature['properties']['marker-color'] = "FF3374"
    collection['features'].append(feature)
    return feature


geojson_series = df.apply(
    lambda x: feature_from_row(x['CommunityPartner'], x['Lat'], x['Longitude'], x['description'], x['Primary'],
                               x['Website'], x['Phone']),
    axis=1)

jsonstring = pd.io.json.dumps(collection)

output_filename = 'CommunityPartner.geojson' #The file will be saved under static/GEOJSON
with open(output_filename, 'w') as output_file:
    output_file.write(format(jsonstring))

ExceltoGEOJSON_3

This is how the GEOJSON looks:

 
   “type”:“FeatureCollection”,
   “features”: 
       
         “type”:“Feature”,
         “properties”: 
            “PartnerName”:“75 North”,
            “Address”:“4383 Nicholas St Suite 24 Omaha NE 68131.0”,
            “marker-color”:“FF5733”,
            “Website”:null,
            “PrimaryMission”:“Economic Sufficiency”,
            “Phone”:“402-502-2770”
         },
         “geometry”: 
            “type”:“Point”,
            “coordinates”: 
               -95.957309,
               41.276479
            ]
         }
      },
       
         “type”:“Feature”,
         “properties”: 
            “PartnerName”:“A Time to Heal”,
            “Address”:“6001 Dodge St CEC 216 Suite 219C  Omaha NE 68182.0”,
            “marker-color”:“74FF33”,
            “Website”:null,
            “PrimaryMission”:“Health and Wellness”,
            “Phone”:“402-401-6083” …

One important note. If you are a fan of Jupyter Notebook, beware that there may be a problem when it comes to the last step of the process. Here is how the collection variable looks before being dumped into the GEOJSON file.

ExceltoGEOJSON_4

But I ran into errors in the last step. I spent quite some time on fixing it but I couldn’t.

ExceltoGEOJSON_5

Creating the Python code in Pycharm is much easier and produces the same result. It’s even more convenient if you are in the middle of an application development project.

Hope this post helps. Much thanks to appendto and geoffboeing for inspiration.

 

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.