How to create Rolling average in Power BI without data tailing off in the end

Today, I will talk about how to do rolling average in Power BI and how to address the tailing off in the end.

Rolling average is a technique used to address the short term volatility and fluctuation of data. In this example, I used the Apple Mobility Data for the state of New York. The data is from 1st March 2020 to 14th June 2020, which is the latest date when the data is available as of this writing. My dataset has only two columns: date and mobility. Here is what it looks like in Power BI

Apple Mobility for 11th and 12th May 2020 is not available, but you don’t see on the graph above because I smoothed it out by setting the date field to “continuous”.

To create a rolling average in Power BI, it’s actually quite simple. Click on “New Quick Measure” and choose “Rolling Average”, then you just need to fill in the details

Depending on what you are trying to do, you can set an appropriate period as well as the “periods before” and “periods after” fields. In this example, I am looking at 30-day rolling average. Therefore, I set it up that way.

Your data will look like this. There are a few things worth pointing out:

  • To make a new quick measure work, the time variable on the X-axis has to be a date hierarchy
  • That’s why you no longer see the whole graph on the screen. Instead, to see every data point, you now have to scroll horizontally
  • Even though the data stops on 14th June 2020, the graph doesn’t stop until 14th July 2020. It is because right now, you don’t tell Power BI when to stop projecting the data. The value on 14th July 2020 is exactly the value of 14th June 2020 and the upward trend is misleading because the more the graph moves to the right, the fewer data points there are.

If you look at the code for the New Quick Measure, here is what it looks like

To fix the tailing off issue, you just need to modify the code a little bit in the “Return” part, as follows:

The result will look like this

You can see the original data with its own short-term fluctuation in orange and the 30-day rolling average in the blue line. After the code is modified, the blue line now stops on 14th June 2020.

To enhance user experience by eliminating the need to roll horizontally, make sure that the date hierarchy also has “Year” as you can see below

Don’t you think the line chart looks smoother and better now?

Hope this little tutorial helps

Weekly readings – 21st December 2019

Argument against direct listings

What Happens After Prisoners Learn to Code?

Google Culture War Escalates as Era of Transparency Wanes


The Wilderness of Suburban Saigon in 1904. Source: Saigoneer

Popcorn is a serious business at AMC theaters

Why Kansas City’s Free Transit Experiment Matters. Regardless of how this experiment will turn out, it will provide a valuable case study, data and motivation for other cities.

The Man Who Built Amazon’s Delivery Machine

The curtain on Vision Fund and Masa was pulled back a little bit more.

The fall from an icon of Sheryl Sandberg

The horrifying truth behind the track of our location data

Good practices in coding

Like many things in our society, there is also recommended etiquette in coding. There are two practices, in particular, that I find important and useful.

First, it’s beneficial to painstakingly document your code. At the beginning of any program, jog down some lines on what the program is about. Then, before any function, write something about it. If you give aliases to variables or tables that have long names, put down some notes as well. If there is any logic behind the code, make it visible to others too. Often times, folks may understand the mechanics of the code, but don’t understand what the code actually does since they don’t understand the logic.

Below is an excerpt from a document in one of my first coding classes. In our assignments, if we forgot to document our code, we would have 5-10% of our grade taken away.

As highlighted in the screenshot, a detailed documentation is very helpful to not only others looking at your code, but also yourself later on. If a program is complex and there is no documentation, you’ll find it more difficult than it should be to refresh your memory on the code. I have been there and I don’t even write complex code!

Above is an example I had from my programming class. In practice, it doesn’t need to be that detailed, but the description section and the date are necessary in my opinion.

The second practice that I think is useful is to format the code. Normally, we tend to get carried away while coding and neglect how the whole program actually looks. Lines are not aligned. Blocks of code are nested and difficult to read. Brackets are all over the place, making it challenging to debug and understand the code. What I usually do is that after I am sure my program works as expected, I search for a website to help with the formatting of code (it’s easy, just google, for instance, HTML formatter) and have the website re-format the code so that it’s easier to digest.

Python: Most frequent words in a string

This post is my practice on getting the most frequent words in a string in Python. Here is the code

#import the necessary packages
from collections import Counter
import pandas as pd
#open the text file. Here in this case, it is named "text"
with open('text.txt') as fin:
    counter = Counter(fin.read().strip().split())

numbers = sorted(counter.most_common(), key=lambda student: student[1], reverse=True)
top15 = numbers[0:15]

counter.most_common() is the function to get all the words and their respective count from the string. If you put a number, let’s say 10, in the brackets, it means that you want to get only the first 10 elements of the array. Here is how counter.most_common(10) looks:

[(‘to’, 27), (‘the’, 26), (‘of’, 20), (‘and’, 19), (‘in’, 16), (‘a’, 15), (‘is’, 12), (‘for’, 11), (‘it’, 9), (‘new’, 9)]

numbers = sorted(counter.most_common(), key=lambda student: student[1], reverse=True)
top15 = numbers[0:15]

The above code is to get all the words and sort them in the descending order according to the words’ frequency. top15 is to get the first 15 elements of the sorted array. Here is how the top15 looks:

[(‘to’, 27), (‘the’, 26), (‘of’, 20), (‘and’, 19), (‘in’, 16), (‘a’, 15), (‘is’, 12), (‘for’, 11), (‘it’, 9), (‘new’, 9), (‘are’, 9), (‘their’, 8), (‘video’, 7), (‘has’, 7), (‘by’, 7)]

After we get the top 15, we should put them into a data frame so that data processing can be easier. Here is how

text = [] #an array for the words
number = [] #an array for the frequency
for i in top15: #iterate through the top 15
   text.append(i[0])
   number.append(i[1])

#create the data frame
rawdata = {'words': text, 'frequency': number}
df = pd.DataFrame(rawdata, columns = ['words', 'frequency'])

This is the final data frame

    words  frequency
0      to         27
1     the         26
2      of         20
3     and         19
4      in         16
5       a         15
6      is         12
7     for         11
8      it          9
9     new          9
10    are          9
11  their          8
12  video          7
13    has          7
14     by          7

Data Analytics: Klay Thompson’s Performance

This is my data analytics practice by analyzing Klay Thompson’s performance so far in the 2018-2019 season up to 22nd Dec 2018. Klay Thompson is the shooting guard of Golden State Warriors. He is a three time world champion and I am a big fan of his playing style and deadly explosiveness. This post features my findings by analyzing his shot data this season from NBA website here. My code is available on my personal GitHub for your reference.

Findings

  • Klay made about 44% of his shots so far
  • Klay’s successful shots’ average distance to the basket is 15.92m
  • He made more shots in the first half than he did in the second half
  • 67% of Klay’s made shots are two pointers. The rest are three pointers
  • Living up to his name, Klay’s favorite play type is “catch and shoot jump shot”
  • Regarding Klay’s made two-pointers, below is the distribution by distance. He seems to be more effective within 10 feet of the basket and from 15 to 20 feet.
  • In regards to Klay’s three pointers, the distribution by distance to the basket is as follows: (no surprise that the farther he is from the basket, the less lethal he is)

  • As one of the best three point shooters in the league, Klay seems to be equally good throughout the periods of a game, except for the first quarter

Technical lessons I learned from this practice:Pie chart in Python with Matplot

Pie chart in Python

Let’s say you have two variables: TwoPT and ThreePT that stand for the shooting percentage of Klay’s two and three pointers respectively. Here is the code to draw a pie chart

labels = '2PT Field Goal', '3PT Field Goal'
sizes = [TwoPT, ThreePT]
colors = ['green', 'gold']
explode = (0, 0)  # explode 1st slice
 
# Plot
plt.pie(sizes, explode=explode, labels=labels, colors=colors,
        autopct='%1.1f%%', shadow=True, startangle=140)
 
plt.axis('equal')
plt.title("Klay's made shots by shot types")
plt.show()

Nunique function

Imagine if you have a data frame as the following

If you want to count how many events (whether it’s a missed or made shot) by Klay by period, instead of using SQL, the alternative is to use Nunique function. An advantage of using the aggregate is that the outcome is automatically a data frame. The code is as follows:

periodstats = madeshot.groupby(by='period', as_index=False).agg({"game_date": pd.Series.nunique, 'time_remaining': pd.Series.nunique}) #the data frame's name is madeshot. Pd is the abbreviation of Pandas

The result is:

Sort and get the top 10 of a data frame

If your data frame looks like the one below and your intention is to get the top 10 records in terms of “times”, what will you do?


The code I used is pretty straightforward and simple. (The data frame’s name is shotdistance

shotdistance = shotdistance.sort_values(by='times', ascending=False)
shotdistance_top10 = shotdistance.head(10)

Categorize a data frame by bins

If you want to categorize Klay’s shot by distance in terms of “less than 10 feet”, “from 10 to 15 feet” and “from 15 to 20 feet”, for instance, what will you do? The code to turn the distance to categories is:

df1 = pd.cut(TwoPTtype['shot_distance'], bins=[0, 10, 15, 20, 23], include_lowest=True, labels=['Less than 10 feet', 'From 10 to 15 feet', 'From 15 to 20 feet', 'From 20 to 23 feet'])

#pd stands for Pandas
#TwoPTtype is the name of the data frame in question

The result is:

If you merge that data frame with the frequencies in the original data frame:

df1 = pd.cut(TwoPTtype['shot_distance'], bins=[0, 10, 15, 20, 23], include_lowest=True, labels=['Less than 10 feet', 'From 10 to 15 feet', 'From 15 to 20 feet', 'From 20 to 23 feet'])

newdf = pd.concat([df1, TwoPTtype['times']], axis=1)

Checking if a point is in a polygon in Python and Javascript

This post will detail how to determine if a point with a longitude and a latitude resides within a polygon – an area comprised of multiple coordinates. There are two ways to accomplish the task, either in Python or in Javascript. 

In Javacript – Turf package

In order to accomplish the task in Javacript, I used turf package. The first thing to do is to add this line to the HTML where the task takes place

<script src='https://npmcdn.com/@turf/turf/turf.min.js'></script>

Then, this is how the task is completed:

point = turf.point([longitude,latitude]); #the point in question

polygon = turf.polygon(#the polygon in question); #the polygon should come in the form of an array of points

turf.booleanPointInPolygon(point,poly) #return either True or False 

In Python

First, install the following:

from shapely.geometry import shape, Point

Then, this is how the task is accomplished:

coord = Point([longitude, latitude]) #this is the point in question

polygon = shape(property['geometry']) #the geometry part of a typical GEOJSON

polygon.contains(coord) #should return either True or False

Hope it is helpful to whoever is looking for an answer to this issue 

Mapbox – Two ways to show a map

For the past few months, I have been charged with visualizing data onto maps using either Google Maps or Mapbox. I chose the latter. After days and nights of struggle, I am pretty close to the finishing line and have gained quite a bit of experience in Mapbox that I want to share. 

Long story short, to map data onto maps, you need to structure data into a specific structure called GEOJSON. It looks like this:

You can put anything in the “properties” key, but the rest essentially have to follow the above format. The coordinates will be used to locate markers or data on the map. 

Let’s say the data that I want to map has 6 different mission areas (see the screenshot above) and my job is to map them onto the map in 6 different colors. 

I have an array that contains 6 different mission areas like this and 6 different colors representing those areas

var missionarea = [area1, area2, area3, area4, area5, area6]
var colorcode = [color1, color2, color3, color4, color5, color6]

One layer approach

#create a map canvas

var map = new mapboxgl.Map({
    container: 'map',
    style: 'mapbox://styles/mapbox/light-v9',
    center: [-96.797885, 39.363438],
    // initial zoom
    zoom: 3.3
});

#load data and create the layer

map.on("load", function() {

    map.addSource('communityData', {   #name the data as communityData
        type: 'geojson',
        data: communityData, #this is the name of your GEOJSON
    });

    map.addLayer({      #add the layer
        "id": "commMap", #your single layer's name
        "type": "circle", #meaning that each item will be a dot
        "source": "communityData", #the name of your data assigned above
        'layout': {},
        'paint': {
            "circle-radius": 8,
            "circle-opacity": 1,
            "circle-color": {
                "property": "Mission Area",
                "type": 'categorical',
                "stops": [ #assign color to respective mission areas
                    [Missionarea[0], colorcode[0]],
                    [Missionarea[1], colorcode[1]],
                    [Missionarea[2], colorcode[2]],
                    [Missionarea[3], colorcode[3]],
                    [Missionarea[4], colorcode[4]],
                    [Missionarea[5], colorcode[5]],
                ]
            }
        }
    });
}

Multiple Layers

In this example, I’ll label each map layer as “show0”, “show1″…”show5”

var showlist = [] //the array of layerIDs, used to categorize layers
var base = "show"
for (var i = 0; i < Missionarea.length; i++) { //populate showlist array
    var text = base + i
    showlist.push(text)
}
var map = new mapboxgl.Map({
    container: 'map',
    style: 'mapbox://styles/mapbox/light-v9',
    center: [-95.957309, 41.276479],
    // initial zoom
    zoom: 6
});

map.on("load", function() {

    map.addSource('communityData', {
        type: 'geojson',
        data: communityData,
    });
    //*********************************** Load partners *****************************************************

    communityData.features.forEach(function(feature) {
        var primary = feature.properties["Mission Area"];
       // iterate through the mission area array
        for (var i = 0; i < missionarea.length; i++) {
            if (primary == missionarea[i]) {
      // assign a name to each layer
                layerID = showlist[i];
                if (!map.getLayer(layerID)) {
                    map.addLayer({
                        "id": layerID, #layer's name
                        "type": "circle",
                        "source": "communityData",
                        "paint": {
                            "circle-radius": 8,
                            "circle-opacity": 1,
                            "circle-color": colorcode[i], #color
                        },
                        "filter": ["all", ["==", "Mission Area", primary]]
                    })
                }
            }
        }
    });

Which approach is better?

If the intention is just to show the dots, there is no difference and it depends on personal preference. However, if your code gets more complicated and as in my case, I had to create at least 6 filters on the map, things will get messy and one approach will no longer allow you to do what you want. Unfortunately, I don’t have that much experience yet to tell you more and I personally believe it’s a case-by-case thing. 

Tool: Repl.it

I recently and fortunately came across a very interesting tool called Repl.it. Here is what it brings to the table:

Usually, the normal steps in programming include writing code in a text editor such as Pycharm or Eclipse, uploading to a repository such as GitHub and pushing it to a PaaS like Heroku or PythonAnywhere. However, even a text editor such as Pycharm requires some installation and housekeeping that can seem daunting to beginners.

Repl.it lowers that entry barrier. It allows coding in many popular languages right from a browser. Below is a quick code I wrote to have a dropdown menu from 1 to 49:

repl

All it takes is Internet, a browser and one-minute sign-up.

As of now, Repl.it seems to be focused on students. It’s free and its premium packages are very student-friendly. The Classroom Pro package is only $1/student/month. I think coding is fun and Repl.it seems to be highly useful in making coding accessible.

I am not an investor in the tool or one of its employees. Just a fan. I am glad that the startup recently raised some funding from the VCs.

Turn Excel into GEOJSON

My Capstone project requires me to turn Excel in GEOJSON for mapping purposes. Handling and preparing data is 50% of the whole process. I’d like to share what I did step by step, hoping that it will be useful to some who are learning the ropes like I am.

I am using Python as the programming language of choice and Pycharm as IDE. Create a folder on your computer and store the Excel file in question in it. Open the folder in Python and create a new Python file. Here is how it looks on my Pycharm

ExceltoGEOJSON_1

Before we move forward, it’s important to know what a GEOJSON is and how it looks. This website offers a great review on GEOJSON. In terms of structure, a GEOJSON file looks like this

{
  "type": "FeatureCollection",
  "features": [
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [0, 0]
      },
      "properties": {
        "name": "null island"
      }
    }
  ]
}

I am pretty sure we can have as many variables under “properties” as we want. The rest should be standard to be followed as possible.

This is how the Excel file looks. Notice that there are coordinates available already. In the future, I’ll work on geocoding an address into coordinates.

ExceltoGEOJSON_2

Let’s start working on the Python code.

import pandas as pd

df = pd.read_excel('CommunityPartner.xls')

Import the “pandas” package. Shorten the package’s name as pd because who would want to repeat a long name many times in the code?

The following line is to read the Excel file into a data frame called df. You can name it however you want. Since the Excel file and the Python code are in the same folder, there is no need to have a directory. Otherwise, it’s necessary to have a full directory.

collection = {'type': 'FeatureCollection', 'features': []}

The next step is to create a shell dictionary. Refer back to the sample structure of a GEOJSON file above to see why I structure the collection variable like that.

df['description'] = df[['Address', 'City', 'State', 'Zip']].apply(lambda x: ' '.join(x.astype(str)), axis=1)

Since we don’t have a full address, the above line is to combine four columns together to form a full address string. The next step is to populate the dictionary

def feature_from_row(CommunityPartner, latitude, longitude, fulladdress, Primary, Website, Phone):
    feature = {'type': 'Feature', 'properties': {'PartnerName': '', 'Address': '', 'marker-color': '',
                                                 'Website': '', 'PrimaryMission': '', 'Phone': ''},
               'geometry': {'type': 'Point', 'coordinates': []}
               }
    feature['geometry']['coordinates'] = [longitude, latitude]
    feature['properties']['PartnerName'] = CommunityPartner
    feature['properties']['Address'] = fulladdress
    feature['properties']['Website'] = Website
    feature['properties']['PrimaryMission'] = Primary
    feature['properties']['Phone'] = Phone
    if Primary == "Economic Sufficiency":
        feature['properties']['marker-color'] = "FF5733"
    elif Primary == "Social Justice":
        feature['properties']['marker-color'] = "FFF033"
    elif Primary == "Health and Wellness":
        feature['properties']['marker-color'] = "74FF33"
    elif Primary == "Environmental Stewardship":
        feature['properties']['marker-color'] = "338DFF"
    elif Primary == "Educational Support":
        feature['properties']['marker-color'] = "CE33FF"
    else:
        feature['properties']['marker-color'] = "FF3374"
    collection['features'].append(feature)
    return feature

Create a function that will undertake the data processing. Between the brackets are the input variables.

feature = {'type': 'Feature', 'properties': {'PartnerName': '', 'Address': '', 'marker-color': '',
                                             'Website': '', 'PrimaryMission': '', 'Phone': ''},
           'geometry': {'type': 'Point', 'coordinates': []}
           }

Create a “feature”variable as above. Try to mirror it in “type” and “geometry” agains the standard GEOJSON (see above) as much as possible. Leave the “coordinate” value as empty to fill in later. Under “properties”, list the keys you want.

feature['geometry']['coordinates'] = [longitude, latitude]
feature['properties']['PartnerName'] = CommunityPartner
feature['properties']['Address'] = fulladdress
feature['properties']['Website'] = Website
feature['properties']['PrimaryMission'] = Primary
feature['properties']['Phone'] = Phone

Time to populate the keys. Remember to key the names of the keys and input variables consistent with what was already posted so far.

You must wonder: what about “marker-color”. You can use the conditional argument to assign values to the variable as follows:

if Primary == "Economic Sufficiency":
    feature['properties']['marker-color'] = "FF5733"
elif Primary == "Social Justice":
    feature['properties']['marker-color'] = "FFF033"
elif Primary == "Health and Wellness":
    feature['properties']['marker-color'] = "74FF33"
elif Primary == "Environmental Stewardship":
    feature['properties']['marker-color'] = "338DFF"
elif Primary == "Educational Support":
    feature['properties']['marker-color'] = "CE33FF"
else:
    feature['properties']['marker-color'] = "FF3374"

If you wonder about the HTML color code, just Google “HTML Color Code” and you’ll see it.

collection['features'].append(feature)
return feature

The first line of the block above dictates that we add every single row of the Excel file to the “features” key of the collection variable. “Return” is a mandatory feature of every function.

geojson_series = df.apply(
    lambda x: feature_from_row(x['CommunityPartner'], x['Lat'], x['Longitude'], x['description'], x['Primary'],
                               x['Website'], x['Phone']),
    axis=1)

jsonstring = pd.io.json.dumps(collection)

The first line is to add every single row of the Excel file to the function so that we can create the string needed for the GEOJSON. The second line is to turn it into json file.

output_filename = 'CommunityPartner.geojson' 
with open(output_filename, 'w') as output_file:
    output_file.write(format(jsonstring))

Name the file however you want and use the second line to write it into GEOJSON. The file product will look like this:

import pandas as pd

df = pd.read_excel('CommunityPartner.xls') #Get the Excel file from static/Excel

collection = {'type': 'FeatureCollection', 'features': []}

df['description'] = df[['Address', 'City', 'State', 'Zip']].apply(lambda x: ' '.join(x.astype(str)), axis=1)


def feature_from_row(CommunityPartner, latitude, longitude, fulladdress, Primary, Website, Phone):
    feature = {'type': 'Feature', 'properties': {'PartnerName': '', 'Address': '', 'marker-color': '',
                                                 'Website': '', 'PrimaryMission': '', 'Phone': ''},
               'geometry': {'type': 'Point', 'coordinates': []}
               }
    feature['geometry']['coordinates'] = [longitude, latitude]
    feature['properties']['PartnerName'] = CommunityPartner
    feature['properties']['Address'] = fulladdress
    feature['properties']['Website'] = Website
    feature['properties']['PrimaryMission'] = Primary
    feature['properties']['Phone'] = Phone
    if Primary == "Economic Sufficiency":
        feature['properties']['marker-color'] = "FF5733"
    elif Primary == "Social Justice":
        feature['properties']['marker-color'] = "FFF033"
    elif Primary == "Health and Wellness":
        feature['properties']['marker-color'] = "74FF33"
    elif Primary == "Environmental Stewardship":
        feature['properties']['marker-color'] = "338DFF"
    elif Primary == "Educational Support":
        feature['properties']['marker-color'] = "CE33FF"
    else:
        feature['properties']['marker-color'] = "FF3374"
    collection['features'].append(feature)
    return feature


geojson_series = df.apply(
    lambda x: feature_from_row(x['CommunityPartner'], x['Lat'], x['Longitude'], x['description'], x['Primary'],
                               x['Website'], x['Phone']),
    axis=1)

jsonstring = pd.io.json.dumps(collection)

output_filename = 'CommunityPartner.geojson' #The file will be saved under static/GEOJSON
with open(output_filename, 'w') as output_file:
    output_file.write(format(jsonstring))

ExceltoGEOJSON_3

This is how the GEOJSON looks:

 
   “type”:“FeatureCollection”,
   “features”: 
       
         “type”:“Feature”,
         “properties”: 
            “PartnerName”:“75 North”,
            “Address”:“4383 Nicholas St Suite 24 Omaha NE 68131.0”,
            “marker-color”:“FF5733”,
            “Website”:null,
            “PrimaryMission”:“Economic Sufficiency”,
            “Phone”:“402-502-2770”
         },
         “geometry”: 
            “type”:“Point”,
            “coordinates”: 
               -95.957309,
               41.276479
            ]
         }
      },
       
         “type”:“Feature”,
         “properties”: 
            “PartnerName”:“A Time to Heal”,
            “Address”:“6001 Dodge St CEC 216 Suite 219C  Omaha NE 68182.0”,
            “marker-color”:“74FF33”,
            “Website”:null,
            “PrimaryMission”:“Health and Wellness”,
            “Phone”:“402-401-6083” …

One important note. If you are a fan of Jupyter Notebook, beware that there may be a problem when it comes to the last step of the process. Here is how the collection variable looks before being dumped into the GEOJSON file.

ExceltoGEOJSON_4

But I ran into errors in the last step. I spent quite some time on fixing it but I couldn’t.

ExceltoGEOJSON_5

Creating the Python code in Pycharm is much easier and produces the same result. It’s even more convenient if you are in the middle of an application development project.

Hope this post helps. Much thanks to appendto and geoffboeing for inspiration.

 

Create a hover effect on Mapbox

I am sharing my experience in trying to create a hover effect on Mapbox. The first thing to do is to read their example and understand what is going on. Let’s unpack a little bit:

<!DOCTYPE html>
<html>
<head>
    <meta charset='utf-8' />
    <title>Create a hover effect</title>
    <meta name='viewport' content='initial-scale=1,maximum-scale=1,user-scalable=no' />
    <script src='https://api.tiles.mapbox.com/mapbox-gl-js/v0.49.0/mapbox-gl.js'></script>
    <link href='https://api.tiles.mapbox.com/mapbox-gl-js/v0.49.0/mapbox-gl.css' rel='stylesheet' />
    <style>
        body { margin:0; padding:0; }
        #map { position:absolute; top:0; bottom:0; width:100%; }
    </style>
</head>

It’s the <head> of the HTML that has scripts from Mapbox. Just follow them and you’ll be fine. Change the text in <title> to have your own page title.

<body>


—- Your real code goes here —–

</body>

Your real work will go between and . The <div> is a container that refers to the map you are working on. The next part is Mapbox token

mapboxgl.accessToken = '<your access token here>';

To get a token, just create a free account on Mapbox. A free account is allowed up to 50,000 requests a month if I am not mistaken. It should be enough for a student or an enthusiast wishing to try it out. Once you have a token, just put it in between ” in the above line.

Let’s have a base map

var map = new mapboxgl.Map({
    container: 'map',
    style: 'mapbox://styles/mapbox/streets-v9',
    center: [-100.486052, 37.830348],
    zoom: 2
});

The “center” feature’s coordinates refer to where you want to focus on. Get your chosen destination’s coordinates and just put them there. Alternate the two figures in coordinates if you don’t get it right in the first try. “Zoom” is how close you look at the chosen destination. The greater the number, the closer the zoom.

var hoveredStateId =  null;

map.on('load', function () {
    map.addSource("states", {
        "type": "geojson",
        "data": "https://www.mapbox.com/mapbox-gl-js/assets/us_states.geojson"
    });

HoveredStateID is a placeholder variable that will be used later for hover effect. The following code block is to load the base map. Just follow the templates. Three things to note here:

  • “state” refers to the object’s name that contains the GEOJSON data. You can name whatever you want
  • “GEOJSON” refers to to the style of the file. Mapping requires GEOJSON files, whether you load it from an external source, like we do in this case, or from a hardcoded file
  • The link that goes with “data” is where the author stores the data.

One note here: if you use Github or any cloud platform to store and source your file, be careful. For instance, let’s look at a file I have on github.

Github_Link

Just copying the usual link when you access your file on Github like that won’t work. To get the link that works, click on “Raw” and here is how it shows on the screen

Github_Content_Link

Copy the link in the browser. It should work.

Back to the HTML. Add the two “map.addLayer” code sections to what you already have. It should look like the below

map.on('load', function () {
    map.addSource("states", {
        "type": "geojson",
        "data": "https://www.mapbox.com/mapbox-gl-js/assets/us_states.geojson"
    });

    map.addLayer({
        "id": "state-fills",
        "type": "fill",
        "source": "states",
        "layout": {},
        "paint": {
            "fill-color": "#627BC1",
            "fill-opacity": ["case",
                ["boolean", ["feature-state", "hover"], false],
                1,
                0.5
            ]
        }
    });

    map.addLayer({
        "id": "state-borders",
        "type": "line",
        "source": "states",
        "layout": {},
        "paint": {
            "line-color": "#627BC1",
            "line-width": 2
        }
    });

The first addLayer is for the polygon itself while the second one is for the lines between the states. “id” refers to the name of the layer for future reference. Remember to tie the “source” value back to the name of map.addSource. In this case, it’s “states”. The rest is a Mapbox standard template for hover effect. You can change the color whenever you feel like.

The next step is to work on “hover effect”. Add the following code to the end of the previous block

    map.on("mousemove", "state-fills", function(e) {
        if (e.features.length > 0) {
            if (hoveredStateId) {
                map.setFeatureState({source: 'states', id: hoveredStateId}, { hover: false});
            }
            hoveredStateId = e.features[0].id;
            map.setFeatureState({source: 'states', id: hoveredStateId}, { hover: true});
        }
    });

    // When the mouse leaves the state-fill layer, update the feature state of the
    // previously hovered feature.
    map.on("mouseleave", "state-fills", function() {
        if (hoveredStateId) {
            map.setFeatureState({source: 'states', id: hoveredStateId}, { hover: false});
        }
        hoveredStateId =  null;
    });

The first thing to notice is here:  map.on(“mousemove”, “state-fills”, function(e) {

“State-fills” is the “id” of the polygon layer mentioned previously. So whatever name is chosen for that addLayer, it should be used here.

source: ‘states’

In this case, ‘states’ refers to the source of the data in the map.addSource section above. Remember to use the same reference name for the source. The rest is just a standard template. If you have time, feel free to explore. I am under pressure to deliver features for my Capstone, so I just prefer not touching or changing any of it.

Here is an important note. If you don’t follow, the hover effect won’t work. I use the same code as Mapbox’s example, just changing the GEOJSON source. The hover effect doesn’t work as you can see below:

The key is the data source. Let’s look at the data that Mapbox uses. Here is the tree view of the first item in the polygon array, just to show its structure

GEOJSON_2

Here is the structure of the data I used that led to the unsuccessful “hover effect”

GEOJSON_3

Notice the difference? As far as I am concerned, the hover template in question needs the data to have a certain structure. Otherwise, the code won’t work. Now, there should be other ways to go around this, but if you don’t have time, I’d suggest modifying the data to mirror Mapbox’s example. Here is the structure of my modified data

GEOJSON_4

Does the code work? You bet!

GEOJSON_5

Hopefully this post will be useful to starters like I am.