Feature engineering: all I learnt about Geo-spatial features

Top 9 tricks on Geo-spatial features and visualization technique

9 min readAug 16, 2019

We come across many features while working on data sets for real. Geo-spatial features are the ones that I encountered recently while working with some data sets from Kaggle. As an amateur, This was my first-time using Geo-Spatial data. I wondered what to do with these data as there were no inter-related meaning within these points. what I see is just two numbers with a lot of noise around and even couldn’t find any ordinal property within it like with the age, year, experience and so on.

I asked, “If not useful, Why can’t we simply drop it?”

Nah! It might not be the case, It doesn’t contain any meaning within it but these are points in space (so-called Geographical space). if we find some technique to interpret it then would be lot-more useful than any other feature in our dataset. All it takes is a little bit of creativity.

Imagination is the beginning of creation. You imagine what you desire, You will what you imagine, And at last, You create what you will — George Bernard Shaw

I had a hard time figuring out such techniques as there are not much learning resources available out there or I would say it is scattered around the corner. So, By writing this article I would like to pen down some of the Tips-N-Tricks that I learned around last week from books, blogs, and kernels where some of the Masters utilized these things to power-up their machine learning algorithms towards global-cause. bravo!

I hope this article would help someone like me who is entering into the business of creating machine learning models.

OK! Here is the deal. To make it interesting, I like to narrate a story based on the concepts and keep you wondering. Happy Learning!

Conceptual story…

The story(a.k.a dataset) starts with this, There were two friends named Latitude and Longitude residing in the Cartesian street are frustrated IT employees. While coming from office on the way back home they saw a huge billboard ad about a Mid-Night party, that is happening near Mid-Town area around the weekend. They both badly wanted to go.

But, they can’t! It mentioned, “ *Only couples are allowed “. Our boys don’t have a girlfriend, Pity them! While programs and math busy rattling around in their mind-space figuring out on girlfriends,

There came the Tinder app to the rescue! Took their mobiles out, opened it. TA-DA! Both find their first match named ‘r’ and ‘φ’ residing in Polar street.

But still, They used below formulas to recheck if the girls suite their mindset and adding to the story.

Trick(1): Add two new features of Polar coordinates to the dataset

x = Longitude; y = Latitude

Note: This basic method had some intuition for the model, polar co-ordinates created from Cartesian Lat/Long are for further processing.

What next? They both whispered. Yeah! it’s party time.

On a beautiful evening, The boys are getting ready for the party. They had a very simple pattern to get dressed up. something like below,

rot_x = x * cosθ + y * sinθ
rot_y = x * sinθ — y * cosθ

Trick(2): Add 4 new features of rotational Cartesian coordinates

Notes: By rotating them, they would provide more spatial information for the Tree type models. which are extremely beneficial when compared to normal x-y coordinates. They help to visualize coordinates in different perception (viewing angle) and put some insights on the data that the model can learn from it. we can also do this rotation by Principal Component Analysis(PCA) which can give more options in our bucket. Will see it in a bit!

Now, they hop into the car and asked the neighbor standing beside them named Google. “How far is the Polar street from here?” He is a technical geek and gives more info than what we actually needed. He talks a lot!

He proposed a couple of distance methods which would be useful.

Haversine distance:

Basically, Haversine distance calculates the great-circle distance between any two points in a sphere given their latitude and longitude. It is an integral part of trigonometry which is primarily used in navigation techniques. In our case, we have to convert the points to radian metric and also feed in the Mean Earth Radius to calculate the distance between the two points.

Trick(3): Add new feature of Haversine distance to the dataset

Notes: Useful to calculate the distance between the center of the city and our point of interest or the travel distance between pickup and drop points( in taxi fare dataset) and so on. It is a really good feature in our dataset. we can also extract center distance as one separate new feature.

Manhattan distance:

Sometimes, Manhattan distance comes in handy as it is the measure of the distance between two points along the axes at right angles and if the city has grid structure (square blocks) like in Newyork for example.

Trick(4): Add a new feature named Manhattan distance to the dataset

Notes: Haversine and Manhattan distance combined gives more intuitive sense to the data.

Now that, our guys know the distance to pick up the girls they also wanted to know the direction to be headed. Make sense isn’t it!

Anyway, Google had the answer for that too.

Bearing degree:

Bearing is used to represent the direction of one point relative to another. It’s like a compass viewing the point standing on another point.

Trick(5): Add a new feature named bearing degree to the dataset

Notes: It is more powerful when a dataset contains some moving subjects like in Taxi fare/time prediction dataset.

Hmm! That’s a lot of info from Google.

Besides all these happenings, the girls found a Beauty shop named PCA to prepare them for the party.

Rotation with PCA:

We would have heard about Principal Component Analysis(PCA) as a dimensionality reduction algorithm. We can also use it to rotate the Cartesian coordinate space. Here, the idea is only to rotate them which would help decision tree splits in typical tree-based models.

Trick(6): Add two new features of rotational coordinates using PCA

Notes: Using PCA from sklearn package, It would automatically learn best rotation angle based on the density of the points in the coordinate space and can be used to transform those points in the same space which gives more accuracy.

Mean-while, boys arrived at their doorstep, met them, had a good conversation and now everything falls in place. They hop onto the car rushing to the destination.

Nearing the place, what they see is a large open area covered with grass and car slowly rolling over the pathway they put their foot out on a red carpet welcoming them to a marvelous entrance. The operator quickly guides them to the reception where the entry formalities are taken care of. He asked them to provide their Id’s and everyone agreed onto it.

The operator quickly scans their Id’s through some device running on the below code.

Reverse-Geocoder:

This is one of the awesome packages that I found on the Internet which can give you the building name, street name of specified Latitude and Longitude coordinates everything for free(i.e.., no API keys) and is really fast compared to all other packages like Geopy, Geo-python, etc..,

Trick(7): Add new feature about location in the dataset

Notes: It is really helpful when we know official/household area or any history about the building, government/private buildings and so on. It provides a new perspective on the data. It has many other amazing options. Check that out.

So, everything looks perfect! He assigned them to a new tag as everyone is anonymous in there. That’s what fun is! He whispered in their ears.

Geohash:

It is a public domain Geo-Code system that encodes the coordinates to a short string of characters or numbers. Geohash divides the Earth into “buckets” of different size based on the number of digits (short Geohash codes create big areas and longer codes for smaller areas).

Trick(8): Add new feature of encoded coordinates in the dataset

Notes: when we mention precision as 1 it reduces hassles and creates single-digit encoded strip more like a cluster but are totally different as it considers the whole world rather than specific part.

Everyone got their respective hash-codes with a fancy wrist band that glows in the dark. They say it’s a NEON!

With a smile on the face, they entered inside. Huge crowd dancing to the loud music with a star performer on stage far away from them. They somehow have to settle on their respective clusters written on the tickets. As the whole party is separated into different clusters for safety.

Clusters:

So, we usually have a huge number of coordinates in a dataset. we can divide them into the number of clusters. This gives us extra power to calculate the different things that are not possible with any other methods.

For example, in a problem of Taxi fare prediction. we can,

aggregate pickup points in one cluster and drop points in another and calculate the distance between the center of those corresponding clusters.
calculate how many taxis are going out of the cluster and vice-versa.
calculate the distance between the center of the assigned cluster and respective point of our interest.
check the trend/density of each clusters i.e.., how many taxis are present inside the cluster at any point of time.
calculate average distance traveled by taxi which are going out of the cluster and vice-versa
analyze the density of clusters in weekday and weekend to get insights on which of those are official trips and so on.

Trick(9) Add new feature and assign coordinates to different clusters

Notes: our model will get benefited If we manage to include a different set of features based on clusters.

They put up a struggle and found their respective clusters and from now on its all about music, dance, and what-not!

are we safe?girls asked, Yes! the place is equipped with multiple cameras around every corner of the cluster and some of them are ‘Radar’ maybe.

Speaking about the ‘Radar’ cameras,

Visualization:

I found an amazing code-snippet from a kernel in the Kaggle platform. He plotted the coordinates in a scatter plot with tiny little points that literally shows the structure of streets when zoomed in.

Image sourced from a kernel in kaggle’s competition — Link in the Reference section

Notes: By choosing a very small alpha value, we could achieve this type of visualizations with Cartesian coordinates.

It is clear that radar cameras show the X-ray structure of the groups present in the event. But, what about normal cameras? let’s get on to it

Folium:

Folium package is based on JavaScript in a python platform exclusively designed for visualizing the geographical coordinates. We can mark a single point in real map or cluster of points together and so on.

Notes: This package creates an interactive map visualization where you can zoom in and out of the map. check out their website for more.

Finally, the long party got over. They part ways and happily lived ever-after :)

This is my first blog. If you made it through, thanks for your valuable time.

Reference:

Folium → https://github.com/python-visualization/folium
Kaggle kernel → https://www.kaggle.com/gaborfodor/from-eda-to-the-top-lb-0-367/notebook#Feature-Extraction
sklearn → https://scikit-learn.org/stable/modules/generated/sklearn.cluster.k_means.html#sklearn.cluster.k_means
Geohash → https://github.com/vinsci/geohash/
Reverse-Geocoder → https://github.com/thampiman/reverse-geocoder
Books → Data Mining techniques from Wiley written by Gordon S. Linoff and Michael J.A.Berry
https://www.commonlounge.com/discussion/8bc2062e262440f28f29b53015243e31
https://datascience.stackexchange.com/questions/23651/can-gps-coordinates-latitude-and-longitude-be-used-as-features-in-a-linear-mod

Disclaimer:

If I made any mistakes or didn’t like the story, kindly apologize and let me know in the comment section below.