I’m writing this as a way to keep track of my thoughts while trying to solve a problem. You may not find this at all interesting but since it has to do with GPS data I thought that I would post it on my blog. If you decide to write or comment about this and expect any type of reply please ask relevant questions. Please also remember that I am not obligated to reply back to you and will do so only if I feel like it. I will probably try and write the processing formulas in PHP and use a graphing library for display of the results.
March 9/2008
I’ve been looking at raw GPS data trying to find a way to explain this information or present it in a more usable form that businesses could use easily to make decisions on movement in a geographic dimension.
To find a market for this I have to identify the reasons why people would want to have a visual representation of what this data means. In essence the data only represents time and a location in a 3 dimensional space.
The standard way of interpreting this data is to show the user their location in a 3 dimensional space and to also give them directions to another location in a 3 dimensional space. This interpretation is based on the users current position in only 1 historical data-set (which is most likely 2D).
My goal is to allow a user to have more than one historical data-set that the user can compare - thereby enabling the user to make a decision based on the differences between the data-sets.
To make this data relevant in a business environment the user would have to be able to give a value to TIME and a value to DISTANCE. One could make the mistake of thinking that the equation is simply that of comparing TIME and DISTANCE but a comparison this simple does not allow for multiple data-sets to be used.
Without knowing the end-users intended use or the value that they place on TIME and DISTANCE we are limited to only being able to draw a relationship between the TIME and DISTANCE and putting them beside the representations of the other sets of data.
I’ve had to consider the nature of the data that will be available. The data sometimes consists of values that have 12 decimal places and because of data collection limitations it would be very unlikely to get exact START and END points for all the data to be compared.
The assumption is that the START and END points will be the same and I have decided that it would be most accurate to take the START 3D location reading and the END 3D location reading of every data-set and find the average between them.
The average of the START X,Y & Z coordinates across the data-sets allows me to have the average START point between the data-sets. The same is true for that of the END point data.
Take the fist X coordinate of every data-set and put it into an array. Count the number of values in your array. Add the values of the array together and divide by the number of counted values to give you the average X co-ordinate. Do this for Y and Z as well to give the average START location in 3D space. Repeat the process to obtain the average END co-ordinate.Calculating for time is the easiest of the calculations because the data collection process includes a time-stamp with every reading - so the difference between the START and END point time-stamp data is the value of TIME.
For each data-set find the difference between START and END time-stamps. This is the value that represents TIME.If the value of TIME is unknown we can give a visual representation of time by showing the user a line-graph because a line-graph can display difference without the need of a base value.
We could easily solve using just the X and Y co-ordinates to find the most efficient or differences in distance between two points using different routes but that is not how I would like to have the data represented. Adding the Z co-ordinate allows us to find the true distance. For example: If I need to get to a point that is only 5 meters in front of me - in a 2D environment it is only 5 meters away. If I add the Z dimension to represent a 3D environment and find out that the point 1 meter in front of me is 500 meters up a cliff then the shortest distance I can travel to get to my end point is 500.02499937503 meters.
To the average viewer this will probably seem mundane - but realizing the answer to the same question over a huge data-set can be helpful in many industries. Different values can be associated with the X, Y and Z coordinates allowing the data to be visualized as COST of movement. In a practical sense - if I know my cost of gaining or losing elevation in my path to reach a certain point is very high I can try and avoid a route that has a lot of elevation differences and choose one that is more direct on the Z plane.
March 9/2008
The Data Storage Model that the typical GPS devices can’t quite handle — yet.
This is a graphical representation of how data can be stored in a 3D environment. The value or COST of moving to another point in the 3D environment can be stored in a database allowing for calculations to be made so one can choose the best route from one point to another in the 3D grid.
Every plane between intersects represents a space in which data can be stored about the transition between the two points. The data stored in the grid is incomplete in this illustration. The size of the grid would be directly related to the density of the data that can be observed or obtained (Data-Resolution). In this model I cannot see any limitations to the fields of data that can be stored about the transitional points besides that of your data storage and processing mediums.
At this point I’ve started to ask myself about the relevance of such a database of information and although I’m not ready to expose what type of usage I have in mind for it - I can assure you that there are industries that exist today that would probably demand this processing ability.
Getting Back to My Original Plan
Sometimes I get a little bit ahead of myself and that last little bit was all that. I’ve decided to scrap the line graph as a display of time. The whole vision of how I was going to display the information has been scrapped and I’ve opted for something totally different. My new design will be a 2 axis graph with one axis being TIME and the other axis being the % DISTANCE FROM END.
This is an illustration of how I would like to visually represent and translate the raw GPS data-sets. Right now it’s just a sketch with some Photoshop cleanup. The finished version will be full color which will help explain the function of the graph.
You can see that the START point is the same for all of the data-sets, the end point differs with time but the height in the graph is the same because the other axis is % DISTANCE FROM END point (which will need to be calculated from the total distance travelled in that particular data-set).
I wanted to include some more useful information that I could acquire from the data as well - I achieve this by including the TOTAL ELEVATIONAL DISTANCE TRAVELLED in the chart legend.
Since I will now be arranging my data on a time scale I need to look at the values stored in my data-set. When exported into a CSV my GPS gives me a human readable date-stamp with every reading that it saved. Sometimes there are multiple entries during a 1 second period.
I’m going to process my data-set and average the data readings that fall into the same one-second data time-stamps. Doing this allows me to arrange my data into data-sets that have one-second resolution without any data-overlap.
My data exports to UTM in a CSV file so that is the standard format that I will use for my data-sets.
So far here is where I am at:
- Processed 1 Second Data Resolution
- Equal Start and End Points
- Visual Layout for Data Representation
- Standard UTM Data-set


