Wednesday, February 22, 2017

A location model based on rent

I have developed a location model based on rent. In this model, the rent of each cell is calculated by taking the average of agents' income in this area. Agents have different income levels and requirements on space. Agents want to be located in the most accessible area they can afford where their preferences for space are matched.

There are two types of agents: residents and employers. Residents have high income (e.g. financial services), middle income (e.g. teachers and other professional occupations) and low income workers, which are classed as ‘commerce’, ‘service’ and ‘industry’ respectively. These classes are additionally broken down by age as young (18-34), middle aged (35-65) and old (66+). The agent’s age is calculated randomly when it is first created (18-67). Each agent desires a certain amount of space which is broken down by age categories.

Employer agents were designed to reflect the residential agents’ employers, and subsequently the same three groups of ‘commerce’, ‘service’ and ‘industrial’ were used to represent employers’ different roles instead of age, they have a tenure set between 0-6. employer agents’ decrease their tenure to zero. Once zero is reached, the employer can move. As with residents, employers have a space requirement. For example industrial firms are driven by the need for large amounts of land while financial services (i.e. ‘commerce’ employer) need less land but want a more central location. Each employer also has an income which is four times that of residents.

It is assumed that younger residential agents will move more frequently (every 2 iterations on average) than those who are middle aged (every 5 iterations) with the older residents moving the least (every 10 iterations, On the other hand, employers only move if their tenure is 0. Once an employer agent has moved and finds a suitable location, its tenure is reset to 6 and cannot move for 6 iterations of the model.

Agents of either residential or employer type wanting to be located in the most accessible area they can afford where their preferences for space are matched. An alternative zonal system is used, based on a series of small overlapping areas which allow agents to search the entire area which is not restricted to such boundaries and allows agents to identify clusters spread across such boundaries.

When an agent decides to move, it goes through the list of areas and finds which area is the most attractrive area (in this area its based on accessability). The agent initially moves to the centre of the area, then searches the area for an affordable neighborhood.

The results with one city center:

The results with new city center:

The code can be found here:

Saturday, April 23, 2016

Walk This Way: Pedestrian agent-based model using mobility datasets

This is a Netlogo reimplementation of the pedestrian model in “Walk This Way: Improving Pedestrian Agent-Based Models through Scene Activity Analysis” by Andrew Crooks et al. The purpose of pedestrian models in general, is to better understand and model how pedestrians utilize and move through space. This model makes use of mobility datasets from video surveillance to explore the potential that this type of information offers for the improvement of agent-based pedestrian models.  

The visualization of the model looks like this:
(Grey boxes are the obstacles. Yellow triangles are the agents.)

Here is a video showing the simulation process:

There are 16 entrances and 18 exits in the model. An agent is created at an entrance, and will choose one exit as its destination. Agents move towards their destinations using shortest route while avoiding both the fixed obstacles and the other agents. The rule of selecting shortest route is simple: set the patch that one can see with the lowest gradient as target, and move towards it. One can see a patch that is both within vision and not blocked by obstacles. The method of calculating gradients will be explained in the following text.

Diagram of the route-planning algorithm:

Two types of empirical data are used in this model. Firstly, the empirical of probability of choosing each entrance and exit is used when creating agents and assigning their entrance and exits. Secondly, the empirical data of how people have moved on this map on August 25th is used to construct the gradients map, according to which agents select their path towards their destinations. The more frequently being chosen as a path + the closer to destination, the lower the gradient will be. When the empirical gradient maps are not used, the gradients map is constructed purely based on distance to destinations. Four scenarios are designed to compare the simulation results with the empirical result, in order to show how mobility data could help to improve pedestrian models.

Scenario 1: No Realistic Information about Entrance/Exit Probabilities or Heat Maps
In this scenario, entrance and exit locations are considered known, but traffic flow through them is considered unknown. Under such conditions, we run the model to understand its basic functionality without calibrating it with real data about entrance and exit probabilities, nor activity-based heat maps. This will serve as a comparison benchmark, to assess later on how the ABM calibration through such information improves (or reduces) our ability to model movement within our scene.

Scenario 2: Realistic Entrance/Exit Probabilities But Disabled Heat Maps
In this scenario, we explore the effects of introducing realistic entrance and exit probabilities on the model. The heat map models used are distance-based, and not informed by the real datasets. Instead, we use distance-based gradients (i.e., agents choose an exit and walk the shortest route to that exit).

Scenario 3: Realistic Heat Maps but Disabled Entrance/Exit Probabilities
In this scenario we introduce real data-derived heat maps in the model calibration. These activity-based heat map-informed gradients are derived from harvesting the scene activity data, however entrance and exit probabilities are turned off. In a sense one could consider this a very simple form of learning how agents walk on paths more frequently traveled within the scene. It also allows us to compare to extent to which the quality of the results are due to the heat maps versus entrance and exit probability.

Scenario 4: Realistic Entrance/Exit Probabilities and Heat Maps Enabled
In the final scenario we use all available information to calibrate our ABM, namely, the heat map-informed gradients and entrance-exit combinations and see how this knowledge impacts the performance of the ABM.

Please note that there is one gradient map for each pair of entrance and exit, therefore, 16 * 18 = 288 maps are loaded. However, the final result is compared to only one path frequency map which is an empirical data obtained on August 25th. Also please note that, when the entrance/exit probabilities table is used, some entrances are exits have a probability of being chosen equals to zero. While the table is not used, agents just randomly choose any entrances or exits.  

Please find the model here:

Monday, February 15, 2016

Pedestrian model of agents exiting a building

I built a model of pedestrians who try to leave the floor through one or two exits. The map being used is from GMU’s Krasnow Institute. The model records the frequency of each cell being chosen as a path and draws the result into a path graph, which can be exported to ArcGIS for further analysis.

Here is a graph showing the path graph opened in ArcGIS:  

Here is a video showing the simulation process:

Each pacth has a variable called elevation, which is determined by (1) the shortest distance to the exit; (2)if it is in a room, elevation is lower being closer to gate. If there are more than one exit patches, the elevation is equal to the shortest distance to closest one of the exit patches. People use the gravity model (always flow to lower elevation, if space is available) to move to the exit.

In this model, the “elevation” of a patch is decided by its distance to exits as well as how close it is located to the gate of the room, so that people can run out if rooms. When running the model, people always try to move to lower elevation. This algorithm can also be used to build a rainfall model to analyze the movement of rain drops on the ground. See this link for the Rainfall model. (

I have also added the export function to export the path frequency graph to an asc file. You may open the file in ArcGIS for further analysis.

Here is the code:

Saturday, February 6, 2016

Agents Exiting A Room

This is a model of agents who try to leave the room through the exit on the right hand side. The model also records the frequency of each cell being chosen as a path and draws the result into a path graph, which can be exported to GIS for further analysis.  

Here is a graph showing the path graph opened in GIS:  
In order to calculate the “elevation”, each patch calculates its distance to each exit patch, and set the lowest distance as elevation. When running the model, people always try to move to lower elevation. This algorithm can also be used to build a rainfall model to analyze the movement of rain drops on the ground.  

A video showing the process:


Saturday, January 30, 2016

Path finding model using the A-star algorithm in Netlogo

This is a path-finding model using the A-star algorithm to find the shortest path. The models uses the map of George Mason University, including the buildings, walkways, drive-ways, and waters. Commuters randomly select a building as destination, find and follow the shortest path to reach there.

The following is the original map this model uses. It has been simplified in the model for faster computation.

Here is a video showing the process:

How it works?

In the beginning, each commuter randomly selects a destination and then identify the shortest path to the destination. The A-star algorithm is used to find the shortest path in terms of distance. The commuters move one node in a tick. When they reach the destination, they stay there for one tick, and then find the next destination and move again.

The code for path selection can be simply explained as following:

Each node has a variable "distance" that records the shortest distance to the origin. It is set to be 9999 at default. The origin has distance 0.

While not all nodes have updated their neighbors:
     ask those nodes to update their neighbors
           if the distance through this node is shorter than the existing distance of neighbors, update neighbor, and updated neighbor is marked as "has not updated its neighbors"
           the node is marked as "has updated it neighbor"

The loop stops when all nodes have updated their neighbors, in other words, no node can be updated with a shorter distance. The nodes of the shortest path are then put into a list for the commuter to follow.

How is the map simplified?

For faster computation this model simplifies the original data by reducing the number of nodes. To do that, the walkway data is loaded to the 20 x 20 grid in Netlogo, which is small, and therefore, many nodes fall on the same patch. In each patch, we only want to keep one node, and duplicate nodes are removed, while their neighbors are connected to the one node left.

Also, links are created in this model to represent roads. This is so far the best way I can find to deal with road related problems in Netlogo. However, because the way I create links is to link nodes one by one (see code for more details), so some roads are likely to be left behind. But again there is no better way I can find. Therefore, I also used a loop in setup to delete nodes that are not connected to the whole network.

The code and data is here:

Wednesday, November 18, 2015

Segregation Model and Calculation of Moran's I

Recently I have created a segregation model with the calculation of Moran's I, a measure of spatial autocorrelation developed by Patrick Alfred Pierce Moran. In this model, I am using the map of Washington DC.The form of data is vector data. 

Each turtle here represents a houshold that is either blue or red. All turtles want to have neighbors with the same color. The simple rule is that they move to unoccupied patches until they are happy with their neighbors.

Here is the map I am using in this model.     

In the beginning, 10 to 80 turtles are created in each polygon, depending on the population data. Turtles are either blue or red. Red polygons have 60% red and 40% blue. Blue polygons have 60% blue and 40% red.

In each tick, turtles look at two kinds of neighborhoods to decide whether they are happy or not. One is their geometrical neighboring polygons; the other is the 8-connected neighbors. If either neighborhood has different neighbors more than the specified percentage to be unhappy, turtle will move to an unoccupied patch in a polygon that is unoccupied or has the same color with it. The colors of the polygons are decided by the majority of turtles living in each of them, and the colors change every tick.

Here is a video recording the simulation process.

How to identify polygon neighbors?

It is tricky to find the geometrical neighbors of each polygon, since Netlogo does not have this function. How I did it was to use the Polygon Neighbors function in ArcGIS 10.2 to create a text file which maps each polygon to its neighbors. Then, I deleted unecessary information like headers and ask Netlogo to read the information. Notice that neighbors are polygons that share either a boundary (edge) or a corner (node).  

How to export to ArcGIS?

There is a button Export to export the map to GIS. It exports current map to finalmap.csv in data folder. Information will include color and pcentage red for each polygon. To analyze it in ArcGIS, open the csv file in ArcGIS, and export data as a dbf file to replace the oringinal DC.dbf file.

How to calculate Moran's I and verify it?

Moran’s I is a measure of spatial correlation. Values range from −1 (indicating perfect dispersion) to +1 (perfect correlation). If the different items are randomly distributed, Moran’s I is 0. There is a slider to choose whether to do row standardization or not. Row Standardization is a technique for adjusting the weights in a spatial weights matrix. When weights are row standardized, each weight is divided by its row sum. The row sum is the sum of weights for a feature’s neighbors.     

I have verified the Moran's I calculated in my model with ArcGIS, and they are the same. To verify it, open final map in GIS, create a new numeric field equal to pcetred. Then, use the tool "Spatial Autocorrelation (Morans I)" in ArcGIS. Choose the numeric field as input, "CONIGUITY_EDGES_CORNERS" as conceptualization relationship, and whether to do Row Standardization. See below for the settings.

 Compare the results.

Here is the code:

Friday, October 30, 2015

Tutorial on Using and Exporting GIS Vector Dataset in Netlogo

Hi, this is a tutorial on how to import and export Vector dataset in Netlogo. It could be helpful when you want to study a specific area, and you have the data in ArcGIS, for example a shapefile. In this post, you will learn how to import the data into Netlogo and run simulations with it, as well as how to export the data back to ArcGIS for further analysis. Note that this tutorial is for Vector data, if you are interested in using Raster dataset in Netlogo, please check my post on the Urban Growth Model.

In this post, I will talk about how I developed a Schelling segregation model using the map of Washington DC. The model and data are available here:

And a video showing how it works:

Importing and drawing Vector data

Netlogo has a GIS extension that allows us to read data files from GIS and copy the values to patches or turtles for simulation. I am using the following lines to load the Vector data into Netlogo. Note that by loading the shapefile, you will also load the attributes in the .dbf file and the projection information in the .prj file.

Next, I drew the map on the display. To do that, I am using the following lines. In these lines I go through each vector-feature (each polygon), change the drawing color according to the SOC attribute, and then fill the polygon with the corresponding color. In the last two lines, I use gis:draw to draw the boundary of polygon data using white color.

Copying attributes to patches

In Netlogo, we can not ask a polygon to perform anything. Therefore, in order to study the area, we need to copy the attributes into patches. To do that, I loop through each polygon and copy the color attribute to the patch at its centroid. In this way, I am using one patch to represent one polygon. Mind that you may need to have a larger size of Netlogo map, so that two centroids will not lay on the same patch.

Since this is a segregation model, I have also found the neighbors of all polygons. ArcGIS has a tool called Polygon Neighbors. For each polygon, the tool finds all the polygons that have coincident edges with it, and reports the information in a table. I exported this file to a text file and deleted all the headers and labeling numbers. Then, I use the following lines to ask each patch that represents a polygon to read the file and set neighbors.

Exporting Vector dataset

After running the simulation, how do we export the final map into ArcGIS for further analysis? So far the GIS extension does not allow us to modify or create a vector dataset. My idea is to write the information into a text file, open it in ArcGIS, save it as a .dbf file, and replace the original .dbf file. Here are the codes I used to do that.

The color information is stored in the patches that represent polygons. Therefore, I loop through those patches and ask them to write down their ID and then color. Note that more attributes could be easily included in the table if necessary.

It does take an extra step to use this attribute table in ArcGIS.  Simply open the text file in ArcGIS, save as .dbf, and replace the original one (remember to make a copy).

I wish you find this tutorial helpful, and let me know if you have any questions!