The role of Social Networks in the projection of international migration flows: an Agent-Based approach Carla Anjos (University of Aveiro) Pedro Campos (Statistics Portugal and University of Porto) Work Session on Demographic Projections - April, 29, 2010, Lisbon Contents Motivation, goals The context ◦ Demography and migrations ◦ Social Networks ◦ The Multi-agent System The Model ◦ Variables ◦ Gravitational Model ◦ Simulation/Parameters Results Final Remarks Anjos & Campos, 2010 2 Demography and Migrations Population estimates (Comp. Method) Pt Pt 1 N M I E Pt Pt-1 N M I E = population at time t = population at time t-1 = number of births between Pt-1 and Pt = number of deaths between Pt-1 and Pt = number of imigrants between Pt-1 and Pt = number of emigrants between Pt-1 and Pt Anjos & Campos, 2010 3 Motivation Population Projections ◦ Need to elaborate social policies Importance of studies in migration flows ◦ More accurate demographic forecasts ◦ Lack of information of migration flows “New” approaches based on Agent-Based Computational Demography (ABCD) ◦ bottom-up approach (Billari et al. (2003a); Billari and Prskawetz (2005)) Anjos & Campos, 2010 4 Interaction between social mechanisms Situacional Mechanism Macro Level Transformational Mechanism Micro Level Mechanism of formation Interaction between social mechanisms - Billari e Prskawetzy (2005) Anjos & Campos, 2010 5 Main goals Verify the effect of the structure of social networks on the migration flows ◦ Social network analysis Density Degree centralization Input Output General Anjos & Campos, 2010 6 Social Networks Relationships and individuals Agents or actors – “vertices” ◦ Graph theory ◦ Organized within a society Well defined structure (or not?) ◦ A set of units Social Economic Cultural Links between individuals ◦ Oriented – “arcs” Directed transmission of something (goods, services,information). ◦ Non oriented– “links” Undirected links between pairs of agents Anjos & Campos, 2010 7 Indicators of Social Networks Agents ◦ Degree – Number of adjacent agents Non oriented networks Total number of links Oriented networks: Indegree – number of links received that an agent “receives” Outdegree – number of links received that depart from an agent General – number of adjacent agents (total Indegree+Outdegree) Networks ◦ Density Proportion between the number of existent links and the number of possible links among all the agents More links More cohesion Estrutura Higher denisy ◦ Degree centralization Evaluates the structure of the communication in the network More variation in agents centrality More centralized networks Indegree, Outdegree, General Anjos & Campos, 2010 8 Multi-Agent Systems Agent ◦ Entity that lives in a certain environment, having the capacity to interact with other agents Characteristics: ◦ Action and interaction Agents interact with other agents and with the environment ◦ Communication ◦ Individual goals and autonomy Agents are oriented towards specific goals ◦ (Limits of) Perception “Limited Racionality” – Limited computational resources Anjos & Campos, 2010 9 Our study: the Variables variable Description Domain y Age of the agent {1, …, 95} e Educational level of the agent {1, 2, 3} r Income of the housheold ($/1000) [2; +∞[ p Number of individuals in the household {1, 2, …, 15} s Number of individuals in the agents’ social network Labour status (working situation: working/not working) {2, …, 20} w Anjos & Campos, 2010 {0,1} 10 Gravitational Model, Ma M a CM Fm PM Ma= propensity of an agent to migrate CM – Migration cost Fm –Force of migration PM - Propensity to migrate Migration Level (ML) ◦ If ML is greater than the value Ma, then the agent remains in the country of origin. Otherwise, the agent will migrate or stay in U.S. We assumed that three different levels of ML may occur (low, medium and high). These values are defined as 1,5, 4,0 and 5,0 respectively Anjos & Campos, 2010 11 Gravitational Model M a CM Fm PM f EUA f O PM h 100 fEUA - per capita income of USA h – Geographical distance between two countries fO - per capita income of the country of origin CM U(0,5;0,9) From the Country of origin to USA U(0,1;0,4) From USA to country of origin Anjos & Campos, 2010 12 Gravitational Model M a CM Fm PM M N ma Fm G 2 d Fm – Force of migration ma – Agente mass MN - Mass of social network d – Average distance between agents Anjos & Campos, 2010 13 M N ma Fm G d2 Gravitational Model da – average distance between agents ma – Agent’s mass log(r ) p ma w y / 10 da d a ,i ... d a , s s MN – mass of the social network log(rN ) mediana p N MN wN y N / 10 Anjos & Campos, 2010 14 The data IPUMS (Integrated Public Use Microdata Series, Ruggles et al, (2009)) The extracted database contains data of migration flows to the United States between 2001 and 2008. Four communities in the U.S. were considered with origin in four different countries (Portugal, Mexico, China and Germany) Anjos & Campos, 2010 Parameters of the simulation Countries ◦ ◦ ◦ ◦ Germany China Mexico Portugal Three different continents ◦ Different terrritorial and social dynamics Different development stages Different migration flows ◦ migrantes have different characteristics in the USA Anjos & Campos, 2010 16 Parameters of the simulation Initial considerations ◦ The majority of the individuals migrate to the communities created by other individuals of the same nationality. ◦ Simulated population is proportional to the population in database IPUMS ◦ Individuals are created within the scope of three clusters that were found in the original population ◦ Simulação: 2000 to 2008 Anjos & Campos, 2010 17 Simulation 2000 ◦ Agents are created (respecting the clusters found in IPUMS) 2001 to 2008 ◦ Ageing of agents in USA Agents decide their situation as migrants ◦ Creation of potential new migrants according to original migrants Agents decide to migrate to USA or to stay in their country of origin Three different scenarios (with 15 runs in each) ◦ Simulation I (ML=1.5) Migration level is Low, number of agents is high ◦ Simulation II (ML=4.0) Migration level is medium, low number of agents ◦ Simulation IIII (ML=5.0) Migration level is high, low number de agentes Anjos & Campos, 2010 18 Validation Stability of the model according to the variability of the means in the 15 runs Simulated data are similar to reality for the following variables: Country Variable Simularion Z* p-value Country of origin Variable Scenario Z* p-value Germany Working situation (w) I -1,718 0,0858 China HH Income (r) I -1,362 0,1731 Working situation (w) I -0,889 0,3743 HH Income (r) I -1,362 0,1731 Hh Income (r) II -1,244 0,2135 Mexico * Wilcoxon test, p<0,05 Anjos & Campos, 2010 19 Density and Centrality degree Porto, 15 de Março de 2010 Density Mexico – Simulation I Porto, 15 de Março de 2010 21 Final Remarks Trends between 2000 and 2008 ◦ Variables Number of individuals in household and age have different trens when comparing simulated to real data Income and working condition are similar for some scenarios ◦ Density The greater the diameter of the networks, tjhe lower the density Links disappear ◦ Centralization Indegree – the importance of the arrival of information to the agents in the network is high in the first periods, and stabilizes in the following. Agents in USA are important to the arrival of new agents Outdegree – the importance of the information that leaves from every agent decreases during the period Os agentes nos EUA tendem a perder a sua ligação aos outros agentes da rede General - has the same trend as indegree In general, the communicaton in the network is higher in the first years and stabilizes subsequently Anjos & Campos, 2010 22 Limitations and further work The model is not able to preview the trend of evolution of the main variables in the simulation ◦ It should be important to introduce a calibration procedure in a intermediate period (2004?) The structure of the networks is important has some influence in the flow of migrants Anjos & Campos, 2010 23 Some references Billari, F. C., F. Ongaro, et al. (2003a), "Introduction: AgentBased Computational Demography", in Agent-Based Computational Demography: Using Simulation to Improve Our Understanding of Demographic Behaviour, F. C. Billari e A. Prskawetz (editores), Contributions to Economics, pp.1-15, Heidelberg: Physica- Verlag. Billari, F. C., A. Prskawetzy (2005), "Studying Population Dynamics from the Bottom- Up: The Crucial Role of AgentBased Computational Demography", International Union for the Scientific Study of Population XXV International Population Conference, Tours, France. Carrilho, M. J. (2005), "Metodologias De Cálculo Das Projecções Demográficas: Aplicação Em Portugal", Revista de Estudos Demográficos, Vol. 37, pp. 5-24. Anjos & Campos, 2010 24 The role of Social Networks in the projection of international migration flows: an Agent-Based approach Carla Anjos (University of Aveiro) Pedro Campos (Statistics Portugal and University of Porto) Work Session on Demographic Projections - April, 29, 2010, Lisbon Anjos & Campos, 2010 IMPORTÂNCIA DAS REDES SOCIAIS NOS FLUXOS MIGRATÓRIOS: Aplicação de Sistemas Multi-agente Carla Anjos Mestrado em Análise de Dados e Sistemas de Apoio à Decisão Orientador: Doutor Pedro Campos Faculdade de Economia da Universidade do Porto Porto, 15 de Março de 2010 Migração “Deslocação de uma pessoa através de um determinado limite espacial, com intenção de mudar de residência de forma temporária ou permanente. A migração subdivide-se em migração internacional (migração entre países) e migração interna (migração no interior de um país).” Instituto Nacional de Estatística (INE, (2003a)) Anjos & Campos, 2010 27 Redes sociais – Medidas Agentes Grau (degree) ◦ Redes não orientada É igual ao número de vértices adjacentes ◦ Redes orientadas: Indegree - ligações que são recebidas pelo vértice Outdegree - as ligações que saem do vértice Geral - número de vértices adjacentes Centralidade ◦ Proporção entre o número de ligações do agentes e o número total de ligações. Centralidade do grau (degree centrality) Número de conexões directas de cada agente num grafo Centralidade de proximidade (closeness centrality) Medida do comprimento do caminho mais curto que liga dois agentes Centralidade de intermediariedade (betweenness centrality) Proporção de todos os caminhos geodésicos entre um par de vértices que incluem um determinado vértice, e o número total possível. Anjos & Campos, 2010 28 Algorithm ◦ Age(y) – if the age in year t (yt) ◦ Educational level (e) – depends on variable age: ◦ If et = 1 and 1 ≤ yt+1 ≤ 14, then et = et+1 = 1; If et = 1 e 15 ≤ yt+1 ≤ 18, então et+1 = U(1, min(2, maxe)); If et = 1 e 19 ≤ yt+1 ≤ 94, então et+1 = U(1, min(2, maxe)) If et = 2 e 19 ≤ yt+1 ≤ 94, então et+1 = U(2, min(3, maxe)); Income (r) varies in [2;+∞[, and depends on the inflation rate of USA (equal to 3 %). In t+1, the value of r is given by: rt+1=rt+[U(-1,1)x0,03]. Labour status (w) depends on variable age: ◦ ◦ If 1 ≤ yt+1 ≤ 15 then w t+1 = 0; If 16 ≤ yt+1 ≤ 94 then w t+1 = Bernoulli(k), being k the fraction w of working people in USA. Number of individuals in the household (p): ◦ yt ≤ 94 then yt+1 = yt +1; yt = 95 then the agent die. If pt = 1, then p t+1 = pt + U(0,1); If pt = 15, then p t+1 = pt + U(-1, 0); If 2 ≤ pt+1 ≤ 14 then p t+1 = pt + U(-1,1); The Number of individuals in the agents’ social network (s) varies according to the value of MN in the previous year. Anjos & Campos, 2010 Parâmetros da simulação Idade (y) 1 ≤ y ≤ 95 Atribuição de y ◦ Distribuição normal, N(y,y) Educação (e) Valor possível de e ◦ 1 - Menos de 9 anos de frequência escolar ◦ 2 - Entre 9 e 12 anos de frequência escolar ◦ 3 - Mais de 12 anos de frequência escolar Restrições ◦ y ≤ 14 e=1 e 15 ≤ y ≤ 18 e=1 ou e=2 Atribuição de e ◦ Distribuição aleatória uniforme , U(mine,maxe) Rendimento do agregado familiar (r) r = [2; +∞[ Atribuição do rendimento ◦ Distribuição normal, N(r,r) Anjos & Campos, 2010 30 Parâmetros da simulação Condição perante o trabalho (w) Valor possível de w ◦ w = 0, se o agente não está a trabalhar ◦ w = 1, se o agente está empregado (y>15) Atribuição do rendimento ◦ Distribuição Bernoulli(k), ◦ k=fracção de indivíduos a trabalhar nos EUA Número de pessoas do agregado familiar (p) 1 ≤ p ≤ 15 Atribuição de p ◦ Distribuição aleatória uniforme , U(1º quartilp,3ºquartilp) Número de indivíduos da rede social do agente (s) 2 ≤ s ≤ p+10, mas no máximo s=20 Atribuição de s ◦ Distribuição aleatória uniforme , U(p,maxs) Anjos & Campos, 2010 31 Redes sociais – Medidas Redes Clustering (transitivity) ◦ Probabilidade de dois vizinhos de um dado vértice estarem ligados Densidade ◦ Proporção entre o número de relações existentes e o número de relações possíveis. Orientada o número de relações possíveis é igual ao número de vértices N multiplicado por N-1. Rede não for orientada, o número de relações possíveis é dado por N(N1)/2 Comprimento médio de um caminho ◦ Número médio de ligações no caminho mais curto entre qualquer dois pares de vértices Diâmetro ◦ Número máximo de ligações no caminho mais curto entre qualquer dois vértices Grau de centralização (degree centralization) ◦ Variação centralidade que existe na rede Anjos & Campos, 2010 32 Recursos utilizados Base de dados ◦ IPUMS – recolha de dados reais de migrações Software ◦ SPSS – tratamento de dados ◦ Repast – execução da simulação do modelo ◦ Pajek – análise das redes sociais Anjos & Campos, 2010 33 Estabilidade do modelo Variabilidade das médias das 15 simulações Alemães - Simulação I Variável 2000 2001 2002 2003 2004 2005 2006 2007 2008 Agregado familiar 2,40±0,03 (1,4%) 2,73±0,07 (2,5%) 2,90±0,06 (2,2%) 3,01±0,06 (1,9%) 3,11±0,06 (1,8%) 3,17±0,04 (1,3%) 3,23±0,05 (1,6%) 3,27±0,05 (1,6%) 3,30±0,05 (1,5%) Idade 43,8±0,7 (1,6%) 39,4±1,1 (2,7%) 38,0±0,8 (2,0%) 37,4±0,8 (2,2%) 37,1±0,6 (1,7%) 37,1±0,6 (1,5%) 37,2±0,6 (1,7%) 37,6±0,6 (1,6%) 38,0±0,6 (1,5%) Rede social 7,85±0,21 (2,7%) 7,31±0,14 (1,9%) 7,39±0,13 (1,8%) 7,57±0,15 (2,0%) 7,79±0,14 (1,8%) 8,02±0,14 (1,7%) 8,22±0,15 (1,8%) 8,39±0,16 (1,9%) 8,53±0,15 (1,8%) Rendimento 65,5±1,5 (2,2%) 61,9±1,6 (2,5%) 61,4±1,7 (2,8%) 61,1±1,7 (2,8%) 61,0±1,7 (2,7%) 61,1±1,8 (2,9%) 61,5±1,8 (2,9%) 61,4±1,7 (2,7%) 61,4±1,5 (2,4%) Fracção de 0,476±0,023 0,552±0,017 0,504±0,022 0,473±0,016 0,465±0,017 0,460±0,011 0,455±0,010 0,457±0,014 0,460±0,010 (4,9%) (3,1%) (4,4%) (3,3%) (3,7%) (2,3%) (2,3%) (3,1%) (2,2%) trabalhadores Porto, 15 de Março de 2010 34