***************************************** * Cluster Analysis Paper **************** * Regressions of Outcomes on Clusters *** * Date created: December 22, 2009 ******* * Last update: December 23, 2009 10am *** ***************************************** * We use the data created in mmp_cluster_data10.dta. * Cluster membership data comes from Matlab code cluster_indivs.m. * It is better to run the regressions in Stata rather than Matlab * since we get more diagnostics (pseudo-R2) etc. clear set mem 2000m set matsize 10000 set more off cd "/Users/fgarip/Documents/Data/MMP Data" * cd "/Volumes/s-vol$/Home/Faculty/fgarip/Documents/Data/MMP Data" * cd "M:\Documents\Data\MMP Data\" insheet using "/Users/fgarip/Documents/Clustering Paper 07/Analysis/cluster.txt" gen id = _n ren v1 clcur sort id save "Clustering/clusters.dta", replace use "Clustering/init_data.dta", clear sort id merge id using "Clustering/clusters.dta" drop _merge tab clcur, gen(c) ren c1 inc_max ren c2 risk_div ren c3 network ren c4 urban *************** * Regressions * *************** tab year, gen(y) cd "/Users/fgarip/Documents/Clustering Paper 07/Analysis/Tables Graphs" global vars "risk_div network urban y1-y35" set logtype text log using "outcome_regressions.txt", replace * Data Available for Most Observations * **************************************** regress ttrip $vars regress logtexp $vars logit repmig $vars logit hhmigc1 $vars logit othdest $vars logit us_agri $vars logit us_manuf $vars logit us_serv $vars logit us_none $vars * Receiving residency seems more likely for income maximizers as they have * been observed over a longer period of time logit resid $vars * Data Available for Limited No. of Observations * ************************************************** tabstat undoc legmig coyote crossfamfr crosstij loguswagem, stats(mean N) by(clcur) codebook undoc legmig coyote crossfamfr crosstij loguswagem if relhead==1 codebook undoc legmig coyote crossfamfr crosstij loguswagem if relhead~=1 * These observations are available for hh heads mostly, so for income maximizers * or urban migrants more than risk-diversifiers or network migrants. * The results comparing across clusters are not reliable. logit undoc $vars logit legmig $vars logit coyote $vars logit crossfamfr $vars logit crosstij $vars regress loguswagem $vars * Remittance information is Available for hh heads on their first trip * which should also be their last trip in the data. This is also the case * for english skills, and social relationships in the U.S. (recorded for last trips) logit remsavfl $vars regress logremsavmc $vars logit paistrip $vars logit chicanos $vars regress english $vars ologit english $vars logit reltrip $vars logit sport $vars logit social $vars logit blacks $vars logit anglos $vars logit latinos $vars logit chicanos_fr $vars logit latinos_fr $vars logit anglos_fr $vars logit blacks_fr $vars gen chic_latinos = . replace chic_latinos = 0 if chicanos==0 & latinos==0 replace chic_latinos = 1 if chicanos==1 | latinos==1 regress chic_latinos $vars log cl