Linking

  1. Linking across units of analysis
  2. Linking IPUMS MICS Data to UNICEF MICS Files (single sample)

Linking across units of analysis

IPUMS MICS provides data files for different units of analysis. Most variables are unique to a specific unit of analysis. However, users may want to link the downloaded IPUMS MICS files across units of analysis in their analysis. In Stata, this is fairly straightforward using linking keys and merge commands.

Key variables

In most situations the key identifying variables are the sample, cluster, household number, and line number of the person.

The variables SAMPLE, CLUSTER, and HHNO are identically named in all of the units of analysis.

Line number of individual persons

Individual Person Unit of analysis available Key variable for persons' line number
Persons in household (Household members) HL LINENO
Women WM, BH LINEWM
Men MN LINEMN
Children 0-4 and 5-17 CH, FS LINECH
Women who is the mother or caretaker of child age 0-4 or 5-17 CH, FS LINEMC

Key variables, which link the observations of one data file to those of the other, must have the same names in both data files. If names are not the same, renaming of key variables in one of the two data files is required.

Women, men, children 0-4 and 5-17 can be linked to the corresponding record in the household member file.

Births of women 15-49 can be linked to the corresponding women record using LINEWM. There is no need to rename.

If the mother or caretaker of a child age 0-4 or 5-17 is also a woman respondent in the household, LINEMC can be used to link to a corresponding woman record (LINEWM).

Merge relationships

An important step when merging IPUMS MICS data files is to know the type of relationship between two files, as well as to define the desired unit of analysis.

There are two types of relationships: “one to many”, and “one to one”. “One to many” relationship is one where one entity relates to many others. For example, a relationship between households and women, men or children. “One to one” relationship is, for example, a relationship between list of household members and women, men and children.

The desired unit of analysis establishes which one of the data files will be the base dataset (called "master dataset" in Stata terminology). That is the dataset to which the information from the other dataset (called "using dataset" in Stata terminology) will be linked to.

For example, if you want to add information about mother's characteristics to a child, you would link the WM dataset ("using") to the CH dataset ("master"/base) in order to match women (WM) who are mothers/caretakers of the child (CH) using LINEMC.

Merge relationships

Using
Base/"master" dataset HH HL WM MN CH FS BH
HH x x x x x x
HL m:1 1:1 1:1 1:1 1:1 1:1
WM m:1 1:1 x x x
MN m:1 1:1
CH m:1 1:1 m:1
FS m:1 1:1 m:1
BH m:1 1:1 m:1

x = not recommended merging in that direction

NOTE: In most Children 5-17 year old samples(FS), only one child in that age range was randomly selected per household. However, due to some substitutions due to incomplete interviews, it is recommended to use the m:1 merge instead of the 1:1 merge.

Steps

Open base/"master" file

NOTE: You may have to sort your "base"/"master" and "using" data files before merging.

sort sample cluster hhno [line number] 

Merging

merge 1:1 cluster hhno [lineno] using [IPUMS_MICS.dat "using" file]

Linking IPUMS MICS Data to UNICEF MICS Files (single sample)

Users may want to link additional variables from the original UNICEF MICS SPSS files (that are not yet in the IPUMS MICS data collection) to an IPUMS MICS Stata data extract. To ensure correct linkage, a unique identifier that is identical in name and in character length, sometimes known as a linking key, must be used.

Import the original SPSS MICS dataset to Stata

import spss using [spss .sav file]

In Stata, rename the original MICS variable names to match the IPUMS MICS harmonized variable names:


			gen cluster = HH1
			gen hhno = HH2
			
			*Line number of child (linech), women (linewm), men (linemn) if needed
			gen lineno = LN 
			

Merge the IPUMS MICS Stata file

merge 1:1 cluster hhno [lineno] using [IPUMS_MICS data file]