Stata s facilities for survey data analysis are centered around the svy pre. Will the results for such an analysis be identical to those i would obtain by selecting out only the cases in that subpopulation and running the analysis on those cases. Ppt using stata for subpopulation analysis of complex. How can i analyze a subpopulation of my survey data in stata. Citeseerx document details isaac councill, lee giles, pradeep teregowda.
Using stata for subpopulation analysis of complex sample. This example uses an if statement to define the subpopulation based on the age variables ridageyr value. West center for statistical consultation and research university of. Using stata for subpopulation analysis of complex sample survey data is the property of its rightful owner. Repeated replication methods, subpopulation analysis, and missing data options in sas v9. Subpopulation definition and meaning collins english. Offers a large variety of statistical features for operations like power analysis, cluster analysis, linear regression, choice modeling and many others. This approach would give correct point estimates but incorrect standard errors. Stata, spss, and epi info also have developed special procedures or modules to analyze complex sample survey data. Introduction to structural equation modeling using stata. Subpopulation analysis is used to compute point and variance estimates for a subset of the sampled. Mcgovern harvard center for population and development studies geary institute and school of economics, university college dublin august 2012 abstract this document provides an introduction to the use of stata. The household component of the medical expenditure panel survey hereafter, meps is a complex, multistage probability sample that incorporates stratification, clustering, and oversampling of some subpopulations. The formulas for the estimates of a total and its variance for a simple sampling design with no clusters or stratification are shown in the entry for subpopulation estimation in the svy survey data reference manual and are transcribed below.
In stata, this command is subpop, while in sas the command is domain. Non0 values are included in the analysis, except for missing values, which are excluded from the analysis. I have a survey sample and i am using a subpopulation of the data set for my analysis. A closer examination of subpopulation analysis of complexsample survey data article in stata journal 84. I see the option in the spss complex samples procedures to analyze only a subpopulation. This entry discusses subpopulation estimation and explains why you should use the subpop option instead of if and in for your survey data analysis. Svy survey data data analysis and statistical software. Stata has two subpopulation options that are very flexible and easy. We conclude with important considerations for subpopulation analyses and a. Introduction to stata generating variables using the generate, replace, and label commands duration. I have created dummy records that represent values from the missing stara and clusters when i queried my set of interest. So, stata does adjust the subpopulation weights, but it doesnt adjust them to the subpopulation size.
Some of the assumptions dont really apply, though, because of the extremely large sample size involved. How to prepare and analyze pair data in the national survey on. According to heeringa, west and berglung applied survey data analysis, crc press, chapters 4 and 7, yes, it matters. The model predicted logodds for the categorical subpopulations will be. The authors recommended the unconditional subclass analysis, because it preserves the main survey design and it provides larger standard errors. A practical introduction to stata harvard university. Version control ensures statistical programs will continue to produce the same results no matter when you wrote them. Svy subpopulation estimation subpopulation estimation for survey data svy direct standardization direct standardization of means, proportions, and ratios. It seems to me that adjusting to the full subpopulation size is the correct thing to do, but maybe im missing something.
Thomas lumley april 3, 2020 estimating a mean or total in a subpopulation domain from a survey, eg the mean blood pressure in women, is not done simply by taking the subset of data in that subpopulation and pretending it is a new survey. If you are looking at cigarette smokers aged 1217, code so that. Datasets for stata survey data reference manual, release 9 stata. The household component of the medical expenditure panel survey hereafter, meps is a complex, multistage probability sample that incorporates stratification, clustering, and oversampling of some subpopulations black, hispanic, and asian in some years. Svy subpopulation estimationsubpopulation estimation for survey data. Using stata for subpopulation analysis of complex sample survey data. West center for statistical consultation and research.
Stata wants the subpopulation variable to be coded 01. This approach has the effect of producing estimates for the population of interest, while incorporating the full sample design information for variance estimation. The purpose of this seminar is to explore some issues in the analysis of survey data using stata 11. Hi all, i am new to spss complex samples and have a question regarding subpopulation analysis. Sep 24, 2014 ppt using stata for subpopulation analysis of complex sample survey data powerpoint presentation free to download posted on september 24, 2014 by data with analytic stats share this. Guidelines for analyzing add health data carolina population. Before we begin, you will want to be sure that your copy of stata is uptodate. Data analysis with stata 12 tutorial university of texas at. Introduction to structural equation modeling using stata chuck huber statacorp. A part or subdivision of a population, especially one originating from some other population. In statistics, a population is a set of similar items or events which is of interest for some question or experiment. Download our english dictionary apps available for both ios and android. In the analysis of complex sample data, jackknife repeated replication is a method to estimate the variance of a statistic that can result in substantially reduced variance when compared to traditional estimators.
When the observed variables are categorical, cfa is also referred to as item response theory irt analysis fox, 2010. Exports graphs to eps or tif, to png or svg and so on. I will discuss several examples of possible subpopulation analysis approaches that analysts could take using the stata svy. The following syntax demonstrates, generally, how an analyst can conduct subpopulation analysis using ipums data without compromising the design structure of the data. As ucla explains in its stata faqs, when the subpopulation options is used, only the cases defined by the subpopulation are used in the calculation of the estimate, but all cases are used in the calculation of the standard errors. Another option is to create a dichotomous variable where the subpopulation of. Stata applies the poststratification adjustment according to the estimation.
Remarks and examples subpopulation estimation involves computing point and variance estimates for part of the. Subpopulation definition of subpopulation by the free. Subpopulation estimation from the technical point of view. Using the library haven will allow you to download any data formats. Subpopulation definition and meaning collins english dictionary. Nces 2014364 will help you use subsetting commands in sas and stata to.
Use the subpop option to select a subpopulation for analysis, rather than select the study population in the stata program while preparing the data file. As of stata 11, most of the diagnostic commands that you would use after regress, logistic, etc. In this presentation, i provide an overview of important considerations that analysts of large publicuse survey datasets must keep in mind when attempting to make inferences for finite subpopulations of research interest. Subpopulation analysis when using the svy commands to analyze only a portion of the sample a sub population, it is important to analyze the entire data set and to use the subpopulation option to identify those observations you want to include in the estimate. Data analysis with stata 12 tutorial university of texas. Applied survey data analysis in stata 11 idre stats. How to prepare and analyze pair data in the national survey on drug use and health. Thus, this paper demonstrates examples of subpopulation analysis with jackknife replicate weights using a reweighting method. How is the number of observations computed for subpopulation estimation. Dec 17, 2012 introduction to stata generating variables using the generate, replace, and label commands duration. Ucla further adds that using if in the subpop option does not remove cases from the analysis. Propensity score analysis for complex survey data using pscore. Software for analysis of yrbs data centers for disease control. Swire is a software interface enabling us to query stata for the executing of basic operations like reading or writing data.
Number of observations for subpopulation estimation. This is probably better than the seemingly arbitrary result i get in t2, but id really like at least the option for my result to be adjusted up to the subpopulation size. Combines and reshapes datasets, manages variables and creates graphs. This way stata knows to use only the subpopulation data in the mean.
He proposed that correlations between tests of mental abilities. However, the sampling design information from the entire sample is used in computing variance estimates and related quantities, including standard errors, degrees of freedom. A closer examination of subpopulation analysis of complexsample. Checking assumptions when doing a subpopulation analysis can be even more tricky. Confirmatory factor analysis cfa is used to study the relationships between a set of observed variables and a set of continuous latent variables.
Citeseerx sas global forum 2010 statistics and data. I am using a survey sample and am trying to analyze a subpopulation. Description subpopulation estimation focuses on part of the population. To numerically present this, you can ask stata for the skew and kurtosis statistics, including pvalues, as we did in section 3. Spssx discussion complex samples subpopulation question. The analysis of subpopulations is one place where survey data and.
Subpopulation interested in a specific subgroup of your sample. Charles spearman 1904 is credited with developing the common factor model. When analyzing these subpopulations aka domains, you need to use the appropriate option. My dataset of interest is a subpopulation of a large administrative database. The correct way to perform a subpopulation analysis with survey data in stata is to specify the subpopulation in a svy. I aim to clarify this point in this faq, which is organized as follows. A common and frequently desired technique for analysis of survey data in practice is the restriction of estimation to a subpopulation of interest. In recent years, generalpurpose statistical software packages have incorporated new procedures that feature several useful options for designbased analysis of complexsample survey data. Statas facilities for survey data analysis are centered around the svy prefix command.
If so, share your ppt presentation slides online with. How to prepare and analyze pair data in the national. A closer examination of subpopulation analysis of complex. May 27, 2018 offers a large variety of statistical features for operations like power analysis, cluster analysis, linear regression, choice modeling and many others. The stata newsa periodic publication containing articles on using stata and tips on using the software, announcements of new releases and updates, feature highlights, and other announcements of interest to interest to stata usersis sent to all stata users and those who request information about stata from us. Pdf using stata to analyze data from a sample survey. I have an indicator variable 0data of interest, 1 dummy records. For general information on analysis of complex sample survey data, refer to section e, chapter 19 of the united nations book household sample surveys in developing and. Ibm how does using a group as a subpopulation in the spss. To download the product you want for free, you should use the link provided below and proceed to the developers website, as this is the only legal source to get stata 11. Number of observations for subpopulation estimation stata. As stated in the stata survey manual, when the subpop option is used, the subpopulation is actually defined by the 0s false, which indicate those cases to be excluded from the subpopulation. Berglund, institute for social researchuniversity of michigan, ann arbor, michigan abstract.
Stata is a suite of applications used for data analysis, data management, and graphics. When you specify a subpopulation analysis, only the cases in that subpopulation are used to compute the point estimates, such as means and regression coefficients. A statistical population can be a group of existing objects e. Subpopulation command national center for education. Subpopulation analysis is used to compute point and variance estimates for a subset of the sampled population. In this article, we aim to focus in more detail on the practical issues underlying subpopulation analysis of complexsample survey data. You create a new variable where 1 in the subpopulation, and 0 not in the subpopulation. Though the jrr for subpopulation analysis is a standard feature in stata 10. I am trying to get mean, median, 10th percentile and 90th percentile of a continuous varaible for my subpopulation of interets. This number may not be obvious for subpopulation estimation for survey data. Participants will have time for a questionandanswer session building upon the examples. The data set and documentation can be downloaded from the nhanes web site. We wish to warn you that since stata 11 files are downloaded from an external source, fdm.
944 1037 431 1525 1487 143 300 642 816 594 1203 731 243 459 574 1097 732 1297 1329 492 282 799 1239 108 253 730 931 1460 921