Research Terminology – In its most common sense, methodology is the study of research methods. However, the term can also refer to the methods themselves or to the philosophical discussion of associated background assumptions. A method is a structured procedure for bringing about a certain goal, like acquiring knowledge or verifying knowledge claims. This normally involves various steps, like choosing a sample, collecting data from this sample, and interpreting the data. The study of methods concerns a detailed description and analysis of these processes. It includes evaluative aspects by comparing different methods.
In this way, their benefits and drawbacks are evaluated, as well as the research goals for which they may be used. These descriptions and evaluations are predicated on philosophical background assumptions; examples include how to conceptualize the phenomena under study and what constitutes evidence in favor of or against them. In its broadest sense, methodology encompasses the discussion of these more abstract issues.
Research Terminology
Knowing common research terminology helps you understand how to read and interpret scholarly journal articles so you can more effectively apply the results to real world human performance. The following are basic research terms and definitions.
A | Accuracy: In survey research, accuracy refers to the match between a sample and the target population. It also indicates how close a value obtained from a survey instrument or assessment is to the actual (true) value. Action Research: Action research conducted to solve problems, inform policy, or improve the way that issues are addressed and problems solved. There are two broad types of action research: participatory action research and practical action research. Administrative Data: Administrative data are used in support of the operations and service delivery of government departments and other organizations. Examples are information about individual children, families, and/or providers of early care and education and other family benefits and services. The data are collected and maintained primarily for administrative (not research) purposes. Alternative Hypothesis: The experimental hypothesis stating that there is some real difference between two or more groups. It is the alternative to the null hypothesis, which states that there is no difference between groups. Average: A single value (mean, median, mode) representing the typical, normal, or middle value of a set of data. Axiom: A statement widely accepted as truth. |
B | Bar Chart/Graph Bar charts are used by researchers to visually represent the frequencies or percentages with which different categories of a variable occur. They are most often used when describing and comparing the percentages of different groups with a specific characteristic. Between-Group Variance A measure of the difference between the means of various groups. Between-Subject Design Experimental design in which a different group of subjects are used for each level of the variable under study. Bias Influences that distort the results of a research study. Beta The probability of making an error when comparing groups and stating that differences between the groups are the result of the chance variations when in reality the differences are the result of the experimental manipulation or intervention. Also referred to as the probability of making a Type II error. |
C | Case Study An intensive investigation of the current and past behaviors and experiences of a single person, family, group, or organization. Categorical Data Variables with discrete, non-numeric or qualitative categories (e.g. gender or marital status). The categories can be given numerical codes, but they cannot be ranked, added, multiplied or measured against each other. Also referred to as nominal data. Categorical Data Analysis Categorical data classify responses or observations into discrete categories (e.g.. respondents’ highest level of education is often classified as less than high school, high school, college, and post-graduate). While there are many techniques for analyzing such data, ‘categorical data analysis’ usually refers to the analysis of one or more categorical dependent variables and the relationships to on or more predictor variables (e.g., logistic regression). Causal Analysis An analysis that seeks to establish the cause and effect relationships between variables. Census The collection of data from all members, instead of a sample, of the target population. Central Tendency A measure that describes the “typical” or average characteristic; the three main measures of central tendency are mean, median and mode. Cluster Analysis Cluster analysis is a multivariate method used to classify a sample of subjects (or objects) in such a way that subjects in the same group (called a cluster) are more similar (e.g., in terms of their personal attributes, beliefs, preferences) to each other than to those in other groups (clusters). Cluster Sampling A type of sampling method where the population is divided into groups, called clusters. Cluster designs are often used to control costs. For example, researchers first randomly select clusters of potential respondents, and then respondents are selected at random from within the pre-identified clusters. Codes Values, typically numeric, that are assigned to different levels of variables to facilitate. analysis of the variable. For example, codes such as strongly disagree-1, disagree=2, agree 3, and strongly agree 4 are often assigned. Coding The process of assigning values, typically numeric values, to the different levels of a variable. The process of assigning values to behaviors observed in parent-child interactions and assigning numeric values to responses to open-ended survey questions are examples of coding. Coefficient of Determination A coefficient, ranging between 0 and 1, that indicates the goodness of fit of a regression model. Cognitive Interviewing A research method used to pretest interview questions or items on a questionnaire. Cognitive interviews collect information on how respondents answer questions, their interpretation of the questions asked and their reasons for responding in a particular way. Cohort A group of people sharing a common demographic experience who are observed through time. For example, all the people born in the same year constitute a birth cohort. All the people married in the same year constitute a marriage cohort. Comparability The quality of two or more objects that can be evaluated for their similarity and differences. Completion Rate In survey research, this is the number of people who answered a survey divided by the number of people in the sample. It is sometimes used interchangeably with response rate. Control The processes of making research conditions uniform or constant, so as to isolate the effect of the experimental condition. When it is not possible to control research conditions, statistical controls often will be implemented in the analysis. Control Group In an experiment, the control group does not receive the intervention or treatment under investigation. This group may also be referred to as the comparison group. Control Variable A variable that is not of interest to the researcher, but which interferes with the statistical analysis. In statistical analyses, control variables are held constant or their impact is removed to better analyze the relationship between the outcome variable and other variables of interest. |
D | Data Information collected through surveys, interviews, or observations. Statistics are produced from data, and data must be processed to be of practical use. Data Analysis The process by which data are organized to better understand patterns of behavior within the target population. Data analysis is an umbrella term that refers to many particular forms of analysis such as content analysis, cost-benefit analysis, network analysis, path analysis, regression analysis, etc. Data Collection The observation, measurement, and recording of information in a research study. Data Imputation A method used to fill in missing values (due to nonresponse) in surveys. The method is based on careful analysis of patterns of missing data. Types of data imputation include mean imputation, multiple imputation, hot deck and cold deck imputation. Data imputation is done to allow for statistical analysis of surveys that were only partially completed. Data Reduction Data reduction is the process of transforming numerical or alphabetical digital information into a corrected, ordered, and simplified form. The basic concept is the reduction of large amounts of data down to the meaningful parts. Deduction The process of reasoning from the more general to the more specific. Dispersion In statistics, dispersion refers to the spread of a variable’s values. Techniques that are used to describe dispersion include range, variance, standard deviation, and skew. ![]() |
E | Error The difference between the actual observed data value and the predicted or estimated data value. Predicted or estimated data values are calculated in statistical analyses, such as regression analysis. Experimental Group In experimental research, the group of subjects who receive the experimental treatment or intervention under investigation. Explanatory Analysis A method of inquiry that focuses on the formulating and testing of hypotheses. For example, instead of, or in addition to, describing Black and White differences in the reading and math skills of preschool children, the analysis focuses on testing whether factors that may contribute to these differences (e.g., resources available to children at home and in their child care programs) are in fact associated with those differences. |
F | Face Validity The extent to which a survey or a test appears to actually measure what the researcher claims it measures. For example, a researcher may create survey questions that s/he claims measure gender role attitudes. To have face validity, other researchers who read the survey questions must also agree that the questions do appear to measure gender role attitudes. Field Notes A text document that detail behaviors, conversations, or setting characteristics as recorded by a qualitative researcher. Field notes are the principle form of data gathered from direct observation and participant observation. Field Research Research conducted where research subjects live or where the activities of interest take place. Field Work Observing human behavior or interviewing individuals within their own communities. Field work is generally used to collect qualitative data. It often involves long-term relocation of researchers to the community under study. Data collection generally takes place over an extended period of time. |
G | Grounded Theory Grounded theory (GT) is an inductive research methodology used in the social sciences. It involves the construction of theory from the data collected in research and analyses of those data. Thus, it is quite different from the traditional deductive approach, where the researcher collects and analyzes data to test an existing theory and a set of research hypotheses derived from that theory. Grounded theory is used widely in qualitative research |
H | Heterogeneity The degree of dissimilarity among cases with respect to a particular characteristic. Histogram A visual presentation of data that shows the frequencies with which each value of a variable occurs. Each value of a variable typically is displayed along the bottom of a histogram, and a bar is drawn for each value. The height of the bar corresponds to the frequency with which that value occurs. |
I | Independence The lack of a relationship between two or more variables. For example, annual snow fall and the Yankee’s season record are independent, but annual snow fall and coat sales are not independent. Independent Variable The variable that the researcher expects to be associated with an outcome of interest. For example, if a researcher wants to examine the relationship between parental education and children’s language development, parent education (years of schooling or highest level of education completed) is the independent variable. Sometimes this variable is referred to as the treatment variable or the causal variable. Index A type of composite measure (i.e., a measure that is created from more than one data item, such answers to a series of survey questions) that summarizes responses to several specific observations (e.g., items on a parent questionnaire that ask whether their child participates in a variety of extracurricular activities) and represents a more general dimension (e.g., extracurricular activities might include music and dance lessons, different sports, and youth clubs). An index is often created by simply summing the responses to a series of yes/no questions. Index Variable A variable that is a summed composite of other variables that are assumed to reflect the same underlying construct. For example, a count of the number of caregiving activities (e.g., bathing and feeding) a father engages in with his infant child. Indicator An observation or measure that is assumed to be evidence of the attributes or properties of some phenomenon. Indicators are monitored over time and are used to assess progress toward the achievement of intended outcomes, goals, and objectives. Child well-being indicators include children’s letter knowledge, frequency of pro- and anti-social behaviors, being read to on a regular basis by family members and attending high quality Interval Variable A variable wherein the distance between units is the same but the zero point is arbitrary. |
J | Jackknife Technique A (usually) computer-intensive resampling method used to estimate population parameters (for example, means and percentage), and/or to gauge uncertainty in these estimates (e.g.,standard error). The name is derived from the approach that involves removing each observation (i.e., cut with a knife) one at a time (or two at a time for the second-order Jackknife, and so on), calculating the mean for each new sample (original sample minus the omitted case) and then averaging the means of the new samples. |
K | Kurtosis A statistic that measures how outlier-prone a distribution is. The kurtosis of a normal distribution is 0. If the kurtosis is different from 0, then the distribution produces outliers that are either more extreme (positive kurtosis) or less extreme (negative kurtosis) than are produced by the normal distribution. |
L | Latent Variables In statistics, latent variables are variables not directly observed and measured but inferred from other observed and measured variables. Mathematical models (e.g., factor analysis, structural equation modeling, item response theory models) are used to examine the relationships between a set of observed variables (indicators) in order to identify the latent variable. For example, the latent variable ‘teacher attitudes toward math’ may ‘be modeled from a series of survey items asking about their feelings toward math and how they feel when doing math. Least Squares A commonly used method for calculating a regression equation. This method minimizes the difference between the observed data points and the data points that are estimated by the regression equation. Likert Scale A Likert Scale is a type of rating scale used to measure attitudes, values, or opinions about a subject. Survey respondents are asked to indicate their level of agreement or disagreement with a series of statements. Limited Dependent Variable A limited dependent variable is a variable with the range of possible values “restricted in some important way.” Examples include binary variables that have only two values (e.g.. child attends child care or not; child is promoted to next grade or not). Also, variables that can only take on certain values (e.g., discrete variables that have a limited set of categories or continuous variables that can only have positive values such as hours worked or wages earned). Linear Regression A statistical technique used to find a linear relationship between one or more (multiple) continuous or categorical predictor (or independent) variables and a continuous outcome (or dependent) variable. Literature Review A comprehensive survey of the research literature on a topic. Generally the literature review is presented at the beginning of a research paper and explains how the researcher arrived at his or her research questions. |
M | Main Effect The effect of a predictor (or independent) variable on an outcome (or dependent) variable. Mean A descriptive statistic used as a measure of central tendency. To calculate the mean, all the values of a variable are added and then the sum is divided by the number of values. For example, if the age of the respondents in a sample were 21, 35, 40, 46, and 76, the mean age of the sample would be (21+35+40+46+76)/5 = 43.6 Measurement Error The difference between the value measured in a survey or on a test and the “true: value, if the difference is due to factors beyond the control of the respondent. Some factors that contribute to measurement error include the environment in which a survey or test is administered (e.g., administering a math test in a noisy classroom could lead students to do poorly even though they understand the material), poor measurement tools (e.g., using a ruler that is only marked in feet to measure height would lead to inaccurate measurement), rater effects (e.g., if a police man in uniform conducted interviews with individuals about drug use, they might not feel comfortable revealing their drug use). There are many more such factors that can contribute to measurement error. Measures of Association Statistics that measure the strength and nature of the relationship between variables. For example, correlation is a measure of association Median A descriptive statistic used to measure central tendency. The median is the value that is the middle value of a set of values. 50% of the values lie above the median, and 50% lie below the median. For example, if a sample of individuals are ages 21, 34, 46, 55, and 76 the median age is 46. Member Checking During open-ended interviews, the practice of a researcher restating, summarizing, or paraphrasing the information received from a respondent to ensure that what was heard or written down is in fact correct. Meta-Analysis A statistical technique that combines and analyzes data across multiple studies on a topic. In early childhood and education research, a meta-analysis combines a number of studies (usually conducted by a number of different researchers in a variety of contexts) to quantify the effect a given independent or treatment variable (e.g., full-day versus part- day kindergarten and class size) has on a given outcome (e.g., children’s academic skills and prevalence of positive and negative classroom behavior). Methodology The principles, procedures, and strategies of research used in a study for gathering information, analyzing data, and drawing conclusions. There are broad categories of methodology such as qualitative methods or quantitative methods; and there are particular types of methodologies such as survey research, case study, and participant observation, among many others. Mode A descriptive statistic that is a measure of central tendency. It is the value that occurs most frequently in the data. For example, if survey respondents are ages 21, 33, 33, 45, and 76, the modal age is 33. |
N | Nominal Scale A scale that allows for the classification of elements into mutually exclusive categories based on defined features but without numeric value. Nonlinear Models A nonlinear model describes nonlinear relationships between the dependent and independent variable(s). A linear model assumes that the dependent variable changes by a fixed amount for each unit change in the independent variable. A nonlinear model, on the other hand, does not make this assumption. Instead of the relationship between the dependent and independent variable being represented by a straight line, a nonlinear relationship is characterized by one or more curves. Nonlinear Trends When analyzing time series data, a linear trend is one where the data increase by a constant amount at each successive time period. A linear trend is represented by a straight line. However, data do not always increase by the same amount. For example, data that increase by varying amounts at each successive time period show a nonlinear, curvilinear trend. Nonparametric Statistics Nonparametric statistics refer to the group of statistical methods that require fewer assumptions about the distribution of the data. For example, nonparametric tests of significance such as the Chi-square test dos not require the data to fit a normal distribution. Nonparametric statistical methods are used when analyzing nominal, ordinal or ranked data. Null Hypothesis This hypothesis states that there is no difference between groups. The alternative hypothesis states that there is some real difference between two or more groups. |
O | Observation Unit The actual unit observed during a study in order to measure something about it. In child care and early education research typical observation units include programs and schools, classrooms and teachers, children and their parents. Ordinal Data Data that are categorical, but that can also be ranked (ordered). However, the distance between the categories is not known and may not be equal. For example, parents might rate their satisfaction with their child’s child care provider as “very dissatisfied,” “dissatisfied,” “satisfied,” and “very satisfied.” using numerical values of 1, 2, 3 and 4, respectively. A parent with a satisfaction score of 1 is more dissatisfied than a parent with a score of 2, but not necessarily twice as dissatisfied. And the difference between scores of 1 and 2 and scores of 3 and 4 are not necessary the same. Ordinal Scale A scale that allows for classification and labeling into mutually exclusive categories based on features that are ranked or ordered with respect to one another, although equal differences between numbers do not reflect an equal magnitude of difference, Ordinary Least Squares Estimation A commonly used method for calculating a regression equation. This method minimizes the difference between the observed data points and the data points that are estimated by the regression equation. Outcomes Outcomes are the measured behaviors, attitudes, or other characteristics of a sample or population that research seeks to explain. There may be one or more than one outcome of interest in a single research study. Outcomes may be measured at different levels (e.g., communities, schools/early childhood programs, classrooms, families and children). Outlier An observation in a data set that is much different than the other observations in the data set. The data point is unusually larger or unusually smaller compared to the other data points. |
P | Panel Study A type of longitudinal study in which data are collected from the same group of individuals. (a panel) at two or more points in time. Although the sample selected for a panel study often include individuals (e.g., children, young adults), they may sample from other populations such as households, schools, and classrooms and collect data on these over a period of time. Parameter In statistics, a parameter is a characteristic of a population. It is a numerical quantity that tells us something about a population and is distinct from a statistic, which is a characteristic of a sample. Percentage A proportion times 100. Population In statistics, the population includes all members of a clearly defined group. The population can be comprised of a group of individuals (e.g., all children ages zero to 5) or of organizations (e.g., all programs providing early childhood education to 3- and 4-year old children). Samples are drawn from the population and the statistical results that are derived from random samples can be used to estimate characteristics of the whole population. Probit Models A probit model is a type of regression where the dependent variable can only have two values. For example, a child from a low income family is either enrolled in a Head Start program or not. Program Evaluation Research that is conducted in order to determine the effectiveness of an intervention program.” Projection Estimates of the future size and other demographic characteristics of a population, based on an assessment of past trends and assumptions about the future course of demographic behavior. Propensity Score Matching Propensity score matching is a statistical matching technique that is used to estimate the effect of a treatment or intervention when data come from a nonrandomized (observational) design. It uses a set of observable characteristics to predict the probability that participants will be assigned the treatment. Its purpose is to eliminate or reduce systematic differences between those who received the treatment and those who did not; thus, mimicking a randomized controlled trial design. Proxy Variable A variable used to “stand in” for another variable. Proxy variables are used when the variable of interest is not available in the data, either because it was not collected in the data or because it was too difficult to measure in a survey or interview. |
Q | Qualitative Research A field of social research that is carried out in naturalistic settings and generates data largely through observations and interviews. Compared to quantitative research, which is principally concerned with making inferences from randomly selected samples to a larger population, qualitative research is primarily focused on describing small samples in non- statistical ways. Quasi-Experimental Research Research in which individuals cannot be assigned randomly to two groups, but some environmental factor influences who belongs to each group. For example, if researchers want to look at the effects of smoking on health, they cannot ethically assign individuals to a group that smokes and a group that does not smoke. Researchers might rely on some environmental factor, for example an ad campaign that discourages smoking, to examine changes in health following the campaign. Questionnaire A survey document with questions that are used to gather information from individuals to be used in research. Quota Sampling A non-probability sampling method in which a given number of subjects are selected from a specific group or groups. For example, a researcher might design a sample of 200 parents of newborns that sets quotas of 100 mothers and 100 fathers. Widely used in opinion polling and market research. |
R | Random Sampling A sampling technique in which individuals are selected from a population at random. Each individual has a chance of being chosen, and each individual is selected entirely by chance. Random Selection Random selection refers to the process of selecting individuals (schools,programs, classrooms) from the population to participate in a study. In random selection, each individual is chosen by chance and has a fixed and known probability of selection into the study sample. Range A measure of how widely the data (values) for a specific variable are dispersed or spread. The larger the range the more dispersed the data. The range is calculated by subtracting the value of the lowest data point from the value of the highest data point. For example, in a sample of children between the ages of 2 and 6 years the range would be 4 years. When reporting the range; researchers typically report the lowest and highest value (Range = 2-6 years of age). Ratio A ratio is a relationship between the number in two groups of objects. It tells us how many times the number in the first group contains the number in the second group. For example, if we have a 10 elementary schools and 3 middle schools in a community, the ratio of elementary to middle schools is 10 to 3 (10:3), or roughly 3 elementary schools to every 1 middle school. Ratio Scale A scale in which the difference between the values on the scale are equivalent and the scale has a fixed zero point; values on the scale can be meaningfully measured against each other. Research Question A clear statement in the form of a question of the specific issue that a researcher wishes to answer using data from one or more sources. Examples include: Do children who attend center-based early care and education programs have stronger academic and social skills than children who are cared for in a home-based child care setting? Does the Black- White achievement gap narrow or widen as children move through the elementary school grades? Respondent The person who responds to a survey questionnaire and provides information for analysis. |
S | Sample A group that is selected from a larger group (the population). By studying the sample the researcher tries to draw valid conclusions about the population. Sample Size The number of subjects in a study. Larger samples are preferable to smaller samples, all else being equal. Sampling The process of selecting a subgroup of a population (i.e. sample) that will be used to represent the entire population. Sampling Bias Distortions that occur when some members of a population are systematically excluded from the sample selection process. For example, if interviews are conducted over the phone, only individuals with telephones will be in the sample. This could produce bias if the researcher intends to draw conclusions about the entire population, including those with a phone and those without a phone. Sampling Design (Sample Design) The part of the research plan that specifies the method of selection and the number of individuals or organizations (schools, programs) who will be selected and asked to participate in the study. The sampling design (sample design) specifies the target population, the frame or list from which cases from that population will be selected, the approach that will be used to select the sample members (simple random sampling, stratified sampling, cluster sampling, or combinations of these), the number of sample units to be selected to achieve the study objectives. Sampling Distribution The frequency with which data values appear in the sample. The sampling distribution can be characterized by the mean and the variance of the sample. Sampling Error This is the error that occurs because all members of the population are not sampled and measured. The value of a statistic (e.g., mean or percentage) that is calculated from different samples that are drawn from the same population will not always be the same. For example, if several different samples of 5 people are drawn at random from the U.S. population, the average income of the 5 people in those samples will differ. (In one sample, Bill Gates may have been selected at random from the population, which would lead to a very high mean income for that sample.) It is not incorrect to have sampling error, and in fact statistical techniques take into account that sampling error will occur. Sampling Frame A list of the entire population eligible to be included within the specific parameters of a research study. Scale A group of survey questions that measures the same concept. For example, a researcher may be interested in individuals’ gender role attitudes, and use several questions to determine their attitudes. This group of questions make up a gender role attitude scale. Scaled Score A mathematical transformation of a raw score so that scores can be compared across individuals and over time. The purpose of scaled scores is to report scores for all study participants on a consistent scale. Scatter Plot A display of the relationship between two quantitative or numeric variables. A scatter plot shows the value of one variable plotted against the value of another variable. Subjectivity A reflection of the person’s mind, or thoughts, which is the result of his/her experiences, moods or attitudes. Subjects Those who participate in research and from whom data are collected. |
T | Test-Retest Reliability The degree to which a measure produces consistent results over several administrations. Theoretical Sampling The selection of individuals within a naturalistic research study based on emerging findings as the study progresses to ensure that key issues are adequately represented. Theory General statement that describes a hypothesized relationship between different phenomena or characteristics. Theories should be specific enough to be testable with a well-designed research study. |
U | Unbalanced Scale A scale where the number of favorable and unfavorable categories is not the same. Unbiased A statistic that is free of systematic bias. Systematic bias occurs when the recorded data from a sample is systematically higher or lower than the true data values within the population. Systematic bias can occur as a result of sampling bias or measurement bias. Sampling bias is an error in sampling when some subgroup of the target population is unintentionally left out of the sampling process. Measurement bias is an error in data collection when some occurrence distorts the responses in the same way (e.g., a test is administered in a noisy classroom). Bias is a serious error in data collection and should be handled through a researcher’s careful attention to sources of bias. |
V | Validity The degree to which data and results are accurate reflections of reality. Validity refers to the concepts that are investigated, the people or objects that are studied; the methods by which data are collected; and the findings that are produced. Variable A measurable attribute or characteristics of a person, group or object that varies within the sample under investigation (e.g. age, weight, IQ, child, care type). In research, variables are typically classified as dependent, independent, intervening, moderating, or as control variables (See definitions elsewhere in glossary). Variance A commonly used measure of dispersion for variables. The variance is calculated by squaring the standard deviation. The variance is based on the square of the difference between the values for each observation and the mean value. |

Read more: