MACHINE LEARNING The ability of a program to learn from experience—i.e., to modify its execution on the basis of newly acquired information. In epidemiology and bioinfor- matics, examples include artificial neural networks, support vector machines, Bayesian networks, and other methods that update their procedures as new data are provided.176
MANN-WHITNEY TEST A test that compares two groups of ordinal scores, showing the probability that they form parts of the same distribution. It is a nonparametric equiva- lent of the t-test.
MANTEL-HAENSZEL ESTIMATE, MANTEL-HAENSZEL ODDS RATIO Mantel and Haenszel provided an adjusted (“summary”) odds ratio estimate that may be derived from grouped and matched sets of data.272 It is now known as the Mantel-Haenszel estimate, one of the few eponymous terms of modern epidemiology.
The statistic may be regarded as a type of weighted average of the individual odds ratios, derived from dividing a sample into a series of strata. Ideally, the strata would be internally homogeneous with respect to confounding factors. The Mantel-Haenszel method can also be extended to the summarization of rate ratios and rate differences from follow-up studies.
MANTEL-HAENSZEL TEST (SYN: COCHRAN-MANTEL-HAENSZEL TEST) A sum- mary chi-square test developed by Mantel and Haenszel for stratified data and used when controlling for confounding. It is a slight modification of an earlier test by Wil- liam Gemmel Cochran.
MANTEL’S TREND TEST A regression test of the odds ratio against a numerical vari- able representing ordered categories of exposure. It generalizes the Mantel-Haenszel test can be used to analyze results of any study, including a case-control study.
MARGINAL STRUCTURAL MODELS Statistical models that use inverse probability weighting for the estimation of causal effects in longitudinal studies in which there are time-varying confounders affected by prior exposure.273 Marginal structural models aim, for instance, to control for the effects of time-dependent confounders affected by prior treatment. These models cannot be used to estimate the effects of dynamic treatment regimes. They can, however, be used to estimate the effect of a nondynamic treatment regime when the data are derived from a cohort study in which the treatment regime is dynamic. They can be an alternative to G-estimation of structural nested models.
MARGINALS The row and column totals of a contingency table.
MARGIN OF SAFETY An estimate of the ratio of the no-observed-effect level (NOEL) to the level accepted in regulations. See also no-observed-adverse-effect level.
MARKETING See social marketing.
MARKOV PROCESS, MARKOV CHAIN A stochastic process such that the conditional
probability distribution for the state at any future instant, given the present state, is unaffected by any additional knowledge of the past history of the system. Invented by Andrei A. Markov (1856–1922). A family of regression models for correlated data used to study event histories that include transitions between several states; e.g., Markov chains are a common way of modeling the progression of a chronic disease through various severity states; for these models, a transition matrix with the probabilities of moving from one state to another for a specific time interval is usually estimated from cohort data. Several types of Markov models (e.g., “hidden Markov models”) are applied in health services research, health economics, clinical epidemiology, infectious disease epiemiology, genetic epidemiology, and systems biology. See also Monte-Carlo study.
MASKED STUDY See blind(ed) study.
MASKING (Syn: blinding) Procedures intended to keep participants in a study from knowing some facts or observations that might bias or influence their actions or deci- sions regarding the study.
MASS ACTION PRINCIPLE A fundamental principle of epidemic theory:274,275 the incidence of an infectious disease one serial interval in the future is dependent on the product of the current prevalence and the number of susceptibles in the population:
MATCHED CONTROLS See controls, matched.
MATCHING The process of making a study group and a comparison group similar or
identical with respect to their distribution of extraneous factors.12,31,97 Several kinds of matching can be distinguished:
MATERNAL MORTALITY Several definitions related to maternal mortality have been
agreed upon by internationally representative groups under the auspices of the WHO. A maternal death is death of a woman while pregnant or within 42 days of termination of pregnancy, irrespective of the duration and the site of pregnancy, from any cause related to or aggravated by the pregnancy or its management but not from accidental or incidental causes.
A late maternal death is the death of a woman from direct or indirect obstetric causes more than 42 days but less than 1 year after termination of pregnancy.
A pregnancy-related death is death of a woman while pregnant or within 42 days of termination of pregnancy, irrespective of the cause of death.
Direct obstetric deaths are those resulting from obstetric complications of the pregnant state (pregnancy, labor, and the puerperium) from interventions, omissions, incorrect treatment, or a chain of events resulting from any of the above.
Indirect obstetric deaths are those resulting from previous existing disease that developed during pregnancy and not due to direct obstetric causes but aggravated by the physiological effects of pregnancy.
In order to improve the quality of maternal mortality data and provide alternative methods of collecting data on deaths during pregnancy or related to it, as well as to encourage the recording of deaths from obstetric causes occurring more than 42 days following termination of pregnancy, the 43rd World Health Assembly in 1990 adopted the recommendation that countries consider the inclusion on death certificates of questions regarding current pregnancy and pregnancy within 1 year preceding death.
MATERNAL MORTALITY (RATE) The risk of dying from causes associated with child- birth. The numerator is the deaths arising during pregnancy or from puerperal causes (i.e., deaths occurring during and/or due to deliveries, complications of pregnancy, child- birth, and the puerperium). Women exposed to the risk of dying from puerperal causes are those who have been pregnant during the period. Their number being unknown, the number of live births is used as the conventional denominator for computing comparable maternal mortality rates. The formula is:
There is variation in the duration of the postpartum period in which death may occur and be certified as due to “puerperal causes,” i.e., maternal mortality. Although the WHO defines maternal mortality as death during pregnancy or within 42 days of delivery, in some areas a period as long as a year is used. Maternal deaths may be subdivided into two groups: direct obstetric deaths and indirect obstetric deaths.
MATHEMATICAL MODEL A representation of a system, process, or relationship in math- ematical form in which equations are used to simulate the behavior of the system or process under study. The model usually consists of two parts: the mathematical structure itself (e.g., Newton’s inverse square law or Gauss’s “normal” law), and the particular constants or parameters associated with them (such as Newton’s gravitational constant or the Gaussian standard deviation). A mathematical model is fully deterministic if the dependent variables involved take on values not allowing for any play of chance.
A model is said to be stochastic, or random, if random variation is allowed to enter the
picture. See also model. MATRIX
MAXIMUM ALLOWABLE CONCENTRATION (MAC) See safety standards. MAXIMUM LIKELIHOOD ESTIMATE The value for an unknown parameter in a model
that maximizes the probability of obtaining exactly the data that were observed. Most
often used to find estimates of coefficients in logistic models.
M-BIAS Bias in collider (C)-specific or C-adjusted exposure (E) – disease (D) associations
arising from an “M pattern” within the underlying causal structure (in which all or part of the C–E association arises from shared causes A of C and E, and all or part of the C–D association arises from shared causes B of C and D).82 It is called “M” because of the M shape of the corresponding causal diagram, the “M diagram”, in which events are temporally ordered from top (earliest) to bottom (latest), C is a collider on the “back-door” path from E to D passing through A, C and B (a back-door path from E to D is a path that begins with an arrow pointing to E; such paths are sources of confounding). Like other collider-stratification bias, M-bias arises from adjustment for a variable C that numerically behaves like a classical confounder (in that the effect esti- mate changes upon adjustment for C). Unlike other collider-stratification bias, M-bias attributable to C-adjustment may not be apparent from the time order of the events, for C may be determined before E or D; hence, one may be led to adjust for C (and thus introduce bias) if one uses traditional confounder-selection criteria, even if one takes care to not adjust for variables affected by E or D.82 See also confounding bias.
MCNEMAR’S TEST A form of the chi-square test for matched-pairs data. It is a special case of the Mantel-Haenszel test.
MDR (MULTIDRUG RESISTANT) See drug resistance, multiple.
MEAN, ARITHMETIC The sum of all the individual values in a set of measurements divided by the number of values in the set. A measure of central tendency. See also
MEAN-DIFFERENCE PLOT (Syn: Tukey mean-difference plot) A scatter plot that shows
changes in the percentiles of the distribution of a measure from samples obtained at two time points. The differences of each percentile from the earlier to the later time points are plotted on the vertical axis, and the means of the two values of each per- centile are on the horizontal axis. Data are typically shown for the 2.5, 5, 10, 20, 30, 40, 50, 60, 70, 80, 90, 95, and 97.5 percentile points. Departure of the plotted points from a horizontal line indicates change of shape of the distribution.
MEAN, GEOMETRIC A measure of central tendency. This is calculated by adding the logarithms of the individual values, calculating their arithmetic mean, and converting back by taking the antilogarithm. Can be calculated only for positive quantities.
MEAN, HARMONIC A measure of central tendency computed by summing the recip- rocals of all the individual values and dividing the resulting sum into the number of values.
MEASUREMENT The procedure of applying a standard scale to a variable or to a set of values.
1. Systematic error (bias) in a measurement.
151 Measurement scale
2. Systematic error arising from inaccurate measurements (or classification) of subjects on study variable(s). See information bias.
MEASUREMENT, TERMINOLOGY OF There is sometimes ambiguity about the terms used to describe the properties of measurement: accuracy, precision, validity, reliability, repeatability, and reproducibility. Accuracy and precision are often used synonymously, validity is defined variously, and reliability, repeatability, and reproducibility are often used interchangeably. Etymologies are helpful in making a case for preferred usages, but they are not always decisive. Accuracy is from the Latin cura (care), and while this may be of interest to those in the health field, it does not illuminate the origins of the standard definition, that is, “conforming to a standard or a true value” (OED). Accuracy is distinguished from precision in this way: A measurement or statement can reflect or represent a true value without detail; e.g., a temperature reading of 37.5 ̊C may be accurate, but it may not be precise if a thermometer that registers 37.527 ̊C is taken as the reference. See also accuracy.
Precision (from Latin praecidere, cut short) is the quality of being sharply defined through exact detail. A faulty measurement may be expressed precisely but may not be accurate. Measurements should be both accurate and precise, but the two terms are not synonymous. See also precision.
Consistency or reliability describes the property of measurements or results that conform to themselves.
Reliability (Latin religare, to bind) is defined by the OED as a quality that is sound and dependable. Its epidemiological usage is similar; a result or measurement is said to be reliable when it is stable (i.e., when repetition of an experiment or measurement gives the same results). The terms repeatability and reproducibility are synonymous (the OED defines each in terms of the other), but they do not refer to a quality of measurement— rather, only to the action of performing something more than once. Thus, a way of discovering whether or not a measurement is reliable is to repeat or reproduce it. The terms repeatability and reproducibility, formed from their respective verbs, are used inaccurately when they are substituted for reliability, a noun that refers to the measuring procedure rather than the attribute being measured. However, in common usage, both repeatability and reproducibility refer to the capacity of a measuring procedure to produce the same result on each occasion in a series of procedures conducted under identical conditions.
Validity is used correctly when it agrees with the standard definition given by the OED: “sound and sufficient.” If, in the epidemiological sense, a test measures what it purports to measure (it is sufficient) then the test is said to be valid. See also accuracy; precision; reliability; repeatability; validity; validity, study.
MEASUREMENT SCALE The range of possible values for a measurement (e.g., the set of possible responses to a question, the physically possible range for a set of body weights).
Measurement scales can be classified according to the quantitative character of the scale: 1. dichotomous scale (Syn: binary scale): One that arranges items into either of two
mutually exclusive categories; e.g., yes/no, alive/dead.
2. nominal scale (Syn., polytomous scale, polytomy): Classification into unordered
qualitative categories; e.g., race, religion, and country of birth. Measurements of individual attributes are purely nominal scales, as there is no inherent order to their categories.
Measure of association 152
3. ordinal scale: Classification into ordered qualitative categories, e.g., social class (I, II, III, etc.), where the values have a distinct order but their categories are qualitative in that there is no natural (numerical) distance between their possible values. See also ranking scale.
4. interval scale: An (equal) interval involves assignment of values with a natural distance between them, so that a particular distance (interval) between two values in one region of the scale meaningfully represents the same distance between two values in another region of the scale. Examples include Celsius and Fahrenheit temperature, date of birth.
5. ratio scale: A ratio is an interval scale with a true zero point, so that ratios between values are meaningfully defined. Examples are absolute (Kelvin) temperature, weight, height, blood count, and income, as in each case it is meaningful to speak of one value as being so many times greater or less than another value.
Dichotomous, nominal, and ordinal scales are sometimes called qualitative or “categorical,” but the latter term has other meanings, such as discrete (as opposed to continuous). An example of a categorical scale that is also a ratio scale is household size (1, 2, 3, …). Interval and ratio scales are sometimes called quantitative scales.
MEASURE OF ASSOCIATION A quantity that expresses the strength or degree of association between variables. Commonly used measures of association are ratios and differences between means, proportions, risks, or rates, and correlation and regression coefficients.
MEASURE OF EFFECT See effect measure.
MEASURES OF CENTRAL TENDENCY A general term for several values of the
distribution of a set of values or measurements located at or near the middle of the set. The principal measures of central tendency are the mean (average), median, and mode.
MECHANICAL TRANSMISSION Transmission of pathogens by a vector (e.g., a housefly) without biological development in or dependence on the vector. Many fecal-oral infec- tions are spread by this means. See also vector-borne infection.
MECHANISM In epidemiology and other health, life, and social sciences, the way in which a particular health-related event or outcome occurs, often described in terms of the agents and steps involved. Whereas the focus is often on biological mechanisms, envi- ronmental, social, and cultural mechanisms are also relevant to epidemiology, public health, medicine, and related disciplines.
MECHANISTIC BIAS A form of interpretive bias that occurs if interpretation of scien- tific evidence is less rigorous when basic science furnishes credibility for the putative mechanisms underlying the findings than when it does not.34 See also biological plau- sibility; coherence.
MECHANISTIC EPIDEMIOLOGY Epidemiological research that focuses on mechanisms underlying and explaining associations between determinants and health-related events or states. It is not a formal branch or specialty of epidemiology, nor is it an epide- miological method or philosophy. Loosely, the opposite of “black-box epidemiology.” See also applied epidemiology.
MEDIAN Ameasureofcentraltendency.Thesimplestdivisionofasetofmeasurements is into two parts – the lower and the upper half. The point on the scale that divides the group in this way is called the “median.”
MEDIATOR (MEDIATING) VARIABLE See intermediate variable.
153 Mendelian randomization
MEDICAL AUDIT A health service evaluation procedure in which selected data from patients’ charts are summarized in tables displaying such data as average length of stay or duration of an episode of care, the frequency of diagnostic and therapeutic proce- dures, and outcomes of care arranged by diagnostic category. These are often compared with predetermined norms.
MEDICAL CARE See health care.
MEDICAL GEOGRAPHY (Syn: geographical pathology). A branch of science concerned
with the spatial variations in environmental conditions related to health and disease. It combines biology, ecology, medicine, epidemiology, and geography and applies techniques such as mapping to medical and health problems.276,277 Satellite imaging and remote sensing have facilitated mapping the distribution of epidemiologically important biota, such as phytoplankton and zooplankton, and have strengthened the integration of epidemiology, ecology, and geography in studies of medical geography and geographical medicine. Cartographic methods such as choroplethic and isode- mographic maps provide a useful visual display of the geographical variations in the distribution of disease, medical and other health care facilities, etc. See also geographic information system; geomatics.
MEDICAL RECORD A file of information relating to a transaction(s) in personal health care. In addition to facts about a patient’s illness, medical records nearly always contain other information. The information in medical records includes the following:
MEDICAL STATISTICS The branch of biostatistics concerned with medical problems and research.
MEDICALIZATION The process by which problems traditionally considered nonmedical come to be defined and treated as medical issues. The process of identification of a personal or social condition as a medical issue subject to medical intervention. The expansion of medical profession’s influence and authority into the domains of everyday existence.108,143,144,207,217 See also genetization; integration; reductionism.
1. An approach or “strategy” of observational epidemiology that uses findings from association studies of well-characterized functional genetic variants to assess causal inferences about modifiable environmental exposures. One of the instrumental variable approaches for making causal inferences from observational data in the face of uncontrolled confounders.212,278 It is based on the fact that inheritance of one genetic trait is independent of (i.e., randomized with respect to) other unlinked traits. Functional variants will not be associated with other genetic variants apart from those with which they are in linkage disequilibrium; this assumption follows from the law of independent assortment (sometimes referred to as Mendel’s second law), hence the term Mendelian randomization. At a population level, traits influenced by genetic variants are generally not associated with the social, behavioral, and environmental factors that confound relationships in conventional epidemiological studies.Thus genetic variants can serve as an indicator of the action of environmentally modifiable exposures. Example: studies have pointed out that the autosomal dominant condition of lactase persistence is positively associated with drinking milk; thus protective associations of lactase persistence with osteoporosis, bone mineral density, or fracture risk provide evidence that milk drinking protects against these conditions. Mendelian randomization may help to avoid confounding, bias due to reverse causation or reporting tendency, and underestimation of associations due to variability in behaviors and phenotypes. Factors limiting the inferential power of Mendelian randomization include confounding of associations between genotype, intermediate phenotype, and disease through linkage disequilibrium or population stratification; pleiotropy and the multifunctionality of genes; canalization and developmental stability; and lack of suitable polymorphisms for studying modifiable exposures of interest.212,278
2. Originally, random assortment of genetic variants at conception, used to provide an unconfounded study design for estimating treatment effects for childhood malignancies.
MENDEL’S LAWS Derived from the pioneering genetic studies of Gregor Mendel (1822–1884). Mendel’s first law states that genes are particulate units that segregate; i.e., members of the same pair of genes are never present in the same gamete, but always separate and pass to different gametes. Mendel’s second law states that genes assort independently; i.e., members of different pairs of genes move to gametes independently of one another.
META-ANALYSIS A statistical analysis of results from separate studies, examining sources of differences in results among studies, and leading to a quantitative summary of the results if the results are judged sufficiently similar to support such synthesis. In the bio- medical sciences, the systematic, organized, and structured evaluation of a problem of interest, using information (commonly in the form of statistical tables or other data) from a number of independent studies of the problem. A frequent application is the pooling of results from a set of randomized controlled trials, which in aggregate have more statistical power to detect differences at conventional levels of statistical signifi- cance. Meta-analysis has a qualitative component (i.e., classification of studies accord- ing to predetermined characteristics capable of influencing results, such as study design, completeness and quality of data, absence of biases), and a quantitative component (i.e., extraction and analysis of the numerical information). The aim is to integrate the findings, if possible, and to identify overall trends or patterns in the results.279 Studies must be subject to critical appraisal, and various biases in the selection of subjects, detection of events, or presentation of results (e.g., publication bias) must be assessed.14,106,280 See also systematic review.
METAPHOR A word, image, expression, concept, or symbol used as a cognitive device to convey or comprehend an idea – sometimes an abstract or complex concept. In epide- miology, a classic example is the “web of causation.” Metaphors are important in many scientific and professional endeavors, including many epidemiology-related activities (e.g., health promotion, risk assessment, risk communication); and, of course, in teach- ing epidemiology.143,144,169,216,217 They may inspire or otherwise form the basis of subse- quent formal developments (e.g., causal diagrams partly stem from and formalize the web-of-causation metaphor).
METHODOLOGY The scientific study of methods. Methodology should not be confused with methods. The word methodology is all too often used when the writer means method.
MIASMA THEORY An explanation for the origin of epidemics, the “miasma theory” was implied by many ancient writers and made explicit by Lancisi in De noxiis paludum effluviis (1717). It was based on the notion that when the air was of a “bad quality” (a state that was not precisely defined but that was supposedly due to decaying organic matter), the persons breathing that air would become ill. Malaria (“bad air”) is the clas- sic example of a disease that was long attributed to miasmata. “Miasma” was believed to pass from cases to susceptibles in those diseases considered contagious.
MIGRANT STUDIES Studies taking advantage of migration to one country by those from other countries with different physical and biological environments, cultural background, and/or genetic makeup, and different morbidity or mortality experience.
Comparisons are made between the mortality or morbidity experience of the migrant groups with that of their current country of residence and/or their country of origin. Sometimes the experiences of a number of different groups who have migrated to the same country have been compared.
MILLENNIUM DEVELOPMENT GOALS (MDGs) drawn from the actions and targets con- tained in the Millenium Declaration, which was adopted by 189 nations during the United Nations Millennium Summit in September 2000. To be achieved by 2015, the eight MDGs break down into 18 quantifiable targets measured by 48 indicators. The MDGs recognize the interdependence between health, growth, poverty reduction, and sustainable devel- opment; they acknowledge that development rests on democratic governance, the rule of law, respect for human rights, and peace and security.281 The eight MDGs are:
Goal 1: Eradicate extreme poverty and hunger.
Goal 2: Achieve universal primary education.
Goal 3: Promote gender equality and empower women. Goal 4: Reduce child mortality.
Goal 5: Improve maternal health.
Goal 6: Combat HIV/AIDS, malaria, and other diseases. Goal 7: Ensure environmental sustainability.
Goal 8: Develop a Global Partnership for Development.
MILL’S CANONS In A System of Logic (first edition 1843), John Stuart Mill (1806–1873) devised logical strategies (“canons”) from which causal relationships may be inferred. Four in particular are pertinent to epidemiology:10
Method of agreement (first canon): “If two or more instances of the phenomenon under investigation have only one circumstance in common, the circumstance in which alone all the instances agree, is the cause (or effect) of the given phenomenon.”
Method of difference (second canon): “If an instance in which the phenomenon under investigation occurs, and an instance in which it does not occur, have every circumstance in common save one, that one occurring only in the former, the circumstance in which alone the two instances differ is the effect, or cause or a necessary part of the cause, of the phenomenon.”
Method of residues (fourth canon): “Subduct from any phenomenon such part as is known by previous inductions to be the effect of certain antecedents, and the residue of the phenomenon is the effect of the remaining antecedents.”
Method of concomitant variation (fifth canon):“Whatever phenomenon varies in any manner whether another phenomenon varies in some particular manner, is either a cause or an effect of that phenomenon, or is connected with it through some fact of causation.” See also causal criteria.
MINIMAL CLINICALLY IMPORTANT DIFFERENCE The smallest effect of a treatment that patients perceive as beneficial and that, in the absence of unacceptable side effects, inconvenience, and costs, mandates that the treatment be given. A term used in clinical trials.
MINIMUM DATA SET (Syn: uniform basic data set) A widely agreed upon and generally accepted set of terms and definitions constituting a core of data acquired for medical records and employed for developing statistics suitable for diverse types of analyses and users. Such sets have been developed for birth and death certificates, ambulatory care, hospital care, and long-term care. See also birth certificate; death certificate; hospital discharge abstract system.
MISCLASSIFICATION The erroneous classification of an individual, a value, or an attribute into a category other than that to which it should be assigned.12,31,97 The probability of misclassification given the true value may be the same in all study groups (nondifferential misclassification) or may vary between groups (differential misclassification, e.g., accuracy of diagnoses of cases depends on their alcohol consumption).254 It is wrong to assume that nondifferential misclassification can produce only bias toward the null in measures of association or effect; other conditions must also be satisfied in order to ensure that bias is toward the null, most prominently that the misclassification must be independent of (unrelated to) the occurrence of other errors.12 Such independence is rare in clinical and epidemiological research.
MISSION The purpose for which an organization exists. See also goal, objective, target. MMWR Morbidity and Mortality Weekly Report. A publication of the U.S. Centers for
Disease Control and Prevention (www.cdc.gov/mmwr).
MOBILITY, GEOGRAPHIC Movement of persons from one permanent place of residence (country or region) to another.
MOBILITY, SOCIAL Movement from one defined socioeconomic group to another, either upward or downward. Downward social mobility, which can be related to impaired health (e.g., alcoholism, schizophrenia, mental retardation), is sometimes referred to as “social drift.”
MODE The most frequently occurring value in a set of observations. One of the measures of central tendency. See also average.
In epidemiology, the use of models began with an effort to predict the onset and course of epidemics. In the second report of the Registrar-General of England and Wales (1840), William Farr developed the beginnings of a predictive model for communicable disease epidemics. He had recognized regularities in the smallpox epidemics of the 1830s. By calculating frequency curves for these past outbreaks, he estimated the deaths to be expected. See also demonstration model; mathematical model; theoretical epidemiology.
MODEL LIFE TABLE Simulated life table constructed for a country, used mainly when vital statistics are deficient. The model may be based on averaging of empirical data or on more sophisticated methods. The Coale-Demeny method is a range of models for life expectancies ranging from 20 to 80+ years with four variations of mortality patterns.
MODIFYING FACTOR See effect modifier.
MOLECULAR EPIDEMIOLOGY An approach to study the molecular mechanisms, pathophysiology, and etiology of disease; less frequently, early detection, treatment and prognosis. A way of practicing integrative research; sometimes, a level of measurement—but not really a discipline with substantive research content.23,282–285 From an instrumental viewpoint, the use in epidemiological research of the techniques of molecular and cellular biology, genetics, systems biology, proteomics and other “omics” approaches to analyze biomarkers.192,193 Molecular techniques are used in can- cer epidemiology to identify, characterize, and measure molecular changes involved in carcinogenesis (xenobiotic DNA adducts, somatic genetic mutations); metabolic polymorphisms; and many other genetic and epigenetic processes. Molecular epidemiology is making valuable contributions to biomedical, clinical, and population sciences; e.g., research on the role of gene-environment interactions in the etiology of many dis- eases is generating knowledge about biological mechanisms as well as about primary prevention.80 See also HuGENet.
MONOGENIC DISEASES Diseases in which a genetic variant of high penetrance con- fers a high risk of developing the disease and may thus be thought to be the sole cause of the disease, although the penetrance and expressivity of the gene are sometimes regulated by other genes or even by lifestyle and environmental exposures (e.g., diet, access to effective medical treatment). An antonym of polygenic diseases.
MONOTONIC SEQUENCE A sequence is said to be monotonically increasing if each value is greater than or equal to the previous one and monotonically decreas- ing if each value is less than or equal to the previous one. If equality of values is excluded, we speak of a strictly (increasing or decreasing) monotonic sequence. A sequence that is monotonic in either direction is said to be monotone, or to display monotonicity.
MONTE-CARLO STUDY, TRIAL Complex relationships that are difficult to solve by mathematical analysis are sometimes studied by computer experiments that simulate and analyze a sequence of events using random numbers. Such experiments are called Monte Carlo trials or studies, in recognition of Monte Carlo as one of the gambling capitals of the world. See also Markov process; simulation.
MOOSE Meta-analysis Of Observational Studies in Epidemiology. A consensus checklist to improve the quality of reports of meta-analyses of observational studies. It contains specifications on background, search strategy, methods, results, discussion, and conclusion.95,106,286 See also consort; quadas; quorom; stard; strobe; trend.
See also health index; incidence rate; notifiable disease; prevalence.
MORBIDITY RATE A term, preferably avoided, used to refer to the incidence rate and sometimes (incorrectly) to the prevalence of disease.
MORBIDITY SURVEY A method for estimating the prevalence and/or incidence of disease in a population. A morbidity survey is usually designed simply to ascertain the facts as to disease distribution and not to test a hypothesis. See also cross-sectional study; health survey.
MORTALITY:INCIDENCE RATIO See cancer mortality:incidence ratio.
MORTALITY RATE See death rate.
MORTALITY STATISTICS Statistical tables compiled from the information contained in death certificates. Most administrative jurisdictions in all nations produce tables of mortality statistics. These may be published at regular intervals; they usually show numbers of deaths and/or rates by age, sex, cause, and sometimes other variables.
MOVING AVERAGES (Syn: rolling averages) A set of methods for smoothing irregularities in trend data, such as long-term secular trends in incidence or mortality rates. Graphical display of (say, 3- or 5-year) moving averages makes it easier to discern long- term trends in rates that otherwise might be obscured by short-term fluctuations. The span over which the average is taken is sometimes called the window width. Within that window, the averages may be weighted by proximity to the point at which the rate is being estimated. This weighting function is sometimes called a “kernel function,” and the process is then called kernel smoothing.
MRC Medical Research Council (UK, Canada, other countries) Oversight government- appointed groups that define and set policies for research in all aspects of medical sci- ence and usually implement policies by allocating funds for research training, research programs, and research projects.
MSM Men who have sex with men. In this group, high-risk practices for HIV infection may occur.
MULTICOLLINEARITY In multiple regression analysis, a situation in which at least some of the regressors (independent variables) are highly correlated with each other. Such a situation can result in inaccurate or undefined estimates of the parameters in the regression model.12,20
MULTIDRUG-RESISTANT (MDR) See drug resistance, multiple.
MULTIFACTORIAL ETIOLOGY See multiple causation.
MULTILEVEL ANALYSIS (Syn: contextual analysis, hierarchical analysis) Integration of contextual, group, or macrolevel factors with individual-level factors in epidemiological analyses of health states and outcomes. The rationale is that the distribution of health and disease in populations is not explained only by characteristics of individuals.287 Methodologies that analyze outcomes in relation to determinants simultaneously measured at different levels (e.g., individual, workplace, neighborhood, region). One aim of multilevel analyses is to explain how group- and individual-level variables interact in shaping health. Such analyses require one to select the appropriate contextual units and contextual variables, to correctly specify the model, and to account for residual correlation between individuals within contexts.120,137
MULTILEVEL MODEL (Syn: hierarchical model) A regression model in which the coefficients of the regressors are themselves modeled as functions of properties of the regressors. For example, in a regression of colon cancer incidence in relation to food intakes, the food coefficients may be modeled as functions of their nutrient contents. In a regression of lung-cancer incidence in relation to occupation, an occupational coefficient may be modeled as a function of the chemical exposures in the occupation. The properties used to model the coefficients are called second-level or second-stage co-variates. Multilevel models are equivalent to random-coefficient models or mixed models in which the second-level covariates have random coefficients.
MULTINOMIAL DISTRIBUTION The probability distribution associated with the classification of each of a sample of individuals into one of several mutually exclusive and exhaustive categories, assuming that the individual classifications are independent of one another. When the number of categories is two, the distribution is called binomial distribution.
MULTIPHASE SAMPLING Method of sampling that gathers some information from a large sample and more detailed information from subsamples within this sample, either at the same time or later. Contrast to multistage sampling.
MULTIPHASIC SCREENING See screening.
MULTIPLE CAUSATION (Syn: multifactorial etiology) The concept that a given health state or health-related process may have more than one cause. A combination of causes or alternative combinations of causes is often required to produce the health outcome. See also causal diagram; diseases of complex etiology; probability of causation; risk factor; web of causation.
MULTIPLE CAUSE THEORY A theory coherent with multiple causation, i.e., with the fact that multiple coexisting causes may influence the occurrence of disease and other health outcomes. By contrast, Henle-Koch postulates do not admit multiple causes of a single disorder, nor do they contemplate causal relations not susceptible to experi- mentation. Early in the twentieth century, German scientists raised questions about the limitation of such postulates and paved the way for new ideas on multifactorial causality.288 Consensus about multiple causation coalesced a half-century later, when chronic noninfectious disease had become a leading public health concern. Thereafter, the theory permeated epidemiology up to the present time.6–11,66,67,120,169,253,262,273
MULTIPLE COMPARISON PROBLEMS Problems that arise from the fact that the greater the number of conventional statistical tests of significance conducted on a data set, the greater the probability that at least one or more tests will falsely reject the null hypothesis solely because of the play of chance. Adjustment of the alpha level in this situation is a debatable option; it has been strongly criticized because it will dramati- cally raise the false-negative rate (rate of type-ii error, failure to reject a false null).12 See also P value; significance, statistical.
MULTIPLE COMPARISON TECHNIQUES Statistical procedures to adjust for differences in probability levels in setting up simultaneous confidence limits involving several dis- tributions or sets of data or in comparing the means of several groups. Tukey’s method is the most conservative; this uses the difference between the largest and smallest means as a measure of their dispersion; the q statistic, based on the α level (acceptable rate of type-i error), and the number of groups are used as multipliers of the standard devia- tion. The Bonferroni correction adjusts the α error level to compensate for multiple comparisons between three or more groups or two or more response variables.
Such conventional multiple comparisons techniques are problematic because they raise the false-negative rate (rate of type-ii error, failure to reject a false null), often to the point that it may become impossible to detect any true effects. Modern techniques that attempt to address both type-I and type-II error have been developed, especially under the topic of empirical-Bayes methods and shrinkage estimation.
MULTIPLE LOGISTIC MODEL See logistic model.
MULTIPLE OF THE MEDIAN A simple method of adjusting for variables such as age and sex in direct proportion to the magnitude of the original measurements; the method is not much affected by variation in measurement errors. However, the method is criti- cized because the multiple of the median is affected by the distribution of results used to determine the median, and there is no correction for the spread of the data. For these reasons the z score is preferable.
MULTIPLE REGRESSION TECHNIQUES Techniques for regression analysis that allow the inclusion of multiple regressors (independent variables).20
MULTIPLE RISK Where more than one risk factor for the development of a disease or other outcome is present and their combined presence results in an increased risk, we speak of “multiple risk.” The increased risk may be due to the additive effects of the risks associated with the separate risk factors, or to synergism. See also multiple causation.
MULTIPLICATIVE MODEL A model in which the joint effect of two or more causes is the product of their individual effects. For instance, if factor X multiplies risk by the amount x in the absence of factor Y, and factor Y multiplies risk by the amount y in the absence of factor X, then the multiplicative risk model states that the two factors X and Y together will multiply the risk by x × y. See also additive model.
MULTISTAGE MODEL A mathematical model, mainly for carcinogenesis, based on the theory that a specific carcinogen may affect any one of several stages in the develop- ment of cancer.
MULTISTAGE SAMPLING Selection, random or otherwise, of entities (such as geo- graphical regions, schools, workplaces) followed by random sampling of persons within each sampled group. The method has advantages such as convenience and feasibility, but it complicates analysis. Contrast to multiphase sampling.
MULTIVARIATE ANALYSIS A set of techniques used when the variation in several variables has to be studied simultaneously. In statistics, any analytical method that allows the simultaneous study of two or more dependent variables (regressands).12,20,31,97
MUTAGEN A physical or chemical agent that raises the frequency of mutation above the spontaneous rate. Any substance that can cause genetic mutations. Mutagens cause mutations in several different ways.
MUTAGENIC That which causes mutations. Contrast clastogenic and aneugenic.
MUTATION Any change in a DNA sequence. In a clinical sense, any such change that disrupts the information contained in DNA and leads to disease. Many types of mutations and mechanisms leading to mutations exist.23 Change in the genetic material not caused by genetic segregation or recombination that is transmitted to daughter cells and to succeeding generations provided that it is not a dominant lethal factor.
MUTATION RATE The frequency with which mutations occur per gene or per generation.