Scientific Contribution
“A Disease by any other Name:
musings on the concept of a genetic disease”
Kelly C. Smith
Clemson University
Department of Philosophy & Religion
101 Hardin Hall
Clemson, SC 29634
USA
(864) 656-5366 Voice
(864) 656-2858 Fax
kcs@clemson.edu
ABSTRACT
What exactly is a genetic disease? For a phrase one hears on a daily basis,
there has been surprisingly little analysis of the underlying concept. Medical doctors seem perfectly willing to
admit that the etiology of disease is typically complex, with a great many
factors interacting to bring about a given condition. On such a view, descriptions of diseases like cancer as genetic
seem at best highly simplistic, and at worst philosophically indefensible. On the other hand, there is clearly some
practical value to be had by classifying diseases according to their predominant
cause when this can be accomplished in a theoretically satisfactory
manner. The question therefore becomes
exactly how one should go about selecting a single causal factor among many to
explain the presence of disease. When
an attempt to defend such causal selection is made at all, the standard
accounts offered (Koch’s postulates, Hill’s epidemiological criteria,
manipulability) are all clearly inadequate.
I propose, however, an epidemiological account of disease causation
which walks the fine line between practical applicability and theoretical
considerations of causal complexity and attempts to compromise between
patient-centered and population-centered concepts of disease. The epidemiological account is the most
basic framework consistent with our strongly held intuitions about the causal
classification of disease, yet it avoids the difficulties encountered by its
competitors.
There is a crucial distinction that must be drawn between two related problems in causal analysis. Germond Hesslow (1983, 1984, 1988) does this admirably well in his discussion of causal connection vs. causal selection. The problem of causal connection is the problem of determining which factors in a complex set, many of which may be correlated to the outcome, are actually causes of the outcome. It is this question that takes up the bulk of the scientist’s attention, and for obvious reasons. For example, the current debate over whether one’s intake of dietary salt increases the risks of coronary disease is a debate about whether and to what extent salt is actually a causal factor (as opposed to merely a correlationally attractive bystander) in the disease etiology.
Casual selection is a fundamentally different problem, however. Causal selection involves identifying which factor(s), among those known to be causally involved in a particular outcome, to cite as explaining that outcome. In complex causal situations (which are particularly common in medicine), we simply can not, for practical reasons, cite the entire causal matrix as the appropriate explanation[i]. For example, we would tend to cite the application of a burning match to a pile of hay as the cause of a barn fire, without meaning to imply that other factors (presence of oxygen, absence of large quantities of water, etc.) were not also causally involved[ii]. Causal factors other than the one selected (presence of oxygen, etc.) are typically referred to as causal conditions (or, derogatorily, as mere conditions).
Most human diseases have a highly complex etiology, involving as they do causal factors at all levels: genetic, physiological, systemic, psychological, social, etc. It is thus understandable that, as long as questions of causal connection remain unanswered, considerable effort is expended in an attempt to further elucidate the causal matrix. What we must keep firmly in mind, however, is that describing a disease simply in terms of its etiology is answering a fundamentally different question than that of causal selection. This becomes clear once we realize that, even if all causal connection problems were answered once and for all, causal selection problems would remain. This is because what we regard as an appropriate explanation has a pragmatic component that is not captured in causal connection analysis.
Allow me to illustrate this point with a hypothetical example. Suppose an alien physician were to examine a human hospital patient. Suppose further that the alien’s technology is so advanced that he is able to determine the precise causal sequence leading to that particular patient’s cancer. The alien has thus completely solved the causal connection problem and could, at least in principle, draw a causal map of the development of cancer accurate down to the subatomic level. If you were to ask the alien, “What caused the cancer?” he might simply point to the causal diagram. However, this is unlikely to satisfy us because we are not asking for an exhaustive list of causal connections, but for some selection as to which of those factors we should cite as being most explanatory.
The alien might be able to satisfy this desire for explanation if some causal factor were either necessary or sufficient for the development of the cancer. For example, if everyone with a particular gene developed this cancer, the gene would be an obvious choice for causal selection and we could legitimately call this a genetic disease in that sense. We could also select in a relatively straightforward fashion if nobody without some (non-universal) environmental influence developed the cancer. For example, if nobody developed the cancer unless they drank large amounts of cranberry juice – in which case, we might label it an environmental disease. However, there are two basic problems here. First, note that the examples above require information about populations of humans. It would not be possible for the alien physician to judge the necessity or sufficiency of the causal factors present in this isolated case without knowing a great deal more about humans than he could reasonably derive from a single patient. Second, given what we know about diseases like cancer, it seems highly unlikely that there will be such a neat identification of sufficient and/or necessary conditions, since the causal matrices are just too complex.
The two philosophers to think most clearly about this problem were J.S. Mill and J.L. Mackie, both of whom ultimately expressed pessimism about the possibility of non-arbitrary causal selection. Mill (1859) points out that, rather than simple necessary or sufficient conditions, what one typically finds are sets of conditions, the components of which are sufficient only when considered jointly, with the set itself being non-necessary[iii]. For example, the fire in the barn is the result of a great many factors such as the application of burning match to hay, absence of water, presence of oxygen, etc. – none of which would bring about a fire by themselves (i.e., they are individually insufficient). On the other hand, there are an infinite number of such sets which could bring about a fire in the barn – we might apply white phosphorous to the wall or suddenly increase the air pressure 100,000,000 fold, for example. Therefore, the particular set of factors in any case, while (jointly) sufficient, is not necessary.
Mill sympathizes with the desire to causally select one factor over another in such situations, but feels such exercises are ultimately arbitrary:
“The real cause is the whole of these antecedents; and we have, philosophically
speaking, no right to give the name of cause to one of them exclusively of the
others…Nothing can better show the absence of any scientific ground for the
distinction between the cause of a phenomenon and its conditions, than the
capricious manner in which we choose to denominate the cause.” (Mill 1859, pp
214-215)
Given that most diseases are causally complex, this problem is inescapable when classifying diseases in terms of their etiology, as modern medicine so often does[iv]. The fact of the matter is that much of importance hinges on the causal classification of a disease: social acceptability, funding for research, etc. If such classifications really are fundamentally arbitrary, then we have much to correct.
II. Simplistic Methods of Causal Selection
“He who would do away with philosophy is the slave of the worst philosophy.”
- Fredrich Engels
Ultimately, some of what we mean to capture by classifying diseases as genetic can be salvaged. However, people tend to think about causal selection, when they think about it at all, in highly simplistic terms[v]. In particular, there are three popular approaches which are used in an attempt to settle the causal selection problem in this context, all of which are inadequate.
Perhaps the most common, but also certainly the most hopelessly flawed of these techniques, holds that we are justified in claiming a disease is genetic, provided that genes are causally involved. This might be a defensible claim, as we will see, if the analysis went further than this to discuss what kind involvement there is and why it is the most important aspect of the explanation of the disease. However, the typical conclusion of a medical genetics paper does not even begin to consider such things. The recent literature is thus rife with examples of researchers claiming that, because there is good evidence of genetic involvement, the disease is now known to be genetic[vi]. Little more needs to be said to show the absurdity of this than to point out that any disease whatsoever can be classified as genetic on these grounds. All disease must involve, directly or indirectly, genes – if nothing else, genes code for the proteins which make all of biology possible. Even something so obviously non-genetic as lead poisoning, for example, could be called genetic in the sense that some people are likely to have more efficient (gene-based) mechanisms for handling high lead levels and thus will tend not to get the condition when others around them do, etc. We could, with equal justification, argue that all such disease should be classified as protein based, since all disease must also involve proteins. Clearly, this kind of analysis does nothing to illuminate the causal situation.
The second
approach was first systematically developed by Robert Koch in his set of three
key postulates. Since these postulates
have done much to shape the current thinking about medical causation, they
warrant close examination. Koch argues
that we are justified in saying a particular pathogen causes a particular disease
whenever three basic conditions are met:
1) The pathogen is always found in individuals with the disease.
2) The pathogen is never found in
individuals with conditions other than the
disease.
3) The pathogen always produces the disease when
introduced into healthy
individuals.
Note that postulate #1 is simply a requirement that the pathogen is necessary, while postulates #2 and #3 are requirements that it also be sufficient[vii]. It is certainly true that these postulates served Koch and others well in the early field work for which they became famous. The pathogenic diseases Koch studied are probably uniquely suited to this type of analysis, involving as they do a clearly identifiable infectious agent with well-defined and dramatic onset of a stereotypical set of symptoms. But even for infectious diseases, these postulates are, strictly speaking, too strong. For example, many people are infected with the TB bacillus yet never exhibit the disease, but we do not conclude that TB is not caused by the bacillus[viii]. In any event, infectious diseases are not representative of disease in general and genetic disease in particular.
Consider the case of Cystic Fibrosis, often put forward as a classic example of a genetic disease. As the story is typically told, Cystic Fibrosis is caused by any of several hundred known mutations in the Cystic Fibrosis Transmembrane Receptor (CFTR). Given this account, one might reasonably expect that having a CFTR mutation will lead to CF and, conversely, that lacking a CFTR mutation prevents CF. As often happens in such cases, however, the actual clinical picture that emerges is much more complex:
INSERT TABLE 1
If we
apply Koch’s postulates to the case of CF, we see the following pattern:
1)
Is the genetic anomaly always found in individuals with the disease? NO
2)
Is the genetic anomaly never found in individuals with other conditions? NO
3) Does the genetic anomaly
always produce the disease in otherwise healthy
individuals? NO
According to Koch, therefore, there is no reason to
describe CF as a genetic disease – despite the fact that we have a relatively
good idea of its etiology and have excellent reason to believe that a gene is a
crucial causal player.
There are
also epidemiological criteria which are essentially more complex versions of
Koch’s postulates. Sir Austin Hill’s
analysis lists eight different criteria which should be considered:
1) Strength: The correlation between the causal factor
and the disease should be strong.
2) Consistency:
The correlation between causal factor and disease should be observed under
varying conditions/individuals.
3) Specificity: The causal factor should be correlated only,
or most strongly, with the disease.
4) Temporality:
The causal factor should precede the disease in temporal sequence.
5) Biological
gradient: The disease should exhibit a dose response
curve for the causal factor.
6) Plausibility:
There should be a plausible causal story as to how the factor causes the
disease.
7) Coherence:
The causal story about the factor and disease must cohere with other knowledge
we have about other causal factors, etc.
8) Analogy: Similar causal factors should cause
similar diseases.
This is undeniably an
improvement over Koch’s original formulation, since it allows for a much more
nuanced description of the causal relationship between a particular factor and
the disease. However, this is also
something of a drawback: where Koch’s
postulates will usually yield a clear (if sometimes misleading) answer, Hill’s
criteria will often not yield much of an answer at all (though they will not as
frequently mislead). For example, Hill
provides no clear method of ranking or weighting the various
factors. Thus, it is relatively
mysterious what we are to make of a putative causal agent which scores well on
one criterion and poorly on another.
When we examine the suitability of a genetic explanation for CF, we get
something like the following pattern:
1) Strength Unknown[ix]
2) Consistency No
3) Specificity: No
4) Temporality:
Yes
5) Biological
gradient Possibly
6) Plausibility Yes
7) Coherence Yes
8) Analogy: Unknown
With three fairly positive indicators, two negative and three uncertain, perhaps one could make a better case on Hill’s criteria than on Koch’s postulates that CF is actually a genetic disease. Hill’s account remains disturbingly vague, however.
“A little inaccuracy saves a world of explanation.” - C.E. Ayers
There are
actually a wide variety of techniques that have been developed to make sense of
causal selection[x]. However, the one most pertinent by far to
the question of genetic disease is manipulability. The manipulability criterion says that a disease is genetic if
and only if it is best controlled (prevented) through manipulation of the
genes. This has an obvious appeal for
medicine - since the physician’s primary interest is in preventing disease, he
is likely to advocate (at least implicitly) a manipulability criterion. The basic problem here is that the intuitive
appeal of manipulability, like that of Hill’s criteria, has been purchased at
the price of clarity.
There are
two main sorts of ambiguity here. The
first concerns how we are to interpret the phrase “best controlled”. That is, what exactly makes one form of
manipulation better than another?
Perhaps if one technique is cheaper, less invasive, more permanent, and
easier to employ, this will not be a difficult decision. However, how often will the choice actually
be this clear? It seems at least
plausible, perhaps likely, that within 50 years we will have developed the
technology to alter the somatic genes of CF patients and restore them to
something like normal health. However,
this treatment is likely to be expensive and may carry significant risks. Would the gene therapy be a better
manipulation than second generation DNase inhalers, a cheap and effective (if
impermanent) treatment? This seems like
the kind of decision we would want to let the patient make himself, based on
his own personal preferences. Our
instincts lead us in this direction precisely because it is very unclear how to
decide which treatment is in fact best, or even if there is an objective
best. If we leave the decision to each
patient while maintaining the manipulability criterion, however, CF becomes a
genetic disease for patients who prefer the gene therapy and a non-genetic
disease for those preferring the inhaler.
This seems a very odd sort of relativity to introduce into our causal
classification of a disease.
The
second point of ambiguity concerns what kind of manipulability we have in
mind. When people speak of
manipulability, they are rarely clear as to whether they mean manipulability in
practice or manipulability in principle. Either way, there are difficulties. Suppose we mean manipulability in practice – a disease is genetic
if there are or very soon will be effective genetic interventions for
individuals with the disease. We would
all like to believe that such treatments are just around the corner, but is
there really any compelling evidence to this effect? To date, no somatic gene therapy has been successfully vetted in
a clinical trial[xi]. Moreover, there are several serious
technical difficulties that will have to be resolved before the treatment becomes
the standard of care (e.g., targeting all and only the desired cells,
etc.). Of course, we might be willing
to endorse selective breeding or genetic engineering, where permanent
alterations are made to the germ line of the population. This would be more tractable from a purely
technical point of view (we have bred agricultural organisms for thousands of
years), but these techniques still are not really practical because they have been rejected by all
virtually all modern countries for compelling ethical reasons. In such a case, the cure literally seems
worse than the disease. At the moment,
therefore, there are very poor grounds for describing any disease as genetic on
the basis of it manipulability in practice.
If what we mean by a genetic disease is that it is genetically
manipulable in practice, then we will have to refrain from describing any
diseases as genetic for a long time to come.
“Metaphysics is nothing more than an extremely obstinate effort to think clearly.”
– William James
There must be a way to satisfy both our practical need for disease classification and our theoretical need to do justice to the complexities of causal systems in biology. There must be a causal analysis which captures both the practically-minded focus on individual patients and the necessary elements of population thinking. The proper system would be as clearly decidable as Koch’s postulates, but allow for the kind of causal complexity incorporated in Hill’s criteria. This system will certainly not be able to classify all human disease, indeed it may fail to clearly classify most human disease, given the complex causal system the human body is. However, such an account would still represent a solid foundation on which to build. The epidemiological account of disease is an attempt to meet this challenge.
In order to avoid as many difficulties as possible, I set myself a relatively modest goal: I want to provide an account of disease causation which is minimally adequate – I make no pretense that this account can not be greatly improved or even that different versions of it may not be appropriate for answering different kinds of questions. I must also make it clear that my goal is primarily to present a normative account of how the phrase “genetic disease” should be used, rather than a descriptive account of how it is actually used. To be sure, a careful description and categorization of the various uses would be an interesting project, but it is not mine at present. Of course, there is a fine line here - I have no wish to develop a disease concept that is so far from common practice that nobody will ever use it. On the other hand, I do not want to adopt uses of the term which are incompatible or theoretically indefensible.
My epidemiological account is thus designed to satisfy the two most basic and widely held intuitions about disease causation as simply as possible:
1) If a disease is genetic, this must mean that those with the gene are more likely
than not to develop the disease[xii]. We might call this the bottom-up or
individual causal viewpoint.
2) If a disease is genetic, this must mean that most cases of disease in the
population are caused by the gene. We might call this the top-down, or populational causal viewpoint.
These seem entirely unexceptionable intuitions that any adequate account of causal selection will have to incorporate. How could a disease be said to be genetic if those with the gene are not (in some sense) likely to develop the disease? How could a disease be genetic if most people who are its victims do not have the gene in question? I thus treat these intuitions as primitive and in need of no further support.
I call my account the epidemiological account because it draws its inspiration from the field of epidemiology. Like epidemiology, it is an analysis of disease that crucially depends on statistical methods applied to populations rather than individuals. It is unlike epidemiology, however, in one crucial particular: while epidemiologists invest a great deal of their effort in finding causes to associate with disease (solving the causal connection problem), my concern is in the explanation of disease (solving the causal selection problem) on the assumption that we have already distinguished causes from conditions[xiii]. To be sure, answering the causal selection question depends crucially on accurate information concerning causal connection. I do not want to appear to underestimate the complexity of this task, but the methods and problems of causal connection analysis have been and will continue to be discussed quite widely in the literature and thus are not the focus of this paper.
Epidemiology is an examination of the properties of populations and thus
it is crucial at the outset to specify a well-defined population to which the
analysis will apply[xiv]. As we will see, extremely counterintuitive
results will be produced if we are unclear or waffle concerning the population
to which the analysis applies. Once we
have clearly delineated our population and drawn up our list of causal factors,
there are a great many questions one might ask. For our purposes, however, there are four which are most crucial[xv]:
1) The Global Question (for the entire population): What is the probability that those who have the gene will contract or have contracted the disease because of that gene?
2) The Diagnostic Question (for someone with both the disease and the gene): What is the probability that the disease was caused by the gene?
3) The Testing Question (for someone with the disease): What is the probability the disease was caused by the gene?
4) The Prognosis Question (for someone who has the gene): What is the probability that this will cause the disease?
Let’s turn to a particular case to develop the analysis we will need to answer each of these questions. Suppose we look carefully at a population of 10,000 people where 12.5% (1250) have a particular disease and 10% (1000) have a gene thought to cause that disease because 80% (800) of those with the gene develop the disease. We then gather the following data about the distribution of the disease and its associated gene:
INSERT TABLE 2
Unfortunately, this table is not accurate enough to answer our questions precisely. This is because some people who do not have the gene nevertheless develop the disease, due to the involvement of other causal factors. Therefore, some of the people who have the gene will develop the disease, not because of the gene, but because of these other factors. Thus, we must distinguish between cases where people have the gene and it causes them to have the disease and those where people have the gene and develop the disease for other reasons. If we assume that the gene and these other factors act independently, then we can say that approximately the same percentage of gene carriers will develop the disease for other reasons as those in the general population without the gene but with the disease (450/9000 = 5%). We can then move on to the more accurate Table 3:
INSERT TABLE 3
With this correction, we are now in a position to answer, for this population, the original four question posed:
1) The Global Question (for the entire population): What is the probability that those who have the gene will contract or have contracted the disease because of that gene?
This would simply be the number of individuals whose disease was caused by the gene divided by the size of the entire population (760/10,000 = 7.6%). This would tell us how large a problem the gene is causing in the population and thus, by extension, what could potentially be fixed by genetic manipulation. It does not really tell us anything interesting about whether the disease is genetic, however. We certainly would not want to say, for example, that rare diseases can not be considered genetic (in fact, most paradigm examples of genetic disease are quite rare).
2) The Diagnostic Question (for someone with both the disease and the gene): What is the probability that the disease was caused by the gene?
This may be called the Simple Etiologic Fraction (SEF), which in this case would be the number of people whose disease was caused by the gene divided by the total number of people with the gene and the disease (760/800 = 95%). This would tell us how likely it is that a particular patient with the gene will develop a disease because of those genes or, if he already has the disease (and the gene), how likely it is that the condition was caused by the genes. It also indicates what percentage of the population with the gene might potentially be impacted by genetic manipulation to prevent disease.
It is tempting to view a high SEF (above 50%) as an indication that the disease is in fact genetic. However, we are tempted not because of what SEF actually indicates as what we might think it indicates. It does not really interest us to know that, in individuals with both the gene and the disease, the gene is or is not usually the causal factor responsible (which is what SEF actually reveals). It would interest us to know that the gene is usually the causal factor amongst those with the disease in general, since this would allow us to discover whether the populational causal intuition was met. But this is not what SEF tells us and we can not derive that information from SEF (for this we need to answer the testing question). Simply put, SEF tells us nothing about the percentage of diseased individuals who owe their suffering to their genes. To do that, it would have to incorporate information about individuals with the disease, but who lack the gene (which it does not).
Similarly, SEF tells us nothing at all about the likelihood of developing the disease, given the gene (and thus can not answer the prognosis question). We need this information to decide if the individual causal intuition is met. In order to do that, however, SEF would have to factor in information about people with the gene, but who remain disease-free (which it does not). Tempting as it might appear on first examination, SEF is pretty useless in answering the causal selection problem.
3) The Testing Question (for someone with the disease): What is the probability the disease was caused by the gene?
This is asking for what epidemiologists call the Population Etiologic Fraction (PEF). In this case, it would be the number of individuals whose disease was caused by the genes, divided by the total number of diseased individuals (760/1250 = 61%). This does seem to be getting at something important in our concept of disease intuition. In particular, as long as the PEF > 50%, we know that most cases of disease in the population are in fact caused by the genes. This is precisely the requirement of the populational causal intuition, so it seems we need a stipulation in our concept of genetic disease that the PEF > 50%. We might be tempted to stop here and say that this is the only criterion for genetic status. However, although a high PEF insures that the populational intuition is met, it does not assure us with respect to the individual intuition. Consider the following variation on our original case:
INSERT TABLE 4
Here, although it is quite true that most cases of disease in the population are caused by the genes (PEF > 50%), the gene does not cause most people to develop the disease. This violates our individual causal intuition and thus PEF needs to be supplemented as a concept of genetic disease.
4) The Prognosis Question (for someone who has the gene): What is the probability that this will cause the disease?
Here we are asking for what epidemiologist call Attributable Risk (AR). In original case from Table 3, we calculate AR by dividing the number of people whose disease was caused by the gene by the total number of people with the gene (760/1000 = 76%). Again, this does seem to be getting at something important in our concept of disease intuition. As long as the AR > 50%, we know that the gene will cause most of its carriers to develop the disease. This meets the requirement of the individual causal intuition, and thus we must also stipulate in out general account of disease that the AR > 50%. Note, however, that just as a high PEF (answering the populational causal intuition) does not guarantee a high AR, so a high AR (answering the individual causal intuition) does not guarantee a high PEF. Consider the following case:
INSERT TABLE 5
Here, although it is quite true that, in most cases, the disease genes cause the disease (AR > 50%), it is equally true that most cases of disease are not caused by the gene (PEF < 50%). This violates our populational causal intuition and thus we can not use the AR criterion alone.
We can now get down to the business of applying the epidemiological concepts. Given the epidemiological outlook, what exactly constitutes a genetic disease? Clearly, it must involve both PEF and AR, on pain of giving up one of our original intuitions. It remains an open issue, however, how strongly these should be interpreted. One obvious possible answer would be to say that a disease is genetic whenever it is “practically sufficient” within a given population[xvi]:
Practically Sufficient (PS): A disease is genetic
whenever the gene’s
Attributable Risk AND Population Etiologic
Fraction are both 100%[xvii]. In plain English, this means that everyone
with the gene has the disease because of that gene AND no one with the disease
has the disease because of anything other than the gene.
This accords nicely with our intuitions that being a genetic disease has something important to do with sufficiency and necessity. On the other hand, PS does not make the mistake of claiming that the genes are either sufficient or necessary in the strict sense. Since epidemiological analysis is relativized explicitly to some population of interest, it is entirely possible that what is practically sufficient in one population will not be in another. Practical sufficiency is thus a modification of our ordinary notion of sufficiency to highly complex causal systems - it requires only that that the gene(s) be necessary components of each set of sufficient conditions which can occur in the population. PS is thus a more workable empirical notion of causal necessity and sufficiency.
The main difficulty with PS is that, even though it is more practical and restricted than sufficiency and necessity in the strict sense, it is still too strong to apply to the vast majority of human diseases (this would likely be true even if we lowered the threshold value to 95%). Its value for our purposes is thus mainly to anchor the endpoint of epidemiological concepts of disease – it represents as strong a notion of causation as it is possible to generate using population-relative epidemiological analysis of complex causal systems.
However,
the goal is to develop a minimal notion of genetic disease. If we are to preserve our two intuitions,
any epidemiological concept must include both PEF and AR. But what should the threshold value for each
variable be? Clearly, it would have to
be at least 50%, as suggested by the use of the term “most” in our original
intuitions.
Minimally
Epidemiological (ME): A disease is
classified as genetic whenever
both the Population
Etiologic Fraction and the Attributable Risk exceed 50%.
If I
refuse to sanction labeling a disease as genetic unless it has at least a 50%
PEF and a 50% AR, then I have insured that my intuitions are met. Most people with the disease will have the
disease because they have the gene and most people with the gene will develop
the disease because of the gene. We
could, of course, require that the thresholds be set higher than 50%. While I have no principled objection to
this, it will be difficult to defend any number other than 50% as anything
other than arbitrary. Moreover, it will
be difficult enough to establish that current candidates for genetic disease
status are legitimate in a minimal sense.
“The more you know, the more
you know you don't know.”
-Aristotle
Let’s
attempt to fix the endpoints of the epidemiological continuum as it appears in
practice. On one end of the continuum
we have diseases like obesity[xviii]. Obesity has increasingly been described as a
genetic condition, since several genes have recently been discovered that
regulate body weight, at least in mice.
In this case, we can safely say that genes are 1) causally involved, 2)
in principle manipulable and 3)
practically sufficient to induce obesity in certain populations of lab
mice. Even in the mouse populations,
however, we do not have the data we would need to claim that the condition is
genetic in the ME sense. One reason for
this, ironically, is that we do not know enough about healthy mice. That is, we do not know the prevalence of
the “obesity gene” among normal weight mice, and thus can not accurately
calculate AR. We also do not know how
many diseased mice lack the obesity gene, and thus can not calculate PEF
precisely either[xix].
Of course,
we could sample normal weight mice and diseased mice and at least
estimate these values in some rough sense, likely with enough accuracy to pass
judgment on the disease’s ME genetic status for populations of mice reared
under “normal” protocols (e.g., with diet and exercise held constant). However, in human populations, at least in
the affluent West, it seems highly unlikely that PEF for genes with respect to
obesity will exceed 50%. Although it is
certainly interesting to find that genes can induce obesity, these cases are
only a small fraction of the causes of obesity. Other factors like diet (which can not be controlled well in
human populations, despite the best educational efforts of our medical
community) will have extremely high PEF values. Diet will therefore almost certainly be singled out as the
explanatory factor, contra genes, by any reasonable causal selection
scheme. According to the
epidemiological account, then, human obesity should not be classed as a genetic
disease - genes do seem to play a role here, even an important role, but they
are not the predominant cause of obesity.
At the
other extreme, a condition like Klinefelter’s syndrome seems defensibly genetic[xx]. Klinefelter’s is caused by the presence of
more than one copy of the X chromosome alongside a Y chromosome and results in
numerous problems with the development of sexual characteristics. Here, the genes seem Practically Sufficient
(PS) for the trait (AR and PEF of 100%).
Again, of course, we will have difficulty supporting this claim as strongly as we would like,
since we do not really know if the relative numbers of people with the genetic
anomaly who fail to exhibit the condition or who exhibit the condition for
reasons other than their genes.
However, in the case of entire additional chromosomes and complex
symptomologies, we have very sound theoretical grounds to expect that very few
people indeed will fall in these categories.
At the very least, it seems an excellent bet that Klinefelter’s syndrome
will qualify as genetic in the ME sense, and most likely in the PS sense as
well.
Cystic
fibrosis (CF) is a much more complex case.
CF is associated with any of at least 300 different known mutations in
the Cystic Fibrosis Transmembrane Conductance Regulator (CFTR) gene. There is no clear relationship between the
severity of symptoms and specific mutations, and there are even cases where individuals
with a CFTR mutation do not have the disease(see Table 1). CF thus could not be classified as genetic
in the PS sense. Even on the ME
account, the case is not perfectly clear.
The PEF for the genes (collectively, though not for any single mutation)
is probably at or close to 100% - we have, at least arguably, decent data here
since it is not unusual for CF sufferers to be tested for the gene. Again, however, the AR is simply not known,
though we do know it is definitely < 100%, since we know of individuals with
the mutations who exhibit no symptoms of the disease. This certainly should counsel caution in describing CF as genetic
even in the ME sense. However, I would
argue that it still seems a good bet (though not an excellent one), given the
fundamental function of the CFTR receptor in the cell, that individuals with
the mutations who manage to function normally are relatively rare (and thus
that AR is fairly high). But they need
not be terribly rare - unless these lucky individuals actually outnumber
those with the gene and the disease, CF will still qualify as genetic in the ME
sense (AR > 50%).
The
epidemiological account is able to handle the complex and highly variable
causal systems we know to be operating in human disease, while also restricting
itself to questions of a decidable
empirical nature. However, this
advantageous arrangement does come at a price – as an inherently statistical
account, any epidemiological explanation of a disease must be explicitly
relativized to some particular population.
There is no guarantee that the explanation for a trait in one population
will hold true for other populations – indeed, there is often excellent reason
to think that it will not[xxi].
The
population relativity of epidemiological explanations leads to a number of
potential worries. For one thing, it is
not clear precisely how one is to choose the population in question. There can be no answer to this other than a
pragmatic one - the population chosen must be one of sufficient interest to
one’s audience. It is true that
sometimes this will result in a great many competing explanations for the same
trait. While this is certainly
imperfect, it is clearly preferable to the alternative of a misleading
uniformity.
In any
event, it is also true that we very often will have a clearly defined
population in mind when we create an explanation. The United States Department of Health, for example, is primarily
concerned with the explanation of diseases within the U.S. population. Those still worried about relativism
creeping into our explanations can take solace from the fact that the
population in question must at least be cited explicitly and the rules for what
counts as adequate within any population are well-defined. The epidemiological account, therefore,
hardly constitutes an “anything goes” abandonment of objectivity.
It also might plausibly be objected that such an
account of explanation is completely at a loss when faced with the task of
explaining individual occurrences of a trait[xxii]. For example, if a patient insists on
knowing what caused his particular case of cancer, it does not help him
very much to cite the relative prevalence of causal factors within the larger
population of which he is a part. This
is indeed counter-intuitive, especially given the focus of modern medicine on
the care of individual patients.
However, before taking this criticism too far, we should consider
whether or not there really is an alternative.
Recall the
case of our causally omniscient alien physician: he knows everything there is
to know about the particular patient’s case, but nothing about the more general
population. If he really wanted to
answer the patient’s question about his particular case (other than by simply
indicating the entire causal matrix), he would have only two possible
routes. First, he could engage in
so-called counterfactual analysis, where one tries to make projections
about what would have happened in a particular case were the
circumstances different from what they actually are. This can become tortuous indeed – as when modal logicians speak of
an infinite series of possible worlds, similar in some but not
all respects to our own, in an attempt to make sense of the semantics
here. For example, we might well wonder
what life would be like in a possible world where Hitler postponed the invasion
of Soviet Russia until after the fall of Britain. When it comes to causal analysis of disease in individual cases
without any information from populations, it is hard to see how this is more
than raw speculation of a particularly fanciful kind.
The only
other option left to the alien doctor would be to explicitly import
populational data. In a population, but
not in an individual, there will be variation in causal factors. In a large enough population, one can find
almost any combination of relevant casual (risk) factors. By crunching these numbers, we can calculate
the actual risk of a particular disease, given any set of initial
conditions. In other words, populational
data would allow us to solve the problem - but only by cheating. We simply can not say much, if anything,
about the relative importance of causal factors if we rely only on the
perspective of the individual patient – the causal selection problem requires
data from populations for its resolution.
To suggest, therefore, that reliance on populations is somehow a
difficulty is to imply endorsement of an impossible alternative. Without some, at least implicit, appeal to
population-level information such as which factors vary and in what way, there
are simply no grounds for causal selection, no matter what account one
favors.
Thus, an
account of causal selection which can explain individual cases is, strictly
speaking, impossible. Accounts other
than the epidemiological one still use populational information because they
must, but they sneak it in the back door without ever being clear about what
they are doing. The result is a fuzzy
and misleading analysis of the problem.
Seen in this light, the epidemiological requirement that the explanation
be relativized explicitly to a carefully delimited population is a virtue, not
a vice.
The concept of a genetic disease is neither
well-developed nor generally defensible as it is employed in the
literature. There are very few
attempts to make the criteria for causal selection explicit in general, and
almost none in the specific case of human disease. One must reason backwards from the kinds of claims one finds in
the literature to implicit notions of causal selection, but the notions thus
uncovered are not able to withstand the harsh light of critical scrutiny. Rather, they owe their survival to their
very ambiguity.
The
epidemiological account of genetic traits is an analysis of genetic disease
which is both practical and theoretically defensible. This account avoids many of the criticisms leveled against its
rivals, while still preserving a use of “genetic trait” which is useful and
informative. However, we must apply it
with great care. In particular, we must
always keep in mind that a great many traits, likely even a large majority,
will not meet any defensible criterion of genetic status. This is not a failure of the epidemiological
account, so much as an admission that the world of biological causation is far
too complex and varied to admit of simplistic categorization.
TABLE 1: PERMUTATIONS OF
DIAGNOSTIC CRITERIA
AND DISEASE IN CYSTIC FIBROSIS
|
CLASSIC SYMPTOMS |
SWEAT TEST |
GENETIC TEST |
FREQUENCY |
|
positive |
positive |
positive |
classic CF |
|
positive |
positive |
negative |
occurs |
|
positive |
negative |
positive |
common |
|
positive |
negative |
negative |
common |
|
negative |
positive |
positive |
occurs |
|
negative |
positive |
negative |
occurs |
|
negative |
negative |
positive |
unknown |
|
negative |
negative |
negative |
health |
Table 2
|
|
DISEASE |
NO DISEASE |
TOTAL |
|
DISEASE GENE |
800 |
200 |
1000 |
|
NO DISEASE GENE |
450 |
8550 |
9000 |
|
TOTAL |
1250 |
8750 |
10000 |
Table 3
|
|
DISEASE |
NO DISEASE |
TOTAL |
|
Gene present,
causes disease |
760 |
0 |
760 |
|
Gene present,
but not cause of disease |
40 |
200 |
240 |
|
NO DISEASE
GENE |
450 |
8,550 |
9,000 |
|
TOTAL |
1,250 |
8,750 |
10,000 |
Simple Etiologic Fraction = 760/800 = 95%
Population Etiologic Fraction = 760/1,250 = 61%
Attributable Risk 760/1,000 = 76%
Table 4
|
|
DISEASE |
NO DISEASE |
TOTAL |
|
Gene present,
causes disease |
1,000 |
0 |
1,000 |
|
Gene present,
but not cause of disease |
100 |
1,050 |
1,150 |
|
NO DISEASE
GENE |
700 |
7,150 |
7,850 |
|
TOTAL |
1,800 |
8,200 |
10,000 |
Simple Etiologic Fraction = 1,000/1,100 = 91%
Population Etiologic Fraction = 1,000/1,800 = 56%
Attributable Risk = 1,000/2,150 = 47%
Table 5
|
|
DISEASE |
NO DISEASE |
TOTAL |
|
Gene present,
causes disease |
710 |
0 |
710 |
|
Gene present,
but not cause of disease |
90 |
200 |
290 |
|
NO DISEASE
GENE |
2,000 |
7,000 |
9,000 |
|
TOTAL |
2,800 |
7,200 |
10,000 |
Simple Etiologic Fraction = 710/800 = 89%
Population Etiologic Fraction = 710/2,800 = 25%
References
Brandon, R.N., et. al. (1994) "Sober on Brandon
on Screening-Off and the Levels of
Selection", Philosophy of Science, 61(3):475-486.
Burian,
R. (1981-2) “Human Sociobiology and Genetic Determinism” in The Philosophy
Forum, XIII(2-3):43-66.
Collingwood,
R.G. (1938) “On the So-called Idea of Causation” in Proceedings of the
Aristotelian Society, pp. 85-112.
Ducasse,
C.J. (1924) “Causation and the Types of Necessity” in University of
Washington Publications in the Social
Sciences, 1(2):69-200.
Gardenfors,
P. (1980) “A Pragmatic Theory of Explanation” in Philosophy of Science,
47: 404-23.
Gifford,
F. (1990) “Genetic Traits” in Biology and Philosophy, 5(3):327-347.
Hart, H.L.A. and A.M. Honore (1959) Causation in
the Law, Oxford, Oxford University
Press.
Hesslow, G. (1983) “Explaining Differences and
Weighting Causes”, in Theoria, 49:87-
111.
__________
(1984) “What is a Genetic Disease? On the Relative Importance of Causes”
in L. Nordenfelt and B.I.B. Lindahl (eds.) Health,
Disease, and Causal Explanations in
Medicine,
Dordrecht, Reidel.
Hilton, DJ (1988) “The Problem of Causal Selection”
in Contemporary Science and
Natural
Explanation: Commonsense Conceptions of Causality, DJ Hilton (ed.),
New
York, NYU
Press.
Mackie, J.L. (1965) “Causes and Conditions”, in American
Philosophical Quarterly, 2(4):
245-64.
__________ (1974) Cement of the Universe,
Oxford, Clarendon Press.
Martin,
R. (1978) “Judgments of Contributory Causes and Objectivity”, in Philosophy
of
Social Science, 8: 173-86.
Mill,
J.S. (1859) A System of Logic,
London, Longmanns (reprinted 1961).
Nagel, E. (1961) The Structure of Science:
Problems in the logic of scientific
explanation,
New York, HBJ.
Rothman,
K.J. (1976) “Causes” in American Journal of Epidemiology, 104:587-592.
Ryle,
G. (1949) The Concept of Mind, London, Hutchison.
Salmon,
W.(1984) Scientific Explanation and the Causal Structure of the World,
Princeton, Princeton University Press.
________ (1998) "Equivocal Notions of
Accuracy and Genetic Screening of the General
Population" in The Mount Sinai Journal of Medicine, 65(3):178-83.
________
(1992) “The New Problem of Genetics: A Response to Gifford” in Biology
and Philosophy, 7: 331-348.
Sterelny, K., K.C. Smith and M. Dickison (1996) “The
Extended Replicator” in Biology
and Philosophy,
11:377-403.
Van Fraassen, B.C. (1980) The Scientific Image,
Oxford, Oxford University Press.
Wulff,
H.R. (1984) “The Causal Basis of the Current Disease Classification” in L.
Nordenfelt and B.I.B. Lindahl (eds.) Health,
Disease, and Causal Explanations in
Medicine, Dordrecht, Reidel.
[i] Indeed, one common
complaint about the holistic approach to causal explanation is that it becomes
very difficult to limit the number of causal factors one is forced to consider,
and a kind of “galloping holism” threatens (Sterelney, et. al., 1996). In the extreme, for example, one can make a
case that all factors within the light cone of a particular outcome may have to
be included in a truly complete causal explanation.
[ii] I revert to
well-characterized situations like barn fires when it is important to minimize
the intuitive impact of our causal ignorance.
One could in principle construct the same sort of example with a
disease, but our ignorance of the causal factors and their importance will tend
to lead the discussion away from causal selection and towards causal connection
issues.
[iii] Mackie (1965, 1974)
develops a precise vocabulary to discuss these relationships. The application of fire to hay would be seen
as an INUS factor: an
Insufficient but Unnecessary part of a
set of conditions which is Unnecessary but Sufficient for the effect. A set of INUS factors, which is jointly
sufficient to bring about an effect, is called a Minimally Sufficient Condition
(MSC).
[iv] Wulff (1984) offers an
interesting discussion of the evolution of disease classification away from
symptomology and towards causal agency.
It should be clear, however, that if we define a disease in terms of the
presence of some particular causal factor, then we have already chosen a
particular way to answer the causal selection question. This may work very well for, say, infectious
diseases. However, the coming genetic
revolution is likely to reveal major flaws in this approach as more and more
asymptomatic people with “disease genes” are discovered.
[v] To further complicate matters, questions of genetic disease are
tied to misleading and inappropriate conceptions of the nature and significance
of genes – though this is not often recognized (see Smith 1999).
[vi] Even in those cases where
the researchers are carefully circumspect, reporters covering their work
typically are not.
[vii] It may be a bit unfair to
show Koch as requiring strict sufficiency and necessity through the use
of “always” and “never” in the postulates.
Whatever the historical accuracy of this move, it serves to illustrate
the poverty of such an approach when applied to complex diseases. At a minimum, Koch’s approach would have to
be supplemented with another technique (which he does not provide) on pain of
leaving the causal selection to the whim of the researcher.
[viii] Indeed, exposure to the TB bacillus used to be essentially universal before modern public health measures were instituted in the Western world. Today, exposure to the bacillus tends to be identified as the cause of TB, but in the 1940’s it would be more accurate to identify differences in immunity rather than exposure as the culprit (Stern 1973).
[ix] We have to be very careful
not to overestimate the strength of the correlation. It is true that all individuals with Huntington’s Chorea, for
example, also have the characteristic genetic anomaly. It does not follow from this, however, that
all individuals with the genetic anomaly have Huntington’s. We simply do not know how many asymptomatic
people have the gene in question. What
is more, we will not find out with typical testing procedures, which are
directed only at those who are sick.
[x] These include abnormality
analysis (Hart & Honore 1959, Hilton 1988), unexpected conditions
(Gardenfors 1980, van Fraassen 1980), precipitating causes (Ducasse 1924, Ryle
1949), dispositional conditions (Nagel, 1961, Martin 1978), and instrumental
efficacy (Collingwood 1938). One of the
most promising approaches is to develop a notion of limited or practical
sufficiency (Gifford 1990, Wulff 1984b) which I attempt to incorporate in an
explicit fashion in my own epidemiological account.
[xi] Worse, there has been at
least one unanticipated death which seems to have resulted from gene therapy.
[xii] Actually, this is a bit
more complex than it seems here, but I will discuss this in the specific cases
later.
[xiii] We must specify an explicit
list of causal factors among which we will perform our selection. Should it turn out later that we were
unaware of some important causal factor, for example, we will have to redo the
analysis.
[xiv] In fact, the reference
population must have at least two subpopulations differing in the trait
in question. This is because the
phenomenon actually being explained is the variance in the trait between
the two subpopulations, not the trait simplicitur.
[xv] I am indebted to Henrik
Wulff for much of the insight behind this categorization and for greatly
improving the precision of the numerical analysis which supports them.
[xvi] See also Gifford 1990 and
Wulff 1984b for similar attempts to develop a notion of practical sufficiency.
[xvii] Alternately, one might
require both numbers to exceed some very high figure less than 100%, say 95%,
as we do when calculating statistical significance. This would provide a very strong notion of genetic causation
without ruling out every possible exception.
[xviii] Of course, it is an open
question as to whether a condition like obesity should really qualify as a
disease. It is not my intent to offer a
general account of disease, so I will simply assume for the purposes of
argument here that it does.
[xix]
In fact, this sort of
situation poses a major problem for any theoretically adequate notion of
disease causation. Studies to determine
the actual incidence of disease genes in healthy populations will be extremely
expensive and may not be available for many years to come.
[xx]
It has been suggested
that perhaps Klinefelter’s is not a genetic disease at all, since it is not
heritable. People’s intuitions seem to
differ widely on the importance of this point and, in any event, my purpose is
to develop a normative account rather than a descriptive one. Unless it can be shown that some particular
environmental factor routinely accounts for the genetic changes, the
epidemiological account will still describe the condition as genetic.
[xxi] In fact, this will be quite
common with some traits: lactose intolerance is a genetic trait for most
populations in Western Europe and the U.S. (where consumption of milk products
is common), yet it is clearly environmental in many Asian countries. See also
Burian (1981-2) and Smith (1992) for a discussion of population-relativity in
phenylketonuria, as well as Stern’s (1973) analysis of tuberculosis.
[xxii] This is because there is no
variation within an individual with respect to the causal factors to be
analyzed.