The Analyst's Cookbook: August 2009

In thinking about what this blog is for I've drifted more and more into considering question about the production of new knowledge (then again this could just be the ultimate in yak-shaving). How does one generate knowledge from data. What are the bounds on the validity of this knowledge? To what extent is it valid to infer causal relationships linking related data?

Many systems for generating new knowledge exist. Formal and informal, specialist and general, tacit and codified. However the scientific process is one of the best documented, has at least a significant overlap with the process of information analysis, and highly influential in today's world. I'm therefore going to look at it first, and in depth.

The process of the scientific method has been quite a battleground in recent decades, as CP Snow's two cultures express their misunderstanding of and distain for each other. The scientific method in theory and to some extent in practice relies on Karl Popper's The Logic of Scientific Discovery, somewhat modified by later contributors.

The approach I'm going to take is to examine in detail's Popper's philosopy of scientific knowledge, then the significant modifications and other contributions, then finally explore the current situation and relevance to information analysis. All chapter section and page references are to the 2002 Routledge Classics written by Popper in 1935 and updated by him up to 1980.

So then... Chapter 1, A Survey of Some Fundamental Problems

This first chapter serves as an overview of the ground that Popper covers in much more detail later on.

1. The Problem of Induction

Popper firmly rules knowledge generation using inductive process out of his scientific logic. His reasoning is that he can't see a way to justify the process of induction except by using induction (leading to an unsatisfying infinite regress) or by just assuming that it is valid, which he can't justify. So... no induction.

2. Elimination of Psychologism

The process of formulating hypotheses is ruled out of the logic of the scientific process (though they may be of some interest psychologically). Science is there to deductively test the hypotheses that are generated. Hypotheses can be generated any way you please: creative intuition, some irrational inspiration or perhaps (whisper it) by contemplating data inductively.

According to Popper hypothesis generation is pre-match warm-up and not subject to the rules of the scientific game. This concerns itself with the process of rationally testing and eliminating these hypotheses.

3. Deductive Testing of Theories

The game then, is to deductively test the theories that we have (somehow) generated. This can proceed along four lines, a) is the theory self consistent, b) is it an empirical theory - can it make any statements about the world, c) does it repeat, add to, or contradict existing theories, and d) does it produce conclusions that contradict experimental evidence.

If the theory passes the first two tests then it is in the game and Popper's business is to provide a deductive method by which we can examine it in the light of other existing theories and against the evidence.

Here we encounter the key point that exposure to experimental evidence can never verify a theory, it can only (by contradiction) falsify it. No scientific theory can be proved right, they can only be more and more highly corroborated by accumulation of supporting evidence.

4. The Problem of Demarcation

Demarcation is the boundary between metaphysical systems of knowledge (i.e. those which are knowable by pure thought - maths, logic and the like) and scientific systems of knowledge. The Problem of Demarcation that Popper addresses is how to define this boundary in the absence of induction.

Popper surveys and dismisses a range of inductivist approaches to the problem before stating that the structure of his own solution is one of convention and agreement. It is a method, rather than a principle. It is agreed upon rather than derived or proved. And we use it because it works.

5. Experience as a Method

Popper starts to assemble his method: it must be non-contradictory, representing a possible world of experience; it must not cross over into metaphysics; and it must represent our world of experience.

We do this by, yes, deductive testing of theories against empirical evidence formed from our experience of the world.

6. Falsifiability as a Criterion of Demarcation

How do we judge our hypotheses against the evidence? Popper advocates using falsifiability. We cannot, even in principle, verify our hypotheses but we can prove them false: "all swans are white" remains ever-vulnerable to the discovery of a black swan in some dry land... which immediately falsifies it.

But there are loopholes. We could try to preserve our exclusive white swan hypothesis by introducing auxillary hypotheses ("There exists a disease which in some circumstances causes naturally white swans to turn black"), changing definitions ("of course I was referring to European swans"), or refusing to recognise the evidence ("I see no swan"). Some variations on these themes are probably familiar from real-world debates about matters of fact.

This is where the methodological aspects of Popper's theory come in, essentially part of his definition of science is a gentleman's agreement not to indulge in this kind of milarky.

7. The Problem of the Empirical Basis

So we can generate theories, and falsify them by comparing against empirical statements of fact... but from where do we get our statements of fact? They would seem to arise, somehow, from our perceptual experience of the world but how can we put them on a firm footing? This is the Problem of the Empirical Basis, and is addressed in the 5th chapter.

8. Scientific Objectivity and Subjective Conviction

Where then does all this leave scientific objectivity? We can't know anything for sure, and we can only dismiss proposed theories if we are convinced that people are following the methodological rules.

Popper hangs scientific objectivity on inter-subjectively testable (i.e. non-falsifiable) theories. Science is what can be repeated between different scientists (and defining "scientist" as someone who follows the scientific method ruling out the various forms of shenanigans).

What role do our convictions of the rightness of a theory play? None directly, which comes as no surprise to a modern ear - according to this method there is no corroboration to be had for a theory by saying "I believe it". I get that this wasn't necessarily the case at that point in the great conversation.

Chapter 1 Summary

So then, that's Popper's method in a nutshell. This is a brave piece of work.

It was surprising to me just how assembled the foundations of the scientific method are. Popper just keeps on trucking through a blizzard of problems. He rejects half of inference at the get-go and runs with deductive logic alone. We lose the verification of theories: all we know is what we don't know. We reach for objectivity from a basis of subjective experiences, and - by necessity - embrace methodological rules to assemble the results allowing us to compare our theories against the world. The surprise, and the delight, is that after all of this we still find something that works.

To quote from later in the book:

"Science does not rest upon solid bedrock. The bold structure of its theories rises, as it were, above a swamp. It is like a building erected on piles. The piles are driven down from above into the swamp, but not down to any natural or 'given' base; and when we cease our attempts to drive our piles into a deeper layer, it is not because we have reached firm ground. We simply stop when we are satisfied that they are firm enough to carry the structure, at least for the time being." - Karl Popper

The Analyst's Cookbook

Tuesday, 25 August 2009

The logic of scientific discovery, Part 1

About Me

Useful links

Labels

Blog Archive