Home
 
 
 
What is LIFO®? What are the LIFO® styles? Personal style 360 Feedback Creating effective teams Creating effective change Managing time effectively Profits and people Building sales Stress Negotiation Reliability and validity LIFO® and NLP LIFO® News Courses and events Agents around the world Case Studies Who uses LIFO®? How do I find out more? LIFO® Licensee site Home

 

Reliability and Validity of the Personal Style Survey
Ian Tibbles MA (Cantab.), MSc (London)


Introduction

Common questions from potential licensees and from participants completing a Personal Style Survey are “How reliable is this survey?” “Is it relevant to me at work (or at home)?” and “Are the results accurate?” These are important questions and must be answered clearly if the results of the survey are to have any credibility for that person. The difficulty is that the questions can be answered in a number of different ways depending on the perspective of the questioner and their degree of understanding of the issues of reliability and validity in the design of behavioural surveys. This note seeks to give licensees a framework for dealing with both the technical and non-technical questioner.


What does “Psychometric” Mean?

This term is widely used to describe ability, aptitude, behavioural and personality surveys and questionnaires. Literally “metric” means measure and “psycho” means mind – its dictionary definition is “the science of measuring mental capacities and processes”. This is done through the collection and interpretation of survey data. The Personal Style Survey was designed using psychometric principles of survey construction.


The Difference Between a Test and a Survey

A test is designed to measure some aspect of ability, aptitude, personality or motivation against a pre-determined standard. Potentially it can be threatening to the participant, as there is inevitably a sense of pass or fail in the analysis. It is therefore, important that the use of tests is demonstrated to be objective, fair and appropriate. Tests of personality, for example, commonly have measures of:

• faking good
• faking bad and
• consistency

to ensure the results are not distorted.

The benefits of this process are that it is objective (as far as possible) and usually rigorous. The potential disadvantages are that it is threatening and can be a mystery to the participant who is trying to understand how the results were arrived at.


The Personal Style Surveys are not tests and should never be described or used as such. Everybody scores 100%. Each survey simply seeks to measure how the person completing it prefers to behave when things are going well (favourable conditions) and when they are experiencing stress or conflict (unfavourable conditions). The surveys are not situation specific and are not a predictor of effective or ineffective behaviour – each person’s profile is capable of being effective or ineffective depending on their understanding and management of their behavioural strengths and potential weaknesses. Nevertheless the results can be very powerful, giving people insights into how to:

• make more of their strengths
• make more effective use of the strengths of others
• minimise potentially inappropriate or ineffective behaviour and
• get on well with people who are not like them

Ensuring the Reliability and validity of the Personal style Survey Findings

The Personal Style Survey is constructed as a “forced choice ranking” of four different endings to each statement. The process of forcing the person completing the survey to choose between 4 behaviours quickly is designed to access the individual’s sub-conscious self-understanding and to bring it into conscious understanding through feedback and discussion of the survey results.

Because the process is non-threatening it is possible to openly discuss and confirm the survey findings with the client – “Does this feel or sound accurate to them?” The licensee can encourage them to discuss and validate the findings with friends and colleagues. It is important to ensure that they choose someone who they trust to know them and to have a constructive opinion to offer. If necessary, they should be allowed to modify the findings to create a “best fit” profile of their behaviour.

However, some aspects of traditional reliability and validity measures are helpful. Below is a description of the measures and how they relate to the Personal Style Survey.


Reliability
Survey scores vary from one measurement to another. A range of factors may cause this:
• differing degrees of effort
• variations in attention levels
• administration
• health
• circumstances etc.
The precision or consistency of measurement displayed by a survey is referred to as its reliability. It is normally expressed in terms of a statistic – the correlation coefficient, often referred to as the reliability coefficient.

The three most common types of reliability measure are:

  1. Test – retest. This compares the results of the same survey being completed by the same candidate at different points in time.
  2. Personal Style Survey – Version Two. This compares the results of two or more forms of the same survey completed by the same group of subjects.
  3. Internal consistency. This measures the performance of all items (questions) in a survey by comparing the two halves of the survey – the split-half technique or using the Kuder-Richardson reliability coefficient (the mean of all split-half coefficients).

Reliability coefficients are usually expressed as a number between 0.1 and 1.0. A coefficient of 0.2 would suggest a much lower level of reliability than a coefficient of 0.6. It should be born in mind however that this is strictly incorrect, as the figure is only an estimate based on a particular group of people. It is not just the statistic but the quality of the study from which it was derived which needs to be understood.


It is important for any survey to measure consistently and with a reasonable degree of accuracy. The reliability coefficient for the Personal Style Survey was derived using Cronbach’s coefficient alpha and is reported below from an analysis by Dr Allan Katcher (co developer of the Life Orientations® Method) for the eight scales:

Orientations Favourable Unfavourable
Supporting/Giving-in 0.54 0.54
Controlling/Taking-over 0.70 0.61
Conserving/Holding-on 0.63 0.46
Adapting/Dealing-away 0.61 0.37


Test/retest study – By Dr Allan Katcher
The reporting of the stability of test results over time is usually reported as part of the data around the performance of any psychological instrument. Test/retest data has a less clear meaning with regard to test reliability than internal consistency data. However, it cannot be determined whether the person has changed over time, has reported him or herself from two different standpoints (not test-related) or whether the survey evokes different kinds of reporting at different times. There is also the attenuation problem; on the second completion of the survey, it is no longer really new - even though in the study reported below, meaning was not put on the test between the first and second administration. Still, in all, one should expect some amount of stability if the test measures salient variables, though apparent shortcomings are very hard to interpret.

The Personal Style Survey was administered to 63 graduate students and then re-administered after five weeks. The subjects were not given their scores or any information about the meaning of the survey until after the second administration. The simple product-moment correlations are as follows:


Orientations Favourable Unfavourable
Supporting/Giving-in 0.49 0.53
Controlling/Taking-over 0.61 0.57
Conserving/Holding-on 0.62 0.60
Adapting/Dealing-away 0.69 0.39

It is of interest to see whether the Life Orientations“ method style descriptions change from one administration to the next. Each pair of test profiles was analysed to note whether the basic descriptions changed. The results of this analysis are as follows:

No change (favourable) 38 of 6360%
No change (unfavourable)   31 of 6349%
No change (considering both)   19 of 6330%

Even though 30% of those tested showed virtually identical scores on both administrations, it was suspected that those who showed a clearly predominant style preference would be less likely to change; that is, if the test really measures some genotype variables. Again, the test was considered in two parts, the "favourable" style and "unfavourable" style. 21 subjects showed a predominant style choice (5 points more than any other score) on the "favourable" scales and of those, 14, or 67%, showed the same style preference on the second administration. 20 subjects showed a predominant "unfavourable" style with 16, or 80%, showing no change on the second taking.


These same data were also examined to pick out those subjects who had clear "favourable" and "unfavourable" styles that were the same, another gross measure of strength of preference. Of the 27 who showed such a pattern on the original administration 17, or 63%, showed no change with the second administration. The expectation that those who have clear style preferences are less likely to change over time is strongly supported.

Overall, it is evident that the Personal Style Survey measures pretty much the same thing in people over time though, as stated earlier, the interpretation of less than perfect stability is difficult. Some anecdotal evidence suggests that changes in scores could be due to subjects focusing on different parts of their lives as they took the test at different times, or that they could respond differently according to mood. One person reported some progress in his personal therapy between the first and second administrations, and felt the second test results reflected more what he was going after and the first a rather pessimistic view of himself. But this sort of evidence only adds to the confidence in the survey’s reliability and usefulness.

Key Points on the Reliability of the Personal Style Survey

In demonstrating why the survey should be considered to be reliable it is important to make the following points:

  • track record – the survey has been in use internationally in all the major developed countries for over 25 years
  • our experience of using the survey, combined with data from our licensees is used to constantly improve the product range
  • translations into other languages are carefully checked by experienced survey developers from each country for accuracy in terms of the culture and linguistic nuances – rather than just literally translated
  • over 8 million people have completed the survey
  • the model is based on well respected and soundly based psychological theories:

    - Erich Fromm in Man For Himself

    - the strength/weakness paradox

    - 4 behavioural orientations

    - Carl Rogers the founding father of client centred therapy

    - client centred development

    - communication congruency

  • the standard statistical measure of reliability often quoted is to achieve a correlation coefficient of 0.7 or above. However this measure is relevant for Psychometric Tests, often used in isolation from other data! The Personal Style Survey is not a test, its structure can be easily explained - its results can therefore be checked and explored openly and fully with the participant. Therefore a lower measure of statistical reliability, 0.4 – 0.6 is perfectly acceptable
  • conclusions are easily understood by the participants and (because the process is non-threatening) can be openlychecked against previous scores and reasons for differences explored jointly to establish confidence in the findings.

Future Developments
The technically minded will be aware that the transparent construction of the survey limits its performance in test/retest. Having completed the survey once completing the same survey at a later date can allow some unconscious manipulation of data – if the individual has had feedback on their profile (unlike the study described above) they may answer on the second occasion as they think they should. Licensees may not be aware that we already have a Personal Style Survey – Version Two for use with individuals who wish to assess how their behaviours may have changed. During 1998 we will be making available for the first time a range of surveys where the sequence of the answers has been randomised. We will notify licensees in the quarterly newsletter when they are available to purchase.

The Relationship Between Reliability and Validity
Reliability has importance because of its relationship to the validity of the survey. Whilst reliability is about the measurement, validity is about the relevance and usefulness of what is measured. It is possible for a survey to be reliable i.e. to measure the same thing consistently and with precision and for what it measures to be of no use or invalid. An example of this would be - knowledge of the person’s behavioural preferences is not a valid measure of their intellectual ability (the Personal Style Survey does not measure this). However, it is not possible for survey results to be valid if the data is not reliable.

Validity Measures
We shall distinguish three types of non-technical 'validity' which in a sense could be argued not to be validity at all:

face validity
content-analytic validity
faith validity
And four main types of technical validity:
content validity
construct validity
concurrent validity
predictive validity


Face validity
Face validity is concerned with whether an instrument appears to measure what it was designed to measure. Whilst face validity has no technical or statistical basis, it must not be overlooked if a survey is to be accepted by participants or (psychometrically) untrained managerial staff.


Content-analytic validity
One sometimes hears test users speak of content-analytic validity where the item content of a test has been analysed and related subjectively to abilities that are of assumed importance in the job. As an illustration, the argument might go:

  1. We wish to select a good salesman.
  2. This survey asks questions about selling.
  3. Therefore this is a valid survey.

This is often what untrained people call validity but it has obvious flaws in failing to define what the specific characteristics of a good salesman are and how the survey will measure these.

Faith validity
This is often the most difficult to deal with. It is a belief in the validity of an instrument without any objective data to back it up, and the evidence is not wanted!

The more empirically based concepts are:

Content validity
This is mainly in relation to attainment tests e.g. a spelling test containing only the names of politicians in America would be a poor test of general spelling in the United Kingdom. High content validity should always be checked with one of the empirical methods of validation described below when using any survey as a test.

Construct validity
Construct validity is more abstract than the other forms of validity and is the extent to which a test measures some theoretical construct or trait. Such constructs might be mechanical, verbal or spatial ability, emotional stability or intelligence. Building up a picture of the construct validity of a test can be a long process and involves any information that throws some light on the nature of the construct under investigation. The complex statistical technique which goes past the more visual inspection of inter-correlations between different tests and which is often met in construct validation is known as factor analysis.

Other information, which can lead to an understanding of the construct validity of a test, includes internal consistency and the effect of experimentally controlled variables and also variables such as age, sex and culture on test scores.

Concurrent validity
Concurrent validity is the relationship between test scores and some criterion of performance obtained at the same time. Thus, if we were to test a group of computer programmers and correlate the results with supervisors' ratings of work performance, we would have undertaken a concurrent validity study.
Where we wish to know the current status of an individual, concurrent validity is the most appropriate form of validity. Some organisations, for example, use attainment tests of job knowledge at the end of training courses or in making decisions on staff promotion. However, although a test may be of high concurrent validity it does not necessarily mean that it will be useful in predicting later performance.

Predictive validity
This is the extent to which a test predicts some future outcome or criterion. This is of crucial importance in personnel selection and placement. Two difficulties in relation to this form of study are:

  1. The timescales for undertaking studies are often lengthy reducing the practical use of the findings.
  2. Results can be distorted by the tests themselves; for example, measuring whether individuals assessed as high flyers achieved their potential can produce false results. Success may be partly a function of being identified by tests as having potential enhancing prospects rather than data on individual potential identified by testing being validated by actual performance leading to career progression.

Statistical benchmarks for validity studies are set at much lower levels than reliability – usually a correlation of between 0.2-0.3 as opposed to 0.6-0.7 for reliability reflecting the difficulty of achieving secure findings in validity studies!

The Personal Style Survey and Validity
Of the non-empirical measures only face validity has any relevance – the other non-empirical measures are seriously flawed and therefore inapplicable.


The whole range of Personal Style Surveys has very high face validity according to feedback received from licensees and course participants over many years. The reasons for this are:

  1. The transparency of the analysis – clients can see how the results are derived.
  2. The “deceptive” simplicity of the model – it is easily understood by participants yet also produces powerful insights into their behavioural strategies.
  3. Comparison of the feedback with self perception – the forced choice ranking is actually accessing sub-conscious self-understanding and bringing it into conscious analysis – giving the client more choices to consider.
  4. The ability to cross reference the survey findings with the views of others who know the individual (either in discussion or from analysis of the results of the Personal Style Feedback Survey)

Face validity is important for both the user and the administrator of the survey to have confidence in the appropriateness of the instrument in individual, team or organisational development.


Faith and Content-Analytic validity are unsound measures and should be discounted.
The empirical measures all presuppose some form of testing as they all require some form of standard to measure the survey against:

  • content validity depends on what purpose the survey is being used for to measure the content against.
  • concurrent validity and predictive validity are both trying to measure against a set of performance characteristics.

The difficulty here is that the Personal Style Survey is not designed to measure performance or ability – only behavioural preferences. As it is not used in isolation as a test there is no basis for doing such studies. A number of studies do exist on the use of the survey in career development and assessment centres but these are measuring the overall effectiveness of the process i.e. the combination of instruments and exercises - not the Personal Style Survey on its own. Information from licensees consistently indicates that the survey is very useful in processes where other instruments and processes can validate its results. It provides a helpful focus, which can be explored in more depth with the other techniques.

Conclusion
The Personal Style Survey is one of the most widely used behavioural surveys in the world. Because of the open process which is employed it is one of the most reliable and meaningful insights an individual can have into their subconscious self-understanding. The individual completing the survey can validate the findings against their self-experience and against the knowledge of them that others have. This information can be used to amend and extend the analysis provided by the survey results, which ensures a refinement of measurement, which is subtler and more robust than a statistical coefficient in isolation.

The ability of the individual to understand, explore and check out the survey results against real life data creates a more meaningful and valid outcome than a validity study can provide – the understanding and ownership of the conclusions are with the client rather than the coach/counsellor. Statistically the level of confidence achieved by validity studies is much lower than that derived from reliability studies and there are numerous examples where difficulties in measuring with confidence and flawed study techniques can all too often undermine the quality of the data generated.

Using a statistical framework to prove the reliability and validity of findings can (unintentionally) disempower clients as it is perceived by many as an incomprehensible “black box” which can create unnecessary threat and provoke caution and scepticism which is inhibiting and unhelpful in a development setting.

In contrast the Personal Style Survey and associated development exercises give the client ownership of the analysis using a client-centred process, promoting understanding and the confidence to consider new behavioural choices validated by their self-understanding and the feedback of friends and colleagues.

back to top