Testing the strength of evidence (TestE)
HR&S has developed a practical strategy for assessing if outcome and impact have been reached. The strategy is based on evidence and is called; TestE (Testing the strength of Evidence).
We aim for scientifically sound evidence and to benefit from a control, randomization, quantitative statistics, qualitative probability assessment, and contribution tracing.
We compile evidence for each expected outcome and each expected impact. The strongest cases use multiple forms of evidence, some addressing the weaknesses of others. Reported effects should be plausible as outcomes of the programme activities and consistent with the strategy for change and the observations compiled as evidence should be easier to reconcile with the programme’s claims than with other possible explanations.
It is of no importance whether an outcome was achieved or not, this is plain testing of evidence with no bias. Lessons learned on lack of outcome is as important.
Monitoring data shall eventually be uploaded real-time on RISEsupport.se, a designated IT platform accessible through computers, tablets and cell phones. We offer RISEsupport.se to our partners thus enabling efficient communication around tools, strategies, and results.
Our evaluations shall be real-time, the design of the evaluation is done as the programme is designed, baseline data collected prior to implementation and then we benefit from all opportunities for evaluation, while implementing. During the evaluation design we agree on how to collect, compile and analyse monitoring data, when and by whom. Lessons learned are compiled and addressed real-time.
We identify the study sample and the control, prior to implementing the programme, we reflect over sample size and external validity to meet the requirements of the selected statistical method, and benefit from simple randomized evaluation, whenever possible. We base the evaluation topics on Outcome Challenges, the quantification on Progress Markers and then measure against Baseline, and if randomization against Control.
Micro data Survey design
Agree on and implement how to collect, compile and analyse monitoring data, when and by whom.
Simple randomized evaluations
Randomized evaluations are a type of impact evaluation that use a specific methodology for creating a comparison group—in particular, the methodology of random assignment. Thus, impact evaluations that are scientifically sound usually compares outcomes of those (individuals, communities, etc.) who participated in the programme against those who did not participate. They generate a statistically identical comparison group, and therefore produce the most accurate (unbiased) results. Randomized evaluations tend to produce results that are very easy to explain.
What is the appropriate level or unit of randomization?
What is the appropriate method of randomization?
How would we implement the randomization?
Study sample & control
Ensure ethical sound selections.
Study sample: Who?
Sample size: How many?
External validity: i) Ensure that a person asked is representative for the group, and that we would expect the same answer if we asked someone else. ii) Reflect over if the results of our randomized evaluations is generalizable to other contexts.
Control: Who, where, how many?
Reflections on ethically sound selections
Without denying access: It can be possible to conduct a randomized evaluation without denying access to the intervention. For example, we could randomly select people to receive encouragement to enrol without denying any interested participants access to the intervention.
Pilot-phase randomization method: An ideal time to conduct a randomized evaluation is during the pilot phase of a programme or before scaling up. During the pilot phase, the effects of a programme on a particular population are unknown. The programme itself may be new or it may be an established programme that is targeting a new population. In both cases programme heads and policymakers may wish to better understand the impact of a programme and how the programme design might be improved. Almost by definition, the pilot programme will reach only a portion of the target population, making it possible to conduct a randomized evaluation. After the pilot phase, if the programme is shown to generate impact, the aim must be that the programme is replicated or scaled up to reach the remaining target population.
Two versions of an intervention
It may be useful to compare two different versions of an intervention, such as an existing version and a version with a new component added.
Quantitative analysis – Statistical method
Basic statistics: The basic assumption to be made is that a set of data, obtained under the same conditions, has a normal or Gaussian distribution. The primary parameters used are the mean (or average) and the standard deviation, and the main tools F-test for precision, t-Tests for bias, Linear correlation and regression and Analysis of variance (ANOVA).
Simple comparison: With randomized evaluations, the simplest method is to measure the average outcome of the targeted group and compare it to the average outcome of the control group. The difference represents the programme’s impact. To determine whether this impact is statistically significant, one can test the equality of means, using a simple t-test. One of the many benefits of randomized evaluations is that the impact can be measured without advanced statistical techniques.
Propagation of errors: The final result of a Programme is calculated from several activities (outputs) performed during the implementation and the total error in a programme is an adding-up of the sub-errors made in the various steps. The bias and precision of the whole Programme are usually relevant parameters.
Qualitative assessment – probability methods
With qualitative assessments, and contrary to statistical methods, the quality of the evidence is not judged by the sample size (the number of observations) but rather the probability of observing certain pieces of evidence. Qualitative impact evaluation includes assessing the contribution made by a particular intervention in achieving one or more outcomes, commonly referred to as a ‘contribution claim’. TestE benefit from process tracing to assess our Strategy for Change and from Contribution tracing to examine the contribution by external stakeholders . We also address Team operations, Cost- benefit, Needs driven, Equal partnership and Unexpected effects.
micro data Survey questions
- Progress markers
- Financial outcome
Use semi-structured questions.
- Strategy for Change
- Milestone achievements
- Cost- benefit
- Team operations
- Needs and user driven achievements
- Equal partnership achievements
- Unexpected effects
Micro data Survey manual
Develop written survey manual.
- Number of interactions, where, when by whom. Interviews, group discussions, testimonies.
- Who to interact with.
- Semi-structured interview guide.
- Survey questions.
- Other issues.
Assign the data collection team
Ensure the team has access to the tools required; written copy of the survey manual, camera, recorder, notebooks ad pens, transportation means, and that each team member is comfortable with the assignments.
The team assignments include:
- Collect testimonies.
- Taking photos.
- Record videos.
- Record interviews.
Taking notes in a dedicated notebook about everything that happens.
Ensure water-tight communication with our repondents.
Interview after interview, community after community, the exercise shall progress smoothly and a lot of interesting evidence – videos, photos, documents, recordings, etc. – shall be gathered.
At the end of a data collection exercise, it will be obvious that a data collection team has learned several lessons.
Macro data Survey design
Compile & analyse monitoring data
Compile & assess monitoring data
- Programme reports
- Financial reports
- Auditing reports
- Implemented Output & Achieved Outcome
- Microdata survey results
- Macodata survey results: Broad-trend data and literature reviews.
Testing Strength of Evidence
Personal stories: Personal stories are not easily classified, categorised, calculated or analysed. More recently, software programmes are available to facilitate categorisation of story fragments, which allows for analysis of patterns that can lead to quantitative information.
Strategy for Change: A reasonable contribution causal claim can be made if: there is a reasoned Strategy for Change for the intervention, in the sense that the key assumptions behind why the intervention is expected to work, make sense, are plausible, may be supported by evidence and/or existing research, and are agreed upon by at least some of the key stakeholders. The activities of the intervention were implemented as set out in the Strategy for Change. The Strategy for Change (or key elements thereof) is supported by and confirmed by evidence of observed results and underlying assumptions, thus, the chain of expected results occurred. The Strategy for change has not been disproved. Other influencing factors have been assessed and either shown not to have made a significant contribution or their relative role in contributing to the desired result has been recognized.
The Lessons learned are compiled and addressed real-time.
Impact reports are shared with all partners and uploaded on our web-site.
Definitions by HR&S
Ambition The need and ambitions as expressed by the local stakeholder, the Target partner.
Outcome challenge Challenges hindering the Target partner to reach her ambitions, as expressed by the Target Partner.
Activity Activities arranged by the Programme management Partner, addressing the Outcome challenges identified by the TP and that generates a specific Output.
Expected Output The Expected Outputs are quantified results from the Activities. The PP are in control over Outputs. It can for example be the number of active participants in a certain number of workshops that lasted a certain period of time.
Input Resources required to arrange the Activities.
Expected Outcome Actions taken by the Target partners as a results of the Activities. The programme managers do not have control.
Progress markers Measurable indicators of progress or non-progress. They are linked to the expected Outcome and are categorised at Level 1, 2 and 3.
Expected Impact We define Expected Impact as Expected Outcomes that have become sustainable over time and does not require backup from the Programme to be sustainable. The expected impact is quantitatively measureable as a result of our progress markers and we are accountable for the Expected Impact. The Expected impact is measured at the time of closing the programme. We may in addition aim to measure if our impact is still sustainable some period after we have closed the programme, maybe one, two, five and event ten years after.
Possible Impact The possible Impact is often a wide and qualitative statement, something that is desired and that may or may not happen as a consequence of our interaction, and often long after we have closed the programme. We are not accountable for the possible Impact, and we can also not claim it as goal that we strategically work towards achieving. If it actually happens, then we do often not have evidence for to which extent it was actually cause as a result of our programme.