Are you planning to commission a mid-term review or evaluation of an ongoing programme? Getting ready to write up the Terms of Reference?  Here some tips for consideration.

 Stand alone? Is your programme by itself a major contributor to the impact you want to achieve? In other words, is it a fairly closed system so that it can be meaningfully reviewed in isolation of the actions of others?   If not, would a review of a collective action make more operational sense? What other programmatic interventions would have to be included? Example: The relief responses for the Rohingya refugees in Bangladesh, or for the earthquake/tsunami survivors in Sulawesi, Indonesia, are fragmented, with target populations receiving complementary services from different agencies. Strong sector-coordination with weak socio-geographic coordination doesn’t help create integrated approaches. In Sulawesi, two donor agencies jointly supported a mid-term review of the overall response, through the total of 21 agencies they funded. In Bangladesh, three mid-term evaluations commissioned separately by three leading agencies, were perceived as an indication of insufficient collaboration.

DSCN1927 (2).JPG

 Everything or focus? Do you need all components of our programme reviewed, or is it more useful to zoom the evaluation in on certain aspects? What could these be: perhaps the most expensive ones; the technically more challenging ones; or e.g. those that relate to people-centered action (inclusion, participation, information, accountability, protection…); impacts of your individual or the collective action on local capacities; or impacts of the intervention on the social relations (gender and age, social cohesion…), all of which are harder to manage and harder to assess? What will bring greatest value for your, and maybe the collective, ongoing programming here? Example: One donor commissioned an evaluation of the accountability-to-affected populations approaches of all the agencies it funded for post-earthquake reconstruction work in Nepal.

DAC criteria? How do you want to interpret the DAC criteria? Do you want to know whether you fulfilled the terms of our contract, or above else whether your programme is having meaningful impacts in this environment? The two may not fully coincide.  Do you mostly want to hear reflections and recommendations about ‘what’ you achieved in your programme (task-focus) or, in a context with many other actors with complex power dynamics, also ‘how’ you are engaging with the many stakeholders (relationship focus)? Is the contextual relevance of your programme a question that can be fairly straightforwardly assessed, or does it require deeper contextual understanding and a critical reflection on theories-of-change? Example: An evaluation of a programme supporting local peace capacities in north Mali, revealed its disconnect from other, larger peace- and governance reform initiatives. That required the evaluators to take a broader look, at the bigger picture. How in practice can efficiency be meaningfully assessed? What data do you have, for example how, in your decisions, ‘cost’ was weighed against other considerations? Is it possible already to asses impacts and the question of sustainability, or is it still too early? Are there other sector-wide benchmarks or thematic references than DAC criteria that must or can be reviewed here? Have you, as an organisation, made public commitments to certain ways of working; do you want the mid-term exercise to assess how you are living up to those?[i] Could you ask the evaluators to look at whether earlier institutional (and sector-wide) learning is being drawn upon?

 Must hear? Whose views and experiences on all this do the reviewers definitely need to hear? Who are important stakeholders whose voices and views are seldom heard? If these are e.g. our intended beneficiaries, clients, partners, local authorities – will you give the evaluators enough time to also listen to them - after they have met with your staff, donors, other agencies? Might you suggest they first listen to these actors, and only afterwards to you and your closest partners?

Relevant time span? Do you have a prior history of engagement in this environment? Is that relevant for the programme that you currently want reviewed? How far back in time should the reviewers ideally go? Do you also want a forward-looking perspective? Can you offer the reviewers/evaluators the additional time to familiarise themselves more deeply with the context and explore possible future scenarios and the key factors that will shape them? Example: One mid-term evaluation was framed to cover six months. As the evaluators decided not to do a data-collection stop during the process of successive draft writing, the final report covered eleven months. But as it took into account relevant prior history and, as requested, looked forward at possible scenarios, the overall time horizon became 7.5 years.

Data quality, sources and interpretation? Do you have good enough data on what you would like reviewed, that go beyond activities and outputs? If not, how much time will the evaluators need to collect such key data? Are the sources of your data relevant and diverse enough, to consider the data representative enough? Are you confident of the interpretation you made of your own and other available data? Do you want the reviewers to check this, and test alternative interpretations? Again, what would that take, practically? Have you already tried to capture also unintended effects and consequences of your programme, or do you want the reviewers to inquire also into this? What would that take, practically? Example: An evaluator was given the results and trend interpretation of three consecutive surveys. A closer look however revealed that some of the questions in the consecutive surveys had changed, and that -unintendedly- the survey respondents were not the same. While this didn’t fully invalidate the interpretation, it reduced its confidence-level.

Controversies? Are there controversies in your programming environment, e.g. about a policy decision, about coordination leadership, about what authority to relate to if there are contesting authorities? Do you want the reviewers to look into those, and give you advice on how to resolve them, or at least navigate them? Example: A mid-term evaluation of a programme supporting internally displaced people, could not ignore the new government policy pushing the displaced to return, when many places of origin were not sufficiently secure and had no basic services. Dilemma-management is often part of programme or larger strategic management. Why not have a review look into that?

Real-time? Who in your organisation will have the time and stamina to engage attentively with successive drafts of the report? If you want the insights and guidance of this review to serve you in real-time, how to avoid getting too delayed by waiting for a final report? Example: The process of successive drafting and commenting on drafts, of a mid-term evaluation of several strands of work in a complex situation, took a good two months. While several of the attention points mentioned in the first draft had been picked up by then, others got lost in the back-and-forth conversations. Put emphasis on the debrief & learning workshop(s) at the end of field work, not only on the final report.  Alternatively, if you are operating in a very complex and/or changing situation, could you consider an iterative review process, with some repeat visits? That could maximise the value of the reviewers as they get to know the context and the interventions in it much better, but also gives them more of an accompaniment role.

Recommendations how? Do you want the reviewers to immediately tell you their recommendations? Or are the questions you ask them to explore actually your questions?[ii]  Is it useful to outsource the thinking?  Can you ask them to hold back their recommendations and at first only present their findings, so you have an opportunity to consider where these findings tell you you should go?

In the public domain? Will you put the report in the public domain? Will that generate defensiveness?  Can you project yourself as an organisation that welcomes positive but also critical feedback, because it is in line with your culture of continuous improvement?

 Evaluability? Many options, to make the review or evaluation exercise a very interesting one. But reviewers/evaluators can’t do miracles. Check the evaluability of your programme – and how much review/evaluation time you can get with your budget. Did you state clear objectives for what you intended to achieve with your action? If not, it will not be possible to evaluate whether you achieved what you set out to do. Do you have baseline information, if not very detailed and quantified, at least a good enough description of the situation as it was when you started out? If not, it will be harder to assess what has changed because of your intervention. Are all stakeholders that the reviewers must speak with accessible, who is not and what does that mean for the evaluation? Example: The Terms of Reference wanted an evaluation to cover a programme extending over several areas in Southcentral Somalia. But, for security reasons, the evaluator could not leave a well-protected compound in Mogadisho. This cannot be compensated for with a few phone calls to people outside Mogadisho.

Above all, are your means proportionate to your expectations: do you have the budget to give the evaluators the time they practically need, to meet all your expectations? Example: One Terms of Reference asked for an evaluation of a three-year programme of engagement with 10 very different institutions, plus a forward-looking perspective. The budget however only allowed for 5 days in-country: not realistic! You can’t buy a Rolls Royce with the budget of a Fiat. However, consider joining up with another agency to co-fund the exercise. Otherwise, what choices must you make to bring your ambitions in line with your means? Can a methodologically lighter ‘review’ still give the guidance you need?

[i] See “Humanitarian Evaluation: Include Grand Bargain references”.

[ii] See “These Questions are Ours: Evaluative thinking beyond monitoring and evaluation”.




Grand Bargain cover.png

The Grand Bargain commitments that resulted from the World Humanitarian Summit in 2016 envisage significant changes to how the collective humanitarian ‘system’ operates. Since then, the ten commitments have been extensively discussed in separate working groups, while there is also a beginning compilation of progress based on voluntary self-reporting.

GMI’s ongoing work, particularly on the commitments related to a ‘participation revolution’ and ‘localisation’, confirms the observation in the 2017 synthesis report that, in this regard, “there is little evidence yet of structural or systemic change that would allow a more flexible international footprint according to national and local capacities and context, or increase the representation of local actors in humanitarian decision-making.” [i]

 One way of driving change is to conduct more reviews and evaluations with reference to the Grand Bargain. Taking our two commitments of core interest, the Grand Bargain document offers some possible indicators, that can directly be translated into real time and retrospective evaluation questions.

 Commitment 6: Participation Revolution

§  The leadership and governance mechanisms at the level of the humanitarian country team and cluster/sector mechanism ensure engagement with and accountability to people and communities affected by crisis;

§   Common standards and a coordinated approach are applied for community engagement and participation, with emphasis on inclusion…and supported by a common platform for sharing and analysing data to strengthen decision-making, transparency, accountability and limit duplication;

§  Local dialogue is used as well as technologies to support agile, transparent but also secure feedback;

§  There is a systematic link between feedback and corrective action to adjust programming;

§   Donors provide time and resources for this and fund with flexibility to facilitate programme adaptation in response to community feedback;

§  All humanitarian response plans – and strategic monitoring of them- as of the beginning of 2018 demonstrate analysis and consideration of inputs from affected communities. 

Commitment 2: More Support and Funding Tools for Local and National Responders.

§  There is multi-year investment in the institutional capacities of local and national responders, including preparedness, response and coordination capacities. This is being achieved also in collaboration with development partners and through the incorporation of capacity strengthening in partnership agreements;

§   Barriers that prevent organisations and donors from partnering with local and national responders are removed, and their administrative burden reduced;

§  National coordination mechanisms are supported where they exist, and local and national responders are included in international coordination mechanisms, as appropriate and in keeping with humanitarian principles;

§   Greater use is made of funding tools which increase and improve assistance delivered by local and national responders, such as country-based pooled funds.

Other indicators can also be envisaged for these two commitments. Other Grand Bargain commitments, e.g. to more joint and impartial needs assessments, increased coordination and use of cash programming, and harmonised and simplified reporting requirements, can equally be chosen for evaluative attention.

In 2017, as member of the Steering Committee of the Dutch Relief Alliance end evaluation, I advised adding a Grand Bargain reference as third attention area, to the two that considered collective performance against classical DAC criteria such as relevance, efficiency, effectiveness and impact.[ii]This may have been the first such.

In February 2018, the Global Mentoring Initiative (GMI) conducted a review of the collective response to the Rohingya crisis in Bangladesh through a Grand Bargain lens, particularly the two commitments mentioned above.[iii] Our general findings confirm those of a real-time review of the DEC agencies response but enrich them with a Grand Bargain lens. This is not as innovative as it seems: It follows in the footsteps of the ground-breaking 2005 evaluation of the ‘Impact of the International Response to the Indian Ocean Tsunami on Local & National Capacities’, that one of us was involved in. [iv]Unfortunately, it also reveals that, 13 years later, the international relief industry has not taken on board its recommendations. Nor have the commitments made in Istanbul in May 2016 made any difference to how we so far responded to the Rohingya crisis.

Grand Bargain commitments are particularly relevant, because they consider a more strategic, collective-action level. Too many reviews and evaluations still focus on projects or smaller-scale programmes which are seldom, by themselves, enough to achieve more lasting impact and more transformative change.

Translating the Grand Bargain commitments into practice will take time and multiple efforts. Introducing them in evaluations is one mechanism to help this happen. To that effect, evaluators of humanitarian action (like humanitarian advisors) will have to become very conversant with the Grand Bargain, which is not yet the case.

[i] OCHA 2017: No Time to Retreat: First annual synthesis report on progress since the World Humanitarian Summit.

[ii] We agree with our ODI colleague T. Pasanen, that we should not add more to ToR that are already disproportionate to the time and budget available but select some for particular attention. 2018: Time to Update the DAC Evaluation Criteria?

[iii] GMI 2018: Debating the Grand Bargain in Bangladesh. How are Grand Bargain commitments shaping the response to the FDM/Rohingya influx?

[iv] Tsunami Evaluation Coalition 2005 :



They just don’t understand what the realities are!” Have you heard such statement of exasperation from programme people about a potential or actual institutional donor? Not infrequently they are right: even in volatile situations and with objectives that require behavioural change from influential others, we are often asked to submit to the illusion of ‘control’ and present overly detailed plans, budgets and timelines.

Fortunately, there are also institutional donors that learn. That can be seen in the recent tender ‘Addressing Root Causes of Conflict and Irregular Migration’ (ARC) of the Dutch Ministry of Foreign Affairs. The ARC Fund was available for NGO applications and for 12 identified countries. Both in process and in content it represents a sophisticated approach.

1. The Process.

Some donor choices stand out as quite unusual or even innovative:

  • By setting (with strong input from the embassies)  country-specific priority objectives that also represent important milestones in the Ministry’s general ‘theory of change’ regarding ‘security and the rule of law’, the tender successfully reconciled country-relevance and policy-relevance;
  • Applying NGOs were not asked to submit a detailed proposal and budget, but a Track Record and a Concept Note in line with detailed guidance. This in recognition of the fact that developing a detailed proposal requires much time investment, which may not result in winning a grant. And it leaves less scope for further discussion between donor and operational agency about specifics of the planned programming – which is meaningful if there is a stronger ‘partnership’ intent;
  • The Track Record, already used in previous tenders, signalled a belief that addressing key drivers or root causes in a country (or specific priority region within), is unlikely to happen if the operational agency is not already well experienced there, and has no prior experience with programming around challenging objectives such as ‘more inclusive and responsive governance’ or generating ‘income-opportunities’ that can reduce the drive to migrate;
  • Because the Ministry wants to promote less competition and more collaboration, its intent is towards ‘partnering’ rather than ‘sub-contracting’. This manifests itself in a readiness to provide thematic support on particular interest areas such as conflict-sensitivity, gender, partnerships and monitoring & evaluation. And in an invitation to grantees, to collaborate in the development of an overarching ‘results framework’ that will make it easier for the Ministry staff to report on the overall effectiveness of this Fund investment.
  • No less significant is the fact that the grants are for 3-5 years, with the understanding from the donor that programmatic adaptations will likely be the norm rather than the exception within such time frames.

2. The Content.

The guidance for the Track Record and Concept Note submissions shows a donor administration that has been learning quite well, and wants to see similar learning reflected in the conceptualisation and design of the programmes it funds.

a. The Track Record had to consist of two case studies relevant to the specific objectives for the country that the applicant sought funding for. One of the case studies had to refer to the country itself, and at least one of them had to be supported by an internal or external mid-term review or evaluation.
Explicitly requested attention points in the Track Record were:

  • Substantive involvement of local partners and of the target groups, that shows that the action is locally owned and there is local accountability;
  • ‘Complementarity’ of the action with that of others working in the same environment or on similar issues – though allowing also for innovative programming;
  • Not just ‘gender sensitivity’ but a conscious interest to support ‘gender transformation’, in situations where women do not enjoy equal rights;
  • ‘Conflict sensitivity’ especially in the sense of awareness about the impacts of the programme and the agency on the overall dynamics of conflict or peacefulness in the operating environment;
  • The quality of the monitoring and evaluation practices;
  • The effectiveness of the past programming at the outcome level – with attention to whether the outcomes were formulated in a ‘SMART’ way;
  • The provisional sustainability of the results achieved, accepting that there are multiple dimensions to ‘sustainability: organisational capacities; financing; something being institutionalised by law, social sustainability (i.e. interiorised behavioural changes) etc.

b. The Concept Note had to be built around four major themes:

  • Context & actor analysis: what are the drivers and contributing factors of conflict and irregular migration (from the local to national, regional and where applicable international levels); who are the relevant actors that are part of the problem or can help resolve it; the power dynamics among them; where do the applicants sit within the actor dynamics; what are the gender relations and gender power dynamics in the proposed programming environment. Those proposing to engage in activities to create more income opportunities had to provide a market analysis. All this analysis can result in a clear problem statement.
  • The Theory of Change for the proposed programme, that creates a credible and plausible link between the problem-statement and the intended outcomes. Attention was required here to the strategies to achieve the outcomes, the assumptions underpinning them and whether the strategies are within the sphere of influence of the applicant. An argument also had to be provided regarding the complementarity of the proposed programme with the efforts of other actors, and the strategies envisaged to increase the chances that positive results can be sustained.
  • Added value and relevance: Here the applicants that apply as a consortium were asked to explain what the added value is of working in a consortium (general), and what the added value is of their specific consortium. They can also make a pitch for why their proposed programme is strategically relevant for the identified priority objectives in that country.
  • Conflict sensitivity: The applicants were expected to demonstrate that they can practice conflict sensitivity, and ensure their local partners, contractors etc. can as well. When the proposal concerns an economic activity, they were to demonstrate that they will responsibly manage the risk of market distortion.

The choice for a longer duration, the encouragement of collaborative work and a recognition that adaptations along the road will be almost inevitable, make this ARC Fund and tender a ‘progressive’ endeavour. In its content, it also integrates important ‘lessons identified’ (though not everywhere ‘learned’!): Attention is paid to the potential interconnectedness of ‘contexts’ at different levels, and to power dynamics among actors and between genders. Agencies are asked to reflect about their own position in the actor-landscape. Where they envisage economic activities, they need to go well beyond ‘vocational training’, and underpin their proposals with a credible market analysis that also looks at demand, the role of the private sector, financing facilities etc. The Dutch have opted to use the ‘theory of change’ approach which is potentially more sophisticated than the now mechanically used ‘logframe’. They rightly focus on the likely area of ‘influence’ of the applicants. Complementarity or added value is another explicit attention point, as are gender, conflict-sensitivity and strategies towards sustainability.

It could also have been appropriate to encourage the applicants to do some explicit ‘scenario thinking’ – not a luxury when considering a 3-5 year horizon in rather volatile environments. This can still be introduced at the subsequent stage of detailed proposal development. With regard to assumptions underlying the tender, questions can be asked whether better local income-opportunities will be sufficient to decrease the perceived necessity or desirability of migration. ‘Irregular’ migration tends to be driven by a variety of complex motives, and some have argued that better incomes actually allow prospective migrants to cover the costs of ‘irregular’ migration.

Overall however, this is a high quality tender, that integrates and promotes various important attention points that in discussions and trainings tend to be separated out. As such it promotes and supports sophisticated programming: Producing a credible ‘Track Record’ as requested here, requires not just effective M&E systems, but overall reflective practice and solid documentation. A ‘Concept Note’ of this calibre cannot be produced by a professional proposal writer without significant substantive input.

The other side of the coin is that the bar here gets very high for quite a number of NGOs. Even if an organisation or consortium/partnership of organisations has been doing very good and relevant programming in one of the target countries, not all of them have the quality of documentation to put together a strong written ‘Track Record’. While many aid workers have now heard about ‘theories of change’, for decades they have been forced to think ‘logframes’. Presumably many would be challenged to explain succinctly what the differences are between both and why a ‘theory of change’ approach might be superior? ‘Power analysis’ (or what is also referred to as ‘thinking and working politically’) is not part of the daily mind-set in a sector that has long pretended that the problems and their solutions are largely ‘technical’. Most aid-supported interventions related to conflict & peace still tend to work on a particular level (local, intermediary, national…) with no or limited cross-linking between levels.

We can certainly see how these requirements become even more daunting for many national and local organisations. Even if their work is as sophisticated as requested, they may stumble over the need to demonstrate this, in a limited number of pages, in English. Not many made a bid as lead applicant. If the ultimate aim of external support is to strengthen the local capacities for peace and equitable development, then this remains a strategic attention point. Not only to enable them to access aid money, but also to understand better where the sophistication in their ways of working may lie.



The concern to demonstrate ‘results’ or ‘impact’ has provided a broad sectoral incentive to invest more in ‘design, monitoring and evaluation’. This has led to a proliferation of manuals for programme staff and practitioners and a fair specialisation of ‘evaluation’, with courses, communities of practice, professional associations and even a few universities offering advanced degrees in ‘evaluation’.

Relevant and valuable as the investment in M&E is, we are missing however a critical element: stronger evaluative thinking among programme staff.



Read More