Home page

Better quality agricultural innovation, instructional simulation and reinforcement learning
A Note

Economics Unit

Instructional simulation is a powerful means of gaining important insights into any set of relationship that determine the likelihood of success of an agricultural innovation project. Decision analysis makes use of proven and robust operational research techniques based on the process of reiteration to identify the best options for investment. Today a "new" machine learning approach is based on what is called "reinforcement learning" which appears to be re-inventing the wheel of the cybernetics feedback models of the 1940s. This note provides an explanation of what is new, and perhaps useful, in this discussion.

Feedback and learning

Priors are the state of understanding of knowledge about how things are related and what happens when anything in a set of associated phenomena change. Therefore someone's estimate of the probability of the yield of a crop, in the absence of any prior information on the weather forecast or where the crop will be planted, is likely to be wrong. However, the same person provided with an accurate weather forecast as well as the location where the crop will be planted has a better foundation to estimate the likely yield. Therefore the receipt of feedback helps a person adjust their priors so that they approximate reality.

Instructional simulation

In any complex system, such as agricultural production systems, the level of knowledge concerning the determinants of animal and crop production yields is at a level that enables us to construct models to simulate the relationships between inputs and crop or animal yields. Models are usually built by establishing a network of multiple inputs and outputs relating to object properties such as temperature, availability of water, the ability of the soil to retain water, the structure of the soil particles so as to enable plant roots to access and absorb water and nutrients in solution, seed viability, genotype and other factors.

Trying to set up experiments in the field to analyse the impact of changes in any one of these variables is beyond the capability of most statistical designs and the interpretation of results would be very complicated. However, there are techniques, such as Monte Carlo Simulation, that can handle as many factors as there are inputs and trace the impacts of change. This is why simulations based on Monte Carlo allow project teams to ask many "what if?" type questions and receive replies almost immediately. The impact of this question-response cycle has a significant impact on people's priors as a team begins to understand in greater depth how their model or project will respond to changes in inputs. The result of such instructional simulation is that a project team builds up a profound knowledge of a project's resilience as well as which factors can have the most significant impacts on performance. Usually this type of information is lacking because project design tends to be a somewhat haphazard process being based on documents and spreadsheets and inadequate analysis of models.

Decision analysis

Feedback adjusting priors

Whether we speak of the classic decision analysis cycle, instructional simulation or reinforcement learning, the basic model and feedback mechanisms are the same. They all mimic the early cybernetic feedback models used to seek optimal outcomes, built in the 1940s.

Based on a cybernetic feedback loop circa 1940s
Decision analysis is an advanced set of methods and procedures designed to orientate decision-makers to take rational, effective and efficient decisions. It's prime methods of analysis include model building and then simulation as described above. The resulting build up of understanding of a project design results in a team having a comprehensive understanding of which design options will provide the most likely feasible and successful outcome.

Reinforcement learning

Reinforcement learning is a more recent reinterpretation of cybernetic networks where feedback is used to orientate machine decisions until some optimum result is achieved. It is in reality a mimicking of decision analysis and many operations research techniques including Monte Carlo simulation, linear programming and Markov chains. All of thee are based on a reiterative feedback procedure which repeats until a desired or close-to-desired outcome is identified and quantified.

Therefore instructional simulation, decision analysis are essentially the same thing and reinforcement learning is an attempt to automate the learning process as an embedded digital process. The diagram on the right summarizes all of these versions of the same decision analysis cycle model.

Implications for agricultural innovation

If we step back through the account above we can priorities the various approaches in terms of their relevance to better agricultural project design. The priority is clearly:
  1. Decision analysis
  2. Instructional simulation
  3. Reinforcement learning
Other articles on this site explain how decision analysis can be applied to project design and in that process the comprehension of a project team of their evolving project design is greatly enhanced by instructional simulation. To some degree the most robust operations research methods applied as simulation procedures are already reinforcement learning operating on a reiterative basis to identify desirable outcomes.

Why do all of these techniques do the same thing?

As humans we have a complement of cognitive and physical manipulative capabilities. One of the only people to analyse these capabilities of observation and deduction (the construction of priors) and to establish a mathematical logic to explain how we take decisions, was George Boole. He set out his unique explanation in the book, "The Laws of Thought on Which are Founded the Mathematical Theories of Logic and Probabilities" which was published in 1854. He included in this book a practical mathematics of logic and probabilities. This work provided the rationale and methodology for reducing complex logical relationships to simpler sets of relationships which can reproduce all of the possible relationships from which the set was derived. This process is known as Boolean reduction. Boolean reduction is used to reduce the size and complexity of complex digital logic designs to produce workable logic designs for circuits for digital devices as well as computer programs. The success of modern digital circuitry manufacturing, including micro-devices and the computer industry based upon these, rests directly upon the practical utility of the mathematics developed by George Boole. He succeeded in establishing, some 150 years ago, a practical basis for designing expert and knowledge based systems which mimic the human process of deduction.

Although broadly appreciated for its brilliance, George Boole's work had found limited practical application. However, in 1938 Claude Shannon published a paper, based on his 1937 thesis, entitled, "A Symbolic Analysis of Relay and Switching Circuits" where he explained how Boolean Logic could contribute to a more efficient circuit design. This seminal work initiated a process that eventually launched Boolean logic into the dominant logic of the digital world, including the World Wide Web in 2018 some over 160 years since George Boole published his work.

Based on Boole's work which provides a mathematical logic for human thought and the fact that digital devices and decision analysis algorithms are all based on the same logic it becomes self-evident that the many apparently "different" techniques of deduction and optimization of determinants, based on refinement of priors, all deploy the same logic and all end up doing the same thing.

Is there anything of value to agricultural innovation projects on the horizon?

In a recent Decision Analysis Initiative1 workshop held in Portsmouth, Hampshire in the UK three topics were identified as areas where there is a need for the dedication of more intellectual effort to improve project design based on the accumulation of relevant and comparable data within a multi-project portfolio. This leads to the concept of the portfolio data warehouse (PDW). However, for PDWs to provide useful knowledge there is a need for a more refined basis for portfolio design in regard to the identification and sourcing of the datasets held by them. In this regard two emerging techniques/methods are gaining ground as a foundation for refining data quality so as to enable more precise instructional simulation and better quality projects resulting from decision analysis. These include:
  • Data reference models
  • Applied locational state developments
Portfolio data warehouses

A review of the attempt to apply data warehouses to agricultural policy and project development have not been particularly successful because administrators and politicians have encouraged the merging of so-called administrative or regulatory data with physical and accounts data of farm activities into a single large database. This process appears to have been undisciplined and there has been little thought given to the specification of the datasets required according to their intended use in supporting decision analysis in policy or project design.

It is often the case that because the data was collected for different purposes it is close to illogical, in many cases, to try and combine the data into common functions or to even compare the data. Administrative data often has no basis for determining precision and error so the combination of data within algorithms only compounds errors. In order to build up a portfolio data warehouse with comparable data there is a need to standardize target datasets even when some data elements have not been collected. The objective is to establish a complete data set and then populate the data elements with good quality data and avoid introducing data that might compromise analysis and reporting quality/relevance.

Data reference models

Data reference models are a very simple method of mapping out the required dataset to support a range of analyses and reporting requirements. The map has a tabular structure with a series of rows. Starting with a decision and analytical result objective at the top the rows below each contain the algorithms used to complete the analysis identifying the data elements used. The row below lists the data used in the analytical functions and specifies the data in terms of units and associated properties. The row below identify where the data can be obtained if it already exists somewhere or establishes the method to secure the data based on surveys or experimental or research methods.

Applied locational state developments

Locational state theory states that the states or values of properties of objects in multi-component systems including all inanimate and animate phenomena change according to space-time location. More directly, the state of the properties of any object is a function of its current and past location in space and time (McNeill, H. W., "Introduction to Locational State Theory", SEEL, 1988). This is a statement of the evolutionary principle of change and adaptation or change and decay embodied in geological formations, ecosystems, the weather and in the genetic evolution of plants, animals and mankind. Where humans sense there is no change, this is because the rate of change does not fall within the capability of humans to sense and observe, unaided, such change, for example in geological erosion. On the other hand we can observe many changes with ease such as in the case of insects that have a very short life and life stages each with distinct physical forms and activities (e.g. reproduction) or the growth of crops and livestock.


Reinforcement learning is a reinvention of the cybernetic feedback looped learning model. These relate to "machine learning" but in the case of agricultural projects the essential learning needs to be undertaken by project teams. Therefore instructional simulation as part of decision analysis procedures of simulation, model exploration and optimization remain the main poorly exploited resource in project cycle and portfolio management. However, there are new frontiers in this field in the form of the design of portfolio data warehouses which is dedicated to the rational collection of relevant data to project design and decision analysis. Two techniques/methods are evolving which will contribute to useful portfolio data warehouses in the form of data reference models and locational state theory.

The sequel to this Note will develop the explanations of how portfolio data warehouses can greatly improve the process of project design to secure a higher quality of portfolio entry by providing an evolving knowledge base for decision analysis.

1   McNeill, H.W., "The state of the art and the future of decision analysis", DAI Workshop 5-6 May, Sustainable Agricultural Economic Development Session, SEEL, Portsmouth, 2018.