Engineering Sep. 13, 2018

Production Model Execution via Capabilities

“If software ate the world, models will run it” is the conclusion of a recent influential article about how software and data are transforming the world today. The article nicely sums up how Climate is inventing digital agriculture, building on our road to one billion acres of data:

“These companies structure their business processes to put continuously learning models, built on “closed loop” data, at the center of what they do. When built right, they create a reinforcing cycle: Their products get better, allowing them to collect more data, which allows them to build better models, making their products better, and onward… In a data-driven business, the data helps the business; in a model-driven business, the models are the business." Models Will Run the World*

An interesting technical question I am often asked is how to systematically and predictably make this happen in engineering, having done so at several well-known companies. This is especially challenging in digital agriculture, as our models are sophisticated and the average age of our customers is the mid-60s in the United States. Doing this at Climate is all about turning agronomy into easy-to-use software for our Climate FieldView™ digital farming platform, as fast and predictable as possible. In short, making model-driven innovation predictable.

Beginning this fall, we will deliver our next step-change acceleration, transforming production software from hand-crafted model stacks to reusable capabilities and decoupling model logic to enable parallel iteration by science and engineering teams. We are applying best practices proven in Silicon Valley to advance digital agriculture worldwide. This transformation is technically challenging in numerous interesting ways, so we want to begin the dialog here and elaborate on more technical details in forthcoming posts.

Challenges of Scaling Model-Driven Systems

Before digging into what is coming this season, let’s first summarize a few technical challenges we have faced building model-driven systems while inventing digital agriculture so far. As discussed in previous blogs, Climate was built on cloud primitives from AWS, such as ECS for microservices and RDS for data stores. While cloud primitives look advantageous when starting from scratch, modern scalable model-driven systems built with them suffer from challenges in velocity, complexity, maintenance, and verification. Development velocity progressively decreases as each new generation and category of models requires building yet more services, with the usual proliferation of boilerplate and similar-but-not-identical code. Systems complexity grows superlinearly, as the number of distinct cloud primitives scales up by the number of deployed models. Maintenance and operational costs both grow superlinearly, driven by progressively expanding systems complexity. Automated testing and model verification complexity grow even faster, as model code in R or Python authored by scientists must be rewritten to integrate with the production runtime stack.

All these moving parts make reasoning, validating, operating, monitoring, and tuning everything increasingly difficult and costly as our universe grows and matures. Wearing our computer science hats, we observe that accelerating our productization lifecycle unintentionally causes our global computational environment to exhibit O(n) complexity. Intuitively, we understand this is an impedance mismatch between our problem domain and standard cloud best practices. Upon observing this trend, we set out to instead achieve O(1) complexity.

Predictable Innovation Through Capabilities

Inventing digital agriculture has repeatedly taught us: problems that look different on the surface are often actually the same when viewed through a different lens. For example, consider seeds and fertilizers. From an agronomic perspective, they could not be more different. Seeding is about the right hybrid, quantity, and placement of seed into soil. In contrast, fertilization drives chemical processes and effective application to soil is shaped by wind and weather. Both are remarkably complex scientifically and have been studied intensely for decades. Most importantly, both are cornerstones of our future fully integrated agricultural solutions, as discussed in previous articles.

Yet, if we step back and think abstractly, we see them both as one and the same. For example, both are physical things applied to a field. Both contribute to variability in the quantity and quality of crop we harvest from the field. Both are measured on a cost per unit basis, such as dollar per acre in the United States. Both are evaluated on a variable profit basis, such as dollar value in crop yield increase less dollar cost of input.

In software speak, both seeds and fertilizers are concrete instances of an agronomic input that is associated with a virtual agricultural field for a growing season. We model both seeds and fertilizers as data layers in a multidimensional virtual field, illustrated in the adjacent graphic. If you are familiar with inheritance in object-oriented programming or pattern-driven software architecture, this is the same intuition. We apply this thinking to find duplication in our macro architecture. As we uncover these redundancies, we abstract up and replace various redundant systems, each solving a specific problem with a single generic system.

We use the term “capability” for these generic systems that solve many problems. Each capability has a generic API that can be invoked from both our FieldView products and systems across the larger enterprise. We build these capabilities from combinations of horizontally-scalable cloud primitives, ranging from spark clusters to optimization solvers to macro service endpoints.

Model Execution with Capabilities

Solving many different agronomic problems with a few capabilities enables us to tackle higher-order problems. For example, every season growers must make crucial decisions about seeds and fertilizer for their fields. First, growers must decide what seed to plant on a field. Second, they must decide what fertilizer to apply at what times of the season. In machine learning speak, both problems must translate hundreds of features about a field and agronomic inputs into recommendations for what products to buy and how to use them in the field. These recommendations fit into the larger strategic context of fully integrated agriculture solutions that guide decision making via digital tools.

While the science of seed and fertilization is different, the key engineering insight is recognizing that production code executing recommendation models need not be duplicated for every model or require hand-crafted software stacks. On the contrary, we have learned that we can build a single generic recommendation capability that solves many recommendation problems. For example, a single generic recommendation capability can execute models for both seed hybrids and fertilizer applications, rather than needing stacks of code dedicated to each.

Our recommendation capability is generic, as it decouples agronomic model code and the production code stack that executes them. The capability can thus execute any model, provided its computational learning type is supported. By decoupling models from their underlying execution, we can organize all our models into model repositories. Organizing models in repositories enables us to solve otherwise difficult problems declaratively, such as versioning and cohort experimentation. Use of identifiers in the model repository provide runtime indirection and enable client invocation without assumptions about type or runtime for each model. In doing so, we are increasingly thinking about models in terms of metadata and beginning to apply ideas that originated from traditional metadata management.

So, let’s recap our journey so far, with seeds and fertilizers as our exemplar:

  1. We began thinking about seeds and fertilizers as distinct agronomic things
  2. We abstracted seeds and fertilizers as two concrete instances of generic agronomic input
  3. We abstracted each generic input into an individual layer within a multidimensional virtual field
  4. We abstracted and decoupled agronomic recommendations for seeds and fertilizers into two distinct parts: a single generic recommendation capability and many concrete models that are executed by that capability
  5. We abstracted models and organized them declaratively in a generic model repository

This chain of thinking exemplifies how we are transforming our production software from hand-crafted stacks to reusable capabilities. Doing so enables us to build our recommendation capability to support challenging technical requirements beyond the usual suspects from the consumer Internet world:

  • Spatio-temporal: execution and data environments must be spatio-temporal-aware as agronomic fields are virtually represented as high resolution, multidimensional planes with time series sampling both within and across seasons
  • Hierarchical model graphs: model execution must support hierarchical, directed acyclic graphs (DAGs) of submodels capable of reflecting the complex natural variation arising on fields across the globe
  • Heterogeneous models: integrated models increasingly must combine multiple heterogeneous computational classes of learning and prediction algorithms, including machine, statistical, physical processes, and deep
  • Large model universe: execution must support a large and rapidly growing number of distinct submodels to accommodate our expansion across agronomic domains
  • Verifiable high quality: growers use these recommendations for financial decision making, thus verifiable high production quality is imperative

Let’s now dig into technical implementation details, as illustrated in the adjacent graphic. The recommendation capability is composed of multiple model-independent subsystems, each built using cloud primitives. The model repository provides the bridge between source assets from git and the production environment. Multiple execution environments support each computational model category, using a combo of EMR and commercial systems. Step Functions implement distributed workflow to provide coordination for hierarchical model DAGs. SQS is used for asynchronous execution synchronization and distributed fault tolerance. Use case-specific business logic is isolated in services exclusively within the serve tier, using ECS. Operational instrumentation, monitoring, and alerting is implemented across the subsystems for unified model execution telemetry. Finally, code-driven devops provides automated configuration and compliance with security requirements. The capability provides an API that is both unified and isolates callers from evolution in the underlying cloud primitives used for implementation.

Benefits of Capabilities

Our transition from hand-crafted stacks to generic capabilities is bringing numerous engineering benefits:

  • Code minimization: progressively minimize the incremental new lines of code necessary to deliver each new model
  • Model decoupling: deploy new models only via changes in configuration and model inner loop code, rather than pushing a constellation of services
  • Dynamic model assembly: compose, condition, and ensemble models at scale impossible with hand-crafted code due to combinatorial permutations
  • Workload tuning: operate, scale, and optimize model workloads based on their dynamic runtime profiles, instead of trying to do so prematurely via static code organization
  • Operational efficiency: operate, monitor, and scale a few highly-scalable capabilities rather than many hundreds of cloud primitives
  • Model test infrastructure: build test infrastructure once and reuse across models, such as: model tool chains, test harnesses, test data randomization, and model verification frameworks

As complex science-driven production systems require teamwork amongst scientists and engineers, constantly reducing friction from ideation to production deployment is key. Our desired workflow is parallel iteration by both scientists and engineers, jointly innovating within their respective domains. Doing so requires decoupling models from their code execution environment, eliminating code rewrites and throw-over-the-wall workflow.

From these technical and teamwork benefits come business benefits. We can more predictably improve digital farming every single growing season, fulfilling the heartbeat of our brand promise to growers. Our progress towards full-loop autonomous equipment and automated farm operations is accelerating. We are focused on scaling a handful of capabilities to support centimeter-scale modeling, as necessary to ingest and process the next generation of IoT and sensor networks.

Joint innovation at the intersection of science and software is how our productization lifecycle will become only limited by the organic pace of plant growth. Our capabilities and global data footprint enable us to build systems and products that validate our customers confidence that Climate is the global standard for digital agriculture. We are excited to be tackling cutting-edge challenges in scaling model-driven production systems and encourage you to reach out if you are interested to collaborate or learn more about our progress.

Avery Moon is a Senior Director of Engineering at Climate. He previously held senior engineering and research leadership roles at Wealthfront, LinkedIn, RSA, and two venture-backed companies. He graduated summa cum laude with degrees in Industrial / Computational Mathematics, Economics, and Entrepreneurship from the Eller College of Management at University of Arizona.

*Cohen, Steve A., Granade, Matthew W. (2018) “Models Will Run The World.” Wall Street Journal. Retrieved from

Stay up to date on our latest initiatives
The Climate Corporation is dedicated to delivering the latest in ag innovation solutions to address farmer needs across the world. Discover more about what we're doing to drive technical excellence in ag, enable healthy farm yields, and a more sustainable future for the world.