GAMS Bots

Overview

A "difficult"-level AI player (bot) is coded into the game model, for use in single player and optionally multiplayer. When a player joins a single player room, they have the option of choosing the number of bots to play against. In the game management screen, in addition to regular game parameters, the player can also set the objective of each bot opponent.

The AI bots utilize the GAMS optimization software in order to choose their planting and (if available) management decisions each round. However, instead of simply attempting to maximize their score for the current round by disregarding future progress, the bots have built-in logic to optimize a weighted average of overall score and field health over a five year period.

GAMS Model

The bots make their decisions each round by forming a mixed-integer nonlinear programming problem (MINLP) out of the Fields of Fuel's underlying mathematical model, and using the Cplex algorithm within the GAMS environment to find the optimum point. Ultimately, the GAMS model makes three decisions that it passes back to the FoF game: what crop to plant on each field, and, if applicable, which fields to till and fertilize. The current GAMS implementation of the JAVA model uses 79 blocks of equations and 53 blocks of variables (resulting in 522 equations and 398 variables, 80 of which are binary), and has a solve time of less than a second. The primary variables in the model, which are referenced below, are z1(field,crop) which is a binary variable indicating that a specific crop was planted on the given field, and z2(field,mgnt) which is a binary variable indicating a specific management decision was made on the given field (For example, z1('f1','corn') = 1 and z2('f1','till') = 1 indicates the decision to plant and till corn on field 1).

The objective function of the MINLP to be maximized is the farm overall sustainability score: Sustainability_Score = Economic_Score * Economic_Weight + Energy_Score * Energy_Weight + Environment_Score * Environment_Weight.

By default, the weight on each score is equal to the weights defined in the game for determining sustainability score (which is 1/3 each, by default). However, the user has the ability to modify the weighting on the bots, effectively determining which strategy the bots will implement. For example, the user may choose to put 100% of the weight into economic_weight in order to observe a strategy for maximizing profit with no concern towards energy production or environmental impact.

The GAMS model is divided into nine different sections, each briefly explained below:

  1. Data Input:
    • Due to the inherent cyclical nature of farming, many of this year's decisions are dependent on parameters from the previous year.
    • Specifically, the model is passed the previous year's planting decisions, field organic nitrogen levels, and developed field root biomass, as well as the current year's crop prices.
    • Additionally, in order to estimate the maximum capital for this round (See Heuristics section), the bot's capital and economic score from the previous round are passed.
  2. Heuristics
    • In order to maintain fairness, the bot does not receive anymore information than is available to a human player. However, this causes two problems within the model:
      1. The economic score is calculated by dividing the bot's capital by the capital of the player with the most capital (unknown).
      2. The BCI sub-score is dependent on the 'PGrass' parameter, which is the percentage of neighboring fields that plant grass (unknown).
    • In solution, we used a heuristic process to estimate both of these parameters.
      1. Max Capital:
        • Using the GAMS model to optimize total field health over a 25 year period (See "Identifying Boundaries" section), we found the maximum possible achievable capital increase in one round to be about $45,000.
        • However, using the starting values for field health, the maximum possible capital increase for the first three rounds are $17k, $28k, and $32k.
        • Therefore, to estimate this round's maximum capital, we assume that at least one player is generally maximizing his/her economic potential, but isn't necessarily able to achieve the MAX theoretical amount due to limited field health.
        • The resulting heuristic for the round's maximum capital is then: maxCapital = (capital_previous/EcScore_previous) + $40,000.
      2. PGrass:
        • PGrass is a parameter between 0 and 1 that represents the percentage of surrounding fields that have grass planted. However, the bot has no indication of what the surrounding players are planting.
        • The bot does, however, know what it chooses to plant. Since some of the bot's fields are neighbored by the bot's other fields, the PGrass parameter is partially dependent on its own planting decisions.
          • Therefore, we had (0.5/N) to pGrass for every field that the bot decides to plant grass on [N is the total number of fields per farm].
        • Additionally, it is reasonable to assume that the percentage of neighbors who plant grass is partially dependent on the current grass prices. The higher the price, the more likely players are to plant grass.
          • Therefore, we add 0.125 to pGrass if the price of grass is over $50, 0.25 if the price is over $100, 0.375 if the price is over $125, and 0.5 if the price is over $150.
  3. Overarching Model:
    • Probably the simplest component of the GAMS model, this section simply defines the objective equation and adds the constraint that only one crop can be planted per field.
    • As mentioned above, the weights on each scoring component in the objective equation may be defined by the user.
  4. Indicator Variables:
    • In order to maintain linearity in the MIP, we utilized a series of 'tricks' to represent the product of multiple binary variables as a single indicator variable:
      • d_est = (1 - z1_previous) * z1. This binary variable indicates that a perennial plant is being 'established' this year (planted for the first time). Note that corn must always be established, as it is not perennial.
      • d_cont = 1 - d_est. This binary variable indicates that a perennial plant is being continued from the previous year, and does not need to be replanted.
      • d_t = z1(field,crop)*z2(field,'till'). This binary variable indicates the decision to both plant a certain crop on the field, and till that crop.
      • d_f = z1(field,crop)*z2(field,'fert'). This binary variable indicates the decision to both plant a certain crop on the field, and fertilize that crop.
  5. Yield/Nitrogen Model:
    • This section of the GAMS model contains the entire nitrogen-based soil model on which the FoF game is founded.
    • Due to the differing growth nature of not only perennials verses non-perennials, but also the unique nature of switchgrass, there are numerous complicated equations within this section.
      • See  Nitrogen Model for detailed explanation of the soil fertility model.
  6. Economy Model:
    • This section calculates total farm revenue and costs based on the planting and management decisions made in order to calculate the change in capital.
    • The Max Capital parameter estimated in the heuristics sections is utilized here to calculate the bot's economic score for this round.
  7. Energy Model:
    • This section calculates the net energy gains/losses based on the planting and management decisions made.
    • The energy score is calculated by determining where the bot's farm's net energy falls in the range between maximum and minimum possible energy outputs.
      • The GAMS model was actually used to determine the range of possible energy outputs used for scoring within the FoF game. See the "Identifying Boundaries" section for details.
  8. Environment Model:
    • This section is divided into four separate submodels: Soil, Water, BCI, and Emissions. The total environmental score is the average of these four subscores.
    • The soil and water subscores are both calculated by comparing the bot's soil/water values on a range of possible values.
      • See "Identifying Boundaries" section for details on determining bounds of possible scores.
    • The BCI subscore utilizes the PGrass parameter estimated in the heuristics section in conjunction with the bot's planting decisions to calculate the farm's Bio-control index (natural pest suppression).
    • The emissions section calculates the total emissions generated (possibly negative) by comparing the emissions generated in farming verses the emissions sequestered by the crops. The score is determined by location on a range of possible values.
      • Similar to the soil and water subscores, the range of possible emissions values is determined in the "Identifying Boundaries" section.
  9. Model Solve
    • The final section of the GAMS model contains the actual solve command, as well as the ultimate logic behind the bot's decisions.
    • Additionally, the FoF Boolean control of management decisions is controlled in GAMS as an upper-bound on the z2 binary variable. If management decisions are turned off, z2.up is set to 0, preventing the bot from tilling or fertilizing.
    • See the "Multi-Year Outlook" section for a detailed explanation behind the bot's logic to balance immediate rewards with long-term field health.

Multi-Year Outlook

As mentioned above, attempting to implement a multi-year time period model was very impractical in this case. Doing so would require implementing a multi-year mixed-integer nonlinear programming (MINLP) model. However, the current model already contains 398 variables, which increases exponentially for every time-period added—for example, extending the problem to include just three time periods would increase our number of variables to 3693 = 63,044,792 variables. Furthermore, guaranteeing convergence to a global optimum would require the use of a global NLP solver such as lindoglobal or baron. However, these solvers take an immensely longer period of time to solve (if they converge at all)—this coupled with the exponentially increasing problem size results in an impractically large solve time. Because the GAMS model is being used in a video game environment, it is crucial that the bot logic converges at least as fast as a human player. Otherwise, players of the game would be forced to wait on the bot at every stage of the game, ruining the experience and discouraging players.

Conversely, in order to truly incorporate the game model into the GAMS model, it is still important to consider multiple-year effects. If we didn't, the bots would never choose to plant switchgrass! (Since it doesn't provide any immediate harvestable yield).

Therefore, instead of using a time-based model, we implemented multiple-solve loops in order to incorporate time-sensitive logic and attempt to identify the optimal future path for bots to follow. The procedure for each solve loop is as follows:

  1. Solve the MIP, determining the best decisions to make for this current planting year.
  2. Calculate all resulting scores and field values.
  3. Update all "previous" parameters using the results from the previous solve.
  4. Re-estimate the heuristically determined parameters described above.
  5. Estimate the change in crop prices using a supply/demand approximation, assuming the bot is the only player in the game.
  6. Return to step 1), and repeat T times.

Even though the bot is making an un-forecasted decision (concern only for the current round) within each iteration of the loop, repeating the solve T times gives an impression of the end result of the chosen path. We then utilize this loop twice. For the first loop, we assume that there are no perennial effects in the crops (all crops see immediate yield), and for the second loop all perennial effects are accounted for. Doing so allows us to compare two different rational paths—one in which we attempt to determine the future potential profit that can be realized by investing in perennials now at the sacrifice of immediate results, and the second in which we attempt to identify the long-term results of attempting to maximize immediate results every year (which could potentially destroy the environment, limiting the possibility for future returns). To compare the paths, each path is assigned an overall score based on a weighted average of total sustainability score attained (70% of the total score), total built up organic nitrogen in the soil (15%), and total root-biomass established (15%). Additionally, because the first path does not account for the first-year perennial losses, the first path's overall score is penalized by a percentage, depending on the length of outlook (T). Doing so ensures that the first path is only chosen if there is a definite potential for greater long term results. Finally, the optimal path chosen is the path with the highest adjusted overall average. The planting decisions passed back to the FoF game are the decisions made in the first year of the chosen loop.

The path chosen by the GAMS model is highly sensitive to both the choice of weights on the components of path score, as well as the choice of T (number of years to loop over). Increasing the weight on sustainability score increases the bot's tendencies to make more immediately beneficial decisions since it focuses more on its score within the next T years than it does on preserving field health values for the years beyond the current scope. Conversely, increasing the weight on organic nitrogen does just the opposite—it causes the bot to focus more on building up field health at the cost of its in-game sustainability score. The effect of changing the root-biomass weight is difficult to describe, because it is highly dependent on the current game state and can effect the path choice in either direction. Finally, larger values for T tend to cause the bot to choose the first path more often, since the weight of the penalty is minimized over a longer time period and a longer time period gives the bot more time to recognize benefits of continuing perennials. However, excessively large values of T also tend to cause the bots to consistently under-perform in a normal game length (<25 years) because it takes long periods of time to benefit from the immediate investments. Additionally, using larger T values causes the bot to be more often negatively impacted by the changing in-game dynamics, since the changing crop prices and global sustainability parameters cause the bot to see different results than predicted.

Currently, the bots are looping over a T = 5 year time period which was chosen purely based on experimental results. As demonstrated above in the "Performance" section, the current values generally result in decent bot performance in a 25-year game, but success is limited to certain scenarios. All current values are subject to change with additional testing.