add cloud simulation scripts and docs

2025-04-04 15:55:23 +00:00 · 2019-09-25 12:24:15 -04:00 · 2019-09-25 12:24:15 -04:00 · adbe817eb4
commit adbe817eb4
parent 3871e5a84b
3 changed files with 124 additions and 0 deletions
--- a/simulations/Dockerfile
+++ b/simulations/Dockerfile
@ -0,0 +1,38 @@
+FROM golang:alpine AS build-env
+
+# Set up dependencies
+# bash for debugging
+# git, make for installation
+# libc-dev, gcc, linux-headers, eudev-dev are used for cgo and ledger installation (possibly)
+RUN apk add bash git make libc-dev gcc linux-headers eudev-dev jq
+
+# Install aws cli
+RUN apk add python py-pip
+RUN pip install awscli
+
+# Set working directory for the build
+WORKDIR /root/kava
+# default home directory is /root
+
+COPY go.mod .
+COPY go.sum .
+
+RUN go mod download
+
+# Add source files
+COPY .git .git
+COPY app app
+COPY cli_test cli_test
+COPY cmd cmd
+COPY app app
+COPY Makefile .
+
+# Install kvd, kvcli
+ENV LEDGER_ENABLED False
+RUN make install
+
+# Copy in simulation script after to decrease image build time
+COPY simulations simulations
+
+# Run kvd by default, omit entrypoint to ease using container with kvcli
+CMD ["kvd"]
--- a/simulations/README.md
+++ b/simulations/README.md
@ -0,0 +1,43 @@
+# How To Run Sims In The Cloud
+
+Sims run with AWS batch, with results uploaded to S3
+
+## AWS Batch
+
+In AWS batch you define:
+
+- a "compute environment"--just how many machines you want (and of what kind)
+- a "job queue"--just a place to put jobs (pairs them with a compute environment)
+- a "job definition"--a template for jobs
+
+Then to run stuff you create "jobs" and submit them to a job queue.
+
+The number of machine running auto-scales to match the number of jobs. When there are no jobs there are no machines, so you don't pay for anything.
+
+Jobs are defined as a docker image (assumed hosted on dockerhub) and a command string.  
+>e.g. `kava/kava-sim:version1`, `go test ./app`
+
+This can run sims but doesn't collect the results. This is handled by a custom script.
+
+## Running sims and uploading to S3
+
+The dockerfile in this repo defines the docker image to run sims. It's just a normal app, but with the aws cli included, and the custom script.
+
+The custom script reads some input args, runs a sim and uploads the stdout and stderr to a S3 bucket.
+
+AWS Batch allows for "array jobs" which are a way of specifying many duplicates of a job, each with a different index passed in as an env var.
+
+### Steps
+
+- create and submit a new array job (based of the job definition) with
+  - image `kava/kava-sim:<some-version>`
+  - command `run-then-upload.sh <starting-seed> <num-blocks> <block-size>`
+  - array size of how many sims you want to run
+- any changes needed to the code or script necessitates a rebuild:
+  - `docker build -f simulations/Dockerfile -t kava/kava-sim:<some-version> .`
+  - `docker push kava/kava-sim:<some-version>`
+
+### Tips
+
+- click on the compute environment name, to get details, then click the link ECS Cluster Name to get details on the actual machines running
+- for array jobs, click the job name to get details of the individual jobs
--- a/simulations/run-then-upload.sh
+++ b/simulations/run-then-upload.sh
@ -0,0 +1,43 @@
+#!/bin/bash
+
+# This requires AWS access keys envs to be set (ie AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY)
+# These need to be generated from the AWS console.
+
+# For commands passed to the docker container, the working directory is /root/kava (which is the blockchain git repo).
+
+
+# Parse Input Args
+# get seed
+startingSeed=$1
+# compute the seed from the starting and the job index
+# add two nums together, hence the $(()), and use 0 as the default value for array index, hence the ${:-} syntax
+seed=$(($startingSeed+${AWS_BATCH_JOB_ARRAY_INDEX:-0}))
+echo "seed: " $seed
+# get sim parameters
+numBlocks=$2
+blockSize=$3
+
+
+# Run The Sim
+# redirect stdout and stderr to a file
+go test ./app -run TestFullAppSimulation -Enabled=true -NumBlocks=$numBlocks -BlockSize=$blockSize -Commit=true -Period=5 -Seed=$seed -v -timeout 24h > out.log 2>&1
+# get the exit code to determine how to upload results
+simExitStatus=$?
+if [ $simExitStatus -eq 0 ];then
+   echo "simulations passed"
+   simResult="pass"
+else
+   echo "simulation failed"
+   simResult="fail"
+fi
+
+
+# Upload Sim Results To S3
+# read in the job id, using a default value if not set
+jobID=${AWS_BATCH_JOB_ID:-"testJobID:"}
+# job id format is "job-id:array-job-index", this removes trailing colon (and array index if present) https://stackoverflow.com/questions/3045493/parse-string-with-bash-and-extract-number
+jobID=$(echo $jobID | sed 's/\(.*\):\d*/\1/')
+
+# create the filename from the array job index (which won't be set if this is a normal job)
+fileName=out$AWS_BATCH_JOB_ARRAY_INDEX.log
+aws s3 cp out.log s3://simulations-1/$jobID/$simResult/$fileName