Docs: update architecture and scheduling sections

1b5c6406 · Francesco Siddi · 4185f0bf · 1b5c6406 · 1b5c6406 · 4185f0bf
Commit 1b5c6406 authored 8 years ago by Francesco Siddi
--- a/docs/docs/developer_docs/architecture.md
+++ b/docs/docs/developer_docs/architecture.md
 # System architecture
+
 The architecture of Flamenco is simple, hierarchical.

-We have one server and one or more managers which control one or more workers each.
+In a production setup we have one server and one or more managers which control one or 
+more workers each.

 ![Architecture diagram](img/architecture_diagram.svg)

@@ -10,52 +12,85 @@ develop/maintain different type of front-ends for it.

 Communication between components happens via HTTP. In particular, Server and Manager use a simple 
 and well-defined REST API. This allows third parties to easily implement their Manager, and 
-integrate it into Flamenco.
+integrate it into Flamenco. Flamenco ships with its own Free and Open Source Manager and Worker
+implementations.

-The system is designed with bottom-up communication in mind. For example:
+The whole system is designed with bottom-up communication in mind. For example:

 - Worker has a loop that sends requests to the Manager
 - Worker sends request a task to the Manager
- Manager checks tasks availability with Server
- Server replies to Manager with available task
- Manager replies to Worker with task to execute
- Worker executes task
+- Manager checks Tasks availability with Server
+- Server replies to Manager with available Tasks
+- Manager replies to Worker with a Task to execute
+- Worker executes the Commands of a Task and reports progress to the Manager

-This allows us to have loops only at the worker level, and keep the overall infrastructure as 
-responsive and available a possible.
+By using update buffers on Managers and Worker components we can keep the overall 
+infrastructure as resilient, responsive and available a possible.

 ## Server
+
 In a Flamenco network, there can be only one server. The functionality of the server consists in:

 - storing a list of Managers
 - storing Jobs
- generating and storing Tasks (starting from a job)
- dispatch task for a manager
+- generating and storing Tasks (starting from a Job)
+- dispatch Tasks to a Manager
 - serving entry points to inspect the status of:
-    + jobs
-    + tasks
-    + workers
+    + Jobs
+    + Tasks
+    + Workers
+
+The Server software is based on [Pillar](https://pillarframework.org/), the Free and Open Source
+CMS that provides agnostic user authentication, resource and project management. It requires:
+
+- Linux, macOS or Windows
+- Python 2.7 (soon to become Python 3)
+- MongoDB
+
+## Manager
+
+The goal of the Manager, as the name suggests, is to handle the workload provided by the Server
+and leverage the computing resources available (Workers) to complete the Tasks as soon as possible.
+
+Because the communication between Server and Manager happens via a semi-RESTful API, a Manager
+can be implemented in many ways. At Blender we implemented a Manager in Go, which is available
+as Free and Open Source software. It requires:
+
+- Linux, macOS or Windows
+- Golang
+- MongoDB
+
+## Worker
+
+The lowest-level component of the Flamenco infrastructure, a Worker is directly responsible for
+the execution of a Task, which is composed by an ordered series of Commands. Similarly to the
+Manager, Blender provides a Free and Open Source implementation, with the following requirements:
+
+- Linux, macOS or Windows
+- Python 3.5 or greater
+
+## Jobs, Tasks and Commands

-## Jobs, tasks and commands
 Flamenco is designed to handle several types of jobs, mostly serving computer animated film 
 production, for example:

 - 3D animation rendering
 - simulation baking
- large still image rendering
+- distributed still image rendering
 - video encoding

 A Job is the highest level structure, containing all the necessary information on how to process 
 the Job itself.
-In order to use the computing power of multiple machines, we split the Job into Tasks, according to
-the instructions provided. This process is called Job compilation.
+In order to distribute the execution of a Job, we split it into Tasks, according to the instructions 
+provided. This process is called *Job compilation*.
+
+A task is essentially a list of Commands. Each worker can claim one (or more tasks) to execute,
+which means it will sequentially run all the Commands contained in it. 

 - keeping a log of operations related to Jobs (task logging happens on the manager)
 - collecting and storing all the data needed to complete a Job

+## Components and processes documentation

-## Render workflow
-The render workflow is based on jobs. Once a jobs is added to Flamenco, we automatically create 
-tasks (collection of commands) to send to any available worker.
-
-When all tasks are completed, the job is marked as finished.
+In the rest of the documentation we cover more in depth the different components of the
+architecture, as well as the various processes (Job creation, Manager and Worker management, etc.).
--- a/docs/docs/developer_docs/scheduling.md
+++ b/docs/docs/developer_docs/scheduling.md
+# Scheduling
+
+The scheduling system is supposed to hand out Tasks from the Server to each Worker, through a
+Manager. By design, the communication between Server and Workers is *always* mediated by the
+Manager, to allow completely customizeable resource management on the computing infrastructure 
+available.
+
+At the core of the scheduling system lays the *Dependency Graph*. The Dependency Graph is a DAG
+(Directed Acyclic Graph) where each node represents a Task. Tasks are initially stored in the
+Server database, in a dedicated collection and are passed on to the Manager upon request.
+
+The DG is generated with a database query to the Tasks collection and, depending on the query, 
+can return hundred-thousands of Tasks, which are then stored by the Manager in its own 
+database, so that they can be served to the Workers.
+
+## Priority rules
+
+The priority for the execution of a Task is determined by three factors:
+
+- position in the DG
+- job priority
+- task priority
+
+Therefore, the Task with no parent Task (or all its parent Tasks completed), with the highest 
+Job priority, with the highest Task priority will be dispatched first.
+
+## Task requirements and resource allocation
+
+**Note: This feature is not implemented yet.**
+
+When a Worker queries the Manager for a Task, we use the *services* offered by it as a query
+parameter to find the highest priority Task that can be executed. For example, a Worker 
+might offer `blender_render`, but not `ffmpeg`. This also extends to hardware settings,
+so that we can specify a minimum amount of RAM or CPU cores required by a Task.
+
--- a/docs/docs/img/basic_screenshot.png
+++ b/docs/docs/img/basic_screenshot.png
--- a/docs/docs/index.md
+++ b/docs/docs/index.md
-# Flamenco
+# Flamenco Docs

 Flamenco is a distributed rendering solution for the 3D animation suite
 Blender. It supports many features that make it a perfect fit for a small
@@ -6,11 +6,9 @@ or medium CG studio, such as workstations/nodes availability scheduling,
 project management, render previews via web interface, pre and post-render
 actions.

-![image](img/basic_screenshot.png)
-
 ## Main features

-* Runs on Linux, OSX and Windows
+* Runs on Linux, macOS and Windows
 * Automatic detection of clients
 * Supports multiple Projects
 * Stats on shot completions
@@ -19,7 +17,7 @@ actions.

 ## Supported software

-flamenco is designed to be quite flexible and support other software than
+Flamenco is designed to be quite flexible and support other software than
 just Blender. The integration of other packages is not possible just yet
 but if you are interested to work on this, feel free to get in touch and
 we will figure out how to do it.