Schedule to host where fewer containers are running

This is not perfect, of course, as we rather should check the load on the host rather than some artificial number of running containers (more failed jobs result in more running containers, which results in more scheduled to the host), but still, this is an improvement over the previous state because utilization did not express enough. So the issue was that the utilization only represents the utilization of an endpoint _from the current process_. If butido is called three times, the same host was selected for a job because the utilization in each butido process resulted in the same values for this endpoint. In practice, this resulted in a build for "gcc" to be scheduled to the same host when butido was executed three times for "gcc" (e.g. for different platforms). The endresult was a load of ~70 on a host where 24 was "100% load". This is not ideal. Checking the load of a host is not possible via the docker API, so this is what I came up with. Now, builds are scheduled to endpoints which have fewer running containers. This might not result in a perfect distribution (or even a "good" one), but still it might result in a better distribution of workload than in before this patch. Some completely non-artificial testing resulted in the expected behaviour. Signed-off-by: Matthias Beyer <matthias.beyer@atos.net> Tested-by: Matthias Beyer <matthias.beyer@atos.net>
author: Matthias Beyer <matthias.beyer@atos.net> 2021-09-15 17:13:29 +0200
committer: Matthias Beyer <matthias.beyer@atos.net> 2021-09-16 12:09:05 +0200
commit: 591fbd5afe536bea549009ee0e241e50c9375783 (patch)
tree: f89a2b3f67a37573cd656c6d6846e730165ae9cd
parent: 1a8ce54dd38976bde73f2f76126c4d0fb6a9946a (diff)
1 files changed, 43 insertions, 4 deletions
diff --git a/src/endpoint/scheduler.rs b/src/endpoint/scheduler.rs
index dc1e405..014d68a 100644
--- a/src/endpoint/scheduler.rs
+++ b/src/endpoint/scheduler.rs
@@ -88,6 +88,8 @@ impl EndpointScheduler {
     }
 
     async fn select_free_endpoint(&self) -> Result<EndpointHandle> {
+        use futures::stream::StreamExt;
+
         loop {
             let ep = self
                 .endpoints
@@ -97,13 +99,50 @@ impl EndpointScheduler {
                     trace!("Endpoint {} considered for scheduling job: {}", ep.name(), r);
                     r
                 })
-                .sorted_by(|ep1, ep2| {
-                    ep1.utilization().partial_cmp(&ep2.utilization()).unwrap_or(std::cmp::Ordering::Equal)
+                .map(|ep| {
+                    let ep = ep.clone();
+                    async {
+                        let num = ep.number_of_running_containers().await?;
+                        trace!("Number of running containers on {} = {}", ep.name(), num);
+                        Ok((ep, num))
+                    }
+                })
+                .collect::<futures::stream::FuturesUnordered<_>>()
+                .collect::<Vec<_>>()
+                .await // Vec<Result<_>>
+                .into_iter()
+                .collect::<Result<Vec<_>>>()? // -> Vec<_>
+                .into_iter()
+                .sorted_by(|(ep1, ep1_running), (ep2, ep2_running)| {
+                    match ep1_running.partial_cmp(ep2_running).unwrap_or(std::cmp::Ordering::Equal) {
+                        std::cmp::Ordering::Equal =>  {
+                            trace!("Number of running containers on {} and {} equal ({}), using utilization", ep1.name(), ep2.name(), ep2_running);
+                            let ep1_util = ep1.utilization();
+                            let ep2_util = ep1.utilization();
+
+                            trace!("{} utilization: {}", ep1.name(), ep1_util);
+                            trace!("{} utilization: {}", ep2.name(), ep2_util);
+
+                            ep1_util.partial_cmp(&ep2_util).unwrap_or(std::cmp::Ordering::Equal)
+                        },
+
+                        std::cmp::Ordering::Less => {
+                            trace!("On {} run less ({}) containers than on {} ({})", ep1.name(), ep1_running, ep2.name(), ep2_running);
+                            std::cmp::Ordering::Less
+                        },
+
+                        std::cmp::Ordering::Greater => {
+                            trace!("On {} run more ({}) containers than on {} ({})", ep1.name(), ep1_running, ep2.name(), ep2_running);
+                            std::cmp::Ordering::Greater
+                        }
+                    }
                 })
-                .next();
+                .next()
+                .map(|(ep, _)| ep);
 
             if let Some(endpoint) = ep {
-                return Ok(EndpointHandle::new(endpoint.clone()));
+                trace!("Selected = {}", endpoint.name());
+                return Ok(EndpointHandle::new(endpoint));
             } else {
                 trace!("No free endpoint found, retry...");
                 tokio::task::yield_now().await
author	Matthias Beyer <matthias.beyer@atos.net>	2021-09-15 17:13:29 +0200
committer	Matthias Beyer <matthias.beyer@atos.net>	2021-09-16 12:09:05 +0200
commit	591fbd5afe536bea549009ee0e241e50c9375783 (patch)
tree	f89a2b3f67a37573cd656c6d6846e730165ae9cd
parent	1a8ce54dd38976bde73f2f76126c4d0fb6a9946a (diff)