Learning to swarm with knowledge-based neural ordinary differential equations

A recent paper by members of the DCIST alliance uses the deep learning method, knowledge-based neural ordinary differential equations (KNODE) to develop a data-driven approach for extracting single-robot controllers from the observations of a swarm’s trajectory. The goal is to reproduce global swarm behavior using the extracted controller. Different from the previous works on imitation learning, this method does not require action data for training. The proposed method can combine existing knowledge about the single-robot dynamics, and incorporates information decentralization, time delay, and obstacle avoidance into a general model for controlling each individual robot in a swarm. The decentralized information structure and homogeneity assumption further allow the method for scalable training, i.e., the training time grows linearly with the swarm size. This method was applied on two different flocking swarms, in 2D and 3D respectively, and successfully reproduced global swarm behavior using the learnt controllers. In addition to the learning method, the paper also proposed the novel application of proper orthogonal decomposition (POD) for evaluating the performance of a learnt controller. Furthermore, extensive analysis on hyperparameters is conducted to provide more insights on the properties and  characteristics of the proposed method.

Capability: T3C4C – Adaptive Swarm Behaviors for Uncertainty Mitigation (Hsieh)

Points of Contact: M. Ani Hsieh (PI) and Tom Z. Jiahao

Video: https://drive.google.com/file/d/1QV4kE8K0nYcoLWHTAZ9BNsSI0b4dUax_/view?usp=sharing

Paper: https://arxiv.org/pdf/2109.04927.pdf

Citation: T. Z. Jiahao, L. Pan, M. A. Hsieh “Learning to Swarm with Knowledge-Based Neural Ordinary Differential Equations.” Arxiv Preprint, December 2021.

GNN based Coverage and Tracking Tracking in Heterogeneous Swarms

A recent paper by members of the DCIST alliance designs decentralized mechanisms for coverage control in heterogeneous multi-robot systems especially when considering limited sensing ranges of the robots and complex environments. These are part of the broader DCIST efforts for designing GNN-based control architectures which are, from the ground up, designed to operate in harsh operational conditions, leveraging multi-hop communication to overcome local informational limitations. Our efforts on creating a publication have identified the following salient features of our GNN-controller for multi-robot coverage: (1) We present a model-informed learning solution which leverages relevant (model-based) aspects of the coverage task and propagates it through the network via communication among neighbors in the graph; (2): We use ablation studies explicitly demonstrate that the resulting policies automatically leverage inter-robot communication for improved performance; (3) We show the GNN-based coverage controller outperforms Lloyd’s algorithm under a wide range of training and testing conditions, demonstrating scalability and transferability. 


Capability: T1C5 – Joint Resource Allocation in Perception-Action-Communication Loops

Points of Contact: Vijay Kumar (PI) and Walker Gosrich

Paper: https://arxiv.org/abs/2109.15278

Citation: Walker Gosrich, Siddharth Mayya, Rebecca Li, James Paulos, Mark Yim, Alejandro Ribeiro, and Vijay Kumar. “Coverage Control in Multi-Robot Systems via Graph Neural Networks.” arXiv preprint arXiv:2109.15278 (2021)

Learning Decentralized Controllers with Graph Neural Networks

A recent paper by members of the DCIST alliance develops a perception-action-communication loop framework using Vision-based Graph Aggregation and Inference (VGAI). This multi-agent decentralized learning-to-control framework maps raw visual observations to agent actions, aided by local communication among neighboring agents. The framework is implemented by a cascade of a convolutional and a graph neural network (CNN / GNN), addressing agent-level visual perception and feature learning, as well as swarm-level communication, local information aggregation and agent action inference, respectively. By jointly training the CNN and GNN, image features and communication messages are learned in conjunction to better address the specific task. The researchers use imitation learning to train the VGAI controller in an offline phase, relying on a centralized expert controller. This results in a learned VGAI controller that can be deployed in a distributed manner for online execution. Additionally, the controller exhibits good scaling properties, with training in smaller teams and application in larger teams. Through a multiagent flocking application, the researchers demonstrate that VGAI yields performance comparable to or better than other decentralized controllers, using only the visual input modality (even with visibility degradation) and without accessing precise location or motion state information.

Capability: T1C5: Joint Resource Allocation in Perception-Action-Communication Loops

Points of Contact: Zhangyang “Atlas” Wang and Ting-Kuei Hu

Video: https://www.dropbox.com/sh/adp76y0ro1jb5f2/AADO4xhvkcCrUfIlOGKACDkla?dl=0 

(also appears in ARL press release https://www.youtube.com/watch?v=6sg-4CxNbBk as no. 3)

Paper: https://arxiv.org/pdf/2106.13358.pdf 

Citation: Hu, T. K., Gama, F., Chen, T., Zheng, W., Wang, Z., Ribeiro, A., & Sadler, B. M., “Scalable Perception-Action-Communication Loops with Convolutional and Graph Neural Networks.” IEEE Transactions on Signal and Information Processing over Networks, 2021

Asynchronous and Parallel Distributed Pose Graph Optimization

A recent paper by members of the DCIST alliance has received a 2020 honorable mention from IEEE Robotics and Automation Letters.

The paper presents Asynchronous Stochastic Parallel Pose Graph Optimization (ASAPP), the first asynchronous algorithm for distributed pose graph optimization (PGO) in multi-robot simultaneous localization and mapping. By enabling robots to optimize their local trajectory estimates without synchronization, ASAPP offers resiliency against communication delays and alleviates the need to wait for stragglers in the network. Furthermore, ASAPP can be applied on the rank-restricted relaxations of PGO, a crucial class of non-convex Riemannian optimization problems that underlies recent breakthroughs on globally optimal PGO. Under bounded delay, the authors establish the global first-order convergence of ASAPP using a sufficiently small stepsize. The derived stepsize depends on the worst-case delay and inherent problem sparsity, and furthermore matches known result for synchronous algorithms when there is no delay. Numerical evaluations on simulated and real-world datasets demonstrate favorable performance compared to state-of-the-art synchronous approach, and show ASAPP’s resilience against a wide range of delays in practice.

Source: Yulun Tian, Alec Koppel, Amrit Singh Bedi, and Jonathan P. How, “Asynchronous and Parallel Distributed Pose Graph Optimization,” in IEEE Robotics and Automation Letters, vol. 5, no. 4, pp. 5819-5826, Oct. 2020.

More information: https://www.ieee-ras.org/publications/ra-l/ra-l-paper-awards

Non-Monotone Energy-Aware Information Gathering for Heterogeneous Robot Teams

A recent paper by members of the DCIST alliance considers the problem of planning trajectories for a team of sensor-equipped robots to reduce uncertainty about a dynamical process. Optimizing the trade-off between information gain and energy cost (e.g., control effort, energy expenditure, distance travelled) is desirable but leads to a non-monotone objective function in the set of robot trajectories. Therefore, common multi-robot planning algorithms based on techniques such as coordinate descent lose their performance guarantees. Methods based on local search provide performance guarantees for optimizing a non-monotone submodular function, but require access to all robots’ trajectories, making it not suitable for distributed execution. This work proposes a distributed planning approach based on local search, and shows how to reduce its computation and communication requirements without sacrificing algorithm performance. The team demonstrates the efficacy of their proposed method by coordinating robot teams composed of both ground and aerial vehicles with different sensing and control profiles, and evaluate the algorithm’s performance in two target tracking scenarios. Results show up to 60% communication reduction and 80-92% computation reduction on average when coordinating up to 10 robots, while outperforming the coordinate descent based algorithm in achieving a desirable trade-off between sensing and energy expenditure.

Source: X. Cai, B. Schlotfeldt, K. Khosoussi, N. Atanasov, G.J. Pappas, J.P. How “Non-Monotone Energy-Aware Information Gathering for Heterogeneous Robot Teams”, IEEE Int. Conf. Robot. Autom. (ICRA), ArXiv preprint: https://arxiv.org/abs/2101.11093, 2021.

Article: https://news.mit.edu/2021/robots-collaborate-search-0513 

Kimera: an Open-Source Library for Real-Time Metric-Semantic Localization and Mapping

A recent paper by members of the DCIST alliance develops an open-source C++ library for real-time metric- semantic visual-inertial Simultaneous Localization And Mapping (SLAM). The library goes beyond existing visual and visual-inertial SLAM libraries (e.g., ORB-SLAM, VINSMono, OKVIS, ROVIO) by enabling mesh reconstruction and semantic labeling in 3D. Kimera is designed with modularity in mind and has four key components: a visual-inertial odometry (VIO) module for fast and accurate state estimation, a robust pose graph optimizer for global trajectory estimation, a lightweight 3D mesher module for fast mesh reconstruction, and a dense 3D metric-semantic reconstruction module. The modules can be run in isolation or in combination, hence Kimera can easily fall back to a state-of-the-art VIO or a full SLAM system. Kimera runs in real-time on a CPU and produces a 3D metric-semantic mesh from semantically labeled images, which can be obtained by modern deep learning methods.

Source: A. Rosinol, M. Abate, Y. Chang, L. Carlone “Kimera: an Open-Source Library for Real-Time Metric-Semantic Localization and Mapping”, IEEE Int. Conf. Robot. Autom. (ICRA), ArXiv preprint: https://arxiv.org/pdf/1910.02490.pdf, 2020.

Open-source Code: https://github.com/MIT-SPARK/Kimera

Task: RA1.A1 The Swarm’s Knowledge Base: Contextual Perceptual Representations

Points of Contact: Luca Carlone (PI), Antoni Rosinol.


Asymptotically Optimal Planning for Non-myopic Multi-Robot Information Gathering

A recent paper by members of the DCIST alliance develops a novel highly scalable sampling-based planning algorithm for multi-robot active information acquisition tasks in complex environments. Active information gathering scenarios include target localization and tracking, active Simultaneous Localization and Mapping (SLAM), surveillance, environmental monitoring and others. The goal is to compute control policies for mobile robot sensors which minimize the accumulated uncertainty of a dynamic hidden state over an a priori unknown horizon. To design optimal sensor policies, we propose a novel nonmyopic sampling-based approach that simultaneously explores both the robot motion space and the information space reachable by the sensors. We show that the proposed algorithm is probabilistically complete, asymptotically optimal, and convergences exponentially fast to the optimal solution. Moreover, we demonstrate that by biasing the sampling process towards regions that are expected to be informative, the proposed method can quickly compute sensor policies that achieve user-specified levels of uncertainty in large-scale estimation tasks that may involve large multi-robot teams, workspaces, and dimensions of the hidden state. We provide extensive simulation results that corroborate the theoretical analysis and show that the proposed algorithm can address large-scale estimation tasks.
Target localization and tracking scenario: Two robots with limited field-of-view (blue ellipses) navigate an environment with obstacles to localize and track six targets of interest. Target uncertainty is illustrated in red.
Source: Yiannis Kantaros, Brent Schlotfeldt, Nikolay Atanasov, and George J. Pappas: ‘Asymptotically Optimal Planning for Non-myopic Multi-Robot Information Gathering’ In Proceedings of the 2019 Robotics: Science and Systems (RSS), Freiburg, Germany, June 2019.
Points of Contact: George J. Pappas

Active Exploration in Signed Distance Fields

When performing tasks in unknown environments it is useful for a team of robots to have a good map of the area to assist in efficient, collision-free planning and navigation. A recent paper by members of the DCIST alliance tackles the problem of autonomous mapping of unknown environments using information theoretic metrics and signed distance field maps. Signed distance fields are discrete representations of environmental occupancy in which each cell of the environment stores a distance to the nearest obstacle surface, with negative distances indicating that the cell is within an obstacle. Such a representation has many benefits over the more traditional occupancy grid map including trivial collision checking, and easy extraction of mesh representations of the obstacle surfaces. The researchers use a truncated signed distance field, which only keeps track of cells near obstacle surfaces, and model each cell as a Gaussian random variable with an expected distance and a variance determined incrementally using a realistic RGB-D sensor noise model. The use of Gaussian random variables enables the closed form computation of Shannon mutual information between a Gaussian sensor measurement and the Gaussian cells it intersects. This allows for efficient evaluations of expected information when planning and evaluating possible future trajectories. Using these tools, a robot is able to efficiently evaluate a large number of trajectories before choosing the best next step to increase its information about the environment. The researchers show the resulting active exploration algorithm running on several simulated 2D environments of varying complexity. The figure shows a snapshot of the robot exploring the most complex of the three environments. These simulations can be viewed in more detail in the video linked below.

Points of Contact: Vijay Kumar (PI), Kelsey Saulnier.

Citation:  K. Saulnier., N. Atanasov, G. J.Pappas, & V. Kumar, “Information Theoretic Active Exploration in Signed Distance Fields,” IEEE International Conference on Robotics and Automation (ICRA), Paris, France, June 2020. (Accepted)

Learning Multi-Agent Policies from Observations

A recent paper from the DCIST team introduces a framework for learning to perform multi-robot missions by observing an expert system executing the same
mission. The expert system is a team of robots equipped with a library of controllers, each designed to solve a specific task. The expert system’s policy selects the controller necessary to successfully execute the mission at each time step, based on the states of the robots and the environment. The objective of the learning framework is to enable an un-trained team of robots (i.e., imitator system) — equipped with the same library of controllers but not the expert policy — to learn to execute the mission with performance comparable to that of the expert system. Based on un-annotated and noisy observations of the expert system, a multi-hypothesis filtering technique estimates the series of individual controllers executed by the expert policy. Then, the history of estimated controllers and environmental states provide supervision to train a neural network policy for the imitator system. When evaluated on a perimeter protection scenario, experimental results suggest that the learned policy endows the imitator system with performance comparable to that of the expert system.
Source: P. Pierpaoli, H. Ravichandar, N. Waytowich, A. Li, D. Asher, M. Egerstedt.  “Inferring and Learning Multi-Robot Policies from Observations”, International Conference on Intelligent Robots and Systems (IROS), 2020 – under review
Points of Contact: Pietro Pierpaoli; Harish Ravichandar {pietro.pierpaoli, harish.ravichandar} @gatech.edu

Sim-to-(Multi)-Real: Transfer of Low-Level Robust Control Policies to Multiple Quadrotors

A recent paper by members of the DCIST alliance develops the use of reinforcement learning techniques to train policies in simulation that transfer remarkably well to multiple different physical quadrotors. Quadrotor stabilizing controllers often require careful, model-specific tuning for safe operation. The policies developed are low-level, i.e., they map the rotorcrafts’ state directly to the motor outputs. The trained control policies are very robust to external disturbances and can withstand harsh initial conditions such as throws. The work shows how different training methodologies (change of the cost function, modeling of noise, use of domain randomization) might affect flight performance. The is the first work that demonstrates that a simple neural network can learn a robust stabilizing low-level quadrotor controller (without the use of a stabilizing PD controller) that is shown to generalize to multiple quadrotors.

Project page:

Sim-to-(Multi)-Real: Transfer of Low-Level Robust Control Policies to Multiple Quadrotors
A.Molchanov, T. Chen, W. Hönig, J. A.Preiss, N. Ayanian, G. S. Sukhatme
IEEE/RSJ International Conference on Robots and Systems (IROS) 2019

DCIST Task: RA3.A1 Robust Adaptive Machine Learning
Contact: Gaurav Sukhatme