Traffic lights at intersections are managed by simple computers that assign the right of way to the nonconflicting direction. However, studies looking at travel times in urban areas have shown that delays caused by intersections make up 12-55% of daily commute travel, which could be reduced if the operation of these controllers were more efficient.
A team of researchers led by Guni Sharon, professor in the Department of Computer Science and Engineering at Texas A&M University, has developed a self-learning system that uses machine learning to improve the coordination of vehicles passing through intersections.
The researchers published their findings in the proceedings of the 2020 International Conference on Autonomous Agents and Multiagent Systems.
Many traffic signals today are equipped with signal controllers that serve as the “brains” of an intersection. They are programmed with various settings to tell the traffic display when to change colors depending on the time of day and traffic movement. This gives the signals the ability to handle fluctuations in traffic throughout the day to minimize traffic congestion.
Recent studies have shown learning algorithms, based on a concept in psychology called reinforcement learning where favorable outcomes are rewarded, can be used to optimize the controller’s signal. This strategy enables controllers to make a series of decisions and learn what actions improve its operation in the real world. In this instance, the result would be a reduction in the buildup of traffic delays.
But Sharon noted that these optimized controllers would not be practical in the real world because the underlying operation that controls how it processes data uses deep neural networks (DNNs), which is a type of machine-learning algorithm. They are commonly used to train and generalize a controller’s actuation policy, which is the decision-making (or control) function that determines what actions it should take next based on the current situation it’s in. It consists of several sensors that give information about the current state of the intersection.
Despite how powerful they are, DNNs are very unpredictable and inconsistent in their decision-making. Trying to understand why they take certain actions as opposed to others is a cumbersome process for traffic engineers, which in turn makes them difficult to regulate and understand the different policies.
To overcome this, Sharon and his team defined and validated an approach that can successfully train a DNN in real time while transferring what it has learned from observing the real world to a different control function that is able to be understood and regulated by engineers.
Using a simulation of a real intersection, the team found that their approach was particularly effective in optimizing their interpretable controller, resulting in up to a 19.4% reduction in vehicle delay in comparison to commonly deployed signal controllers.
Despite the effectiveness of their approach, the researchers observed that when they began to train the controller, it took about two days for it to understand what actions actually helped with mitigating traffic congestion from all directions.
“Our future work will examine techniques for jump starting the controller’s learning process by observing the operation of a currently deployed controller while guaranteeing a baseline level of performance and learning from that,” Sharon said.
Other contributors to this research include Josiah P. Hanna, research associate in the School of Informatics at the University of Edinburgh, and James Ault, doctoral student in the Pi Star Lab at Texas A&M.