Publish date March 25, 2020
RL (Reinforcement learning) has proven its worth in a slew of AI-led (artificial intelligence) domains and is also starting to show successes in real-world cases.
The most popular application of deep reinforcement learning is of Google’s Deepmind and its robot named AlphaGo. Deepmind developed AlphaGo for it to be able to beat the most challenging board game in the world – Go, which it did. It then went on to create AlphaGo Zero, a version that trounced AlphaGo. In the real world, RL is playing a significant role in driving insightful decision-making across various industries such as optimizing manufacturing, solving supply-chain inventory problems, making healthcare safer, helping a car learn to drive by itself, etc. among many others.
That said, much of the advances in reinforcement learning have proved challenging to leverage in real-world systems. It is because of a series of presumptions and poorly defined realities/environments that are rarely satisfied in real practice.
In the previous part-1 of the ‘Introduction to RL’ series, we covered what RL is, the several differences between general practice ML and associated benefits. In this blog, I would like to present as a testbed for tangible RL research and the unique challenges that need to be addressed before RL can be effectively productionized in the real world.
Challenges of RL in the real world:
More recent work on RL has recognized that poorly designed realities of real-world systems are contributing to hamper the progress of real-world learning. While games and simple physical simulations have offered a benchmark domain for many fundamental developments, it is crucial to develop more sophisticated learning environments to solve complex real-world problems; as the field continues to mature. Domain experts have addresses, among others, issues such as limited exploration, unspecified reward functions, preparing the simulation environment adequately – all of which are highly dependent on the tasks to be performed. These issues make it difficult for control systems to be grounded in the real, physical world.
Here are SIX of the unique high-level challenges we can look at:
Therefore, being able to transfer an RL model out of the training environment into the real world can prove to be tricky. Tweaking and scaling neural networks controlling the agent is another challenge. To solve them, researchers are looking into several traditional concepts such as intrinsic motivation, imitation learning, and hierarchical learning.
The excitement is justified: Driving RL to maturity
Despite its limitations and the research being in its infancy, RL has offered tremendous value and early wins in use cases for industrial robotics, financial services, healthcare, and designing drugs. Soon and with more maturity, it may be one of the most effective ways to interact with a customer.
In marketing, for example, a brand’s actions could include all the combinations of solutions, services, products, offers, and messaging – harmoniously integrated across different channels, and each message personalized – down to the font, color, words, or images. In supply-chain, RL can help decision-makers make better logistics decisions based on inventory forecasts, resource availability, and timely delivery of shipments at a lower cost.
We shall dive into each of these challenges with ways to solve them in upcoming blogs in the RL series.
Type in a topic service or offering and then hit enter to search