Energy trading on the continuous intraday electricity market is a highly complex process that differs significantly from other markets due to the transience and volatility of the traded products. As part of the PDET project, a first trading agent was developed on the basis of deep reinforcement learning that operates on the energy market in combination with weather forecasts. The trading agent acted from the perspective of a wind power plant operator and sold the forecasted amount of power on the intraday market in a simulation. The trading agent achieved a positive trading result. The results of the Spotlight are therefore published in a paper to introduce this topic in the scientific community.

The project is interesting to:

Utility companies, traders, EPEX, scientists in the field of energy trading and electricity forecasting


  • Developing and testing an automated trading agent based on deep reinforcement learning
  • Early incorporation in the research field of energy trading to scientifically present new state-of-the-art results
  • Significantly increased complexity in energy trading as a result of trading in minute-by-minute resolution
  • Publication in the Energy and AI journal ensures a peer-review process, validating the results


To correctly represent energy trading, a trading environment was created that ensures the trading of hourly electricity products in minute-by-minute resolution. The setting of a wind power plant operator added another obstacle, since the agent had to regularly adapt its trading volume and consequently its strategy to the current wind forecasts (updated every 15 minutes). Both the simulation level and the algorithms used in this project were state of the art. For example, the proximal policy optimization (PPO) [Schulman et al. 2017] algorithm was combined with population based training [Jaderberg et al. 2017] to guarantee optimum utilization.


By training it on the transaction data, a trading agent was created that can independently estimate the trading situation on the intraday spot market. This was ensured by using a precise reward function that offered the agent direct incentives. For this reason, the agent was able to anticipate changes in the trading volume (based on varying wind forecasts), correctly estimate price trends and automatically adapt its strategy. To guarantee comparability, the trading agent was also compared with multiple baseline methods that acted in a rule-based manner on the basis of the price forecast. The agent prevailed over the baselines, demonstrating the possible potential of deep reinforcement learning.

Due to the fact that the scope and the simulation level of the trading agent exceed the current state of scientific knowledge, the results were published in the Energy and AI journal. The publication puts Fraunhofer IEE at the top of the subject area of energy trading based on reinforcement learning, generating both public visibility and scientific value. At the same time, the experiments showed that there are still some open research questions in this field. For this reason, further research projects are planned to be conducted based on the results of the PDET project to investigate the direct use of the agent on the electricity market.

Project Schedule

  • Setting up the simulation environment and preparing wind and price forecasts following [Scholz et al. 2020].
  • Implementing the training and analyzing the results based on new traded products
  • Preparing the results in the paper
  • Submitting and reviewing the paper
  • The paper was accepted by the Energy and AI journal

Project partners

Project participants: Malte Lehna, Christoph Scholz, René Heinrich, Björn Hoppmann

  • Publication in the Energy and AI journal (already accepted)
  • Open access is planned, link will be provided later


[Jaderberg et al. 2017] Jaderberg, Max, et al. “Population based training of neural networks.” arXiv preprint arXiv:1711.09846 (2017). 

[Scholz et al. 2020] Scholz, Christoph, et al. “Towards the Prediction of Electricity Prices at the Intraday Market Using Shallow and Deep-Learning Methods.” Workshop on Mining Data for Financial Applications. Springer, Cham, 2020. 

[Schulman et al. 2017] Schulman, John, et al. “Proximal policy optimization algorithms.” arXiv preprint arXiv:1707.06347 (2017). 

Malte Lehna

Fraunhofer IEE

+49 (0) 160 3412279

Share this Spotlight with your network.

GANs4RE – Artificial SCADA dataset for benchmarking anomaly detection approaches