Abstract
In recent years so-called stochastic power producers (with portfolios including wind and solar power generation capacities) are increasingly asked to participate in electricity markets under the same rules than for conventional generators. Stochastic power producers may act strategically in order to decrease expected penalties induced by imbalances. Many alternative offering strategies based on forecasts in various forms are available in the literature. However, they assume some form of knowledge of future market state and potential balancingprices. In contrast here, we explore whether algorithms could readily learn from market data and deduce how to offer strategically in order to maximize expected market revenues. Our analysis shows that a direct reinforcement learning algorithm can track the nominal level of the optimal quantile forecast to trade in the day-ahead market, while yielding higher revenues than existing benchmark strategies