In this paper we study the problem of assessing the effectiveness of a proactive defense-by-detection policy with a network-based moving target defense. We model the network system using a probabilistic attack graph-a graphical security model. Given a network system with a proactive defense strategy an intelligent attacker needs to perform reconnaissance repeatedly to learn about the locations of intrusion detection systems and re- plan optimally to reach the target while avoiding detection. To compute the attacker's strategy for security evaluation we develop a receding-horizon planning algorithm using a risk-sensitive Markov decision process with a time-varying reward function. Finally we implement both defense and attack strategies in a synthetic network and analyze how the frequency of network randomization and the number of detection systems can influence the success rate of the attacker. This study provides insights for designing proactive defense strategies against online and multi-stage attacks by a resourceful attacker.