Heating, Ventilation, and Air Conditioning (HVAC) is extremely energy-consuming, accounting for 40% of total building energy consumption. Therefore, it is crucial to design some energy-efficient building thermal control policies which can reduce the energy consumption of HVAC while maintaining the comfort of the occupants. However, implementing such a policy is challenging, because it involves various influencing factors in a building environment, which are usually hard to model and may be different from case to case. To address this challenge, we propose a deep reinforcement learning based framework for energy optimization and thermal comfort control in smart buildings. We formulate the building thermal control as a cost-minimization problem which jointly considers the energy consumption of HVAC and the thermal comfort of the occupants. To solve the problem, we first adopt a deep neural network based approach for predicting the occupants’ thermal comfort, and then adopt Deep Deterministic Policy Gradients (DDPG) for learning the thermal control policy. To evaluate the performance, we implement a building thermal control simulation system and evaluate the performance under various settings. The experiment results show that our method can improve the thermal comfort prediction accuracy, and reduce the energy consumption of HVAC while improving the occupants’ thermal comfort.