Building Energy efficiency has gained more and more attention in last few years. Occupancy level is a key factor for achieving building energy efficiency, which directly affects energy-related control systems in buildings. Among varieties of sensors for occupancy estimation, environmental sensors have unique properties of non-intrusion and low-cost. In general, occupancy estimation using environmental sensors contains feature engineering and learning. The traditional feature extraction requires to manually extract significant features without any guidelines. This handcrafted feature extraction process requires strong domain knowledge and will inevitably miss useful and implicit features. To solve these problems, this chapter presents a Convolutional Deep Bi-directional Long Short-Term Memory (CDBLSTM) method that consists of a convolutional neural network with stacked architecture to automatically learn local sequential features from raw environmental sensor data from scratch. Then, the LSTM network is used to encode temporal dependencies of these local features, and the Bi-directional structure is employed to consider the past and future contexts simultaneously during feature learning. We conduct real experiments to compare the CDBLSTM and some state-of-the-art approaches for building occupancy estimation. The results indicate that the CDBLSTM approach outperforms all the state-of-the-arts.