As a lossless data compression coding, Huffman coding is widely used in text compression. Nevertheless, the traditional approach has some deficiencies. For example, same compression on all characters may overlook the particularity of keywords and special statements as well as the regularity of some statements. In terms of this situation, a new data compression algorithm based on semantic analysis is proposed in this paper. The new kind of method, which takes C language keywords as the basic element, is created for solving the text compression of source files of C language. The results of experiment show that the compression ratio has been improved by 150 percent roughly in this way. This method can be promoted to apply to text compression of the constrained-language.