In this paper, we propose an enhanced version of the vanilla transformer for data-to-text generation and then use it as the generator of a conditional generative adversarial model to improve the semantic quality and diversity of output sentences. Specifically, by adding a diagonal mask matrix to the attention scores of the encoder and using the history of the attention weights in the decoder, this enhanced version of the vanilla transformer prevents semantic defects in the output text. Also, using this enhanced transformer along with a triplet network, respectively, as the generator and discriminator of conditional generative adversarial network, diversity and semantic quality of sentences are guaranteed. To prove the effectiveness of the proposed model, called conditional generative adversarial with enhanced transformer (CGA-ET), we performed experiments on three different datasets and observed that our proposed model is able to achieve better results than the baselines models in terms of BLEU, METEOR, NIST, ROUGE-L, CIDEr, BERTScore, and SER automatic evaluation metrics as well as human evaluation.