Deep submicron CMOS technology for low-power, low-voltage applications requires the use of symmetric n+/p+ poly gate structures. This requirement introduces a number of processing challenges, involving fundamental issues of atomic diffusion over distances of 1Å to ∼30μm. Two of the critical issues are dopant cross-diffusion between P- and NMOS devices with connected gates, resulting in large threshold voltage shifts, and boron penetration through the gate oxide. We show that in devices with W-polycide dual-gat:e structure most of these problems can be alleviated by using rapid thermal annealing, RTA, in combination with a few additional, simple processing steps (e. g., low-temperature recrystallization of a-Si layer and selective nitrogen coimplants). The RTA step, in particular, ensures thai: the boron activation in the p+ poly-Si remains high and negates any effects of arsenic cross-diffusion. CMOS devices with properly processed gates have low gate stack profiles, small threshold voltage shifts (<30mV), and excellent device characteristics.