Depth maps acquired with ToF cameras have a limited accuracy due to the high noise level and to the multi-path interference. Deep networks can be used for refining ToF depth, but their training requires real world acquisitions with ground truth, which is complex and expensive to collect. A possible workaround is to train networks on synthetic data, but the domain shift between the real and synthetic data reduces the performances. In this paper, we propose three approaches to perform unsupervised domain adaptation of a depth denoising network from synthetic to real data. These approaches are respectively acting at the input, at the feature and at the output level of the network. The first approach uses domain translation networks to transform labeled synthetic ToF data into a representation closer to real data, that is then used to train the denoiser. The second approach tries to align the network internal features related to synthetic and real data. The third approach uses an adversarial loss, implemented with a discriminator trained to recognize the ground truth statistic, to train the denoiser on unlabeled real data. Experimental results show that the considered approaches are able to outperform other state-of-the-art techniques and achieve superior denoising performances.