Both facial expression and tone of voice represent key signals of emotional communication but their brain processing correlates remain unclear. Accordingly, we constructed a novel implicit emotion recognition task consisting of simultaneously presented human faces and voices with neutral, happy, and angry valence, within the context of recognizing monkey faces and voices task. To investigate the temporal unfolding of the processing of affective information from human face-voice pairings, we recorded event-related potentials (ERPs) to these audiovisual test stimuli in 18 normal healthy subjects; N100, P200, N250, P300 components were observed at electrodes in the frontal-central region, while P100, N170, P270 were observed at electrodes in the parietal-occipital region. Results indicated a significant audiovisual stimulus effect on the amplitudes and latencies of components in frontal-central (P200, P300, and N250) but not the parietal occipital region (P100, N170 and P270). Specifical)