Cross-dialectal perspective on Mandarin neutral tone

Chenzi Xu
MPhil DPhil (Oxon)

University of York

2022/10/04 (updated: 2022-10-25)

About me

Postdoctoral Research Associate, University of York

Person-specific Automatic Speaker Recognition: Understanding the behaviour of individuals for applications of ASR


DPhil Candidate, MPhil (Distinction),

University of Oxford

Have you heard of?


  • Valley Girl accent
  • Multicultural London English
  • Plastic Mandarin?


language variation and change

Outline


  1. What is Plastic Mandarin?
  2. Neutral Tone in Mandarin
  3. Research Question
  4. Data
  5. Method
  6. Analysis and Discussion

What is Plastic Mandarin?

1 2 3 4 5 6

1 2 3 4 5 6

Plastic Mandarin

  • Location: A Changsha Mandarin
    • Changsha, Hunan, China
  • Domains and speakers: A crystallised urban youth speech
    • Predominantly used in schools: de facto lingua franca
    • Millennials and younger: competent “trilinguals”

1 2 3 4 5 6

Plastic Mandarin

  • Origin of the name: An inauthentic Mandarin
    • 塑料普通话 liào tōnghuà
    • 塑普 pǔ “Bad Mandarin”
  • Attitudes: A favorable Mandarin

Everybody speaks plastic Mandarin here. If I did not speak it, it would be embarrassing.”

Speaking Plastic Mandarin is down-to-earth without affectation.”

I am proud of Plastic Mandarin.”

Defining Characteristic

Lexical tones

1 2 3 4 5 6

Standard Mandarin

Plastic Mandarin

Neutral Tone in Mandarin

1 2 3 4 5 6

Occurrence, obligatoriness, and functions

1 2 3 4 5 6


Neutral tone syllables

  • Many are grammatical morphemes
  • Never in initial positions
  • Some have lexical tones in isolation
  • Neutral tone may be optional
  • In colloquial speech: one-third (Duanmu, 2007)
  • In written texts: 15%-20% (Li, 1981)

Occurrence, obligatoriness, and functions

1 2 3 4 5 6


Obligatory neutral tones

Lexical idiosyncrasy Part of speech

Vowel, duration, and pitch patterns

1 2 3 4 5 6


  • Half of the duration of the same syllable in lexical tone at the same position in disyllabic phrases (Lin and Yan, 1980)
  • Approximately 60% (Cao, 1986) or varying from 33% to 60% (Jing, 2002) of the duration of the preceding lexical tone syllable

Neutral tone syllables

  • Always unstressed
  • Vowel centralisation (often)
  • Reduced duration (often)
  • Segment deletion
  • Voicing or nasalisation change

Vowel, duration, and pitch patterns

1 2 3 4 5 6


  • Varied but predictable, depending on preceding lexical tone
  • Falling or lower than the offset of the preceding tone when preceded by H, LH, HL tones, but rising or higher than the preceding tone offset when preceded by the L (T3) tone

Approaches to neutral tone

1 2 3 4 5 6


Translated from qīng shēng (literally, ‘light or soft tone’)

Articulation: Soft pronunciation with less intensity (Hu & Xu, 1995)

Chao’s view: Neutralisation of contrasts among lexical tones in unstressed positions

  • Phonological representation
    • Feature spreading
    • Underlying or boundary L tone
  • Phonetic models of coarticulation
    • Target interpolation
    • Target approximation
  • Neutral tone typology

Progressive feature spreading

1 2 3 4 5 6


Derivation of surface values of neutral tone by Yip (1980)

  • Yip (1980)
    • Neutral tone: specified with a register feature [-upper]
    • A special rule of H insertion
    • Two tier tonal framework
    • Tone 3: [-upper] LL
  • Shen (1992)
    • Neutral tone: not specified with any features
    • A floating H deletion rule
    • Matrix tone framework
    • Tone 3: [LLH]

Underlying or boundary L tone

1 2 3 4 5 6


  • Lin (2006)
    • Neutral tone: An underlying L tone
    • After Tone 3: Dissimilation (Obligatory Contour Principle)
  • Li (2003)
    • Neutral tone: A post-lexical prosodic word boundary L tone
    • Prosodic word:
      • Regardless of their morphosyntactic structure
      • A neutral tone syllable or a sequence of neutral tone syllables

Speech models of coarticulation

1 2 3 4 5 6


  • van Santen et al. (1998)
    • Neutral tone: A single mid target slightly before the end of a syllable
    • Target shifts from mid to low when following a falling tone
    • Target interpolation
  • The STEM-ML model (Kochanski & Shih, 2003; etc.)
    • Neutral tone: No soft template
    • Bi-directional assimilation effects
  • The PENTA model (Xu & Wang, 2001; etc.)
    • Neutral tone: A static and mid target (Chen & Xu, 2006)
    • Target approximation

Typology: Tone loss versus de-stressing

1 2 3 4 5 6


  • Shen (1992)
    • Toneless (underlyingly toneless)
      • Functional morphemes
      • Pitch does not belong to or derive from the four lexical tones
    • Detonic (underlyingly toned)
      • Obligatory and lead to phonemic distinction in about 200 minimal pairs
    • Atonic (underlyingly toned)
      • Idiosyncratic and optional neutral tones
      • Resulting from fast careless speech
  • Zhang (2022)
    • Intrinsic (underlyingly toneless)
      • Functional morphemes
      • Underspecified tonal target
      • Lexicalised early, before Ming dynasty (mid-14th century or earlier)
    • Derived (underlyingly toned)
      • Underlyingly the same as their lexical tones
      • Triggered by the stress movement in compounds and phrases, motivated by semantic shifts, disappearance of tone, and contacts with Manchu

Research Question

1 2 3 4 5 6

1 2 3 4 5 6




1. How do neutral tone patterns vary in different contexts?

2. Is there a pitch target for neutral tone?

3. How do neutral tone patterns differ between Standard Mandarin and Plastic Mandarin?

Data

1 2 3 4 5 6

Fieldwork and recording

1 2 3 4 5 6


Changsha Nanya Middle School

  • Boarding school:
    • Dense and multiplex social networks
    • High frequencies of interaction
    • Exact context where Plastic Mandarin predominates
  • Premises: Music room
  • Participants: 21 in total
    • 16 females and 5 males
    • Age: 17.24 \(\pm\) 0.7 years
    • On average 15.71 years in Changsha

The musical room with porous sound-absorbing wall materials for speech recording in Changsha Nanya Middle School, Hunan, China.

Fieldwork and recording

1 2 3 4 5 6


Phonetics Laboratory, University of Oxford

  • Premises: Audio Studio
  • Participants: 14 in total
    • 9 females and 5 males
    • Age: 24 \(\pm\) 1.96 years
    • Mandarin Chinese region in northern China
    • Eight spent more than 10 years in Beijing
    • None of them claimed to speak or frequently use another Chinese variety or Mandarin dialect

The audio studio with soft furnishings for speech recording in the Phonetics Laboratory, University of Oxford.

Disguised friendship game

1 2 3 4 5 6

  • Peer group pair design
  • Informal context for a non-standard accent
  • Embedded with carefully designed speech materials
    • Information structure
    • Syllable structure and segments
    • Tonal combination
    • Repetition

“to allow the interaction of actual peer group itself to control the level of language produced” (Labov, 1972, p.115)

Disguised friendship game

1 2 3 4 5 6


An excerpt of the transcript of a conversation during the game.

An example prompt used in the word guessing game. The scenario is “the (blank to be filled) they built is very stable”. The three keywords are snowman, castle, and stairs.

Speech materials

1 2 3 4 5 6


X-de

  • The de-marked modifiers: full-fledged relative clauses
  • Most of disyllabic de construction: pre-focus position

Speech materials

1 2 3 4 5 6


X1X2menX4

  • All appear at the beginning of an eight-syllable sentence
  • The first two syllables are reduplicatives
  • Particle zheng does not usually clitisise to the left

Method

1 2 3 4 5 6

Acoustic Preprocessing


  • Transcription and annotation
  • Forced alignment
  • Sound interval extraction
  • Acoustic measurement

Statistical Techniques

1 2 3 4 5 6

  • Hierarchical \(f_0\) time series: dynamic trajectories
    • \(f_0\) at various time points arranged into \(f_0\) contours
    • \(f_0\) contours grouped according to speakers, tones, items
  • Data Normalisation
    • Inter-speaker comparison: \(f_0 (semitone) = 12log_2(\dfrac{f_0}{f_s})\)
    • Inter-token comparison: linear scaling of time at syllabic level

Statistical Techniques

1 2 3 4 5 6


  • Modelling variation: Generalised Additive Mixed Model (GAMM)
    • The relaxation of the linearity assumption: a sum of smooth functions of one or more independent variables (Wood, 2006)
    • Capturing the nonlinear patterns: \(f_0\) contours with temporal autocorrelational structure (Chuang et al., 2021)

Model fitting (a glimpse)


model.ar1.scat <- bam(f0_st ~ prevtone.ord + foltone.ord + #parametric terms
                  #reference smooth curves
                  s(t_norm) + 
                  s(duration, k=15) +
                  #difference curves
                  s(t_norm, by=prevtone.ord) +
                  s(t_norm, by=foltone.ord) +
                  #reference+difference surface
                  ti(t_norm, duration) +
                  ti(t_norm, duration, by=prevtone.ord) +
                  ti(t_norm, duration, by=foltone.ord)+
                  #by-speaker random smooth for effect of prevtone/foltone
                  s(t_norm, speakerpf, bs="fs", m=1, k=5),
                  data=f0mens, method ="fREML",
                  family = "scat", discrete = T,
                  AR.start = f0mens$start.event, 
                  rho = f0.acf1s) 

Analysis and Discussion

1 2 3 4 5 6

Preceding tone matters

1 2 3 4 5 6

Plastic Mandarin /(t)a tə/ and /(C)uei tə/

Preceding tone matters

1 2 3 4 5 6

Standard Mandarin /(t)a tə/ and /(C)uei tə/

Duration matters

1 2 3 4 5 6


Standard Mandarin disyllabic tokens of dǎ de

Duration matters

1 2 3 4 5 6


Speaker 208 dǎ de

Speaker 205 dǎ de

Following tone : not so much

1 2 3 4 5 6

Plastic Mandarin [X1X2men]

Following tone : not so much

1 2 3 4 5 6

Standard Mandarin [X1X2men]

Pitch target? Promising!

1 2 3 4 5 6

Plastic Mandarin

Standard Mandarin

Pitch target? Promising!

1 2 3 4 5 6

[X1X2men de]

Plastic Mandarin

Standard Mandarin

Invariant pitch target of neutral tone

1 2 3 4 5 6

Discussion

1 2 3 4 5 6


  • Neutral tone(s): Approaching a low pitch target
  • Extent of preceding tone influence: Two neutral tone syllables
  • Asymmetric influence of following tone: Boundary effect?
  • Domain of reduplication: Entire reduplicated structure [X1X2] tends to share the lexical tone pattern of X1?

Fitzpatrick-Cole (1996) shows that in Bengali the entire reduplicated phrasal structure including the base phrase and the copy share one expected contour such as the tune for a focus nucleus (L*Hφ), and thereby argues that the entire structure is in a single Phonological Phrase.

Discussion

1 2 3 4 5 6


  • The prosodic leanings of neutral tone syllables are leftwards, similar to the encliticising functional words in Germanic languages
  • The location of a neutral tone syllable always coincides with the right edge of a prosodic constituent

Attracting the boundary L tone?

  • Boundary tone L: attached to right edge of the outermost prosodic constituent in a recursive structure.

Discussion

1 2 3 4 5 6

Standard-Plastic language variation

  • Lexical tones: Systematic tone change
  • Neutral tone: Prosodically conditioned in a similar fashion
    • Pitch target
    • Strength of effects of neighboring tones

Cross-dialectal insight: Prosodic constancy?

  • Li and Chen (2019): a mid-low neutral tone target in Tianjin Mandarin

Thank you! Any questions?