My research program utilizes various computational, quantitative, and experimental methodologies to explore the complexities of the cognitive representation of speech. At the same time, I provide concrete descriptions of language production, variation, and learning, with a specific focus on the languages of Pakistan. I’m convinced that sustained research on individual languages often serves as the catalyst for groundbreaking revelations in linguistic theory. In addition, as a linguist that studies the structure of human languages, which are intimately intertwined with the cultures of the people who speak them, I believe I have a responsibility to work diligently to assist in the preservation of these languages for the benefit of the academic community, but much more than that, for the benefit of the communities to which these languages belong. My recent documentation efforts concentrate on Mankiyali and Pakistani Punjabi, but I have also worked with speakers of Hindko and Pakistani Pashto, among other understudied languages in Pakistan.
Uniform Moraic Quantity and Moraic Sonority
One area of research in which I am particularly interested is the typological analysis of syllable-weight phenomena. Most phonological processes sensitive to syllable structure (stress, tone, etc.) exhibit a shared implicational hierarchy, where syllable types exist in a consistent weighting relationship. For example, if a language utilizes a weight-sensitive stress system that treats {CVC} syllables as heavy, thereby attracting stress, {CVː} syllables virtually always attract stress as well. On the other hand, if a stress system treats {CVː} syllables as heavy, this does not necessarily imply that {CVC} syllables are also heavy. This type of implicational relationship, among several others, generally holds across weight-sensitive processes. Taking advantage of the universal weight hierarchy, I have developed a theory of syllable weight that universally prohibits coda consonants from lacking a weight-bearing unit, commonly referred to as a ‘mora’. The theory also enriches the conception of moras by arguing that they are imbued with the sonority of the segment to which they are linked and introduces a new syllable weight metric – the Moraic Sonority Metric – in which syllable weight is not simply the summation of moras in a syllable, but rather the summation of moras of a specified sonority. Importantly, the moraic structure of syllables remains constant under this proposal, but different processes can treat moras of different sonorities distinctly using the Moraic Sonority Metric. Previous theories of syllable weight have difficulty simultaneously accounting for syllable weight variation both cross-linguistically and within individual languages, but the current approach does so straightforwardly. A paper outlining the central concepts of the theory and a formal account of the cross-linguistic typology of weight-sensitive stress is set to appear in the journal Phonology.
In future work, I plan to explore how the framework outlined above accounts for other phonological processes that are sensitive to syllable weight, such as tone, word minimality, compensatory lengthening, and poetic meter. The languages of Pakistan provide fertile ground for further exploration of these issues, as many languages in the region have been found to exhibit a variety of weight-sensitive phonological processes.
Abstraction in speech representations
In another project, which makes up the core of my dissertation, I use quantitative evidence from original experiments measuring vowel nasality in Punjabi and Mankiyali to demonstrate that linguistic representations must be permitted to be completely abstract with respect to how the ‘basic’ forms of words and morphemes are stored in memory. In the Punjabi experiment, I provide oro-nasal air pressure data to show that vowels occurring before a nasal consonant are phonetically identical in nasality to contrastive nasal vowels. I argue that vowels preceding nasal consonants are underlyingly oral, but undergo a predictable process of nasalization. Nevertheless, only contrastive nasal vowels trigger regressive nasal harmony. Harmony is thus sensitive to whether a vowel is oral or nasal in its basic, underlying form – even for vowels which are always phonetically nasal. This implies that some vowels have completely abstract representations, distinct from their phonetic forms.
In a second experiment measuring oro-nasal air pressure on vowels in Mankiyali, I show that, though oral and nasal vowels are contrastive in the language, this contrast is neutralized before nasal consonant suffixes. Moreover, the neutralization is phonetically complete, in that no phonetic differences in nasal air pressure arise between the two underlyingly distinct vowels. This kind of phonetically complete neutralization is surprising within exemplar-based models that argue that the production of various allomorphs of a morpheme should be influenced by the productions of related allomorphs. However, within generative frameworks, in which abstract, symbolic phonological processes can completely merge two distinct sounds, it is unsurprising that two underlyingly distinct words would surface with phonetically identical realizations.
While the results of my research on vowel nasality in Punjabi and Mankiyali suggest that a level of representation exists in which sounds and words are stored in a relatively abstract form, an abundance of recent work does demonstrate that speech representations are also highly detailed and context-dependent. For example, speech production and perception have been found to be heavily modulated by the identity and social setting of the speaker (Johnson, 2006), and other studies show that the specific phonetic pronunciations that listeners are exposed to alter their future productions (Goldinger, 1998). Unless a level of representation exists where detailed phonetic information and social context are stored for individual lexical items, it is difficult to account for these findings. Given this, I argue that an accurate model of speech production and perception must necessarily include at least two levels of representation: one that is abstract (as in classic generative theories) and another that contains richly detailed, episodic, and redundant information (as in exemplar-based models). In the final chapter of my dissertation, I implement a neural network-based computational model of speech production and perception to demonstrate one approach for how these two levels of representation might be unified into a single system.
Moving forward on this project, I would like to further explore the ramifications of combining exemplar and generative theories into a single model. Both research programs have uncovered many answers to the nature of language, but the two approaches often ask different questions and therefore account for a different array of facts. A model like the one proposed in my dissertation takes a first step in combining the strengths of both theories, but there are still many unanswered questions.
Other Current Projects
Acoustic Correlates of stress in Mankiyali
In collaboration with Aurangzeb, a native speaker and researcher of Mankiyali, we explore the phonetic properties of weight-sensitive stress in Mankiyali. Mankiyali is an endangered language spoken by around 500 people in the Khyber Pakhtunkhwa Province of Northwest Pakistan with a weight-sensitive stress system. In a previous description of the language’s phonology, I utilized native-speaker judgments to establish that the language uses at least three distinct levels of weight when determining stress placement, but through detailed phonetic and quantitative analysis of the acoustic correlates of stress, we provide evidence suggesting that the Mankiyali stress criterion uses a five-tiered scale. This paper is currently undergoing a second round of reviewing at Journal of the International Phonetic Association.
Learning non-alternating abstract URs
In a piece of my dissertation research, I test several learning algorithms on pre-nasal vowel data from Punjabi. The result is that previous computational learners fail to prefer a minimally abstract underlying representation (UR) over increasingly abstract alternatives. However, I demonstrate that if a computational learner is equipped with a bias that prefers to minimize the disparities between a UR and its corresponding surface representation, the learner favors the minimally abstract UR. Given the inclusion of this bias, searching through the set of potential URs can be structured in such a way that URs with the least amount of disparities from the SR are examined first, beginning with the fully faithful UR and followed by URs with an increasing number of disparities until a UR is found that provides a sufficiently high probability for the observed data.
Typological analysis of default stress patterns
In collaboration with Brett Hyde, we use the Weak Bracketing framework (Hyde 2002, 2016) to analyze an interesting stress pattern in Sentani, a Papuan language of New Guinea. In odd-parity forms, Sentani’s stress pattern is identical to iambic minimal alternation patterns, in which syllables are footed as iambs with stress on every even-numbered syllable from the left (e.g., ha.ˌxo.mi.ˈbo.xe ‘he followed them’). In four-syllable forms, Sentani exhibits a stress clash with the final iamb transforming into a trochee to avoid word-final stress (e.g., mo.ˌxa.ˈna.le ‘I do (it) for him’), but in even-parity forms with six or more syllables, a word-medial stress lapse emerges such that stress on the penultimate foot is not realized (e.g., mo.ˌlo.ko.xa.wa.ˈlɛ.ne ‘I wrote to you’). In our analysis, we show that with the use of a constraint that requires word-level stress on the final foot and another constraint requiring initial stress, the Sentani pattern is captured straightforwardly under the Weak Bracketing framework.