Episode 42: Privacy-Preserving Natural Language Processing with Patricia Thaine

Published: Sept. 11, 2020, 1 p.m.

Show Notes

  • (2:55) Patricia talked about his interest in learning languages and living in different cultures.
  • (4:05) Patricia talked about her experience volunteering as a translator at the International Network of Street Papers.
  • (5:00) Patricia studied Liberal Arts at John Abbott College, English Literature at Concordia University, and Computer Science and Linguistics at McGill University during her undergraduate years.
  • (8:06) Patricia worked at McGill Language Development Lab as a Research Assistant, which studied how children learn different types of words and sentences.
  • (9:15) Patricia described her graduate school experience at the University of Toronto, where she researched lost language decipherment and writing systems.
  • (11:19) Patricia talked about MedStory, which is a text-oriented visual prototype built to support the complexity of medical narratives (spearheaded by Nicole Sultanum).
  • (12:35) Patricia explained her research paper, “Vowel and Consonant Classification through Spectral Decomposition.”
  • (15:29) Patricia unpacked her blog post, “Why is Privacy-Preserving NLP Important?
  • (19:02) Patricia dissected her paper “Privacy-Preserving Character Language Modelling” that proposes a method for calculating character bigram and trigram probabilities over sensitive data using homomorphic encryption.
  • (21:13) Patricia wrote a two-part series called “Homomorphic Encryption for Beginners.”
  • (22:21) Patricia unwrapped her paper “Efficient Evaluation of Activation Functions over Encrypted Data” that shows how to represent the value of any function over a defined and bounded interval, given encrypted input data, without needing to decrypt any intermediate values before obtaining the function’s output.
  • (25:33) Patricia elaborated on her paper “Extracting Bark-Frequency Cepstral Coefficients from Encrypted Signals,” which claims that extracting spectral features from encrypted signals is the first step towards achieving secure end-to-end automatic speech recognition over encrypted data.
  • (27:38) Patricia explained why privacy is an essential attribute for speech recognition applications.
  • (29:53) Patricia discussed her comprehensive guide on “Perfectly Privacy-Preserving AI” which dives into the four pillars of perfectly privacy-preserving AI and outlines potential combinatorial solutions to satisfy all four pillars.
  • (37:53) Patricia shared her take on the differences working in academic and commercial settings (she is the founder and CEO of Private AI).
  • (40:50) Patricia talked about Private AI’s GALATEA Anonymization Suite, which anonymizes data at the source and encrypts them using quantum-safe cryptography.
  • (45:05) Patricia emphasized the importance of talking to customers when building a commercial product.
  • (46:58) Patricia shared her experience as a Postgraduate Affiliate at Vector Institute, which works with institutions, industry, startups, incubators, and accelerators to advance AI research and drive its application, adoption, and commercialization across Canada.
  • (49:09) Patricia shared her advice for young researchers by going deep into at least two domains and combining the knowledge.
  • (50:30) Patricia shared her excitement for privacy and NLP research in the upcoming years.
  • (52:36) Closing segment.

Her Contact Info

Her Recommended Resources