My research involves developing NLP and machine learning models for inferences over subjective social media — I am particularly interested in how to build these models in low-resource settings. I am also interested in using machine learning and data science for applications with a positive personal, social or humanitarian impact.
My work has been driven by real-world problems with an emphasis on applications for public health research and the social sciences.
I developed Twitter sentiment analysis models for journalists to gauge public reactions to newsworthy events. I also developed models for a political opinion mining system to track the popularity and attitudes about Portuguese politicians on social media, over time. The indicators produced by the system were aligned with traditional polling data and published daily on the POPSTAR website. These indicators have been used by political scientists, such as Pedro Magalhães and others, for analyses into public opinion.
I developed User2Vec, a tool to infer neural embeddings (i.e. learned vector representations) of users, given their social media posts. The resulting user vectors capture latent personal traits, which can provide context to model highly subjective and ambiguous content, e.g. I used them to build deep neural networks for sarcasm detection on Twitter.
I also used User2Vec embeddings to build models that estimate the likelihood of a person being affected by a mental illness, such as depression or PTSD, given their Twitter data.
Currently, I am investigating how to harness these methods to improve clinical and public health practices. One research thread that I am actively pursuing, is how to build digital epidemiology systems to support large-scale, longitudinal and real-time studies over social media. Specifically, I am building systems to investigate behavioral disorders and how they affect different segments of the population, particularly underpresented groups.
More broadly, I hope this research contributes to increase our knowledge of complex and poorly understood illnesses and ultimately bring about a platform for precision public health, thereby enabling more responsive and deliberate health interventions.
|Social Media Analysis||Natural Language Processing||Machine Learning|
|Neural Networks||Deep Learning||Low-resource Learning|
|Computational Social Sciences||Mental-health Applications||Digital Epidemiology|
- Research Assistant @ Northeastern University NLP lab
- Visiting Researcher @ UTA iSchool
- Research Assistant @ INESC-ID Lisboa
- Junior Research Assistant @ LaSIGE
Master in Software Engineering @ Faculdade de Ciências, Universidade de Lisboa
Thesis: Desenvolvimento e Reengenharia de Aplicações Web de Suporte ao Negócio e Integração com Sistemas de Business Intelligence
Licenciate in Computer Science @ Faculdade de Ciências, Universidade de Lisboa